* [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry. @ 2025-02-20 23:36 NeilBrown 2025-02-20 23:36 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown ` (5 more replies) 0 siblings, 6 replies; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs I'm posting this to a wider audience now as I think it is close to its final form. I have not included every fs maintainer explicitly (though this patch touches every writable FS) but hope that fsdevel will catch enough of those). I have included the affected clients of vfs_mkdir: nfsd, smb/server, cachefiles, and the filesystems with non-trivial changes: nfs, cephfs, hostfs, fuse. mkdir is unique among object creation interfaces as there can only be one dentry for an directory inode. There is a possibilty of races which could result in the inode created by mkdir already having a dentry when mkdir comes to attach one. To cope with this, three users of vfs_mkdir() sometimes do a lookup to find the correct dentry when the one that was passed in wasn't used. This lookup is clumsy and racy. This patch set changes mkdir interface so that the filesystem can provide the correct dentry. Some times this still requires a look-up which can be racey, but having the filesystem do it limits this to only when it is absolutely necessary. So this series changes ->mkdir and vfs_mkdir() to allow a dentry to be returned, changes a few filesystems to actually return a dentry sometimes, and changes the callers of vfs_mkdir() to use the returned dentry. I think it best if this could all land through the VFS tree as ask maitainers of: cachefiles nfsd smb/server hostfs ceph nfs fuse to provide a Reviewed-by. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown @ 2025-02-20 23:36 ` NeilBrown 2025-02-22 4:19 ` Al Viro 2025-02-22 4:56 ` Al Viro 2025-02-20 23:36 ` [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible NeilBrown ` (4 subsequent siblings) 5 siblings, 2 replies; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs Some filesystems, such as NFS, cifs, ceph, and fuse, do not have complete control of sequencing on the actual filesystem (e.g. on a different server) and may find that the inode created for a mkdir request already exists in the icache and dcache by the time the mkdir request returns. For example, if the filesystem is mounted twice the directory could be visible on the other mount before it is on the original mount, and a pair of name_to_handle_at(), open_by_handle_at() calls could instantiate the directory inode with an IS_ROOT() dentry before the first mkdir returns. This means that the dentry passed to ->mkdir() may not be the one that is associated with the inode after the ->mkdir() completes. Some callers need to interact with the inode after the ->mkdir completes and they currently need to perform a lookup in the (rare) case that the dentry is no longer hashed. This lookup-after-mkdir requires that the directory remains locked to avoid races. Planned future patches to lock the dentry rather than the directory will mean that this lookup cannot be performed atomically with the mkdir. To remove this barrier, this patch changes ->mkdir to return the resulting dentry if it is different from the one passed in. Possible returns are: NULL - the directory was created and no other dentry was used ERR_PTR() - an error occurred non-NULL - this other dentry was spliced in This patch only changes file-systems to return "ERR_PTR(err)" instead of "err" or equivalent transformations. Subsequent patches will make further changes to some file-systems to return a correct dentry. Not all filesystems reliably result in a positive hashed dentry: - NFS, cifs, hostfs will sometimes need to perform a lookup of the name to get inode information. Races could result in this returning something different. Note that this lookup is non-atomic which is what we are trying to avoid. Placing the lookup in filesystem code means it only happens when the filesystem has no other option. - kernfs and tracefs leave the dentry negative and the ->revalidate operation ensures that lookup will be called to correctly populate the dentry. This could be fixed but I don't think it is important to any of the users of vfs_mkdir() which look at the dentry. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> (VFS, ext2, ext4, ocfs2, udf) Signed-off-by: NeilBrown <neilb@suse.de> --- Documentation/filesystems/locking.rst | 2 +- Documentation/filesystems/porting.rst | 19 +++++++++++++++++++ Documentation/filesystems/vfs.rst | 23 +++++++++++++++++++++-- fs/9p/vfs_inode.c | 7 +++---- fs/9p/vfs_inode_dotl.c | 8 ++++---- fs/affs/affs.h | 2 +- fs/affs/namei.c | 8 ++++---- fs/afs/dir.c | 12 ++++++------ fs/autofs/root.c | 14 +++++++------- fs/bad_inode.c | 6 +++--- fs/bcachefs/fs.c | 6 +++--- fs/btrfs/inode.c | 8 ++++---- fs/ceph/dir.c | 8 ++++---- fs/coda/dir.c | 14 +++++++------- fs/configfs/dir.c | 6 +++--- fs/ecryptfs/inode.c | 6 +++--- fs/exfat/namei.c | 8 ++++---- fs/ext2/namei.c | 9 +++++---- fs/ext4/namei.c | 10 +++++----- fs/f2fs/namei.c | 14 +++++++------- fs/fat/namei_msdos.c | 8 ++++---- fs/fat/namei_vfat.c | 8 ++++---- fs/fuse/dir.c | 6 +++--- fs/gfs2/inode.c | 9 +++++---- fs/hfs/dir.c | 10 +++++----- fs/hfsplus/dir.c | 6 +++--- fs/hostfs/hostfs_kern.c | 8 ++++---- fs/hpfs/namei.c | 10 +++++----- fs/hugetlbfs/inode.c | 6 +++--- fs/jffs2/dir.c | 18 +++++++++--------- fs/jfs/namei.c | 8 ++++---- fs/kernfs/dir.c | 12 ++++++------ fs/minix/namei.c | 8 ++++---- fs/namei.c | 15 ++++++++++++--- fs/nfs/dir.c | 8 ++++---- fs/nfs/internal.h | 4 ++-- fs/nilfs2/namei.c | 8 ++++---- fs/ntfs3/namei.c | 8 ++++---- fs/ocfs2/dlmfs/dlmfs.c | 10 +++++----- fs/ocfs2/namei.c | 10 +++++----- fs/omfs/dir.c | 6 +++--- fs/orangefs/namei.c | 8 ++++---- fs/overlayfs/dir.c | 9 +++++---- fs/ramfs/inode.c | 6 +++--- fs/smb/client/cifsfs.h | 4 ++-- fs/smb/client/inode.c | 10 +++++----- fs/sysv/namei.c | 8 ++++---- fs/tracefs/inode.c | 10 +++++----- fs/ubifs/dir.c | 10 +++++----- fs/udf/namei.c | 12 ++++++------ fs/ufs/namei.c | 8 ++++---- fs/vboxsf/dir.c | 8 ++++---- fs/xfs/xfs_iops.c | 4 ++-- include/linux/fs.h | 4 ++-- kernel/bpf/inode.c | 8 ++++---- mm/shmem.c | 8 ++++---- security/apparmor/apparmorfs.c | 8 ++++---- 57 files changed, 275 insertions(+), 226 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index d20a32b77b60..0ec0bb6eb0fb 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -66,7 +66,7 @@ prototypes:: int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); - int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry *(*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index 3ed3f39ecf71..d7171057aa3d 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -1178,3 +1178,22 @@ these conditions don't require explicit checks: LOOKUP_EXCL now means "target must not exist". It can be combined with LOOK_CREATE or LOOKUP_RENAME_TARGET. + +--- + +** mandatory** + +->mkdir() now returns a 'struct dentry *'. If the created inode is +found to already be in cache and have a dentry (often IS_ROOT), it will +need to be spliced into the given name in place of the given dentry. +That dentry now needs to be returned. If the original dentry is used, +NULL should be returned. Any error should be returned with +ERR_PTR(). + +In general, filesystems which use d_instantiate_new() to install the new +inode can safely return NULL. Filesystems which may not have an I_NEW inode +should use d_drop();d_splice_alias() and return the result of the latter. + +If a positive dentry cannot be returned for some reason, in-kernel +clients such as cachefiles, nfsd, smb/server may not perform ideally but +will fail-safe. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 31eea688609a..ae79c30b6c0c 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -495,7 +495,7 @@ As of kernel 2.6.22, the following members are defined: int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); - int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry *(*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, @@ -562,7 +562,26 @@ otherwise noted. ``mkdir`` called by the mkdir(2) system call. Only required if you want to support creating subdirectories. You will probably need to - call d_instantiate() just as you would in the create() method + call d_instantiate_new() just as you would in the create() method. + + If d_instantiate_new() is not used and if the fh_to_dentry() + export operation is provided, or if the storage might be + accessible by another path (e.g. with a network filesystem) + then more care may be needed. Importantly d_instantate() + should not be used with an inode that is no longer I_NEW if there + any chance that the inode could already be attached to a dentry. + This is because of a hard rule in the VFS that a directory must + only ever have one dentry. + + For example, if an NFS filesystem is mounted twice the new directory + could be visible on the other mount before it is on the original + mount, and a pair of name_to_handle_at(), open_by_handle_at() + calls could instantiate the directory inode with an IS_ROOT() + dentry before the first mkdir returns. + + If there is any chance this could happen, then the new inode + should be d_drop()ed and attached with d_splice_alias(). The + returned dentry (if any) should be returned by ->mkdir(). ``rmdir`` called by the rmdir(2) system call. Only required if you want diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 3e68521f4e2f..399d455d50d6 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -669,8 +669,8 @@ v9fs_vfs_create(struct mnt_idmap *idmap, struct inode *dir, * */ -static int v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int err; u32 perm; @@ -692,8 +692,7 @@ static int v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (fid) p9_fid_put(fid); - - return err; + return ERR_PTR(err); } /** diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c index 143ac03b7425..cc2007be2173 100644 --- a/fs/9p/vfs_inode_dotl.c +++ b/fs/9p/vfs_inode_dotl.c @@ -350,9 +350,9 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry, * */ -static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap, - struct inode *dir, struct dentry *dentry, - umode_t omode) +static struct dentry *v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap, + struct inode *dir, struct dentry *dentry, + umode_t omode) { int err; struct v9fs_session_info *v9ses; @@ -417,7 +417,7 @@ static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap, p9_fid_put(fid); v9fs_put_acl(dacl, pacl); p9_fid_put(dfid); - return err; + return ERR_PTR(err); } static int diff --git a/fs/affs/affs.h b/fs/affs/affs.h index e8c2c4535cb3..ac4e9a02910b 100644 --- a/fs/affs/affs.h +++ b/fs/affs/affs.h @@ -168,7 +168,7 @@ extern struct dentry *affs_lookup(struct inode *dir, struct dentry *dentry, unsi extern int affs_unlink(struct inode *dir, struct dentry *dentry); extern int affs_create(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode, bool); -extern int affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, +extern struct dentry *affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode); extern int affs_rmdir(struct inode *dir, struct dentry *dentry); extern int affs_link(struct dentry *olddentry, struct inode *dir, diff --git a/fs/affs/namei.c b/fs/affs/namei.c index 8c154490a2d6..f883be50db12 100644 --- a/fs/affs/namei.c +++ b/fs/affs/namei.c @@ -273,7 +273,7 @@ affs_create(struct mnt_idmap *idmap, struct inode *dir, return 0; } -int +struct dentry * affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode) { @@ -285,7 +285,7 @@ affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, inode = affs_new_inode(dir); if (!inode) - return -ENOSPC; + return ERR_PTR(-ENOSPC); inode->i_mode = S_IFDIR | mode; affs_mode_to_prot(inode); @@ -298,9 +298,9 @@ affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, clear_nlink(inode); mark_inode_dirty(inode); iput(inode); - return error; + return ERR_PTR(error); } - return 0; + return NULL; } int diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 02cbf38e1a77..5bddcc20786e 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -33,8 +33,8 @@ static bool afs_lookup_filldir(struct dir_context *ctx, const char *name, int nl loff_t fpos, u64 ino, unsigned dtype); static int afs_create(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode, bool excl); -static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode); +static struct dentry *afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode); static int afs_rmdir(struct inode *dir, struct dentry *dentry); static int afs_unlink(struct inode *dir, struct dentry *dentry); static int afs_link(struct dentry *from, struct inode *dir, @@ -1315,8 +1315,8 @@ static const struct afs_operation_ops afs_mkdir_operation = { /* * create a directory on an AFS filesystem */ -static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct afs_operation *op; struct afs_vnode *dvnode = AFS_FS_I(dir); @@ -1328,7 +1328,7 @@ static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, op = afs_alloc_operation(NULL, dvnode->volume); if (IS_ERR(op)) { d_drop(dentry); - return PTR_ERR(op); + return ERR_CAST(op); } fscache_use_cookie(afs_vnode_cache(dvnode), true); @@ -1344,7 +1344,7 @@ static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, op->ops = &afs_mkdir_operation; ret = afs_do_sync_operation(op); afs_dir_unuse_cookie(dvnode, ret); - return ret; + return ERR_PTR(ret); } /* diff --git a/fs/autofs/root.c b/fs/autofs/root.c index 530d18827e35..174c7205fee4 100644 --- a/fs/autofs/root.c +++ b/fs/autofs/root.c @@ -15,8 +15,8 @@ static int autofs_dir_symlink(struct mnt_idmap *, struct inode *, struct dentry *, const char *); static int autofs_dir_unlink(struct inode *, struct dentry *); static int autofs_dir_rmdir(struct inode *, struct dentry *); -static int autofs_dir_mkdir(struct mnt_idmap *, struct inode *, - struct dentry *, umode_t); +static struct dentry *autofs_dir_mkdir(struct mnt_idmap *, struct inode *, + struct dentry *, umode_t); static long autofs_root_ioctl(struct file *, unsigned int, unsigned long); #ifdef CONFIG_COMPAT static long autofs_root_compat_ioctl(struct file *, @@ -720,9 +720,9 @@ static int autofs_dir_rmdir(struct inode *dir, struct dentry *dentry) return 0; } -static int autofs_dir_mkdir(struct mnt_idmap *idmap, - struct inode *dir, struct dentry *dentry, - umode_t mode) +static struct dentry *autofs_dir_mkdir(struct mnt_idmap *idmap, + struct inode *dir, struct dentry *dentry, + umode_t mode) { struct autofs_sb_info *sbi = autofs_sbi(dir->i_sb); struct autofs_info *ino = autofs_dentry_ino(dentry); @@ -739,7 +739,7 @@ static int autofs_dir_mkdir(struct mnt_idmap *idmap, inode = autofs_get_inode(dir->i_sb, S_IFDIR | mode); if (!inode) - return -ENOMEM; + return ERR_PTR(-ENOMEM); d_add(dentry, inode); if (sbi->version < 5) @@ -751,7 +751,7 @@ static int autofs_dir_mkdir(struct mnt_idmap *idmap, inc_nlink(dir); inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); - return 0; + return NULL; } /* Get/set timeout ioctl() operation */ diff --git a/fs/bad_inode.c b/fs/bad_inode.c index 316d88da2ce1..0ef9bcb744dd 100644 --- a/fs/bad_inode.c +++ b/fs/bad_inode.c @@ -58,10 +58,10 @@ static int bad_inode_symlink(struct mnt_idmap *idmap, return -EIO; } -static int bad_inode_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *bad_inode_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return -EIO; + return ERR_PTR(-EIO); } static int bad_inode_rmdir (struct inode *dir, struct dentry *dentry) diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c index 90ade8f648d9..1c94a680fcce 100644 --- a/fs/bcachefs/fs.c +++ b/fs/bcachefs/fs.c @@ -858,10 +858,10 @@ static int bch2_symlink(struct mnt_idmap *idmap, return bch2_err_class(ret); } -static int bch2_mkdir(struct mnt_idmap *idmap, - struct inode *vdir, struct dentry *dentry, umode_t mode) +static struct dentry *bch2_mkdir(struct mnt_idmap *idmap, + struct inode *vdir, struct dentry *dentry, umode_t mode) { - return bch2_mknod(idmap, vdir, dentry, mode|S_IFDIR, 0); + return ERR_PTR(bch2_mknod(idmap, vdir, dentry, mode|S_IFDIR, 0)); } static int bch2_rename2(struct mnt_idmap *idmap, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a9322601ab5c..851d3e8a06a7 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6739,18 +6739,18 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, return err; } -static int btrfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *btrfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; inode = new_inode(dir->i_sb); if (!inode) - return -ENOMEM; + return ERR_PTR(-ENOMEM); inode_init_owner(idmap, inode, dir, S_IFDIR | mode); inode->i_op = &btrfs_dir_inode_operations; inode->i_fop = &btrfs_dir_file_operations; - return btrfs_create_common(dir, dentry, inode); + return ERR_PTR(btrfs_create_common(dir, dentry, inode)); } static noinline int uncompress_inline(struct btrfs_path *path, diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 62e99e65250d..39e0f240de06 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1092,8 +1092,8 @@ static int ceph_symlink(struct mnt_idmap *idmap, struct inode *dir, return err; } -static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); struct ceph_client *cl = mdsc->fsc->client; @@ -1104,7 +1104,7 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, err = ceph_wait_on_conflict_unlink(dentry); if (err) - return err; + return ERR_PTR(err); if (ceph_snap(dir) == CEPH_SNAPDIR) { /* mkdir .snap/foo is a MKSNAP */ @@ -1173,7 +1173,7 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, else d_drop(dentry); ceph_release_acl_sec_ctx(&as_ctx); - return err; + return ERR_PTR(err); } static int ceph_link(struct dentry *old_dentry, struct inode *dir, diff --git a/fs/coda/dir.c b/fs/coda/dir.c index a3e2dfeedfbf..ab69d8f0cec2 100644 --- a/fs/coda/dir.c +++ b/fs/coda/dir.c @@ -166,8 +166,8 @@ static int coda_create(struct mnt_idmap *idmap, struct inode *dir, return error; } -static int coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *de, umode_t mode) +static struct dentry *coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *de, umode_t mode) { struct inode *inode; struct coda_vattr attrs; @@ -177,14 +177,14 @@ static int coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct CodaFid newfid; if (is_root_inode(dir) && coda_iscontrol(name, len)) - return -EPERM; + return ERR_PTR(-EPERM); attrs.va_mode = mode; - error = venus_mkdir(dir->i_sb, coda_i2f(dir), + error = venus_mkdir(dir->i_sb, coda_i2f(dir), name, len, &newfid, &attrs); if (error) goto err_out; - + inode = coda_iget(dir->i_sb, &newfid, &attrs); if (IS_ERR(inode)) { error = PTR_ERR(inode); @@ -195,10 +195,10 @@ static int coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, coda_dir_inc_nlink(dir); coda_dir_update_mtime(dir); d_instantiate(de, inode); - return 0; + return NULL; err_out: d_drop(de); - return error; + return ERR_PTR(error); } /* try to make de an entry in dir_inodde linked to source_de */ diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 7d10278db30d..5568cb74b322 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -1280,8 +1280,8 @@ int configfs_depend_item_unlocked(struct configfs_subsystem *caller_subsys, } EXPORT_SYMBOL(configfs_depend_item_unlocked); -static int configfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *configfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int ret = 0; int module_got = 0; @@ -1461,7 +1461,7 @@ static int configfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, put_fragment(frag); out: - return ret; + return ERR_PTR(ret); } static int configfs_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c index a9819ddb1ab8..6315dd194228 100644 --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -503,8 +503,8 @@ static int ecryptfs_symlink(struct mnt_idmap *idmap, return rc; } -static int ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int rc; struct dentry *lower_dentry; @@ -526,7 +526,7 @@ static int ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, inode_unlock(lower_dir); if (d_really_is_negative(dentry)) d_drop(dentry); - return rc; + return ERR_PTR(rc); } static int ecryptfs_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c index 691dd77b6ab5..1660c9bbcfa9 100644 --- a/fs/exfat/namei.c +++ b/fs/exfat/namei.c @@ -835,8 +835,8 @@ static int exfat_unlink(struct inode *dir, struct dentry *dentry) return err; } -static int exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct super_block *sb = dir->i_sb; struct inode *inode; @@ -846,7 +846,7 @@ static int exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, loff_t size = i_size_read(dir); if (unlikely(exfat_forced_shutdown(sb))) - return -EIO; + return ERR_PTR(-EIO); mutex_lock(&EXFAT_SB(sb)->s_lock); exfat_set_volume_dirty(sb); @@ -877,7 +877,7 @@ static int exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, unlock: mutex_unlock(&EXFAT_SB(sb)->s_lock); - return err; + return ERR_PTR(err); } static int exfat_check_dir_empty(struct super_block *sb, diff --git a/fs/ext2/namei.c b/fs/ext2/namei.c index 8346ab9534c1..bde617a66cec 100644 --- a/fs/ext2/namei.c +++ b/fs/ext2/namei.c @@ -225,15 +225,16 @@ static int ext2_link (struct dentry * old_dentry, struct inode * dir, return err; } -static int ext2_mkdir(struct mnt_idmap * idmap, - struct inode * dir, struct dentry * dentry, umode_t mode) +static struct dentry *ext2_mkdir(struct mnt_idmap * idmap, + struct inode * dir, struct dentry * dentry, + umode_t mode) { struct inode * inode; int err; err = dquot_initialize(dir); if (err) - return err; + return ERR_PTR(err); inode_inc_link_count(dir); @@ -258,7 +259,7 @@ static int ext2_mkdir(struct mnt_idmap * idmap, d_instantiate_new(dentry, inode); out: - return err; + return ERR_PTR(err); out_fail: inode_dec_link_count(inode); diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 536d56d15072..716cc6096870 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3004,19 +3004,19 @@ int ext4_init_new_dir(handle_t *handle, struct inode *dir, return err; } -static int ext4_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ext4_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { handle_t *handle; struct inode *inode; int err, err2 = 0, credits, retries = 0; if (EXT4_DIR_LINK_MAX(dir)) - return -EMLINK; + return ERR_PTR(-EMLINK); err = dquot_initialize(dir); if (err) - return err; + return ERR_PTR(err); credits = (EXT4_DATA_TRANS_BLOCKS(dir->i_sb) + EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3); @@ -3066,7 +3066,7 @@ static int ext4_mkdir(struct mnt_idmap *idmap, struct inode *dir, out_retry: if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries)) goto retry; - return err; + return ERR_PTR(err); } /* diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c index a278c7da8177..24dca4dc85a9 100644 --- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -684,23 +684,23 @@ static int f2fs_symlink(struct mnt_idmap *idmap, struct inode *dir, return err; } -static int f2fs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *f2fs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct f2fs_sb_info *sbi = F2FS_I_SB(dir); struct inode *inode; int err; if (unlikely(f2fs_cp_error(sbi))) - return -EIO; + return ERR_PTR(-EIO); err = f2fs_dquot_initialize(dir); if (err) - return err; + return ERR_PTR(err); inode = f2fs_new_inode(idmap, dir, S_IFDIR | mode, NULL); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); inode->i_op = &f2fs_dir_inode_operations; inode->i_fop = &f2fs_dir_operations; @@ -722,12 +722,12 @@ static int f2fs_mkdir(struct mnt_idmap *idmap, struct inode *dir, f2fs_sync_fs(sbi->sb, 1); f2fs_balance_fs(sbi, true); - return 0; + return NULL; out_fail: clear_inode_flag(inode, FI_INC_LINK); f2fs_handle_failed_inode(inode); - return err; + return ERR_PTR(err); } static int f2fs_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c index f06f6ba643cc..23e9b9371ec3 100644 --- a/fs/fat/namei_msdos.c +++ b/fs/fat/namei_msdos.c @@ -339,8 +339,8 @@ static int msdos_rmdir(struct inode *dir, struct dentry *dentry) } /***** Make a directory */ -static int msdos_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *msdos_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct super_block *sb = dir->i_sb; struct fat_slot_info sinfo; @@ -389,13 +389,13 @@ static int msdos_mkdir(struct mnt_idmap *idmap, struct inode *dir, mutex_unlock(&MSDOS_SB(sb)->s_lock); fat_flush_inodes(sb, dir, inode); - return 0; + return NULL; out_free: fat_free_clusters(dir, cluster); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); - return err; + return ERR_PTR(err); } /***** Unlink a file */ diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index 926c26e90ef8..dd910edd2404 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -841,8 +841,8 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry) return err; } -static int vfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *vfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct super_block *sb = dir->i_sb; struct inode *inode; @@ -877,13 +877,13 @@ static int vfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, d_instantiate(dentry, inode); mutex_unlock(&MSDOS_SB(sb)->s_lock); - return 0; + return NULL; out_free: fat_free_clusters(dir, cluster); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); - return err; + return ERR_PTR(err); } static int vfat_get_dotdot_de(struct inode *inode, struct buffer_head **bh, diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 198862b086ff..5bb65f38bfb8 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -898,8 +898,8 @@ static int fuse_tmpfile(struct mnt_idmap *idmap, struct inode *dir, return err; } -static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *entry, umode_t mode) +static struct dentry *fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *entry, umode_t mode) { struct fuse_mkdir_in inarg; struct fuse_mount *fm = get_fuse_mount(dir); @@ -917,7 +917,7 @@ static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; - return create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR); + return ERR_PTR(create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR)); } static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 6fbbaaad1cd0..198a8cbaf5e5 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1248,14 +1248,15 @@ static int gfs2_symlink(struct mnt_idmap *idmap, struct inode *dir, * @dentry: The dentry of the new directory * @mode: The mode of the new directory * - * Returns: errno + * Returns: the dentry, or ERR_PTR(errno) */ -static int gfs2_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *gfs2_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { unsigned dsize = gfs2_max_stuffed_size(GFS2_I(dir)); - return gfs2_create_inode(dir, dentry, NULL, S_IFDIR | mode, 0, NULL, dsize, 0); + + return ERR_PTR(gfs2_create_inode(dir, dentry, NULL, S_IFDIR | mode, 0, NULL, dsize, 0)); } /** diff --git a/fs/hfs/dir.c b/fs/hfs/dir.c index b75c26045df4..86a6b317b474 100644 --- a/fs/hfs/dir.c +++ b/fs/hfs/dir.c @@ -219,26 +219,26 @@ static int hfs_create(struct mnt_idmap *idmap, struct inode *dir, * in a directory, given the inode for the parent directory and the * name (and its length) of the new directory. */ -static int hfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; int res; inode = hfs_new_inode(dir, &dentry->d_name, S_IFDIR | mode); if (!inode) - return -ENOMEM; + return ERR_PTR(-ENOMEM); res = hfs_cat_create(inode->i_ino, dir, &dentry->d_name, inode); if (res) { clear_nlink(inode); hfs_delete_inode(inode); iput(inode); - return res; + return ERR_PTR(res); } d_instantiate(dentry, inode); mark_inode_dirty(inode); - return 0; + return NULL; } /* diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c index f5c4b3e31a1c..876bbb80fb4d 100644 --- a/fs/hfsplus/dir.c +++ b/fs/hfsplus/dir.c @@ -523,10 +523,10 @@ static int hfsplus_create(struct mnt_idmap *idmap, struct inode *dir, return hfsplus_mknod(&nop_mnt_idmap, dir, dentry, mode, 0); } -static int hfsplus_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hfsplus_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return hfsplus_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0); + return ERR_PTR(hfsplus_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0)); } static int hfsplus_rename(struct mnt_idmap *idmap, diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index e0741e468956..ccbb48fe830d 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -679,17 +679,17 @@ static int hostfs_symlink(struct mnt_idmap *idmap, struct inode *ino, return err; } -static int hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino, - struct dentry *dentry, umode_t mode) +static struct dentry *hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino, + struct dentry *dentry, umode_t mode) { char *file; int err; if ((file = dentry_name(dentry)) == NULL) - return -ENOMEM; + return ERR_PTR(-ENOMEM); err = do_mkdir(file, mode); __putname(file); - return err; + return ERR_PTR(err); } static int hostfs_rmdir(struct inode *ino, struct dentry *dentry) diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c index d0edf9ed33b6..e3cdc421dfba 100644 --- a/fs/hpfs/namei.c +++ b/fs/hpfs/namei.c @@ -19,8 +19,8 @@ static void hpfs_update_directory_times(struct inode *dir) hpfs_write_inode_nolock(dir); } -static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { const unsigned char *name = dentry->d_name.name; unsigned len = dentry->d_name.len; @@ -35,7 +35,7 @@ static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, int r; struct hpfs_dirent dee; int err; - if ((err = hpfs_chk_name(name, &len))) return err==-ENOENT ? -EINVAL : err; + if ((err = hpfs_chk_name(name, &len))) return ERR_PTR(err==-ENOENT ? -EINVAL : err); hpfs_lock(dir->i_sb); err = -ENOSPC; fnode = hpfs_alloc_fnode(dir->i_sb, hpfs_i(dir)->i_dno, &fno, &bh); @@ -112,7 +112,7 @@ static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, hpfs_update_directory_times(dir); d_instantiate(dentry, result); hpfs_unlock(dir->i_sb); - return 0; + return NULL; bail3: iput(result); bail2: @@ -123,7 +123,7 @@ static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, hpfs_free_sectors(dir->i_sb, fno, 1); bail: hpfs_unlock(dir->i_sb); - return err; + return ERR_PTR(err); } static int hpfs_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 0fc179a59830..d98caedbb723 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -991,14 +991,14 @@ static int hugetlbfs_mknod(struct mnt_idmap *idmap, struct inode *dir, return 0; } -static int hugetlbfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hugetlbfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int retval = hugetlbfs_mknod(idmap, dir, dentry, mode | S_IFDIR, 0); if (!retval) inc_nlink(dir); - return retval; + return ERR_PTR(retval); } static int hugetlbfs_create(struct mnt_idmap *idmap, diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c index 2b2938970da3..dd91f725ded6 100644 --- a/fs/jffs2/dir.c +++ b/fs/jffs2/dir.c @@ -32,8 +32,8 @@ static int jffs2_link (struct dentry *,struct inode *,struct dentry *); static int jffs2_unlink (struct inode *,struct dentry *); static int jffs2_symlink (struct mnt_idmap *, struct inode *, struct dentry *, const char *); -static int jffs2_mkdir (struct mnt_idmap *, struct inode *,struct dentry *, - umode_t); +static struct dentry *jffs2_mkdir (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t); static int jffs2_rmdir (struct inode *,struct dentry *); static int jffs2_mknod (struct mnt_idmap *, struct inode *,struct dentry *, umode_t,dev_t); @@ -446,8 +446,8 @@ static int jffs2_symlink (struct mnt_idmap *idmap, struct inode *dir_i, } -static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, - struct dentry *dentry, umode_t mode) +static struct dentry *jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, + struct dentry *dentry, umode_t mode) { struct jffs2_inode_info *f, *dir_f; struct jffs2_sb_info *c; @@ -464,7 +464,7 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, ri = jffs2_alloc_raw_inode(); if (!ri) - return -ENOMEM; + return ERR_PTR(-ENOMEM); c = JFFS2_SB_INFO(dir_i->i_sb); @@ -477,7 +477,7 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, if (ret) { jffs2_free_raw_inode(ri); - return ret; + return ERR_PTR(ret); } inode = jffs2_new_inode(dir_i, mode, ri); @@ -485,7 +485,7 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, if (IS_ERR(inode)) { jffs2_free_raw_inode(ri); jffs2_complete_reservation(c); - return PTR_ERR(inode); + return ERR_CAST(inode); } inode->i_op = &jffs2_dir_inode_operations; @@ -584,11 +584,11 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, jffs2_complete_reservation(c); d_instantiate_new(dentry, inode); - return 0; + return NULL; fail: iget_failed(inode); - return ret; + return ERR_PTR(ret); } static int jffs2_rmdir (struct inode *dir_i, struct dentry *dentry) diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c index fc8ede43afde..65a218eba8fa 100644 --- a/fs/jfs/namei.c +++ b/fs/jfs/namei.c @@ -187,13 +187,13 @@ static int jfs_create(struct mnt_idmap *idmap, struct inode *dip, * dentry - dentry of child directory * mode - create mode (rwxrwxrwx). * - * RETURN: Errors from subroutines + * RETURN: ERR_PTR() of errors from subroutines. * * note: * EACCES: user needs search+write permission on the parent directory */ -static int jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, - struct dentry *dentry, umode_t mode) +static struct dentry *jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, + struct dentry *dentry, umode_t mode) { int rc = 0; tid_t tid; /* transaction id */ @@ -308,7 +308,7 @@ static int jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, out1: jfs_info("jfs_mkdir: rc:%d", rc); - return rc; + return ERR_PTR(rc); } /* diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 5f0f8b95f44c..d296aad70800 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -1230,24 +1230,24 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir, return d_splice_alias(inode, dentry); } -static int kernfs_iop_mkdir(struct mnt_idmap *idmap, - struct inode *dir, struct dentry *dentry, - umode_t mode) +static struct dentry *kernfs_iop_mkdir(struct mnt_idmap *idmap, + struct inode *dir, struct dentry *dentry, + umode_t mode) { struct kernfs_node *parent = dir->i_private; struct kernfs_syscall_ops *scops = kernfs_root(parent)->syscall_ops; int ret; if (!scops || !scops->mkdir) - return -EPERM; + return ERR_PTR(-EPERM); if (!kernfs_get_active(parent)) - return -ENODEV; + return ERR_PTR(-ENODEV); ret = scops->mkdir(parent, dentry->d_name.name, mode); kernfs_put_active(parent); - return ret; + return ERR_PTR(ret); } static int kernfs_iop_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/minix/namei.c b/fs/minix/namei.c index 5d9c1406fe27..8938536d8d3c 100644 --- a/fs/minix/namei.c +++ b/fs/minix/namei.c @@ -104,15 +104,15 @@ static int minix_link(struct dentry * old_dentry, struct inode * dir, return add_nondir(dentry, inode); } -static int minix_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *minix_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode * inode; int err; inode = minix_new_inode(dir, S_IFDIR | mode); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); inode_inc_link_count(dir); minix_set_inode(inode, 0); @@ -128,7 +128,7 @@ static int minix_mkdir(struct mnt_idmap *idmap, struct inode *dir, d_instantiate(dentry, inode); out: - return err; + return ERR_PTR(err); out_fail: inode_dec_link_count(inode); diff --git a/fs/namei.c b/fs/namei.c index 4677d86f9758..63fe4dc29c23 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4290,6 +4290,7 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, { int error; unsigned max_links = dir->i_sb->s_max_links; + struct dentry *de; error = may_create(idmap, dir, dentry); if (error) @@ -4306,10 +4307,18 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (max_links && dir->i_nlink >= max_links) return -EMLINK; - error = dir->i_op->mkdir(idmap, dir, dentry, mode); - if (!error) + de = dir->i_op->mkdir(idmap, dir, dentry, mode); + if (IS_ERR(de)) + return PTR_ERR(de); + if (de) { + fsnotify_mkdir(dir, de); + /* Cannot return de yet */ + dput(de); + } else { fsnotify_mkdir(dir, dentry); - return error; + } + + return 0; } EXPORT_SYMBOL(vfs_mkdir); diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 56cf16a72334..101b1098e87b 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2422,8 +2422,8 @@ EXPORT_SYMBOL_GPL(nfs_mknod); /* * See comments for nfs_proc_create regarding failed operations. */ -int nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +struct dentry *nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct iattr attr; int error; @@ -2439,10 +2439,10 @@ int nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, trace_nfs_mkdir_exit(dir, dentry, error); if (error != 0) goto out_err; - return 0; + return NULL; out_err: d_drop(dentry); - return error; + return ERR_PTR(error); } EXPORT_SYMBOL_GPL(nfs_mkdir); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index fae2c7ae4acc..1ac1d3eec517 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -400,8 +400,8 @@ struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int); void nfs_d_prune_case_insensitive_aliases(struct inode *inode); int nfs_create(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, bool); -int nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, - umode_t); +struct dentry *nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, + umode_t); int nfs_rmdir(struct inode *, struct dentry *); int nfs_unlink(struct inode *, struct dentry *); int nfs_symlink(struct mnt_idmap *, struct inode *, struct dentry *, diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c index 953fbd5f0851..40f4b1a28705 100644 --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -218,8 +218,8 @@ static int nilfs_link(struct dentry *old_dentry, struct inode *dir, return err; } -static int nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct nilfs_transaction_info ti; @@ -227,7 +227,7 @@ static int nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, err = nilfs_transaction_begin(dir->i_sb, &ti, 1); if (err) - return err; + return ERR_PTR(err); inc_nlink(dir); @@ -258,7 +258,7 @@ static int nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, else nilfs_transaction_abort(dir->i_sb); - return err; + return ERR_PTR(err); out_fail: drop_nlink(inode); diff --git a/fs/ntfs3/namei.c b/fs/ntfs3/namei.c index abf7e81584a9..652735a0b0c4 100644 --- a/fs/ntfs3/namei.c +++ b/fs/ntfs3/namei.c @@ -201,11 +201,11 @@ static int ntfs_symlink(struct mnt_idmap *idmap, struct inode *dir, /* * ntfs_mkdir- inode_operations::mkdir */ -static int ntfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ntfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return ntfs_create_inode(idmap, dir, dentry, NULL, S_IFDIR | mode, 0, - NULL, 0, NULL); + return ERR_PTR(ntfs_create_inode(idmap, dir, dentry, NULL, S_IFDIR | mode, 0, + NULL, 0, NULL)); } /* diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c index 2a7f36643895..5130ec44e5e1 100644 --- a/fs/ocfs2/dlmfs/dlmfs.c +++ b/fs/ocfs2/dlmfs/dlmfs.c @@ -402,10 +402,10 @@ static struct inode *dlmfs_get_inode(struct inode *parent, * File creation. Allocate an inode, and we're done.. */ /* SMP-safe */ -static int dlmfs_mkdir(struct mnt_idmap * idmap, - struct inode * dir, - struct dentry * dentry, - umode_t mode) +static struct dentry *dlmfs_mkdir(struct mnt_idmap * idmap, + struct inode * dir, + struct dentry * dentry, + umode_t mode) { int status; struct inode *inode = NULL; @@ -448,7 +448,7 @@ static int dlmfs_mkdir(struct mnt_idmap * idmap, bail: if (status < 0) iput(inode); - return status; + return ERR_PTR(status); } static int dlmfs_create(struct mnt_idmap *idmap, diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c index 0ec63a1a94b8..99278c8f0e24 100644 --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -644,10 +644,10 @@ static int ocfs2_mknod_locked(struct ocfs2_super *osb, suballoc_loc, suballoc_bit); } -static int ocfs2_mkdir(struct mnt_idmap *idmap, - struct inode *dir, - struct dentry *dentry, - umode_t mode) +static struct dentry *ocfs2_mkdir(struct mnt_idmap *idmap, + struct inode *dir, + struct dentry *dentry, + umode_t mode) { int ret; @@ -657,7 +657,7 @@ static int ocfs2_mkdir(struct mnt_idmap *idmap, if (ret) mlog_errno(ret); - return ret; + return ERR_PTR(ret); } static int ocfs2_create(struct mnt_idmap *idmap, diff --git a/fs/omfs/dir.c b/fs/omfs/dir.c index 6bda275826d6..2ed541fccf33 100644 --- a/fs/omfs/dir.c +++ b/fs/omfs/dir.c @@ -279,10 +279,10 @@ static int omfs_add_node(struct inode *dir, struct dentry *dentry, umode_t mode) return err; } -static int omfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *omfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return omfs_add_node(dir, dentry, mode | S_IFDIR); + return ERR_PTR(omfs_add_node(dir, dentry, mode | S_IFDIR)); } static int omfs_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/orangefs/namei.c b/fs/orangefs/namei.c index 200558ec72f0..82395fe2b956 100644 --- a/fs/orangefs/namei.c +++ b/fs/orangefs/namei.c @@ -300,8 +300,8 @@ static int orangefs_symlink(struct mnt_idmap *idmap, return ret; } -static int orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct orangefs_inode_s *parent = ORANGEFS_I(dir); struct orangefs_kernel_op_s *new_op; @@ -312,7 +312,7 @@ static int orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, new_op = op_alloc(ORANGEFS_VFS_OP_MKDIR); if (!new_op) - return -ENOMEM; + return ERR_PTR(-ENOMEM); new_op->upcall.req.mkdir.parent_refn = parent->refn; @@ -366,7 +366,7 @@ static int orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, __orangefs_setattr(dir, &iattr); out: op_release(new_op); - return ret; + return ERR_PTR(ret); } static int orangefs_rename(struct mnt_idmap *idmap, diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index c9993ff66fc2..21c3aaf7b274 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -282,7 +282,8 @@ static int ovl_instantiate(struct dentry *dentry, struct inode *inode, * XXX: if we ever use ovl_obtain_alias() to decode directory * file handles, need to use ovl_get_inode_locked() and * d_instantiate_new() here to prevent from creating two - * hashed directory inode aliases. + * hashed directory inode aliases. We then need to return + * the obtained alias to ovl_mkdir(). */ inode = ovl_get_inode(dentry->d_sb, &oip); if (IS_ERR(inode)) @@ -687,10 +688,10 @@ static int ovl_create(struct mnt_idmap *idmap, struct inode *dir, return ovl_create_object(dentry, (mode & 07777) | S_IFREG, 0, NULL); } -static int ovl_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ovl_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return ovl_create_object(dentry, (mode & 07777) | S_IFDIR, 0, NULL); + return ERR_PTR(ovl_create_object(dentry, (mode & 07777) | S_IFDIR, 0, NULL)); } static int ovl_mknod(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c index 8006faaaf0ec..775fa905fda0 100644 --- a/fs/ramfs/inode.c +++ b/fs/ramfs/inode.c @@ -119,13 +119,13 @@ ramfs_mknod(struct mnt_idmap *idmap, struct inode *dir, return error; } -static int ramfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ramfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int retval = ramfs_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0); if (!retval) inc_nlink(dir); - return retval; + return ERR_PTR(retval); } static int ramfs_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h index 831fee962c4d..8dea0cf3a8de 100644 --- a/fs/smb/client/cifsfs.h +++ b/fs/smb/client/cifsfs.h @@ -59,8 +59,8 @@ extern int cifs_unlink(struct inode *dir, struct dentry *dentry); extern int cifs_hardlink(struct dentry *, struct inode *, struct dentry *); extern int cifs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, dev_t); -extern int cifs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, - umode_t); +extern struct dentry *cifs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, + umode_t); extern int cifs_rmdir(struct inode *, struct dentry *); extern int cifs_rename2(struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index 9cc31cf6ebd0..685a176f7f7e 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -2194,8 +2194,8 @@ cifs_posix_mkdir(struct inode *inode, struct dentry *dentry, umode_t mode, } #endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */ -int cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, - struct dentry *direntry, umode_t mode) +struct dentry *cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, + struct dentry *direntry, umode_t mode) { int rc = 0; unsigned int xid; @@ -2211,10 +2211,10 @@ int cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, cifs_sb = CIFS_SB(inode->i_sb); if (unlikely(cifs_forced_shutdown(cifs_sb))) - return -EIO; + return ERR_PTR(-EIO); tlink = cifs_sb_tlink(cifs_sb); if (IS_ERR(tlink)) - return PTR_ERR(tlink); + return ERR_CAST(tlink); tcon = tlink_tcon(tlink); xid = get_xid(); @@ -2270,7 +2270,7 @@ int cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, free_dentry_path(page); free_xid(xid); cifs_put_tlink(tlink); - return rc; + return ERR_PTR(rc); } int cifs_rmdir(struct inode *inode, struct dentry *direntry) diff --git a/fs/sysv/namei.c b/fs/sysv/namei.c index fb8bd8437872..ba037727c1e6 100644 --- a/fs/sysv/namei.c +++ b/fs/sysv/namei.c @@ -110,8 +110,8 @@ static int sysv_link(struct dentry * old_dentry, struct inode * dir, return add_nondir(dentry, inode); } -static int sysv_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *sysv_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode * inode; int err; @@ -135,9 +135,9 @@ static int sysv_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (err) goto out_fail; - d_instantiate(dentry, inode); + d_instantiate(dentry, inode); out: - return err; + return ERR_PTR(err); out_fail: inode_dec_link_count(inode); diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c index 53214499e384..cb1af30b49f5 100644 --- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -109,9 +109,9 @@ static char *get_dname(struct dentry *dentry) return name; } -static int tracefs_syscall_mkdir(struct mnt_idmap *idmap, - struct inode *inode, struct dentry *dentry, - umode_t mode) +static struct dentry *tracefs_syscall_mkdir(struct mnt_idmap *idmap, + struct inode *inode, struct dentry *dentry, + umode_t mode) { struct tracefs_inode *ti; char *name; @@ -119,7 +119,7 @@ static int tracefs_syscall_mkdir(struct mnt_idmap *idmap, name = get_dname(dentry); if (!name) - return -ENOMEM; + return ERR_PTR(-ENOMEM); /* * This is a new directory that does not take the default of @@ -141,7 +141,7 @@ static int tracefs_syscall_mkdir(struct mnt_idmap *idmap, kfree(name); - return ret; + return ERR_PTR(ret); } static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index fda82f3e16e8..3c3d3ad4fa6c 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1002,8 +1002,8 @@ static int ubifs_rmdir(struct inode *dir, struct dentry *dentry) return err; } -static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct ubifs_inode *dir_ui = ubifs_inode(dir); @@ -1023,7 +1023,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, err = ubifs_budget_space(c, &req); if (err) - return err; + return ERR_PTR(err); err = ubifs_prepare_create(dir, dentry, &nm); if (err) @@ -1060,7 +1060,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, ubifs_release_budget(c, &req); d_instantiate(dentry, inode); fscrypt_free_filename(&nm); - return 0; + return NULL; out_cancel: dir->i_size -= sz_change; @@ -1074,7 +1074,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, fscrypt_free_filename(&nm); out_budg: ubifs_release_budget(c, &req); - return err; + return ERR_PTR(err); } static int ubifs_mknod(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/udf/namei.c b/fs/udf/namei.c index 2cb49b6b0716..5f2e9a892bff 100644 --- a/fs/udf/namei.c +++ b/fs/udf/namei.c @@ -419,8 +419,8 @@ static int udf_mknod(struct mnt_idmap *idmap, struct inode *dir, return udf_add_nondir(dentry, inode); } -static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct udf_fileident_iter iter; @@ -430,7 +430,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, inode = udf_new_inode(dir, S_IFDIR | mode); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); iinfo = UDF_I(inode); inode->i_op = &udf_dir_inode_operations; @@ -439,7 +439,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (err) { clear_nlink(inode); discard_new_inode(inode); - return err; + return ERR_PTR(err); } set_nlink(inode, 2); iter.fi.icb.extLength = cpu_to_le32(inode->i_sb->s_blocksize); @@ -456,7 +456,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (err) { clear_nlink(inode); discard_new_inode(inode); - return err; + return ERR_PTR(err); } iter.fi.icb.extLength = cpu_to_le32(inode->i_sb->s_blocksize); iter.fi.icb.extLocation = cpu_to_lelb(iinfo->i_location); @@ -471,7 +471,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, mark_inode_dirty(dir); d_instantiate_new(dentry, inode); - return 0; + return NULL; } static int empty_dir(struct inode *dir) diff --git a/fs/ufs/namei.c b/fs/ufs/namei.c index 38a024c8cccd..5b3c85c93242 100644 --- a/fs/ufs/namei.c +++ b/fs/ufs/namei.c @@ -166,8 +166,8 @@ static int ufs_link (struct dentry * old_dentry, struct inode * dir, return error; } -static int ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, - struct dentry * dentry, umode_t mode) +static struct dentry *ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, + struct dentry * dentry, umode_t mode) { struct inode * inode; int err; @@ -194,7 +194,7 @@ static int ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, goto out_fail; d_instantiate_new(dentry, inode); - return 0; + return NULL; out_fail: inode_dec_link_count(inode); @@ -202,7 +202,7 @@ static int ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, discard_new_inode(inode); out_dir: inode_dec_link_count(dir); - return err; + return ERR_PTR(err); } static int ufs_unlink(struct inode *dir, struct dentry *dentry) diff --git a/fs/vboxsf/dir.c b/fs/vboxsf/dir.c index a859ac9b74ba..770e29ec3557 100644 --- a/fs/vboxsf/dir.c +++ b/fs/vboxsf/dir.c @@ -303,11 +303,11 @@ static int vboxsf_dir_mkfile(struct mnt_idmap *idmap, return vboxsf_dir_create(parent, dentry, mode, false, excl, NULL); } -static int vboxsf_dir_mkdir(struct mnt_idmap *idmap, - struct inode *parent, struct dentry *dentry, - umode_t mode) +static struct dentry *vboxsf_dir_mkdir(struct mnt_idmap *idmap, + struct inode *parent, struct dentry *dentry, + umode_t mode) { - return vboxsf_dir_create(parent, dentry, mode, true, true, NULL); + return ERR_PTR(vboxsf_dir_create(parent, dentry, mode, true, true, NULL)); } static int vboxsf_dir_atomic_open(struct inode *parent, struct dentry *dentry, diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 40289fe6f5b2..a4480098d2bf 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -298,14 +298,14 @@ xfs_vn_create( return xfs_generic_create(idmap, dir, dentry, mode, 0, NULL); } -STATIC int +STATIC struct dentry * xfs_vn_mkdir( struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode) { - return xfs_generic_create(idmap, dir, dentry, mode | S_IFDIR, 0, NULL); + return ERR_PTR(xfs_generic_create(idmap, dir, dentry, mode | S_IFDIR, 0, NULL)); } STATIC struct dentry * diff --git a/include/linux/fs.h b/include/linux/fs.h index ac5d699e3aab..8f4fbecd40fc 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2201,8 +2201,8 @@ struct inode_operations { int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *, const char *); - int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *, - umode_t); + struct dentry *(*mkdir) (struct mnt_idmap *, struct inode *, + struct dentry *, umode_t); int (*rmdir) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t,dev_t); diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 9aaf5124648b..dc3aa91a6ba0 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -150,14 +150,14 @@ static void bpf_dentry_finalize(struct dentry *dentry, struct inode *inode, inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); } -static int bpf_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *bpf_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; inode = bpf_get_inode(dir->i_sb, dir, mode | S_IFDIR); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); inode->i_op = &bpf_dir_iops; inode->i_fop = &simple_dir_operations; @@ -166,7 +166,7 @@ static int bpf_mkdir(struct mnt_idmap *idmap, struct inode *dir, inc_nlink(dir); bpf_dentry_finalize(dentry, inode, dir); - return 0; + return NULL; } struct map_iter { diff --git a/mm/shmem.c b/mm/shmem.c index 4ea6109a8043..00ae0146e768 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3889,16 +3889,16 @@ shmem_tmpfile(struct mnt_idmap *idmap, struct inode *dir, return error; } -static int shmem_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *shmem_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int error; error = shmem_mknod(idmap, dir, dentry, mode | S_IFDIR, 0); if (error) - return error; + return ERR_PTR(error); inc_nlink(dir); - return 0; + return NULL; } static int shmem_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index c07d150685d7..6039afae4bfc 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -1795,8 +1795,8 @@ int __aafs_profile_mkdir(struct aa_profile *profile, struct dentry *parent) return error; } -static int ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct aa_ns *ns, *parent; /* TODO: improve permission check */ @@ -1808,7 +1808,7 @@ static int ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, AA_MAY_LOAD_POLICY); end_current_label_crit_section(label); if (error) - return error; + return ERR_PTR(error); parent = aa_get_ns(dir->i_private); AA_BUG(d_inode(ns_subns_dir(parent)) != dir); @@ -1843,7 +1843,7 @@ static int ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, mutex_unlock(&parent->lock); aa_put_ns(parent); - return error; + return ERR_PTR(error); } static int ns_rmdir_op(struct inode *dir, struct dentry *dentry) -- 2.47.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-20 23:36 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown @ 2025-02-22 4:19 ` Al Viro 2025-02-24 1:34 ` NeilBrown 2025-02-22 4:56 ` Al Viro 1 sibling, 1 reply; 36+ messages in thread From: Al Viro @ 2025-02-22 4:19 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > +In general, filesystems which use d_instantiate_new() to install the new > +inode can safely return NULL. Filesystems which may not have an I_NEW inode > +should use d_drop();d_splice_alias() and return the result of the latter. IMO that's a bad pattern, _especially_ if you want to go for "in-update" kind of stuff later. That's pretty much the same thing as d_drop()/d_rehash() window. We'd be better off dropping that BUG_ON() in d_splice_alias() and teaching __d_add() to handle the "it's a hashed negative" case. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-22 4:19 ` Al Viro @ 2025-02-24 1:34 ` NeilBrown 2025-02-24 2:09 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-24 1:34 UTC (permalink / raw) To: Al Viro Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Sat, 22 Feb 2025, Al Viro wrote: > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > +In general, filesystems which use d_instantiate_new() to install the new > > +inode can safely return NULL. Filesystems which may not have an I_NEW inode > > +should use d_drop();d_splice_alias() and return the result of the latter. > > IMO that's a bad pattern, _especially_ if you want to go for "in-update" > kind of stuff later. Agreed. I have a draft patch to change d_splice_alias() and d_exact_alias() to work on hashed dentrys. I thought it should go after these mkdir patches rather than before. Thanks, NeilBrown > > That's pretty much the same thing as d_drop()/d_rehash() window. > > We'd be better off dropping that BUG_ON() in d_splice_alias() and teaching > __d_add() to handle the "it's a hashed negative" case. > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-24 1:34 ` NeilBrown @ 2025-02-24 2:09 ` Al Viro 2025-02-24 3:09 ` NeilBrown 0 siblings, 1 reply; 36+ messages in thread From: Al Viro @ 2025-02-24 2:09 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > On Sat, 22 Feb 2025, Al Viro wrote: > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > +In general, filesystems which use d_instantiate_new() to install the new > > > +inode can safely return NULL. Filesystems which may not have an I_NEW inode > > > +should use d_drop();d_splice_alias() and return the result of the latter. > > > > IMO that's a bad pattern, _especially_ if you want to go for "in-update" > > kind of stuff later. > > Agreed. I have a draft patch to change d_splice_alias() and > d_exact_alias() to work on hashed dentrys. I thought it should go after > these mkdir patches rather than before. Could you give a braindump on the things d_exact_alias() is needed for? It's a recurring headache when doing ->d_name/->d_parent audits; see e.g. https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for related mini-rant from the latest iteration. Proof of correctness is bloody awful; it feels like the primitive itself is wrong, but I'd never been able to write anything concise regarding the things we really want there ;-/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-24 2:09 ` Al Viro @ 2025-02-24 3:09 ` NeilBrown 2025-02-24 15:56 ` Trond Myklebust 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-24 3:09 UTC (permalink / raw) To: Al Viro, Trond Myklebust Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Mon, 24 Feb 2025, Al Viro wrote: > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > On Sat, 22 Feb 2025, Al Viro wrote: > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > +In general, filesystems which use d_instantiate_new() to install the new > > > > +inode can safely return NULL. Filesystems which may not have an I_NEW inode > > > > +should use d_drop();d_splice_alias() and return the result of the latter. > > > > > > IMO that's a bad pattern, _especially_ if you want to go for "in-update" > > > kind of stuff later. > > > > Agreed. I have a draft patch to change d_splice_alias() and > > d_exact_alias() to work on hashed dentrys. I thought it should go after > > these mkdir patches rather than before. > > Could you give a braindump on the things d_exact_alias() is needed for? > It's a recurring headache when doing ->d_name/->d_parent audits; see e.g. > https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for related > mini-rant from the latest iteration. > > Proof of correctness is bloody awful; it feels like the primitive itself > is wrong, but I'd never been able to write anything concise regarding > the things we really want there ;-/ > As I understand it, it is needed (or wanted) to handle the possibility of an inode becoming "stale" and then recovering. This could happen, for example, with a temporarily misconfigured NFS server. If ->d_revalidate gets a NFSERR_STALE from the server it will return '0' so lookup_fast() and others will call d_invalidate() which will d_drop() the dentry. There are other paths on which -ESTALE can result in d_drop(). If a subsequent attempt to "open" the name successfully finds the same inode we want to reuse the old dentry rather than create a new one. I don't really understand why. This code was added 20 years ago before git. It was introduced by commit 89a45174b6b32596ea98fa3f89a243e2c1188a01 Author: Trond Myklebust <trond.myklebust@fys.uio.no> Date: Tue Jan 4 21:41:37 2005 +0100 VFS: Avoid dentry aliasing problems in filesystems like NFS, where inodes may be marked as stale in one instance (causing the dentry to be dropped) then re-enabled in the next instance. Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no> in history.git Trond: do you have any memory of this? Can you explain what the symptom was that you wanted to fix? The original patch used d_add_unique() for lookup and atomic_open and readdir prime-dcache. We now only use it for v4 atomic_open. Maybe we don't need it at all? Or maybe we need to restore it to those other callers? Thanks, NeilBrown ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-24 3:09 ` NeilBrown @ 2025-02-24 15:56 ` Trond Myklebust 2025-02-26 2:09 ` NeilBrown 0 siblings, 1 reply; 36+ messages in thread From: Trond Myklebust @ 2025-02-24 15:56 UTC (permalink / raw) To: neilb@suse.de, viro@zeniv.linux.org.uk Cc: brauner@kernel.org, xiubli@redhat.com, idryomov@gmail.com, okorniev@redhat.com, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, anton.ivanov@cambridgegreys.com, jack@suse.cz, tom@talpey.com, richard@nod.at, linux-um@lists.infradead.org, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Mon, 2025-02-24 at 14:09 +1100, NeilBrown wrote: > On Mon, 24 Feb 2025, Al Viro wrote: > > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > > On Sat, 22 Feb 2025, Al Viro wrote: > > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > > > +In general, filesystems which use d_instantiate_new() to > > > > > install the new > > > > > +inode can safely return NULL. Filesystems which may not > > > > > have an I_NEW inode > > > > > +should use d_drop();d_splice_alias() and return the result > > > > > of the latter. > > > > > > > > IMO that's a bad pattern, _especially_ if you want to go for > > > > "in-update" > > > > kind of stuff later. > > > > > > Agreed. I have a draft patch to change d_splice_alias() and > > > d_exact_alias() to work on hashed dentrys. I thought it should > > > go after > > > these mkdir patches rather than before. > > > > Could you give a braindump on the things d_exact_alias() is needed > > for? > > It's a recurring headache when doing ->d_name/->d_parent audits; > > see e.g. > > https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for > > related > > mini-rant from the latest iteration. > > > > Proof of correctness is bloody awful; it feels like the primitive > > itself > > is wrong, but I'd never been able to write anything concise > > regarding > > the things we really want there ;-/ > > > > As I understand it, it is needed (or wanted) to handle the > possibility > of an inode becoming "stale" and then recovering. This could happen, > for example, with a temporarily misconfigured NFS server. > > If ->d_revalidate gets a NFSERR_STALE from the server it will return > '0' > so lookup_fast() and others will call d_invalidate() which will > d_drop() > the dentry. There are other paths on which -ESTALE can result in > d_drop(). > > If a subsequent attempt to "open" the name successfully finds the > same > inode we want to reuse the old dentry rather than create a new one. > > I don't really understand why. This code was added 20 years ago > before > git. > It was introduced by > > commit 89a45174b6b32596ea98fa3f89a243e2c1188a01 > Author: Trond Myklebust <trond.myklebust@fys.uio.no> > Date: Tue Jan 4 21:41:37 2005 +0100 > > VFS: Avoid dentry aliasing problems in filesystems like NFS, > where > inodes may be marked as stale in one instance (causing the > dentry > to be dropped) then re-enabled in the next instance. > > Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no> > > in history.git > > Trond: do you have any memory of this? Can you explain what the > symptom > was that you wanted to fix? > > The original patch used d_add_unique() for lookup and atomic_open and > readdir prime-dcache. We now only use it for v4 atomic_open. Maybe > we > don't need it at all? Or maybe we need to restore it to those other > callers? > 2005? That looks like it was trying to deal with the userspace NFS server. I can't remember when it was given the ability to use the inode generation counter, but I'm pretty sure that in 2005 there were plenty of setups out there that had the older version that reused filehandles (due to inode number reuse). So you would get spurious ESTALE errors sometimes due to inode number reuse, sometimes because the filehandle fell out of the userspace NFS server's cache. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-24 15:56 ` Trond Myklebust @ 2025-02-26 2:09 ` NeilBrown 2025-02-26 2:34 ` Trond Myklebust 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-26 2:09 UTC (permalink / raw) To: Trond Myklebust Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, xiubli@redhat.com, idryomov@gmail.com, okorniev@redhat.com, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, anton.ivanov@cambridgegreys.com, jack@suse.cz, tom@talpey.com, richard@nod.at, linux-um@lists.infradead.org, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Tue, 25 Feb 2025, Trond Myklebust wrote: > On Mon, 2025-02-24 at 14:09 +1100, NeilBrown wrote: > > On Mon, 24 Feb 2025, Al Viro wrote: > > > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > > > On Sat, 22 Feb 2025, Al Viro wrote: > > > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > > > > > +In general, filesystems which use d_instantiate_new() to > > > > > > install the new > > > > > > +inode can safely return NULL. Filesystems which may not > > > > > > have an I_NEW inode > > > > > > +should use d_drop();d_splice_alias() and return the result > > > > > > of the latter. > > > > > > > > > > IMO that's a bad pattern, _especially_ if you want to go for > > > > > "in-update" > > > > > kind of stuff later. > > > > > > > > Agreed. I have a draft patch to change d_splice_alias() and > > > > d_exact_alias() to work on hashed dentrys. I thought it should > > > > go after > > > > these mkdir patches rather than before. > > > > > > Could you give a braindump on the things d_exact_alias() is needed > > > for? > > > It's a recurring headache when doing ->d_name/->d_parent audits; > > > see e.g. > > > https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for > > > related > > > mini-rant from the latest iteration. > > > > > > Proof of correctness is bloody awful; it feels like the primitive > > > itself > > > is wrong, but I'd never been able to write anything concise > > > regarding > > > the things we really want there ;-/ > > > > > > > As I understand it, it is needed (or wanted) to handle the > > possibility > > of an inode becoming "stale" and then recovering. This could happen, > > for example, with a temporarily misconfigured NFS server. > > > > If ->d_revalidate gets a NFSERR_STALE from the server it will return > > '0' > > so lookup_fast() and others will call d_invalidate() which will > > d_drop() > > the dentry. There are other paths on which -ESTALE can result in > > d_drop(). > > > > If a subsequent attempt to "open" the name successfully finds the > > same > > inode we want to reuse the old dentry rather than create a new one. > > > > I don't really understand why. This code was added 20 years ago > > before > > git. > > It was introduced by > > > > commit 89a45174b6b32596ea98fa3f89a243e2c1188a01 > > Author: Trond Myklebust <trond.myklebust@fys.uio.no> > > Date: Tue Jan 4 21:41:37 2005 +0100 > > > > VFS: Avoid dentry aliasing problems in filesystems like NFS, > > where > > inodes may be marked as stale in one instance (causing the > > dentry > > to be dropped) then re-enabled in the next instance. > > > > Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no> > > > > in history.git > > > > Trond: do you have any memory of this? Can you explain what the > > symptom > > was that you wanted to fix? > > > > The original patch used d_add_unique() for lookup and atomic_open and > > readdir prime-dcache. We now only use it for v4 atomic_open. Maybe > > we > > don't need it at all? Or maybe we need to restore it to those other > > callers? > > > > 2005? That looks like it was trying to deal with the userspace NFS > server. I can't remember when it was given the ability to use the inode > generation counter, but I'm pretty sure that in 2005 there were plenty > of setups out there that had the older version that reused filehandles > (due to inode number reuse). So you would get spurious ESTALE errors > sometimes due to inode number reuse, sometimes because the filehandle > fell out of the userspace NFS server's cache. So this was likely done to work-around known weaknesses in NFS servers at the time. The original d_add_unique() was used in nfs_lookup() nfs_atomic_lookup() and nfs_readdir_lookup() but the current descendent d_exact_alias() is only used in _nfs4_open_and_get_state() called only by nfs4_do_open() which is only used in nfs4_atomic_open() and nfs4_proc_create(). So the usage in 'lookup' and 'readdir' have fallen by the wayside with no apparent negative consequences. The old NFS servers have probably been fixed. So do you have any concerns with us discarding d_exact_alias() and only using d_splice_alias() in _nfs4_open_get_state() ?? Thanks, NeilBrown ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-26 2:09 ` NeilBrown @ 2025-02-26 2:34 ` Trond Myklebust 2025-02-26 3:18 ` NeilBrown 0 siblings, 1 reply; 36+ messages in thread From: Trond Myklebust @ 2025-02-26 2:34 UTC (permalink / raw) To: neilb@suse.de Cc: xiubli@redhat.com, brauner@kernel.org, idryomov@gmail.com, okorniev@redhat.com, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, anton.ivanov@cambridgegreys.com, viro@zeniv.linux.org.uk, jack@suse.cz, tom@talpey.com, richard@nod.at, linux-um@lists.infradead.org, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Wed, 2025-02-26 at 13:09 +1100, NeilBrown wrote: > On Tue, 25 Feb 2025, Trond Myklebust wrote: > > On Mon, 2025-02-24 at 14:09 +1100, NeilBrown wrote: > > > On Mon, 24 Feb 2025, Al Viro wrote: > > > > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > > > > On Sat, 22 Feb 2025, Al Viro wrote: > > > > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > > > > > > > +In general, filesystems which use d_instantiate_new() to > > > > > > > install the new > > > > > > > +inode can safely return NULL. Filesystems which may not > > > > > > > have an I_NEW inode > > > > > > > +should use d_drop();d_splice_alias() and return the > > > > > > > result > > > > > > > of the latter. > > > > > > > > > > > > IMO that's a bad pattern, _especially_ if you want to go > > > > > > for > > > > > > "in-update" > > > > > > kind of stuff later. > > > > > > > > > > Agreed. I have a draft patch to change d_splice_alias() and > > > > > d_exact_alias() to work on hashed dentrys. I thought it > > > > > should > > > > > go after > > > > > these mkdir patches rather than before. > > > > > > > > Could you give a braindump on the things d_exact_alias() is > > > > needed > > > > for? > > > > It's a recurring headache when doing ->d_name/->d_parent > > > > audits; > > > > see e.g. > > > > https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for > > > > related > > > > mini-rant from the latest iteration. > > > > > > > > Proof of correctness is bloody awful; it feels like the > > > > primitive > > > > itself > > > > is wrong, but I'd never been able to write anything concise > > > > regarding > > > > the things we really want there ;-/ > > > > > > > > > > As I understand it, it is needed (or wanted) to handle the > > > possibility > > > of an inode becoming "stale" and then recovering. This could > > > happen, > > > for example, with a temporarily misconfigured NFS server. > > > > > > If ->d_revalidate gets a NFSERR_STALE from the server it will > > > return > > > '0' > > > so lookup_fast() and others will call d_invalidate() which will > > > d_drop() > > > the dentry. There are other paths on which -ESTALE can result in > > > d_drop(). > > > > > > If a subsequent attempt to "open" the name successfully finds the > > > same > > > inode we want to reuse the old dentry rather than create a new > > > one. > > > > > > I don't really understand why. This code was added 20 years ago > > > before > > > git. > > > It was introduced by > > > > > > commit 89a45174b6b32596ea98fa3f89a243e2c1188a01 > > > Author: Trond Myklebust <trond.myklebust@fys.uio.no> > > > Date: Tue Jan 4 21:41:37 2005 +0100 > > > > > > VFS: Avoid dentry aliasing problems in filesystems like NFS, > > > where > > > inodes may be marked as stale in one instance (causing > > > the > > > dentry > > > to be dropped) then re-enabled in the next instance. > > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no> > > > > > > in history.git > > > > > > Trond: do you have any memory of this? Can you explain what the > > > symptom > > > was that you wanted to fix? > > > > > > The original patch used d_add_unique() for lookup and atomic_open > > > and > > > readdir prime-dcache. We now only use it for v4 atomic_open. > > > Maybe > > > we > > > don't need it at all? Or maybe we need to restore it to those > > > other > > > callers? > > > > > > > 2005? That looks like it was trying to deal with the userspace NFS > > server. I can't remember when it was given the ability to use the > > inode > > generation counter, but I'm pretty sure that in 2005 there were > > plenty > > of setups out there that had the older version that reused > > filehandles > > (due to inode number reuse). So you would get spurious ESTALE > > errors > > sometimes due to inode number reuse, sometimes because the > > filehandle > > fell out of the userspace NFS server's cache. > > So this was likely done to work-around known weaknesses in NFS > servers > at the time. > > The original d_add_unique() was used in nfs_lookup() > nfs_atomic_lookup() > and nfs_readdir_lookup() but the current descendent d_exact_alias() > is > only used in _nfs4_open_and_get_state() called only by nfs4_do_open() > which is only used in nfs4_atomic_open() and nfs4_proc_create(). > > So the usage in 'lookup' and 'readdir' have fallen by the wayside > with > no apparent negative consequences. > The old NFS servers have probably been fixed. > > So do you have any concerns with us discarding d_exact_alias() and > only > using d_splice_alias() in _nfs4_open_get_state() ?? > AFAIK, there were never any NFSv4 servers in public use that mimicked the quirks of the userspace NFSv2/NFSv3 server. So I'm thinking it should be safe to retire d_exact_alias. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-26 2:34 ` Trond Myklebust @ 2025-02-26 3:18 ` NeilBrown 2025-02-26 3:35 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-26 3:18 UTC (permalink / raw) To: Trond Myklebust Cc: xiubli@redhat.com, brauner@kernel.org, idryomov@gmail.com, okorniev@redhat.com, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, anton.ivanov@cambridgegreys.com, viro@zeniv.linux.org.uk, jack@suse.cz, tom@talpey.com, richard@nod.at, linux-um@lists.infradead.org, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Wed, 26 Feb 2025, Trond Myklebust wrote: > On Wed, 2025-02-26 at 13:09 +1100, NeilBrown wrote: > > On Tue, 25 Feb 2025, Trond Myklebust wrote: > > > On Mon, 2025-02-24 at 14:09 +1100, NeilBrown wrote: > > > > On Mon, 24 Feb 2025, Al Viro wrote: > > > > > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > > > > > On Sat, 22 Feb 2025, Al Viro wrote: > > > > > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > > > > > > > > > +In general, filesystems which use d_instantiate_new() to > > > > > > > > install the new > > > > > > > > +inode can safely return NULL. Filesystems which may not > > > > > > > > have an I_NEW inode > > > > > > > > +should use d_drop();d_splice_alias() and return the > > > > > > > > result > > > > > > > > of the latter. > > > > > > > > > > > > > > IMO that's a bad pattern, _especially_ if you want to go > > > > > > > for > > > > > > > "in-update" > > > > > > > kind of stuff later. > > > > > > > > > > > > Agreed. I have a draft patch to change d_splice_alias() and > > > > > > d_exact_alias() to work on hashed dentrys. I thought it > > > > > > should > > > > > > go after > > > > > > these mkdir patches rather than before. > > > > > > > > > > Could you give a braindump on the things d_exact_alias() is > > > > > needed > > > > > for? > > > > > It's a recurring headache when doing ->d_name/->d_parent > > > > > audits; > > > > > see e.g. > > > > > https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for > > > > > related > > > > > mini-rant from the latest iteration. > > > > > > > > > > Proof of correctness is bloody awful; it feels like the > > > > > primitive > > > > > itself > > > > > is wrong, but I'd never been able to write anything concise > > > > > regarding > > > > > the things we really want there ;-/ > > > > > > > > > > > > > As I understand it, it is needed (or wanted) to handle the > > > > possibility > > > > of an inode becoming "stale" and then recovering. This could > > > > happen, > > > > for example, with a temporarily misconfigured NFS server. > > > > > > > > If ->d_revalidate gets a NFSERR_STALE from the server it will > > > > return > > > > '0' > > > > so lookup_fast() and others will call d_invalidate() which will > > > > d_drop() > > > > the dentry. There are other paths on which -ESTALE can result in > > > > d_drop(). > > > > > > > > If a subsequent attempt to "open" the name successfully finds the > > > > same > > > > inode we want to reuse the old dentry rather than create a new > > > > one. > > > > > > > > I don't really understand why. This code was added 20 years ago > > > > before > > > > git. > > > > It was introduced by > > > > > > > > commit 89a45174b6b32596ea98fa3f89a243e2c1188a01 > > > > Author: Trond Myklebust <trond.myklebust@fys.uio.no> > > > > Date: Tue Jan 4 21:41:37 2005 +0100 > > > > > > > > VFS: Avoid dentry aliasing problems in filesystems like NFS, > > > > where > > > > inodes may be marked as stale in one instance (causing > > > > the > > > > dentry > > > > to be dropped) then re-enabled in the next instance. > > > > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no> > > > > > > > > in history.git > > > > > > > > Trond: do you have any memory of this? Can you explain what the > > > > symptom > > > > was that you wanted to fix? > > > > > > > > The original patch used d_add_unique() for lookup and atomic_open > > > > and > > > > readdir prime-dcache. We now only use it for v4 atomic_open. > > > > Maybe > > > > we > > > > don't need it at all? Or maybe we need to restore it to those > > > > other > > > > callers? > > > > > > > > > > 2005? That looks like it was trying to deal with the userspace NFS > > > server. I can't remember when it was given the ability to use the > > > inode > > > generation counter, but I'm pretty sure that in 2005 there were > > > plenty > > > of setups out there that had the older version that reused > > > filehandles > > > (due to inode number reuse). So you would get spurious ESTALE > > > errors > > > sometimes due to inode number reuse, sometimes because the > > > filehandle > > > fell out of the userspace NFS server's cache. > > > > So this was likely done to work-around known weaknesses in NFS > > servers > > at the time. > > > > The original d_add_unique() was used in nfs_lookup() > > nfs_atomic_lookup() > > and nfs_readdir_lookup() but the current descendent d_exact_alias() > > is > > only used in _nfs4_open_and_get_state() called only by nfs4_do_open() > > which is only used in nfs4_atomic_open() and nfs4_proc_create(). > > > > So the usage in 'lookup' and 'readdir' have fallen by the wayside > > with > > no apparent negative consequences. > > The old NFS servers have probably been fixed. > > > > So do you have any concerns with us discarding d_exact_alias() and > > only > > using d_splice_alias() in _nfs4_open_get_state() ?? > > > > AFAIK, there were never any NFSv4 servers in public use that mimicked > the quirks of the userspace NFSv2/NFSv3 server. So I'm thinking it > should be safe to retire d_exact_alias. Thanks. I'll submit a patch through the VFS tree as I have other VFS patches in the works that will depend on that so having them together would be good. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-26 3:18 ` NeilBrown @ 2025-02-26 3:35 ` Al Viro 0 siblings, 0 replies; 36+ messages in thread From: Al Viro @ 2025-02-26 3:35 UTC (permalink / raw) To: NeilBrown Cc: Trond Myklebust, xiubli@redhat.com, brauner@kernel.org, idryomov@gmail.com, okorniev@redhat.com, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, anton.ivanov@cambridgegreys.com, jack@suse.cz, tom@talpey.com, richard@nod.at, linux-um@lists.infradead.org, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Wed, Feb 26, 2025 at 02:18:01PM +1100, NeilBrown wrote: > Thanks. I'll submit a patch through the VFS tree as I have other VFS > patches in the works that will depend on that so having them together > would be good. Do it on top of mainline, please (say, -rc4) and let's put it into a separate branch - easier that way to pull it into other branches without causing headache. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-20 23:36 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown 2025-02-22 4:19 ` Al Viro @ 2025-02-22 4:56 ` Al Viro 1 sibling, 0 replies; 36+ messages in thread From: Al Viro @ 2025-02-22 4:56 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > Not all filesystems reliably result in a positive hashed dentry: > > - NFS, cifs, hostfs will sometimes need to perform a lookup of > the name to get inode information. Races could result in this > returning something different. Note that this lookup is > non-atomic which is what we are trying to avoid. Placing the > lookup in filesystem code means it only happens when the filesystem > has no other option. At least in case of cifs I don't see that lookup anywhere in your series. Have I missed it, or...? ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible. 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown 2025-02-20 23:36 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown @ 2025-02-20 23:36 ` NeilBrown 2025-02-21 13:17 ` Jeff Layton 2025-02-20 23:36 ` [PATCH 3/6] ceph: return the correct dentry on mkdir NeilBrown ` (3 subsequent siblings) 5 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs After handling a mkdir, get the inode for the name and use d_splice_alias() to store the correct dentry in the dcache. Signed-off-by: NeilBrown <neilb@suse.de> --- fs/hostfs/hostfs_kern.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index ccbb48fe830d..a2c6b9051c5b 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -682,14 +682,22 @@ static int hostfs_symlink(struct mnt_idmap *idmap, struct inode *ino, static struct dentry *hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino, struct dentry *dentry, umode_t mode) { + struct inode *inode; char *file; int err; if ((file = dentry_name(dentry)) == NULL) return ERR_PTR(-ENOMEM); err = do_mkdir(file, mode); + if (err) { + dentry = ERR_PTR(err); + } else { + inode = hostfs_iget(dentry->d_sb, file); + d_drop(dentry); + dentry = d_splice_alias(inode, dentry); + } __putname(file); - return ERR_PTR(err); + return dentry; } static int hostfs_rmdir(struct inode *ino, struct dentry *dentry) -- 2.47.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible. 2025-02-20 23:36 ` [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible NeilBrown @ 2025-02-21 13:17 ` Jeff Layton 0 siblings, 0 replies; 36+ messages in thread From: Jeff Layton @ 2025-02-21 13:17 UTC (permalink / raw) To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > After handling a mkdir, get the inode for the name and use > d_splice_alias() to store the correct dentry in the dcache. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > fs/hostfs/hostfs_kern.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c > index ccbb48fe830d..a2c6b9051c5b 100644 > --- a/fs/hostfs/hostfs_kern.c > +++ b/fs/hostfs/hostfs_kern.c > @@ -682,14 +682,22 @@ static int hostfs_symlink(struct mnt_idmap *idmap, struct inode *ino, > static struct dentry *hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino, > struct dentry *dentry, umode_t mode) > { > + struct inode *inode; > char *file; > int err; > > if ((file = dentry_name(dentry)) == NULL) > return ERR_PTR(-ENOMEM); > err = do_mkdir(file, mode); > + if (err) { > + dentry = ERR_PTR(err); > + } else { > + inode = hostfs_iget(dentry->d_sb, file); > + d_drop(dentry); > + dentry = d_splice_alias(inode, dentry); > + } > __putname(file); > - return ERR_PTR(err); > + return dentry; > } > > static int hostfs_rmdir(struct inode *ino, struct dentry *dentry) Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown 2025-02-20 23:36 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown 2025-02-20 23:36 ` [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible NeilBrown @ 2025-02-20 23:36 ` NeilBrown 2025-02-21 1:48 ` Viacheslav Dubeyko 2025-02-21 13:31 ` Jeff Layton 2025-02-20 23:36 ` [PATCH 4/6] fuse: return correct dentry for ->mkdir NeilBrown ` (2 subsequent siblings) 5 siblings, 2 replies; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs ceph already splices the correct dentry (in splice_dentry()) from the result of mkdir but does nothing more with it. Now that ->mkdir can return a dentry, return the correct dentry. Signed-off-by: NeilBrown <neilb@suse.de> --- fs/ceph/dir.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 39e0f240de06..c1a1c168bb27 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct ceph_acl_sec_ctx as_ctx = {}; + struct dentry *ret = NULL; int err; int op; @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, !req->r_reply_info.head->is_dentry) err = ceph_handle_notrace_create(dir, dentry); out_req: + if (!err && req->r_dentry != dentry) + /* Some other dentry was spliced in */ + ret = dget(req->r_dentry); ceph_mdsc_put_request(req); out: if (!err) + /* Should this use 'ret' ?? */ ceph_init_inode_acls(d_inode(dentry), &as_ctx); else d_drop(dentry); ceph_release_acl_sec_ctx(&as_ctx); - return ERR_PTR(err); + if (err) + return ERR_PTR(err); + return ret; } static int ceph_link(struct dentry *old_dentry, struct inode *dir, -- 2.47.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-20 23:36 ` [PATCH 3/6] ceph: return the correct dentry on mkdir NeilBrown @ 2025-02-21 1:48 ` Viacheslav Dubeyko 2025-02-24 2:15 ` NeilBrown 2025-02-21 13:31 ` Jeff Layton 1 sibling, 1 reply; 36+ messages in thread From: Viacheslav Dubeyko @ 2025-02-21 1:48 UTC (permalink / raw) To: brauner@kernel.org, neilb@suse.de, idryomov@gmail.com, Xiubo Li, Olga Kornievskaia, Dai.Ngo@oracle.com, johannes@sipsolutions.net, chuck.lever@oracle.com, anna@kernel.org, jlayton@kernel.org, miklos@szeredi.hu, trondmy@kernel.org, anton.ivanov@cambridgegreys.com, jack@suse.cz, richard@nod.at, viro@zeniv.linux.org.uk, tom@talpey.com, senozhatsky@chromium.org Cc: ceph-devel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-um@lists.infradead.org, linux-cifs@vger.kernel.org On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > ceph already splices the correct dentry (in splice_dentry()) from the > result of mkdir but does nothing more with it. > > Now that ->mkdir can return a dentry, return the correct dentry. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > fs/ceph/dir.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > index 39e0f240de06..c1a1c168bb27 100644 > --- a/fs/ceph/dir.c > +++ b/fs/ceph/dir.c > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > struct ceph_client *cl = mdsc->fsc->client; > struct ceph_mds_request *req; > struct ceph_acl_sec_ctx as_ctx = {}; > + struct dentry *ret = NULL; I believe that it makes sense to initialize pointer by error here and always return ret as output. If something goes wrong in the logic, then we already have error. > int err; > int op; > > @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > !req->r_reply_info.head->is_dentry) > err = ceph_handle_notrace_create(dir, dentry); > out_req: > + if (!err && req->r_dentry != dentry) > + /* Some other dentry was spliced in */ > + ret = dget(req->r_dentry); > ceph_mdsc_put_request(req); > out: > if (!err) > + /* Should this use 'ret' ?? */ Could we make a decision should or shouldn't? :) It looks not good to leave this comment instead of proper implementation. Do we have some obstacles to make this decision? > ceph_init_inode_acls(d_inode(dentry), &as_ctx); > else > d_drop(dentry); > ceph_release_acl_sec_ctx(&as_ctx); > - return ERR_PTR(err); > + if (err) > + return ERR_PTR(err); > + return ret; What's about this? return err ? ERR_PTR(err) : ret; Thanks, Slava. > } > > static int ceph_link(struct dentry *old_dentry, struct inode *dir, ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-21 1:48 ` Viacheslav Dubeyko @ 2025-02-24 2:15 ` NeilBrown 2025-02-24 22:09 ` Viacheslav Dubeyko 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-24 2:15 UTC (permalink / raw) To: Viacheslav Dubeyko Cc: brauner@kernel.org, idryomov@gmail.com, Xiubo Li, Olga Kornievskaia, Dai.Ngo@oracle.com, johannes@sipsolutions.net, chuck.lever@oracle.com, anna@kernel.org, jlayton@kernel.org, miklos@szeredi.hu, trondmy@kernel.org, anton.ivanov@cambridgegreys.com, jack@suse.cz, richard@nod.at, viro@zeniv.linux.org.uk, tom@talpey.com, senozhatsky@chromium.org, ceph-devel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-um@lists.infradead.org, linux-cifs@vger.kernel.org On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > ceph already splices the correct dentry (in splice_dentry()) from the > > result of mkdir but does nothing more with it. > > > > Now that ->mkdir can return a dentry, return the correct dentry. > > > > Signed-off-by: NeilBrown <neilb@suse.de> > > --- > > fs/ceph/dir.c | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > > index 39e0f240de06..c1a1c168bb27 100644 > > --- a/fs/ceph/dir.c > > +++ b/fs/ceph/dir.c > > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > struct ceph_client *cl = mdsc->fsc->client; > > struct ceph_mds_request *req; > > struct ceph_acl_sec_ctx as_ctx = {}; > > + struct dentry *ret = NULL; > > I believe that it makes sense to initialize pointer by error here and always > return ret as output. If something goes wrong in the logic, then we already have > error. I'm not certain that I understand, but I have made a change which seems to be consistent with the above and included it below. Please let me know if it is what you intended. > > > int err; > > int op; > > > > @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > !req->r_reply_info.head->is_dentry) > > err = ceph_handle_notrace_create(dir, dentry); > > out_req: > > + if (!err && req->r_dentry != dentry) > > + /* Some other dentry was spliced in */ > > + ret = dget(req->r_dentry); > > ceph_mdsc_put_request(req); > > out: > > if (!err) > > + /* Should this use 'ret' ?? */ > > Could we make a decision should or shouldn't? :) > It looks not good to leave this comment instead of proper implementation. Do we > have some obstacles to make this decision? I suspect we should use ret, but I didn't want to make a change which wasn't directly required by my needed. So I highlighted this which looks to me like a possible bug, hoping that someone more familiar with the code would give an opinion. Do you agree that 'ret' (i.e. ->r_dentry) should be used when ret is not NULL? > > > ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > else > > d_drop(dentry); > > ceph_release_acl_sec_ctx(&as_ctx); > > - return ERR_PTR(err); > > + if (err) > > + return ERR_PTR(err); > > + return ret; > > What's about this? > > return err ? ERR_PTR(err) : ret; We could do that, but you said above that you thought we should always return 'ret' - which does make some sense. What do you think of the following alternate patch? Thanks, NeilBrown diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 39e0f240de06..d2e5c557df83 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct ceph_client *cl = mdsc->fsc->client; struct ceph_mds_request *req; struct ceph_acl_sec_ctx as_ctx = {}; + struct dentry *ret; int err; int op; @@ -1116,32 +1117,32 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, ceph_vinop(dir), dentry, dentry, mode); op = CEPH_MDS_OP_MKDIR; } else { - err = -EROFS; + ret = ERR_PTR(-EROFS); goto out; } if (op == CEPH_MDS_OP_MKDIR && ceph_quota_is_max_files_exceeded(dir)) { - err = -EDQUOT; + ret = ERR_PTR(-EDQUOT); goto out; } if ((op == CEPH_MDS_OP_MKSNAP) && IS_ENCRYPTED(dir) && !fscrypt_has_encryption_key(dir)) { - err = -ENOKEY; + ret = ERR_PTR(-ENOKEY); goto out; } req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); if (IS_ERR(req)) { - err = PTR_ERR(req); + ret = ERR_CAST(req); goto out; } mode |= S_IFDIR; req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); if (IS_ERR(req->r_new_inode)) { - err = PTR_ERR(req->r_new_inode); + ret = ERR_CAST(req->r_new_inode); req->r_new_inode = NULL; goto out_req; } @@ -1165,15 +1166,23 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, !req->r_reply_info.head->is_target && !req->r_reply_info.head->is_dentry) err = ceph_handle_notrace_create(dir, dentry); + ret = ERR_PTR(err); out_req: + if (!IS_ERR(ret) && req->r_dentry != dentry) + /* Some other dentry was spliced in */ + ret = dget(req->r_dentry); ceph_mdsc_put_request(req); out: - if (!err) - ceph_init_inode_acls(d_inode(dentry), &as_ctx); - else + if (!IS_ERR(ret)) { + if (ret) + ceph_init_inode_acls(d_inode(ret), &as_ctx); + else + ceph_init_inode_acls(d_inode(dentry), &as_ctx); + } else { d_drop(dentry); + } ceph_release_acl_sec_ctx(&as_ctx); - return ERR_PTR(err); + return ret; } static int ceph_link(struct dentry *old_dentry, struct inode *dir, ^ permalink raw reply related [flat|nested] 36+ messages in thread
* RE: [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-24 2:15 ` NeilBrown @ 2025-02-24 22:09 ` Viacheslav Dubeyko 2025-02-24 22:53 ` Jeff Layton 2025-02-24 23:29 ` NeilBrown 0 siblings, 2 replies; 36+ messages in thread From: Viacheslav Dubeyko @ 2025-02-24 22:09 UTC (permalink / raw) To: neilb@suse.de Cc: brauner@kernel.org, Xiubo Li, idryomov@gmail.com, Olga Kornievskaia, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-um@lists.infradead.org, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, trondmy@kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, tom@talpey.com, richard@nod.at, anton.ivanov@cambridgegreys.com, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Mon, 2025-02-24 at 13:15 +1100, NeilBrown wrote: > On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > > ceph already splices the correct dentry (in splice_dentry()) from the > > > result of mkdir but does nothing more with it. > > > > > > Now that ->mkdir can return a dentry, return the correct dentry. > > > > > > Signed-off-by: NeilBrown <neilb@suse.de> > > > --- > > > fs/ceph/dir.c | 9 ++++++++- > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > > > index 39e0f240de06..c1a1c168bb27 100644 > > > --- a/fs/ceph/dir.c > > > +++ b/fs/ceph/dir.c > > > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > > struct ceph_client *cl = mdsc->fsc->client; > > > struct ceph_mds_request *req; > > > struct ceph_acl_sec_ctx as_ctx = {}; > > > + struct dentry *ret = NULL; > > > > I believe that it makes sense to initialize pointer by error here and always > > return ret as output. If something goes wrong in the logic, then we already have > > error. > > I'm not certain that I understand, but I have made a change which seems > to be consistent with the above and included it below. Please let me > know if it is what you intended. > > > > > > int err; > > > int op; > > > > > > @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > > !req->r_reply_info.head->is_dentry) > > > err = ceph_handle_notrace_create(dir, dentry); > > > out_req: > > > + if (!err && req->r_dentry != dentry) > > > + /* Some other dentry was spliced in */ > > > + ret = dget(req->r_dentry); > > > ceph_mdsc_put_request(req); > > > out: > > > if (!err) > > > + /* Should this use 'ret' ?? */ > > > > Could we make a decision should or shouldn't? :) > > It looks not good to leave this comment instead of proper implementation. Do we > > have some obstacles to make this decision? > > I suspect we should use ret, but I didn't want to make a change which > wasn't directly required by my needed. So I highlighted this which > looks to me like a possible bug, hoping that someone more familiar with > the code would give an opinion. Do you agree that 'ret' (i.e. > ->r_dentry) should be used when ret is not NULL? > I think if we are going to return ret as a dentry, then it makes sense to call the ceph_init_inode_acls() for d_inode(ret). I don't see the point to call ceph_init_inode_acls() for d_inode(dentry) then. > > > > > ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > > else > > > d_drop(dentry); > > > ceph_release_acl_sec_ctx(&as_ctx); > > > - return ERR_PTR(err); > > > + if (err) > > > + return ERR_PTR(err); > > > + return ret; > > > > What's about this? > > > > return err ? ERR_PTR(err) : ret; > > We could do that, but you said above that you thought we should always > return 'ret' - which does make some sense. > > What do you think of the following alternate patch? > Patch looks good to me. Thanks. Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> > Thanks, > NeilBrown > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > index 39e0f240de06..d2e5c557df83 100644 > --- a/fs/ceph/dir.c > +++ b/fs/ceph/dir.c > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > struct ceph_client *cl = mdsc->fsc->client; > struct ceph_mds_request *req; > struct ceph_acl_sec_ctx as_ctx = {}; > + struct dentry *ret; > int err; > int op; > > @@ -1116,32 +1117,32 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > ceph_vinop(dir), dentry, dentry, mode); > op = CEPH_MDS_OP_MKDIR; > } else { > - err = -EROFS; > + ret = ERR_PTR(-EROFS); > goto out; > } > > if (op == CEPH_MDS_OP_MKDIR && > ceph_quota_is_max_files_exceeded(dir)) { > - err = -EDQUOT; > + ret = ERR_PTR(-EDQUOT); > goto out; > } > if ((op == CEPH_MDS_OP_MKSNAP) && IS_ENCRYPTED(dir) && > !fscrypt_has_encryption_key(dir)) { > - err = -ENOKEY; > + ret = ERR_PTR(-ENOKEY); > goto out; > } > > > req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); > if (IS_ERR(req)) { > - err = PTR_ERR(req); > + ret = ERR_CAST(req); > goto out; > } > > mode |= S_IFDIR; > req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); > if (IS_ERR(req->r_new_inode)) { > - err = PTR_ERR(req->r_new_inode); > + ret = ERR_CAST(req->r_new_inode); > req->r_new_inode = NULL; > goto out_req; > } > @@ -1165,15 +1166,23 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > !req->r_reply_info.head->is_target && > !req->r_reply_info.head->is_dentry) > err = ceph_handle_notrace_create(dir, dentry); > + ret = ERR_PTR(err); > out_req: > + if (!IS_ERR(ret) && req->r_dentry != dentry) > + /* Some other dentry was spliced in */ > + ret = dget(req->r_dentry); > ceph_mdsc_put_request(req); > out: > - if (!err) > - ceph_init_inode_acls(d_inode(dentry), &as_ctx); > - else > + if (!IS_ERR(ret)) { > + if (ret) > + ceph_init_inode_acls(d_inode(ret), &as_ctx); > + else > + ceph_init_inode_acls(d_inode(dentry), &as_ctx); > + } else { > d_drop(dentry); > + } > ceph_release_acl_sec_ctx(&as_ctx); > - return ERR_PTR(err); > + return ret; > } > > static int ceph_link(struct dentry *old_dentry, struct inode *dir, > Thanks, Slava. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-24 22:09 ` Viacheslav Dubeyko @ 2025-02-24 22:53 ` Jeff Layton 2025-02-24 23:29 ` NeilBrown 1 sibling, 0 replies; 36+ messages in thread From: Jeff Layton @ 2025-02-24 22:53 UTC (permalink / raw) To: Viacheslav Dubeyko, neilb@suse.de Cc: brauner@kernel.org, Xiubo Li, idryomov@gmail.com, Olga Kornievskaia, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-um@lists.infradead.org, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, anna@kernel.org, miklos@szeredi.hu, trondmy@kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, tom@talpey.com, richard@nod.at, anton.ivanov@cambridgegreys.com, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Mon, 2025-02-24 at 22:09 +0000, Viacheslav Dubeyko wrote: > On Mon, 2025-02-24 at 13:15 +1100, NeilBrown wrote: > > On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > > > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > > > ceph already splices the correct dentry (in splice_dentry()) from the > > > > result of mkdir but does nothing more with it. > > > > > > > > Now that ->mkdir can return a dentry, return the correct dentry. > > > > > > > > Signed-off-by: NeilBrown <neilb@suse.de> > > > > --- > > > > fs/ceph/dir.c | 9 ++++++++- > > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > > > > index 39e0f240de06..c1a1c168bb27 100644 > > > > --- a/fs/ceph/dir.c > > > > +++ b/fs/ceph/dir.c > > > > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > > > struct ceph_client *cl = mdsc->fsc->client; > > > > struct ceph_mds_request *req; > > > > struct ceph_acl_sec_ctx as_ctx = {}; > > > > + struct dentry *ret = NULL; > > > > > > I believe that it makes sense to initialize pointer by error here and always > > > return ret as output. If something goes wrong in the logic, then we already have > > > error. > > > > I'm not certain that I understand, but I have made a change which seems > > to be consistent with the above and included it below. Please let me > > know if it is what you intended. > > > > > > > > > int err; > > > > int op; > > > > > > > > @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > > > !req->r_reply_info.head->is_dentry) > > > > err = ceph_handle_notrace_create(dir, dentry); > > > > out_req: > > > > + if (!err && req->r_dentry != dentry) > > > > + /* Some other dentry was spliced in */ > > > > + ret = dget(req->r_dentry); > > > > ceph_mdsc_put_request(req); > > > > out: > > > > if (!err) > > > > + /* Should this use 'ret' ?? */ > > > > > > Could we make a decision should or shouldn't? :) > > > It looks not good to leave this comment instead of proper implementation. Do we > > > have some obstacles to make this decision? > > > > I suspect we should use ret, but I didn't want to make a change which > > wasn't directly required by my needed. So I highlighted this which > > looks to me like a possible bug, hoping that someone more familiar with > > the code would give an opinion. Do you agree that 'ret' (i.e. > > ->r_dentry) should be used when ret is not NULL? > > > > I think if we are going to return ret as a dentry, then it makes sense to call > the ceph_init_inode_acls() for d_inode(ret). I don't see the point to call > ceph_init_inode_acls() for d_inode(dentry) then. > My assumption when looking at this was that they should point to the same inode. That said, working with d_inode(ret) after that point is less confusing to the casual reader. > > > > > > > ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > > > else > > > > d_drop(dentry); > > > > ceph_release_acl_sec_ctx(&as_ctx); > > > > - return ERR_PTR(err); > > > > + if (err) > > > > + return ERR_PTR(err); > > > > + return ret; > > > > > > What's about this? > > > > > > return err ? ERR_PTR(err) : ret; > > > > We could do that, but you said above that you thought we should always > > return 'ret' - which does make some sense. > > > > What do you think of the following alternate patch? > > > > Patch looks good to me. Thanks. > > Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> > > > Thanks, > > NeilBrown > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > > index 39e0f240de06..d2e5c557df83 100644 > > --- a/fs/ceph/dir.c > > +++ b/fs/ceph/dir.c > > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > struct ceph_client *cl = mdsc->fsc->client; > > struct ceph_mds_request *req; > > struct ceph_acl_sec_ctx as_ctx = {}; > > + struct dentry *ret; > > int err; > > int op; > > > > @@ -1116,32 +1117,32 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > ceph_vinop(dir), dentry, dentry, mode); > > op = CEPH_MDS_OP_MKDIR; > > } else { > > - err = -EROFS; > > + ret = ERR_PTR(-EROFS); > > goto out; > > } > > > > if (op == CEPH_MDS_OP_MKDIR && > > ceph_quota_is_max_files_exceeded(dir)) { > > - err = -EDQUOT; > > + ret = ERR_PTR(-EDQUOT); > > goto out; > > } > > if ((op == CEPH_MDS_OP_MKSNAP) && IS_ENCRYPTED(dir) && > > !fscrypt_has_encryption_key(dir)) { > > - err = -ENOKEY; > > + ret = ERR_PTR(-ENOKEY); > > goto out; > > } > > > > > > req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); > > if (IS_ERR(req)) { > > - err = PTR_ERR(req); > > + ret = ERR_CAST(req); > > goto out; > > } > > > > mode |= S_IFDIR; > > req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); > > if (IS_ERR(req->r_new_inode)) { > > - err = PTR_ERR(req->r_new_inode); > > + ret = ERR_CAST(req->r_new_inode); > > req->r_new_inode = NULL; > > goto out_req; > > } > > @@ -1165,15 +1166,23 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > !req->r_reply_info.head->is_target && > > !req->r_reply_info.head->is_dentry) > > err = ceph_handle_notrace_create(dir, dentry); > > + ret = ERR_PTR(err); > > out_req: > > + if (!IS_ERR(ret) && req->r_dentry != dentry) > > + /* Some other dentry was spliced in */ > > + ret = dget(req->r_dentry); > > ceph_mdsc_put_request(req); > > out: > > - if (!err) > > - ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > - else > > + if (!IS_ERR(ret)) { > > + if (ret) > > + ceph_init_inode_acls(d_inode(ret), &as_ctx); > > + else > > + ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > + } else { > > d_drop(dentry); > > + } > > ceph_release_acl_sec_ctx(&as_ctx); > > - return ERR_PTR(err); > > + return ret; > > } > > > > static int ceph_link(struct dentry *old_dentry, struct inode *dir, > > > > Thanks, > Slava. > -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-24 22:09 ` Viacheslav Dubeyko 2025-02-24 22:53 ` Jeff Layton @ 2025-02-24 23:29 ` NeilBrown 1 sibling, 0 replies; 36+ messages in thread From: NeilBrown @ 2025-02-24 23:29 UTC (permalink / raw) To: Viacheslav Dubeyko Cc: brauner@kernel.org, Xiubo Li, idryomov@gmail.com, Olga Kornievskaia, linux-cifs@vger.kernel.org, Dai.Ngo@oracle.com, linux-um@lists.infradead.org, linux-kernel@vger.kernel.org, johannes@sipsolutions.net, chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org, miklos@szeredi.hu, trondmy@kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, tom@talpey.com, richard@nod.at, anton.ivanov@cambridgegreys.com, linux-fsdevel@vger.kernel.org, netfs@lists.linux.dev, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, senozhatsky@chromium.org On Tue, 25 Feb 2025, Viacheslav Dubeyko wrote: > On Mon, 2025-02-24 at 13:15 +1100, NeilBrown wrote: > > On Fri, 21 Feb 2025, Viacheslav Dubeyko wrote: > > > On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > > > > ceph already splices the correct dentry (in splice_dentry()) from the > > > > result of mkdir but does nothing more with it. > > > > > > > > Now that ->mkdir can return a dentry, return the correct dentry. > > > > > > > > Signed-off-by: NeilBrown <neilb@suse.de> > > > > --- > > > > fs/ceph/dir.c | 9 ++++++++- > > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > > > > index 39e0f240de06..c1a1c168bb27 100644 > > > > --- a/fs/ceph/dir.c > > > > +++ b/fs/ceph/dir.c > > > > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > > > struct ceph_client *cl = mdsc->fsc->client; > > > > struct ceph_mds_request *req; > > > > struct ceph_acl_sec_ctx as_ctx = {}; > > > > + struct dentry *ret = NULL; > > > > > > I believe that it makes sense to initialize pointer by error here and always > > > return ret as output. If something goes wrong in the logic, then we already have > > > error. > > > > I'm not certain that I understand, but I have made a change which seems > > to be consistent with the above and included it below. Please let me > > know if it is what you intended. > > > > > > > > > int err; > > > > int op; > > > > > > > > @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > > > !req->r_reply_info.head->is_dentry) > > > > err = ceph_handle_notrace_create(dir, dentry); > > > > out_req: > > > > + if (!err && req->r_dentry != dentry) > > > > + /* Some other dentry was spliced in */ > > > > + ret = dget(req->r_dentry); > > > > ceph_mdsc_put_request(req); > > > > out: > > > > if (!err) > > > > + /* Should this use 'ret' ?? */ > > > > > > Could we make a decision should or shouldn't? :) > > > It looks not good to leave this comment instead of proper implementation. Do we > > > have some obstacles to make this decision? > > > > I suspect we should use ret, but I didn't want to make a change which > > wasn't directly required by my needed. So I highlighted this which > > looks to me like a possible bug, hoping that someone more familiar with > > the code would give an opinion. Do you agree that 'ret' (i.e. > > ->r_dentry) should be used when ret is not NULL? > > > > I think if we are going to return ret as a dentry, then it makes sense to call > the ceph_init_inode_acls() for d_inode(ret). I don't see the point to call > ceph_init_inode_acls() for d_inode(dentry) then. If the mkdir used the original dentry, then ->mkdir returns NULL so ret is NULL. If the mkdir used a different dentry it returns that, so ret is not NULL. I'll try to re-organise the code so that "dentry" is the correct dentry on success, and "ret" is the returned dentry, which might be NULL. Thanks, NeilBrown > > > > > > > > ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > > > else > > > > d_drop(dentry); > > > > ceph_release_acl_sec_ctx(&as_ctx); > > > > - return ERR_PTR(err); > > > > + if (err) > > > > + return ERR_PTR(err); > > > > + return ret; > > > > > > What's about this? > > > > > > return err ? ERR_PTR(err) : ret; > > > > We could do that, but you said above that you thought we should always > > return 'ret' - which does make some sense. > > > > What do you think of the following alternate patch? > > > > Patch looks good to me. Thanks. > > Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> > > > Thanks, > > NeilBrown > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > > index 39e0f240de06..d2e5c557df83 100644 > > --- a/fs/ceph/dir.c > > +++ b/fs/ceph/dir.c > > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > struct ceph_client *cl = mdsc->fsc->client; > > struct ceph_mds_request *req; > > struct ceph_acl_sec_ctx as_ctx = {}; > > + struct dentry *ret; > > int err; > > int op; > > > > @@ -1116,32 +1117,32 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > ceph_vinop(dir), dentry, dentry, mode); > > op = CEPH_MDS_OP_MKDIR; > > } else { > > - err = -EROFS; > > + ret = ERR_PTR(-EROFS); > > goto out; > > } > > > > if (op == CEPH_MDS_OP_MKDIR && > > ceph_quota_is_max_files_exceeded(dir)) { > > - err = -EDQUOT; > > + ret = ERR_PTR(-EDQUOT); > > goto out; > > } > > if ((op == CEPH_MDS_OP_MKSNAP) && IS_ENCRYPTED(dir) && > > !fscrypt_has_encryption_key(dir)) { > > - err = -ENOKEY; > > + ret = ERR_PTR(-ENOKEY); > > goto out; > > } > > > > > > req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); > > if (IS_ERR(req)) { > > - err = PTR_ERR(req); > > + ret = ERR_CAST(req); > > goto out; > > } > > > > mode |= S_IFDIR; > > req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); > > if (IS_ERR(req->r_new_inode)) { > > - err = PTR_ERR(req->r_new_inode); > > + ret = ERR_CAST(req->r_new_inode); > > req->r_new_inode = NULL; > > goto out_req; > > } > > @@ -1165,15 +1166,23 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > !req->r_reply_info.head->is_target && > > !req->r_reply_info.head->is_dentry) > > err = ceph_handle_notrace_create(dir, dentry); > > + ret = ERR_PTR(err); > > out_req: > > + if (!IS_ERR(ret) && req->r_dentry != dentry) > > + /* Some other dentry was spliced in */ > > + ret = dget(req->r_dentry); > > ceph_mdsc_put_request(req); > > out: > > - if (!err) > > - ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > - else > > + if (!IS_ERR(ret)) { > > + if (ret) > > + ceph_init_inode_acls(d_inode(ret), &as_ctx); > > + else > > + ceph_init_inode_acls(d_inode(dentry), &as_ctx); > > + } else { > > d_drop(dentry); > > + } > > ceph_release_acl_sec_ctx(&as_ctx); > > - return ERR_PTR(err); > > + return ret; > > } > > > > static int ceph_link(struct dentry *old_dentry, struct inode *dir, > > > > Thanks, > Slava. > > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/6] ceph: return the correct dentry on mkdir 2025-02-20 23:36 ` [PATCH 3/6] ceph: return the correct dentry on mkdir NeilBrown 2025-02-21 1:48 ` Viacheslav Dubeyko @ 2025-02-21 13:31 ` Jeff Layton 1 sibling, 0 replies; 36+ messages in thread From: Jeff Layton @ 2025-02-21 13:31 UTC (permalink / raw) To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > ceph already splices the correct dentry (in splice_dentry()) from the > result of mkdir but does nothing more with it. > > Now that ->mkdir can return a dentry, return the correct dentry. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > fs/ceph/dir.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > index 39e0f240de06..c1a1c168bb27 100644 > --- a/fs/ceph/dir.c > +++ b/fs/ceph/dir.c > @@ -1099,6 +1099,7 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > struct ceph_client *cl = mdsc->fsc->client; > struct ceph_mds_request *req; > struct ceph_acl_sec_ctx as_ctx = {}; > + struct dentry *ret = NULL; > int err; > int op; > > @@ -1166,14 +1167,20 @@ static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, > !req->r_reply_info.head->is_dentry) > err = ceph_handle_notrace_create(dir, dentry); > out_req: > + if (!err && req->r_dentry != dentry) > + /* Some other dentry was spliced in */ > + ret = dget(req->r_dentry); > ceph_mdsc_put_request(req); > out: > if (!err) > + /* Should this use 'ret' ?? */ Probably? Is there a guarantee that "dentry" will even have an inode attached if it got replaced by an disconnected one in the dcache? > ceph_init_inode_acls(d_inode(dentry), &as_ctx); > else > d_drop(dentry); > ceph_release_acl_sec_ctx(&as_ctx); > - return ERR_PTR(err); > + if (err) > + return ERR_PTR(err); > + return ret; > } > > static int ceph_link(struct dentry *old_dentry, struct inode *dir, -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 4/6] fuse: return correct dentry for ->mkdir 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown ` (2 preceding siblings ...) 2025-02-20 23:36 ` [PATCH 3/6] ceph: return the correct dentry on mkdir NeilBrown @ 2025-02-20 23:36 ` NeilBrown 2025-02-21 13:39 ` Jeff Layton 2025-02-22 4:24 ` Al Viro 2025-02-20 23:36 ` [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed NeilBrown 2025-02-20 23:36 ` [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry NeilBrown 5 siblings, 2 replies; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs fuse already uses d_splice_alias() to ensure an appropriate dentry is found for a newly created dentry. Now that ->mkdir can return that dentry we do so. This requires changing create_new_entry() to return a dentry and handling that change in all callers. Signed-off-by: NeilBrown <neilb@suse.de> --- fs/fuse/dir.c | 55 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 5bb65f38bfb8..8c44c9c73c38 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -781,9 +781,9 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry, /* * Code shared between mknod, mkdir, symlink and link */ -static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, - struct fuse_args *args, struct inode *dir, - struct dentry *entry, umode_t mode) +static struct dentry *create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, + struct fuse_args *args, struct inode *dir, + struct dentry *entry, umode_t mode) { struct fuse_entry_out outarg; struct inode *inode; @@ -792,11 +792,11 @@ static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, struct fuse_forget_link *forget; if (fuse_is_bad(dir)) - return -EIO; + return ERR_PTR(-EIO); forget = fuse_alloc_forget(); if (!forget) - return -ENOMEM; + return ERR_PTR(-ENOMEM); memset(&outarg, 0, sizeof(outarg)); args->nodeid = get_node_id(dir); @@ -826,29 +826,27 @@ static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, &outarg.attr, ATTR_TIMEOUT(&outarg), 0, 0); if (!inode) { fuse_queue_forget(fm->fc, forget, outarg.nodeid, 1); - return -ENOMEM; + return ERR_PTR(-ENOMEM); } kfree(forget); d_drop(entry); d = d_splice_alias(inode, entry); if (IS_ERR(d)) - return PTR_ERR(d); + return d; - if (d) { + if (d) fuse_change_entry_timeout(d, &outarg); - dput(d); - } else { + else fuse_change_entry_timeout(entry, &outarg); - } fuse_dir_changed(dir); - return 0; + return d; out_put_forget_req: if (err == -EEXIST) fuse_invalidate_entry(entry); kfree(forget); - return err; + return ERR_PTR(err); } static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, @@ -856,6 +854,7 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, { struct fuse_mknod_in inarg; struct fuse_mount *fm = get_fuse_mount(dir); + struct dentry *de; FUSE_ARGS(args); if (!fm->fc->dont_mask) @@ -871,7 +870,12 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; - return create_new_entry(idmap, fm, &args, dir, entry, mode); + de = create_new_entry(idmap, fm, &args, dir, entry, mode); + if (IS_ERR(de)) + return PTR_ERR(de); + if (de) + dput(de); + return 0; } static int fuse_create(struct mnt_idmap *idmap, struct inode *dir, @@ -917,7 +921,7 @@ static struct dentry *fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; - return ERR_PTR(create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR)); + return create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR); } static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, @@ -925,6 +929,7 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, { struct fuse_mount *fm = get_fuse_mount(dir); unsigned len = strlen(link) + 1; + struct dentry *de; FUSE_ARGS(args); args.opcode = FUSE_SYMLINK; @@ -934,7 +939,12 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, args.in_args[1].value = entry->d_name.name; args.in_args[2].size = len; args.in_args[2].value = link; - return create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); + de = create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); + if (IS_ERR(de)) + return PTR_ERR(de); + if (de) + dput(de); + return 0; } void fuse_flush_time_update(struct inode *inode) @@ -1117,7 +1127,7 @@ static int fuse_rename2(struct mnt_idmap *idmap, struct inode *olddir, static int fuse_link(struct dentry *entry, struct inode *newdir, struct dentry *newent) { - int err; + struct dentry *de; struct fuse_link_in inarg; struct inode *inode = d_inode(entry); struct fuse_mount *fm = get_fuse_mount(inode); @@ -1131,13 +1141,16 @@ static int fuse_link(struct dentry *entry, struct inode *newdir, args.in_args[0].value = &inarg; args.in_args[1].size = newent->d_name.len + 1; args.in_args[1].value = newent->d_name.name; - err = create_new_entry(&invalid_mnt_idmap, fm, &args, newdir, newent, inode->i_mode); - if (!err) + de = create_new_entry(&invalid_mnt_idmap, fm, &args, newdir, newent, inode->i_mode); + if (!IS_ERR(de)) { + if (de) + dput(de); + de = NULL; fuse_update_ctime_in_cache(inode); - else if (err == -EINTR) + } else if (PTR_ERR(de) == -EINTR) fuse_invalidate_attr(inode); - return err; + return PTR_ERR(de); } static void fuse_fillattr(struct mnt_idmap *idmap, struct inode *inode, -- 2.47.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 4/6] fuse: return correct dentry for ->mkdir 2025-02-20 23:36 ` [PATCH 4/6] fuse: return correct dentry for ->mkdir NeilBrown @ 2025-02-21 13:39 ` Jeff Layton 2025-02-22 4:24 ` Al Viro 1 sibling, 0 replies; 36+ messages in thread From: Jeff Layton @ 2025-02-21 13:39 UTC (permalink / raw) To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > fuse already uses d_splice_alias() to ensure an appropriate dentry is > found for a newly created dentry. Now that ->mkdir can return that > dentry we do so. > > This requires changing create_new_entry() to return a dentry and > handling that change in all callers. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > fs/fuse/dir.c | 55 +++++++++++++++++++++++++++++++-------------------- > 1 file changed, 34 insertions(+), 21 deletions(-) > > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c > index 5bb65f38bfb8..8c44c9c73c38 100644 > --- a/fs/fuse/dir.c > +++ b/fs/fuse/dir.c > @@ -781,9 +781,9 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry, > /* > * Code shared between mknod, mkdir, symlink and link > */ > -static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, > - struct fuse_args *args, struct inode *dir, > - struct dentry *entry, umode_t mode) > +static struct dentry *create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, > + struct fuse_args *args, struct inode *dir, > + struct dentry *entry, umode_t mode) > { > struct fuse_entry_out outarg; > struct inode *inode; > @@ -792,11 +792,11 @@ static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, > struct fuse_forget_link *forget; > > if (fuse_is_bad(dir)) > - return -EIO; > + return ERR_PTR(-EIO); > > forget = fuse_alloc_forget(); > if (!forget) > - return -ENOMEM; > + return ERR_PTR(-ENOMEM); > > memset(&outarg, 0, sizeof(outarg)); > args->nodeid = get_node_id(dir); > @@ -826,29 +826,27 @@ static int create_new_entry(struct mnt_idmap *idmap, struct fuse_mount *fm, > &outarg.attr, ATTR_TIMEOUT(&outarg), 0, 0); > if (!inode) { > fuse_queue_forget(fm->fc, forget, outarg.nodeid, 1); > - return -ENOMEM; > + return ERR_PTR(-ENOMEM); > } > kfree(forget); > > d_drop(entry); > d = d_splice_alias(inode, entry); > if (IS_ERR(d)) > - return PTR_ERR(d); > + return d; > > - if (d) { > + if (d) > fuse_change_entry_timeout(d, &outarg); > - dput(d); > - } else { > + else > fuse_change_entry_timeout(entry, &outarg); > - } > fuse_dir_changed(dir); > - return 0; > + return d; > > out_put_forget_req: > if (err == -EEXIST) > fuse_invalidate_entry(entry); > kfree(forget); > - return err; > + return ERR_PTR(err); > } > > static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, > @@ -856,6 +854,7 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, > { > struct fuse_mknod_in inarg; > struct fuse_mount *fm = get_fuse_mount(dir); > + struct dentry *de; > FUSE_ARGS(args); > > if (!fm->fc->dont_mask) > @@ -871,7 +870,12 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, > args.in_args[0].value = &inarg; > args.in_args[1].size = entry->d_name.len + 1; > args.in_args[1].value = entry->d_name.name; > - return create_new_entry(idmap, fm, &args, dir, entry, mode); > + de = create_new_entry(idmap, fm, &args, dir, entry, mode); > + if (IS_ERR(de)) > + return PTR_ERR(de); > + if (de) > + dput(de); > + return 0; > } > > static int fuse_create(struct mnt_idmap *idmap, struct inode *dir, > @@ -917,7 +921,7 @@ static struct dentry *fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, > args.in_args[0].value = &inarg; > args.in_args[1].size = entry->d_name.len + 1; > args.in_args[1].value = entry->d_name.name; > - return ERR_PTR(create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR)); > + return create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR); > } > > static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, > @@ -925,6 +929,7 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, > { > struct fuse_mount *fm = get_fuse_mount(dir); > unsigned len = strlen(link) + 1; > + struct dentry *de; > FUSE_ARGS(args); > > args.opcode = FUSE_SYMLINK; > @@ -934,7 +939,12 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, > args.in_args[1].value = entry->d_name.name; > args.in_args[2].size = len; > args.in_args[2].value = link; > - return create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); > + de = create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); > + if (IS_ERR(de)) > + return PTR_ERR(de); > + if (de) > + dput(de); > + return 0; > } > > void fuse_flush_time_update(struct inode *inode) > @@ -1117,7 +1127,7 @@ static int fuse_rename2(struct mnt_idmap *idmap, struct inode *olddir, > static int fuse_link(struct dentry *entry, struct inode *newdir, > struct dentry *newent) > { > - int err; > + struct dentry *de; > struct fuse_link_in inarg; > struct inode *inode = d_inode(entry); > struct fuse_mount *fm = get_fuse_mount(inode); > @@ -1131,13 +1141,16 @@ static int fuse_link(struct dentry *entry, struct inode *newdir, > args.in_args[0].value = &inarg; > args.in_args[1].size = newent->d_name.len + 1; > args.in_args[1].value = newent->d_name.name; > - err = create_new_entry(&invalid_mnt_idmap, fm, &args, newdir, newent, inode->i_mode); > - if (!err) > + de = create_new_entry(&invalid_mnt_idmap, fm, &args, newdir, newent, inode->i_mode); > + if (!IS_ERR(de)) { > + if (de) > + dput(de); > + de = NULL; > fuse_update_ctime_in_cache(inode); > - else if (err == -EINTR) > + } else if (PTR_ERR(de) == -EINTR) > fuse_invalidate_attr(inode); > > - return err; > + return PTR_ERR(de); > } > > static void fuse_fillattr(struct mnt_idmap *idmap, struct inode *inode, Pretty straightforward. Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 4/6] fuse: return correct dentry for ->mkdir 2025-02-20 23:36 ` [PATCH 4/6] fuse: return correct dentry for ->mkdir NeilBrown 2025-02-21 13:39 ` Jeff Layton @ 2025-02-22 4:24 ` Al Viro 2025-02-24 2:26 ` NeilBrown 1 sibling, 1 reply; 36+ messages in thread From: Al Viro @ 2025-02-22 4:24 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, Feb 21, 2025 at 10:36:33AM +1100, NeilBrown wrote: > @@ -871,7 +870,12 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, > args.in_args[0].value = &inarg; > args.in_args[1].size = entry->d_name.len + 1; > args.in_args[1].value = entry->d_name.name; > - return create_new_entry(idmap, fm, &args, dir, entry, mode); > + de = create_new_entry(idmap, fm, &args, dir, entry, mode); > + if (IS_ERR(de)) > + return PTR_ERR(de); > + if (de) > + dput(de); > + return 0; Can that really happen? > @@ -934,7 +939,12 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, > args.in_args[1].value = entry->d_name.name; > args.in_args[2].size = len; > args.in_args[2].value = link; > - return create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); > + de = create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); > + if (IS_ERR(de)) > + return PTR_ERR(de); > + if (de) > + dput(de); > + return 0; Same question. > + de = create_new_entry(&invalid_mnt_idmap, fm, &args, newdir, newent, inode->i_mode); > + if (!IS_ERR(de)) { > + if (de) > + dput(de); > + de = NULL; Whoa... Details, please. What's going on here? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 4/6] fuse: return correct dentry for ->mkdir 2025-02-22 4:24 ` Al Viro @ 2025-02-24 2:26 ` NeilBrown 2025-02-24 2:53 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-24 2:26 UTC (permalink / raw) To: Al Viro Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Sat, 22 Feb 2025, Al Viro wrote: > On Fri, Feb 21, 2025 at 10:36:33AM +1100, NeilBrown wrote: > > > @@ -871,7 +870,12 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, > > args.in_args[0].value = &inarg; > > args.in_args[1].size = entry->d_name.len + 1; > > args.in_args[1].value = entry->d_name.name; > > - return create_new_entry(idmap, fm, &args, dir, entry, mode); > > + de = create_new_entry(idmap, fm, &args, dir, entry, mode); > > + if (IS_ERR(de)) > > + return PTR_ERR(de); > > + if (de) > > + dput(de); > > + return 0; > > Can that really happen? Probably now. It would require S_IFDIR to be passed in the mode to vfs_mknod(). I don't think any current callers do that, but I don't see any code in vfs_mknod() to prevent it. > > > @@ -934,7 +939,12 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, > > args.in_args[1].value = entry->d_name.name; > > args.in_args[2].size = len; > > args.in_args[2].value = link; > > - return create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); > > + de = create_new_entry(idmap, fm, &args, dir, entry, S_IFLNK); > > + if (IS_ERR(de)) > > + return PTR_ERR(de); > > + if (de) > > + dput(de); > > + return 0; > > Same question. That definitely cannot happen. - because we *know* that d_splice_alias() never returns a dentry for any but an S_IFDIR inode (how might we explain that to the rust type system I wonder :-). I was going for "obviously correct" without try to optimise, but you are correct that testing for a non-NULL non-ERR dentry should be optimsed away as impossible in all cases except mkdir. Thanks, NeilBrown > > > + de = create_new_entry(&invalid_mnt_idmap, fm, &args, newdir, newent, inode->i_mode); > > + if (!IS_ERR(de)) { > > + if (de) > > + dput(de); > > + de = NULL; > > Whoa... Details, please. What's going on here? > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 4/6] fuse: return correct dentry for ->mkdir 2025-02-24 2:26 ` NeilBrown @ 2025-02-24 2:53 ` Al Viro 0 siblings, 0 replies; 36+ messages in thread From: Al Viro @ 2025-02-24 2:53 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Mon, Feb 24, 2025 at 01:26:18PM +1100, NeilBrown wrote: > Probably now. It would require S_IFDIR to be passed in the mode to > vfs_mknod(). I don't think any current callers do that, but I don't see > any code in vfs_mknod() to prevent it. Not allowed (and that's caller's responsibility to enforce). Local filesystems would break horribly if that ever happened. ->mknod() instance _may_ be a convenient helper for ->mkdir() et.al. to call, but even for ramfs it won't coincide with ->mkdir() (wrong i_nlink, for one thing). If that's not documented, it really should be. vfs_mknod() may be called for block devices, character devices, FIFOs and sockets. Nothing else is allowed. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed. 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown ` (3 preceding siblings ...) 2025-02-20 23:36 ` [PATCH 4/6] fuse: return correct dentry for ->mkdir NeilBrown @ 2025-02-20 23:36 ` NeilBrown 2025-02-22 4:41 ` Al Viro 2025-02-20 23:36 ` [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry NeilBrown 5 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs mkdir now allows a different dentry to be returned which is sometimes relevant for nfs. This patch changes the nfs_rpc_ops mkdir op to return a dentry, and passes that back to the caller. The mkdir nfs_rpc_op will return NULL if the original dentry should be used. This matches the mkdir inode_operation. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> --- fs/nfs/dir.c | 13 ++++--------- fs/nfs/nfs3proc.c | 9 ++++++--- fs/nfs/nfs4proc.c | 43 +++++++++++++++++++++++++++++------------ fs/nfs/proc.c | 12 ++++++++---- include/linux/nfs_xdr.h | 2 +- 5 files changed, 50 insertions(+), 29 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 101b1098e87b..bc957487f6ec 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2426,7 +2426,7 @@ struct dentry *nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode) { struct iattr attr; - int error; + struct dentry *ret; dfprintk(VFS, "NFS: mkdir(%s/%lu), %pd\n", dir->i_sb->s_id, dir->i_ino, dentry); @@ -2435,14 +2435,9 @@ struct dentry *nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, attr.ia_mode = mode | S_IFDIR; trace_nfs_mkdir_enter(dir, dentry); - error = NFS_PROTO(dir)->mkdir(dir, dentry, &attr); - trace_nfs_mkdir_exit(dir, dentry, error); - if (error != 0) - goto out_err; - return NULL; -out_err: - d_drop(dentry); - return ERR_PTR(error); + ret = NFS_PROTO(dir)->mkdir(dir, dentry, &attr); + trace_nfs_mkdir_exit(dir, dentry, PTR_ERR_OR_ZERO(ret)); + return ret; } EXPORT_SYMBOL_GPL(nfs_mkdir); diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 0c3bc98cd999..dfb3fafc9d4f 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -578,7 +578,7 @@ nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct folio *folio, return status; } -static int +static struct dentry * nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) { struct posix_acl *default_acl, *acl; @@ -612,15 +612,18 @@ nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) dentry = d_alias; status = nfs3_proc_setacls(d_inode(dentry), acl, default_acl); + if (status && d_alias) + dput(d_alias); - dput(d_alias); out_release_acls: posix_acl_release(acl); posix_acl_release(default_acl); out: nfs3_free_createdata(data); dprintk("NFS reply mkdir: %d\n", status); - return status; + if (status) + return ERR_PTR(status); + return d_alias; } static int diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index df9669d4ded7..164c9f3f36c8 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5135,9 +5135,6 @@ static int nfs4_do_create(struct inode *dir, struct dentry *dentry, struct nfs4_ &data->arg.seq_args, &data->res.seq_res, 1); if (status == 0) { spin_lock(&dir->i_lock); - /* Creating a directory bumps nlink in the parent */ - if (data->arg.ftype == NF4DIR) - nfs4_inc_nlink_locked(dir); nfs4_update_changeattr_locked(dir, &data->res.dir_cinfo, data->res.fattr->time_start, NFS_INO_INVALID_DATA); @@ -5147,6 +5144,25 @@ static int nfs4_do_create(struct inode *dir, struct dentry *dentry, struct nfs4_ return status; } +static struct dentry *nfs4_do_mkdir(struct inode *dir, struct dentry *dentry, + struct nfs4_createdata *data) +{ + int status = nfs4_call_sync(NFS_SERVER(dir)->client, NFS_SERVER(dir), &data->msg, + &data->arg.seq_args, &data->res.seq_res, 1); + + if (status) + return ERR_PTR(status); + + spin_lock(&dir->i_lock); + /* Creating a directory bumps nlink in the parent */ + nfs4_inc_nlink_locked(dir); + nfs4_update_changeattr_locked(dir, &data->res.dir_cinfo, + data->res.fattr->time_start, + NFS_INO_INVALID_DATA); + spin_unlock(&dir->i_lock); + return nfs_add_or_obtain(dentry, data->res.fh, data->res.fattr); +} + static void nfs4_free_createdata(struct nfs4_createdata *data) { nfs4_label_free(data->fattr.label); @@ -5203,32 +5219,34 @@ static int nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, return err; } -static int _nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, - struct iattr *sattr, struct nfs4_label *label) +static struct dentry *_nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, + struct iattr *sattr, + struct nfs4_label *label) { struct nfs4_createdata *data; - int status = -ENOMEM; + struct dentry *ret = ERR_PTR(-ENOMEM); data = nfs4_alloc_createdata(dir, &dentry->d_name, sattr, NF4DIR); if (data == NULL) goto out; data->arg.label = label; - status = nfs4_do_create(dir, dentry, data); + ret = nfs4_do_mkdir(dir, dentry, data); nfs4_free_createdata(data); out: - return status; + return ret; } -static int nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, - struct iattr *sattr) +static struct dentry *nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, + struct iattr *sattr) { struct nfs_server *server = NFS_SERVER(dir); struct nfs4_exception exception = { .interruptible = true, }; struct nfs4_label l, *label; + struct dentry *alias; int err; label = nfs4_label_init_security(dir, dentry, sattr, &l); @@ -5236,14 +5254,15 @@ static int nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, if (!(server->attr_bitmask[2] & FATTR4_WORD2_MODE_UMASK)) sattr->ia_mode &= ~current_umask(); do { - err = _nfs4_proc_mkdir(dir, dentry, sattr, label); + alias = _nfs4_proc_mkdir(dir, dentry, sattr, label); + err = PTR_ERR_OR_ZERO(alias); trace_nfs4_mkdir(dir, &dentry->d_name, err); err = nfs4_handle_exception(NFS_SERVER(dir), err, &exception); } while (exception.retry); nfs4_label_release_security(label); - return err; + return alias; } static int _nfs4_proc_readdir(struct nfs_readdir_arg *nr_arg, diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 77920a2e3cef..63e71310b9f6 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -446,13 +446,14 @@ nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct folio *folio, return status; } -static int +static struct dentry * nfs_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) { struct nfs_createdata *data; struct rpc_message msg = { .rpc_proc = &nfs_procedures[NFSPROC_MKDIR], }; + struct dentry *alias = NULL; int status = -ENOMEM; dprintk("NFS call mkdir %pd\n", dentry); @@ -464,12 +465,15 @@ nfs_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); nfs_mark_for_revalidate(dir); - if (status == 0) - status = nfs_instantiate(dentry, data->res.fh, data->res.fattr); + if (status == 0) { + alias = nfs_add_or_obtain(dentry, data->res.fh, data->res.fattr); + status = PTR_ERR_OR_ZERO(alias); + } else + alias = ERR_PTR(status); nfs_free_createdata(data); out: dprintk("NFS reply mkdir: %d\n", status); - return status; + return alias; } static int diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 9155a6ffc370..d66c61cbbd1d 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1802,7 +1802,7 @@ struct nfs_rpc_ops { int (*link) (struct inode *, struct inode *, const struct qstr *); int (*symlink) (struct inode *, struct dentry *, struct folio *, unsigned int, struct iattr *); - int (*mkdir) (struct inode *, struct dentry *, struct iattr *); + struct dentry *(*mkdir) (struct inode *, struct dentry *, struct iattr *); int (*rmdir) (struct inode *, const struct qstr *); int (*readdir) (struct nfs_readdir_arg *, struct nfs_readdir_res *); int (*mknod) (struct inode *, struct dentry *, struct iattr *, -- 2.47.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed. 2025-02-20 23:36 ` [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed NeilBrown @ 2025-02-22 4:41 ` Al Viro 2025-02-24 2:41 ` NeilBrown 0 siblings, 1 reply; 36+ messages in thread From: Al Viro @ 2025-02-22 4:41 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, Feb 21, 2025 at 10:36:34AM +1100, NeilBrown wrote: > nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) > { > struct posix_acl *default_acl, *acl; > @@ -612,15 +612,18 @@ nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) > dentry = d_alias; > > status = nfs3_proc_setacls(d_inode(dentry), acl, default_acl); > + if (status && d_alias) > + dput(d_alias); > > - dput(d_alias); > out_release_acls: > posix_acl_release(acl); > posix_acl_release(default_acl); > out: > nfs3_free_createdata(data); > dprintk("NFS reply mkdir: %d\n", status); > - return status; > + if (status) > + return ERR_PTR(status); > + return d_alias; Ugh... That's really hard to follow - you are leaving a dangling reference in d_alias textually upstream of using that variable. The only reason it's not a bug is that dput() is reachable only with status && d_alias and that guarantees that we'll actually go away on if (status) return ERR_PTR(status). Worse, you can reach 'out:' with d_alias uninitialized. Yes, all such branches happen with status either still unmodified since it's initialization (which is non-zero) or under if (status), so again, that return d_alias; is unreachable. So the code is correct, but it's really asking for trouble down the road. BTW, dput(NULL) is guaranteed to be a no-op... ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed. 2025-02-22 4:41 ` Al Viro @ 2025-02-24 2:41 ` NeilBrown 0 siblings, 0 replies; 36+ messages in thread From: NeilBrown @ 2025-02-24 2:41 UTC (permalink / raw) To: Al Viro Cc: Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Sat, 22 Feb 2025, Al Viro wrote: > On Fri, Feb 21, 2025 at 10:36:34AM +1100, NeilBrown wrote: > > > nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) > > { > > struct posix_acl *default_acl, *acl; > > @@ -612,15 +612,18 @@ nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) > > dentry = d_alias; > > > > status = nfs3_proc_setacls(d_inode(dentry), acl, default_acl); > > + if (status && d_alias) > > + dput(d_alias); > > > > - dput(d_alias); > > out_release_acls: > > posix_acl_release(acl); > > posix_acl_release(default_acl); > > out: > > nfs3_free_createdata(data); > > dprintk("NFS reply mkdir: %d\n", status); > > - return status; > > + if (status) > > + return ERR_PTR(status); > > + return d_alias; > > Ugh... That's really hard to follow - you are leaving a dangling > reference in d_alias textually upstream of using that variable. > The only reason it's not a bug is that dput() is reachable only > with status && d_alias and that guarantees that we'll > actually go away on if (status) return ERR_PTR(status). > > Worse, you can reach 'out:' with d_alias uninitialized. Yes, > all such branches happen with status either still unmodified > since it's initialization (which is non-zero) or under > if (status), so again, that return d_alias; is unreachable. > > So the code is correct, but it's really asking for trouble down > the road. > > BTW, dput(NULL) is guaranteed to be a no-op... > Thanks for that. I've minimised the use of status and mostly stored errors in d_alias - which I've renamed to 'ret'. I think that answers your concerns. Thanks, NeilBrown --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -578,13 +578,13 @@ nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct folio *folio, return status; } -static int +static struct dentry * nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) { struct posix_acl *default_acl, *acl; struct nfs3_createdata *data; - struct dentry *d_alias; - int status = -ENOMEM; + struct dentry *ret = ERR_PTR(-ENOMEM); + int status; dprintk("NFS call mkdir %pd\n", dentry); @@ -592,8 +592,9 @@ nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) if (data == NULL) goto out; - status = posix_acl_create(dir, &sattr->ia_mode, &default_acl, &acl); - if (status) + ret = ERR_PTR(posix_acl_create(dir, &sattr->ia_mode, + &default_acl, &acl)); + if (IS_ERR(ret)) goto out; data->msg.rpc_proc = &nfs3_procedures[NFS3PROC_MKDIR]; @@ -602,25 +603,27 @@ nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) data->arg.mkdir.len = dentry->d_name.len; data->arg.mkdir.sattr = sattr; - d_alias = nfs3_do_create(dir, dentry, data); - status = PTR_ERR_OR_ZERO(d_alias); + ret = nfs3_do_create(dir, dentry, data); - if (status != 0) + if (IS_ERR(ret)) goto out_release_acls; - if (d_alias) - dentry = d_alias; + if (ret) + dentry = ret; status = nfs3_proc_setacls(d_inode(dentry), acl, default_acl); + if (status) { + dput(ret); + ret = ERR_PTR(status); + } - dput(d_alias); out_release_acls: posix_acl_release(acl); posix_acl_release(default_acl); out: nfs3_free_createdata(data); - dprintk("NFS reply mkdir: %d\n", status); - return status; + dprintk("NFS reply mkdir: %d\n", PTR_ERR_OR_ZERO(ret)); + return ret; } static int ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry. 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown ` (4 preceding siblings ...) 2025-02-20 23:36 ` [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed NeilBrown @ 2025-02-20 23:36 ` NeilBrown 2025-02-21 14:25 ` Jeff Layton 2025-02-22 0:32 ` Chuck Lever 5 siblings, 2 replies; 36+ messages in thread From: NeilBrown @ 2025-02-20 23:36 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs vfs_mkdir() does not guarantee to leave the child dentry hashed or make it positive on success, and in many such cases the filesystem had to use a different dentry which it can now return. This patch changes vfs_mkdir() to return the dentry provided by the filesystems which is hashed and positive when provided. This reduces the number of cases where the resulting dentry is not positive to a handful which don't deserve extra efforts. The only callers of vfs_mkdir() which are interested in the resulting inode are in-kernel filesystem clients: cachefiles, nfsd, smb/server. The only filesystems that don't reliably provide the inode are: - kernfs, tracefs which these clients are unlikely to be interested in - cifs in some configurations would need to do a lookup to find the created inode, but doesn't. cifs cannot be exported via NFS, is unlikely to be used by cachefiles, and smb/server only has a soft requirement for the inode, so this is unlikely to be a problem in practice. - hostfs, nfs, cifs may need to do a lookup (rarely for NFS) and it is possible for a race to make that lookup fail. Actual failure is unlikely and providing callers handle negative dentries graceful they will fail-safe. So this patch removes the lookup code in nfsd and smb/server and adjusts them to fail safe if a negative dentry is provided: - cache-files already fails safe by restarting the task from the top - it still does with this change, though it no longer calls cachefiles_put_directory() as that will crash if the dentry is negative. - nfsd reports "Server-fault" which it what it used to do if the lookup failed. This will never happen on any file-systems that it can actually export, so this is of no consequence. I removed the fh_update() call as that is not needed and out-of-place. A subsequent nfsd_create_setattr() call will call fh_update() when needed. - smb/server only wants the inode to call ksmbd_smb_inherit_owner() which updates ->i_uid (without calling notify_change() or similar) which can be safely skipping on cifs (I hope). If a different dentry is returned, the first one is put. If necessary the fact that it is new can be determined by comparing pointers. A new dentry will certainly have a new pointer (as the old is put after the new is obtained). Similarly if an error is returned (via ERR_PTR()) the original dentry is put. Signed-off-by: NeilBrown <neilb@suse.de> --- drivers/base/devtmpfs.c | 7 +++--- fs/cachefiles/namei.c | 16 ++++++++------ fs/ecryptfs/inode.c | 14 ++++++++---- fs/init.c | 7 ++++-- fs/namei.c | 46 ++++++++++++++++++++++++++-------------- fs/nfsd/nfs4recover.c | 7 ++++-- fs/nfsd/vfs.c | 34 ++++++++++------------------- fs/overlayfs/dir.c | 37 ++++---------------------------- fs/overlayfs/overlayfs.h | 15 ++++++------- fs/overlayfs/super.c | 7 +++--- fs/smb/server/vfs.c | 32 +++++++++------------------- fs/xfs/scrub/orphanage.c | 9 ++++---- include/linux/fs.h | 4 ++-- 13 files changed, 105 insertions(+), 130 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index 7a101009bee7..6dd1a8860f1c 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -175,18 +175,17 @@ static int dev_mkdir(const char *name, umode_t mode) { struct dentry *dentry; struct path path; - int err; dentry = kern_path_create(AT_FDCWD, name, &path, LOOKUP_DIRECTORY); if (IS_ERR(dentry)) return PTR_ERR(dentry); - err = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode); - if (!err) + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode); + if (!IS_ERR(dentry)) /* mark as kernel-created inode */ d_inode(dentry)->i_private = &thread; done_path_create(&path, dentry); - return err; + return PTR_ERR_OR_ZERO(dentry); } static int create_path(const char *nodepath) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 7cf59713f0f7..83a60126de0f 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -128,18 +128,19 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, ret = security_path_mkdir(&path, subdir, 0700); if (ret < 0) goto mkdir_error; - ret = cachefiles_inject_write_error(); - if (ret == 0) - ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); - if (ret < 0) { + subdir = ERR_PTR(cachefiles_inject_write_error()); + if (!IS_ERR(subdir)) + subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); + ret = PTR_ERR(subdir); + if (IS_ERR(subdir)) { trace_cachefiles_vfs_error(NULL, d_inode(dir), ret, cachefiles_trace_mkdir_error); goto mkdir_error; } trace_cachefiles_mkdir(dir, subdir); - if (unlikely(d_unhashed(subdir))) { - cachefiles_put_directory(subdir); + if (unlikely(d_unhashed(subdir) || d_is_negative(subdir))) { + dput(subdir); goto retry; } ASSERT(d_backing_inode(subdir)); @@ -195,7 +196,8 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, mkdir_error: inode_unlock(d_inode(dir)); - dput(subdir); + if (!IS_ERR(subdir)) + dput(subdir); pr_err("mkdir %s failed with error %d\n", dirname, ret); return ERR_PTR(ret); diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c index 6315dd194228..51a5c54eb740 100644 --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -511,10 +511,16 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct inode *lower_dir; rc = lock_parent(dentry, &lower_dentry, &lower_dir); - if (!rc) - rc = vfs_mkdir(&nop_mnt_idmap, lower_dir, - lower_dentry, mode); - if (rc || d_really_is_negative(lower_dentry)) + if (rc) + goto out; + + lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir, + lower_dentry, mode); + rc = PTR_ERR(lower_dentry); + if (IS_ERR(lower_dentry)) + goto out; + rc = 0; + if (d_unhashed(lower_dentry)) goto out; rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb); if (rc) diff --git a/fs/init.c b/fs/init.c index e9387b6c4f30..eef5124885e3 100644 --- a/fs/init.c +++ b/fs/init.c @@ -230,9 +230,12 @@ int __init init_mkdir(const char *pathname, umode_t mode) return PTR_ERR(dentry); mode = mode_strip_umask(d_inode(path.dentry), mode); error = security_path_mkdir(&path, dentry, mode); - if (!error) - error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, + if (!error) { + dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry, mode); + if (IS_ERR(dentry)) + error = PTR_ERR(dentry); + } done_path_create(&path, dentry); return error; } diff --git a/fs/namei.c b/fs/namei.c index 63fe4dc29c23..bd5eec2c0af4 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4125,7 +4125,8 @@ EXPORT_SYMBOL(kern_path_create); void done_path_create(struct path *path, struct dentry *dentry) { - dput(dentry); + if (!IS_ERR(dentry)) + dput(dentry); inode_unlock(path->dentry->d_inode); mnt_drop_write(path->mnt); path_put(path); @@ -4271,7 +4272,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d } /** - * vfs_mkdir - create directory + * vfs_mkdir - create directory returning correct dentry if possible * @idmap: idmap of the mount the inode was found from * @dir: inode of the parent directory * @dentry: dentry of the child directory @@ -4284,9 +4285,15 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d * care to map the inode according to @idmap before checking permissions. * On non-idmapped mounts or if permission checking is to be performed on the * raw inode simply pass @nop_mnt_idmap. + * + * In the event that the filesystem does not use the *@dentry but leaves it + * negative or unhashes it and possibly splices a different one returning it, + * the original dentry is dput() and the alternate is returned. + * + * In case of an error the dentry is dput() and an ERR_PTR() is returned. */ -int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int error; unsigned max_links = dir->i_sb->s_max_links; @@ -4294,31 +4301,36 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, error = may_create(idmap, dir, dentry); if (error) - return error; + goto err; + error = -EPERM; if (!dir->i_op->mkdir) - return -EPERM; + goto err; mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); error = security_inode_mkdir(dir, dentry, mode); if (error) - return error; + goto err; + error = -EMLINK; if (max_links && dir->i_nlink >= max_links) - return -EMLINK; + goto err; de = dir->i_op->mkdir(idmap, dir, dentry, mode); + error = PTR_ERR(de); if (IS_ERR(de)) - return PTR_ERR(de); + goto err; if (de) { - fsnotify_mkdir(dir, de); - /* Cannot return de yet */ - dput(de); - } else { - fsnotify_mkdir(dir, dentry); + dput(dentry); + dentry = de; } + fsnotify_mkdir(dir, dentry); + return dentry; - return 0; +err: + dput(dentry); + + return ERR_PTR(error); } EXPORT_SYMBOL(vfs_mkdir); @@ -4338,8 +4350,10 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode) error = security_path_mkdir(&path, dentry, mode_strip_umask(path.dentry->d_inode, mode)); if (!error) { - error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, + dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry, mode); + if (IS_ERR(dentry)) + error = PTR_ERR(dentry); } done_path_create(&path, dentry); if (retry_estale(error, lookup_flags)) { diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c index 28f4d5311c40..c1d9bd07285f 100644 --- a/fs/nfsd/nfs4recover.c +++ b/fs/nfsd/nfs4recover.c @@ -233,9 +233,12 @@ nfsd4_create_clid_dir(struct nfs4_client *clp) * as well be forgiving and just succeed silently. */ goto out_put; - status = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU); + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU); + if (IS_ERR(dentry)) + status = PTR_ERR(dentry); out_put: - dput(dentry); + if (!status) + dput(dentry); out_unlock: inode_unlock(d_inode(dir)); if (status == 0) { diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 29cb7b812d71..34d7aa531662 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1461,7 +1461,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, struct inode *dirp; struct iattr *iap = attrs->na_iattr; __be32 err; - int host_err; + int host_err = 0; dentry = fhp->fh_dentry; dirp = d_inode(dentry); @@ -1488,28 +1488,15 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, nfsd_check_ignore_resizing(iap); break; case S_IFDIR: - host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); - if (!host_err && unlikely(d_unhashed(dchild))) { - struct dentry *d; - d = lookup_one_len(dchild->d_name.name, - dchild->d_parent, - dchild->d_name.len); - if (IS_ERR(d)) { - host_err = PTR_ERR(d); - break; - } - if (unlikely(d_is_negative(d))) { - dput(d); - err = nfserr_serverfault; - goto out; - } + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); + if (IS_ERR(dchild)) { + host_err = PTR_ERR(dchild); + } else if (d_is_negative(dchild)) { + err = nfserr_serverfault; + goto out; + } else if (unlikely(dchild != resfhp->fh_dentry)) { dput(resfhp->fh_dentry); - resfhp->fh_dentry = dget(d); - err = fh_update(resfhp); - dput(dchild); - dchild = d; - if (err) - goto out; + resfhp->fh_dentry = dget(dchild); } break; case S_IFCHR: @@ -1530,7 +1517,8 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs); out: - dput(dchild); + if (!IS_ERR(dchild)) + dput(dchild); return err; out_nfserr: diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index 21c3aaf7b274..fe493f3ed6b6 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, goto out; } -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, - struct dentry **newdentry, umode_t mode) -{ - int err; - struct dentry *d, *dentry = *newdentry; - - err = ovl_do_mkdir(ofs, dir, dentry, mode); - if (err) - return err; - - if (likely(!d_unhashed(dentry))) - return 0; - - /* - * vfs_mkdir() may succeed and leave the dentry passed - * to it unhashed and negative. If that happens, try to - * lookup a new hashed and positive dentry. - */ - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", - dentry, err); - return PTR_ERR(d); - } - dput(dentry); - *newdentry = d; - - return 0; -} - struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, struct dentry *newdentry, struct ovl_cattr *attr) { @@ -191,7 +160,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, case S_IFDIR: /* mkdir is special... */ - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); + newdentry = ovl_do_mkdir(ofs, dir, newdentry, attr->mode); + err = PTR_ERR_OR_ZERO(newdentry); break; case S_IFCHR: @@ -219,7 +189,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, } out: if (err) { - dput(newdentry); + if (!IS_ERR(newdentry)) + dput(newdentry); return ERR_PTR(err); } return newdentry; diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 0021e2025020..6f2f8f4cfbbc 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -241,13 +241,14 @@ static inline int ovl_do_create(struct ovl_fs *ofs, return err; } -static inline int ovl_do_mkdir(struct ovl_fs *ofs, - struct inode *dir, struct dentry *dentry, - umode_t mode) +static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs, + struct inode *dir, + struct dentry *dentry, + umode_t mode) { - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); - return err; + dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); + pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry)); + return dentry; } static inline int ovl_do_mknod(struct ovl_fs *ofs, @@ -838,8 +839,6 @@ struct ovl_cattr { #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, - struct dentry **newdentry, umode_t mode); struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, struct dentry *newdentry, struct ovl_cattr *attr); diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 61e21c3129e8..b63474d1b064 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -327,9 +327,10 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, goto retry; } - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); - if (err) - goto out_dput; + work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode); + err = PTR_ERR(work); + if (IS_ERR(work)) + goto out_err; /* Weird filesystem returning with hashed negative (kernfs)? */ err = -EINVAL; diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index fe29acef5872..8554aa5a1059 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -206,8 +206,8 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) { struct mnt_idmap *idmap; struct path path; - struct dentry *dentry; - int err; + struct dentry *dentry, *d; + int err = 0; dentry = ksmbd_vfs_kern_path_create(work, name, LOOKUP_NO_SYMLINKS | LOOKUP_DIRECTORY, @@ -222,27 +222,15 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) idmap = mnt_idmap(path.mnt); mode |= S_IFDIR; - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); - if (!err && d_unhashed(dentry)) { - struct dentry *d; - - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - err = PTR_ERR(d); - goto out_err; - } - if (unlikely(d_is_negative(d))) { - dput(d); - err = -ENOENT; - goto out_err; - } - - ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); - dput(d); - } + d = dentry; + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); + if (IS_ERR(dentry)) + err = PTR_ERR(dentry); + else if (d_is_negative(dentry)) + err = -ENOENT; + if (!err && dentry != d) + ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(dentry)); -out_err: done_path_create(&path, dentry); if (err) pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c index c287c755f2c5..3537f3cca6d5 100644 --- a/fs/xfs/scrub/orphanage.c +++ b/fs/xfs/scrub/orphanage.c @@ -167,10 +167,11 @@ xrep_orphanage_create( * directory to control access to a file we put in here. */ if (d_really_is_negative(orphanage_dentry)) { - error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, - 0750); - if (error) - goto out_dput_orphanage; + orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode, + orphanage_dentry, 0750); + error = PTR_ERR(orphanage_dentry); + if (IS_ERR(orphanage_dentry)) + goto out_unlock_root; } /* Not a directory? Bail out. */ diff --git a/include/linux/fs.h b/include/linux/fs.h index 8f4fbecd40fc..eaad8e31c0d4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1971,8 +1971,8 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap, */ int vfs_create(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, bool); -int vfs_mkdir(struct mnt_idmap *, struct inode *, - struct dentry *, umode_t); +struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *, + struct dentry *, umode_t); int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, dev_t); int vfs_symlink(struct mnt_idmap *, struct inode *, -- 2.47.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry. 2025-02-20 23:36 ` [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry NeilBrown @ 2025-02-21 14:25 ` Jeff Layton 2025-02-22 0:32 ` Chuck Lever 1 sibling, 0 replies; 36+ messages in thread From: Jeff Layton @ 2025-02-21 14:25 UTC (permalink / raw) To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Fri, 2025-02-21 at 10:36 +1100, NeilBrown wrote: > vfs_mkdir() does not guarantee to leave the child dentry hashed or make > it positive on success, and in many such cases the filesystem had to use > a different dentry which it can now return. > > This patch changes vfs_mkdir() to return the dentry provided by the > filesystems which is hashed and positive when provided. This reduces > the number of cases where the resulting dentry is not positive to a > handful which don't deserve extra efforts. > > The only callers of vfs_mkdir() which are interested in the resulting > inode are in-kernel filesystem clients: cachefiles, nfsd, smb/server. > The only filesystems that don't reliably provide the inode are: > - kernfs, tracefs which these clients are unlikely to be interested in > - cifs in some configurations would need to do a lookup to find the > created inode, but doesn't. cifs cannot be exported via NFS, is > unlikely to be used by cachefiles, and smb/server only has a soft > requirement for the inode, so this is unlikely to be a problem in > practice. > - hostfs, nfs, cifs may need to do a lookup (rarely for NFS) and it is > possible for a race to make that lookup fail. Actual failure > is unlikely and providing callers handle negative dentries graceful > they will fail-safe. > > So this patch removes the lookup code in nfsd and smb/server and adjusts > them to fail safe if a negative dentry is provided: > - cache-files already fails safe by restarting the task from the > top - it still does with this change, though it no longer calls > cachefiles_put_directory() as that will crash if the dentry is > negative. > - nfsd reports "Server-fault" which it what it used to do if the lookup > failed. This will never happen on any file-systems that it can actually > export, so this is of no consequence. I removed the fh_update() > call as that is not needed and out-of-place. A subsequent > nfsd_create_setattr() call will call fh_update() when needed. > - smb/server only wants the inode to call ksmbd_smb_inherit_owner() > which updates ->i_uid (without calling notify_change() or similar) That looks like a bug. ksmbd should really be using notify_change(). There is no guarantee that that uid will eventually be persisted. It could get overwritten if the exported filesystem is something like Ceph or NFS. I see no reason why it can't use that either, as it's not in a weird context at that point. It would probably be ideal though to make it create the dir with the right ownership in the first place, possibly by manipulating the task creds? In any case, that's not directly related to your patch. > which can be safely skipping on cifs (I hope). > > If a different dentry is returned, the first one is put. If necessary > the fact that it is new can be determined by comparing pointers. A new > dentry will certainly have a new pointer (as the old is put after the > new is obtained). > Similarly if an error is returned (via ERR_PTR()) the original dentry is > put. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > drivers/base/devtmpfs.c | 7 +++--- > fs/cachefiles/namei.c | 16 ++++++++------ > fs/ecryptfs/inode.c | 14 ++++++++---- > fs/init.c | 7 ++++-- > fs/namei.c | 46 ++++++++++++++++++++++++++-------------- > fs/nfsd/nfs4recover.c | 7 ++++-- > fs/nfsd/vfs.c | 34 ++++++++++------------------- > fs/overlayfs/dir.c | 37 ++++---------------------------- > fs/overlayfs/overlayfs.h | 15 ++++++------- > fs/overlayfs/super.c | 7 +++--- > fs/smb/server/vfs.c | 32 +++++++++------------------- > fs/xfs/scrub/orphanage.c | 9 ++++---- > include/linux/fs.h | 4 ++-- > 13 files changed, 105 insertions(+), 130 deletions(-) > > diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c > index 7a101009bee7..6dd1a8860f1c 100644 > --- a/drivers/base/devtmpfs.c > +++ b/drivers/base/devtmpfs.c > @@ -175,18 +175,17 @@ static int dev_mkdir(const char *name, umode_t mode) > { > struct dentry *dentry; > struct path path; > - int err; > > dentry = kern_path_create(AT_FDCWD, name, &path, LOOKUP_DIRECTORY); > if (IS_ERR(dentry)) > return PTR_ERR(dentry); > > - err = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode); > - if (!err) > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode); > + if (!IS_ERR(dentry)) > /* mark as kernel-created inode */ > d_inode(dentry)->i_private = &thread; > done_path_create(&path, dentry); > - return err; > + return PTR_ERR_OR_ZERO(dentry); > } > > static int create_path(const char *nodepath) > diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c > index 7cf59713f0f7..83a60126de0f 100644 > --- a/fs/cachefiles/namei.c > +++ b/fs/cachefiles/namei.c > @@ -128,18 +128,19 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, > ret = security_path_mkdir(&path, subdir, 0700); > if (ret < 0) > goto mkdir_error; > - ret = cachefiles_inject_write_error(); > - if (ret == 0) > - ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); > - if (ret < 0) { > + subdir = ERR_PTR(cachefiles_inject_write_error()); > + if (!IS_ERR(subdir)) > + subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); > + ret = PTR_ERR(subdir); > + if (IS_ERR(subdir)) { > trace_cachefiles_vfs_error(NULL, d_inode(dir), ret, > cachefiles_trace_mkdir_error); > goto mkdir_error; > } > trace_cachefiles_mkdir(dir, subdir); > > - if (unlikely(d_unhashed(subdir))) { > - cachefiles_put_directory(subdir); > + if (unlikely(d_unhashed(subdir) || d_is_negative(subdir))) { > + dput(subdir); > goto retry; > } > ASSERT(d_backing_inode(subdir)); > @@ -195,7 +196,8 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, > > mkdir_error: > inode_unlock(d_inode(dir)); > - dput(subdir); > + if (!IS_ERR(subdir)) > + dput(subdir); > pr_err("mkdir %s failed with error %d\n", dirname, ret); > return ERR_PTR(ret); > > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c > index 6315dd194228..51a5c54eb740 100644 > --- a/fs/ecryptfs/inode.c > +++ b/fs/ecryptfs/inode.c > @@ -511,10 +511,16 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > struct inode *lower_dir; > > rc = lock_parent(dentry, &lower_dentry, &lower_dir); > - if (!rc) > - rc = vfs_mkdir(&nop_mnt_idmap, lower_dir, > - lower_dentry, mode); > - if (rc || d_really_is_negative(lower_dentry)) > + if (rc) > + goto out; > + > + lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir, > + lower_dentry, mode); > + rc = PTR_ERR(lower_dentry); > + if (IS_ERR(lower_dentry)) > + goto out; > + rc = 0; > + if (d_unhashed(lower_dentry)) > goto out; > rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb); > if (rc) > diff --git a/fs/init.c b/fs/init.c > index e9387b6c4f30..eef5124885e3 100644 > --- a/fs/init.c > +++ b/fs/init.c > @@ -230,9 +230,12 @@ int __init init_mkdir(const char *pathname, umode_t mode) > return PTR_ERR(dentry); > mode = mode_strip_umask(d_inode(path.dentry), mode); > error = security_path_mkdir(&path, dentry, mode); > - if (!error) > - error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > + if (!error) { > + dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > dentry, mode); > + if (IS_ERR(dentry)) > + error = PTR_ERR(dentry); > + } > done_path_create(&path, dentry); > return error; > } > diff --git a/fs/namei.c b/fs/namei.c > index 63fe4dc29c23..bd5eec2c0af4 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -4125,7 +4125,8 @@ EXPORT_SYMBOL(kern_path_create); > > void done_path_create(struct path *path, struct dentry *dentry) > { > - dput(dentry); > + if (!IS_ERR(dentry)) > + dput(dentry); > inode_unlock(path->dentry->d_inode); > mnt_drop_write(path->mnt); > path_put(path); > @@ -4271,7 +4272,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d > } > > /** > - * vfs_mkdir - create directory > + * vfs_mkdir - create directory returning correct dentry if possible > * @idmap: idmap of the mount the inode was found from > * @dir: inode of the parent directory > * @dentry: dentry of the child directory > @@ -4284,9 +4285,15 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d > * care to map the inode according to @idmap before checking permissions. > * On non-idmapped mounts or if permission checking is to be performed on the > * raw inode simply pass @nop_mnt_idmap. > + * > + * In the event that the filesystem does not use the *@dentry but leaves it > + * negative or unhashes it and possibly splices a different one returning it, > + * the original dentry is dput() and the alternate is returned. > + * > + * In case of an error the dentry is dput() and an ERR_PTR() is returned. > */ > -int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > - struct dentry *dentry, umode_t mode) > +struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > + struct dentry *dentry, umode_t mode) > { > int error; > unsigned max_links = dir->i_sb->s_max_links; > @@ -4294,31 +4301,36 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > error = may_create(idmap, dir, dentry); > if (error) > - return error; > + goto err; > > + error = -EPERM; > if (!dir->i_op->mkdir) > - return -EPERM; > + goto err; > > mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); > error = security_inode_mkdir(dir, dentry, mode); > if (error) > - return error; > + goto err; > > + error = -EMLINK; > if (max_links && dir->i_nlink >= max_links) > - return -EMLINK; > + goto err; > > de = dir->i_op->mkdir(idmap, dir, dentry, mode); > + error = PTR_ERR(de); > if (IS_ERR(de)) > - return PTR_ERR(de); > + goto err; > if (de) { > - fsnotify_mkdir(dir, de); > - /* Cannot return de yet */ > - dput(de); > - } else { > - fsnotify_mkdir(dir, dentry); > + dput(dentry); > + dentry = de; > } > + fsnotify_mkdir(dir, dentry); > + return dentry; > > - return 0; > +err: > + dput(dentry); > + > + return ERR_PTR(error); > } > EXPORT_SYMBOL(vfs_mkdir); > > @@ -4338,8 +4350,10 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode) > error = security_path_mkdir(&path, dentry, > mode_strip_umask(path.dentry->d_inode, mode)); > if (!error) { > - error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > + dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > dentry, mode); > + if (IS_ERR(dentry)) > + error = PTR_ERR(dentry); > } > done_path_create(&path, dentry); > if (retry_estale(error, lookup_flags)) { > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c > index 28f4d5311c40..c1d9bd07285f 100644 > --- a/fs/nfsd/nfs4recover.c > +++ b/fs/nfsd/nfs4recover.c > @@ -233,9 +233,12 @@ nfsd4_create_clid_dir(struct nfs4_client *clp) > * as well be forgiving and just succeed silently. > */ > goto out_put; > - status = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU); > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU); > + if (IS_ERR(dentry)) > + status = PTR_ERR(dentry); > out_put: > - dput(dentry); > + if (!status) > + dput(dentry); > out_unlock: > inode_unlock(d_inode(dir)); > if (status == 0) { > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 29cb7b812d71..34d7aa531662 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -1461,7 +1461,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > struct inode *dirp; > struct iattr *iap = attrs->na_iattr; > __be32 err; > - int host_err; > + int host_err = 0; > > dentry = fhp->fh_dentry; > dirp = d_inode(dentry); > @@ -1488,28 +1488,15 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > nfsd_check_ignore_resizing(iap); > break; > case S_IFDIR: > - host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); > - if (!host_err && unlikely(d_unhashed(dchild))) { > - struct dentry *d; > - d = lookup_one_len(dchild->d_name.name, > - dchild->d_parent, > - dchild->d_name.len); > - if (IS_ERR(d)) { > - host_err = PTR_ERR(d); > - break; > - } > - if (unlikely(d_is_negative(d))) { > - dput(d); > - err = nfserr_serverfault; > - goto out; > - } > + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); > + if (IS_ERR(dchild)) { > + host_err = PTR_ERR(dchild); > + } else if (d_is_negative(dchild)) { > + err = nfserr_serverfault; > + goto out; > + } else if (unlikely(dchild != resfhp->fh_dentry)) { > dput(resfhp->fh_dentry); > - resfhp->fh_dentry = dget(d); > - err = fh_update(resfhp); > - dput(dchild); > - dchild = d; > - if (err) > - goto out; > + resfhp->fh_dentry = dget(dchild); > } > break; > case S_IFCHR: > @@ -1530,7 +1517,8 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs); > > out: > - dput(dchild); > + if (!IS_ERR(dchild)) > + dput(dchild); > return err; > > out_nfserr: > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c > index 21c3aaf7b274..fe493f3ed6b6 100644 > --- a/fs/overlayfs/dir.c > +++ b/fs/overlayfs/dir.c > @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, > goto out; > } > > -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, > - struct dentry **newdentry, umode_t mode) > -{ > - int err; > - struct dentry *d, *dentry = *newdentry; > - > - err = ovl_do_mkdir(ofs, dir, dentry, mode); > - if (err) > - return err; > - > - if (likely(!d_unhashed(dentry))) > - return 0; > - > - /* > - * vfs_mkdir() may succeed and leave the dentry passed > - * to it unhashed and negative. If that happens, try to > - * lookup a new hashed and positive dentry. > - */ > - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, > - dentry->d_name.len); > - if (IS_ERR(d)) { > - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", > - dentry, err); > - return PTR_ERR(d); > - } > - dput(dentry); > - *newdentry = d; > - > - return 0; > -} > - > struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > struct dentry *newdentry, struct ovl_cattr *attr) > { > @@ -191,7 +160,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > > case S_IFDIR: > /* mkdir is special... */ > - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); > + newdentry = ovl_do_mkdir(ofs, dir, newdentry, attr->mode); > + err = PTR_ERR_OR_ZERO(newdentry); > break; > > case S_IFCHR: > @@ -219,7 +189,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > } > out: > if (err) { > - dput(newdentry); > + if (!IS_ERR(newdentry)) > + dput(newdentry); > return ERR_PTR(err); > } > return newdentry; > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h > index 0021e2025020..6f2f8f4cfbbc 100644 > --- a/fs/overlayfs/overlayfs.h > +++ b/fs/overlayfs/overlayfs.h > @@ -241,13 +241,14 @@ static inline int ovl_do_create(struct ovl_fs *ofs, > return err; > } > > -static inline int ovl_do_mkdir(struct ovl_fs *ofs, > - struct inode *dir, struct dentry *dentry, > - umode_t mode) > +static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs, > + struct inode *dir, > + struct dentry *dentry, > + umode_t mode) > { > - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); > - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); > - return err; > + dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); > + pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry)); > + return dentry; > } > > static inline int ovl_do_mknod(struct ovl_fs *ofs, > @@ -838,8 +839,6 @@ struct ovl_cattr { > > #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) > > -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, > - struct dentry **newdentry, umode_t mode); > struct dentry *ovl_create_real(struct ovl_fs *ofs, > struct inode *dir, struct dentry *newdentry, > struct ovl_cattr *attr); > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c > index 61e21c3129e8..b63474d1b064 100644 > --- a/fs/overlayfs/super.c > +++ b/fs/overlayfs/super.c > @@ -327,9 +327,10 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, > goto retry; > } > > - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); > - if (err) > - goto out_dput; > + work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode); > + err = PTR_ERR(work); > + if (IS_ERR(work)) > + goto out_err; > > /* Weird filesystem returning with hashed negative (kernfs)? */ > err = -EINVAL; > diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c > index fe29acef5872..8554aa5a1059 100644 > --- a/fs/smb/server/vfs.c > +++ b/fs/smb/server/vfs.c > @@ -206,8 +206,8 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) > { > struct mnt_idmap *idmap; > struct path path; > - struct dentry *dentry; > - int err; > + struct dentry *dentry, *d; > + int err = 0; > > dentry = ksmbd_vfs_kern_path_create(work, name, > LOOKUP_NO_SYMLINKS | LOOKUP_DIRECTORY, > @@ -222,27 +222,15 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) > > idmap = mnt_idmap(path.mnt); > mode |= S_IFDIR; > - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); > - if (!err && d_unhashed(dentry)) { > - struct dentry *d; > - > - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, > - dentry->d_name.len); > - if (IS_ERR(d)) { > - err = PTR_ERR(d); > - goto out_err; > - } > - if (unlikely(d_is_negative(d))) { > - dput(d); > - err = -ENOENT; > - goto out_err; > - } > - > - ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); > - dput(d); > - } > + d = dentry; > + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); > + if (IS_ERR(dentry)) > + err = PTR_ERR(dentry); > + else if (d_is_negative(dentry)) > + err = -ENOENT; > + if (!err && dentry != d) > + ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(dentry)); > > -out_err: > done_path_create(&path, dentry); > if (err) > pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); > diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c > index c287c755f2c5..3537f3cca6d5 100644 > --- a/fs/xfs/scrub/orphanage.c > +++ b/fs/xfs/scrub/orphanage.c > @@ -167,10 +167,11 @@ xrep_orphanage_create( > * directory to control access to a file we put in here. > */ > if (d_really_is_negative(orphanage_dentry)) { > - error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, > - 0750); > - if (error) > - goto out_dput_orphanage; > + orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode, > + orphanage_dentry, 0750); > + error = PTR_ERR(orphanage_dentry); > + if (IS_ERR(orphanage_dentry)) > + goto out_unlock_root; > } > > /* Not a directory? Bail out. */ > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 8f4fbecd40fc..eaad8e31c0d4 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1971,8 +1971,8 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap, > */ > int vfs_create(struct mnt_idmap *, struct inode *, > struct dentry *, umode_t, bool); > -int vfs_mkdir(struct mnt_idmap *, struct inode *, > - struct dentry *, umode_t); > +struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *, > + struct dentry *, umode_t); > int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, > umode_t, dev_t); > int vfs_symlink(struct mnt_idmap *, struct inode *, Nice cleanup in the vfs_mkdir() callers. Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry. 2025-02-20 23:36 ` [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry NeilBrown 2025-02-21 14:25 ` Jeff Layton @ 2025-02-22 0:32 ` Chuck Lever 2025-02-24 2:51 ` NeilBrown 1 sibling, 1 reply; 36+ messages in thread From: Chuck Lever @ 2025-02-22 0:32 UTC (permalink / raw) To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky Cc: linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On 2/20/25 6:36 PM, NeilBrown wrote: > vfs_mkdir() does not guarantee to leave the child dentry hashed or make > it positive on success, and in many such cases the filesystem had to use > a different dentry which it can now return. > > This patch changes vfs_mkdir() to return the dentry provided by the > filesystems which is hashed and positive when provided. This reduces > the number of cases where the resulting dentry is not positive to a > handful which don't deserve extra efforts. > > The only callers of vfs_mkdir() which are interested in the resulting > inode are in-kernel filesystem clients: cachefiles, nfsd, smb/server. > The only filesystems that don't reliably provide the inode are: > - kernfs, tracefs which these clients are unlikely to be interested in > - cifs in some configurations would need to do a lookup to find the > created inode, but doesn't. cifs cannot be exported via NFS, is > unlikely to be used by cachefiles, and smb/server only has a soft > requirement for the inode, so this is unlikely to be a problem in > practice. > - hostfs, nfs, cifs may need to do a lookup (rarely for NFS) and it is > possible for a race to make that lookup fail. Actual failure > is unlikely and providing callers handle negative dentries graceful > they will fail-safe. > > So this patch removes the lookup code in nfsd and smb/server and adjusts > them to fail safe if a negative dentry is provided: > - cache-files already fails safe by restarting the task from the > top - it still does with this change, though it no longer calls > cachefiles_put_directory() as that will crash if the dentry is > negative. > - nfsd reports "Server-fault" which it what it used to do if the lookup > failed. This will never happen on any file-systems that it can actually > export, so this is of no consequence. I removed the fh_update() > call as that is not needed and out-of-place. A subsequent > nfsd_create_setattr() call will call fh_update() when needed. > - smb/server only wants the inode to call ksmbd_smb_inherit_owner() > which updates ->i_uid (without calling notify_change() or similar) > which can be safely skipping on cifs (I hope). > > If a different dentry is returned, the first one is put. If necessary > the fact that it is new can be determined by comparing pointers. A new > dentry will certainly have a new pointer (as the old is put after the > new is obtained). > Similarly if an error is returned (via ERR_PTR()) the original dentry is > put. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > drivers/base/devtmpfs.c | 7 +++--- > fs/cachefiles/namei.c | 16 ++++++++------ > fs/ecryptfs/inode.c | 14 ++++++++---- > fs/init.c | 7 ++++-- > fs/namei.c | 46 ++++++++++++++++++++++++++-------------- > fs/nfsd/nfs4recover.c | 7 ++++-- > fs/nfsd/vfs.c | 34 ++++++++++------------------- > fs/overlayfs/dir.c | 37 ++++---------------------------- > fs/overlayfs/overlayfs.h | 15 ++++++------- > fs/overlayfs/super.c | 7 +++--- > fs/smb/server/vfs.c | 32 +++++++++------------------- > fs/xfs/scrub/orphanage.c | 9 ++++---- > include/linux/fs.h | 4 ++-- > 13 files changed, 105 insertions(+), 130 deletions(-) > > diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c > index 7a101009bee7..6dd1a8860f1c 100644 > --- a/drivers/base/devtmpfs.c > +++ b/drivers/base/devtmpfs.c > @@ -175,18 +175,17 @@ static int dev_mkdir(const char *name, umode_t mode) > { > struct dentry *dentry; > struct path path; > - int err; > > dentry = kern_path_create(AT_FDCWD, name, &path, LOOKUP_DIRECTORY); > if (IS_ERR(dentry)) > return PTR_ERR(dentry); > > - err = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode); > - if (!err) > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode); > + if (!IS_ERR(dentry)) > /* mark as kernel-created inode */ > d_inode(dentry)->i_private = &thread; > done_path_create(&path, dentry); > - return err; > + return PTR_ERR_OR_ZERO(dentry); > } > > static int create_path(const char *nodepath) > diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c > index 7cf59713f0f7..83a60126de0f 100644 > --- a/fs/cachefiles/namei.c > +++ b/fs/cachefiles/namei.c > @@ -128,18 +128,19 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, > ret = security_path_mkdir(&path, subdir, 0700); > if (ret < 0) > goto mkdir_error; > - ret = cachefiles_inject_write_error(); > - if (ret == 0) > - ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); > - if (ret < 0) { > + subdir = ERR_PTR(cachefiles_inject_write_error()); > + if (!IS_ERR(subdir)) > + subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); > + ret = PTR_ERR(subdir); > + if (IS_ERR(subdir)) { > trace_cachefiles_vfs_error(NULL, d_inode(dir), ret, > cachefiles_trace_mkdir_error); > goto mkdir_error; > } > trace_cachefiles_mkdir(dir, subdir); > > - if (unlikely(d_unhashed(subdir))) { > - cachefiles_put_directory(subdir); > + if (unlikely(d_unhashed(subdir) || d_is_negative(subdir))) { > + dput(subdir); > goto retry; > } > ASSERT(d_backing_inode(subdir)); > @@ -195,7 +196,8 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, > > mkdir_error: > inode_unlock(d_inode(dir)); > - dput(subdir); > + if (!IS_ERR(subdir)) > + dput(subdir); > pr_err("mkdir %s failed with error %d\n", dirname, ret); > return ERR_PTR(ret); > > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c > index 6315dd194228..51a5c54eb740 100644 > --- a/fs/ecryptfs/inode.c > +++ b/fs/ecryptfs/inode.c > @@ -511,10 +511,16 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > struct inode *lower_dir; > > rc = lock_parent(dentry, &lower_dentry, &lower_dir); > - if (!rc) > - rc = vfs_mkdir(&nop_mnt_idmap, lower_dir, > - lower_dentry, mode); > - if (rc || d_really_is_negative(lower_dentry)) > + if (rc) > + goto out; > + > + lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir, > + lower_dentry, mode); > + rc = PTR_ERR(lower_dentry); > + if (IS_ERR(lower_dentry)) > + goto out; > + rc = 0; > + if (d_unhashed(lower_dentry)) > goto out; > rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb); > if (rc) > diff --git a/fs/init.c b/fs/init.c > index e9387b6c4f30..eef5124885e3 100644 > --- a/fs/init.c > +++ b/fs/init.c > @@ -230,9 +230,12 @@ int __init init_mkdir(const char *pathname, umode_t mode) > return PTR_ERR(dentry); > mode = mode_strip_umask(d_inode(path.dentry), mode); > error = security_path_mkdir(&path, dentry, mode); > - if (!error) > - error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > + if (!error) { > + dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > dentry, mode); > + if (IS_ERR(dentry)) > + error = PTR_ERR(dentry); > + } > done_path_create(&path, dentry); > return error; > } > diff --git a/fs/namei.c b/fs/namei.c > index 63fe4dc29c23..bd5eec2c0af4 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -4125,7 +4125,8 @@ EXPORT_SYMBOL(kern_path_create); > > void done_path_create(struct path *path, struct dentry *dentry) > { > - dput(dentry); > + if (!IS_ERR(dentry)) > + dput(dentry); > inode_unlock(path->dentry->d_inode); > mnt_drop_write(path->mnt); > path_put(path); > @@ -4271,7 +4272,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d > } > > /** > - * vfs_mkdir - create directory > + * vfs_mkdir - create directory returning correct dentry if possible > * @idmap: idmap of the mount the inode was found from > * @dir: inode of the parent directory > * @dentry: dentry of the child directory > @@ -4284,9 +4285,15 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d > * care to map the inode according to @idmap before checking permissions. > * On non-idmapped mounts or if permission checking is to be performed on the > * raw inode simply pass @nop_mnt_idmap. > + * > + * In the event that the filesystem does not use the *@dentry but leaves it > + * negative or unhashes it and possibly splices a different one returning it, > + * the original dentry is dput() and the alternate is returned. > + * > + * In case of an error the dentry is dput() and an ERR_PTR() is returned. > */ > -int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > - struct dentry *dentry, umode_t mode) > +struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > + struct dentry *dentry, umode_t mode) > { > int error; > unsigned max_links = dir->i_sb->s_max_links; > @@ -4294,31 +4301,36 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, > > error = may_create(idmap, dir, dentry); > if (error) > - return error; > + goto err; > > + error = -EPERM; > if (!dir->i_op->mkdir) > - return -EPERM; > + goto err; > > mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); > error = security_inode_mkdir(dir, dentry, mode); > if (error) > - return error; > + goto err; > > + error = -EMLINK; > if (max_links && dir->i_nlink >= max_links) > - return -EMLINK; > + goto err; > > de = dir->i_op->mkdir(idmap, dir, dentry, mode); > + error = PTR_ERR(de); > if (IS_ERR(de)) > - return PTR_ERR(de); > + goto err; > if (de) { > - fsnotify_mkdir(dir, de); > - /* Cannot return de yet */ > - dput(de); > - } else { > - fsnotify_mkdir(dir, dentry); > + dput(dentry); > + dentry = de; > } > + fsnotify_mkdir(dir, dentry); > + return dentry; > > - return 0; > +err: > + dput(dentry); > + > + return ERR_PTR(error); > } > EXPORT_SYMBOL(vfs_mkdir); > > @@ -4338,8 +4350,10 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode) > error = security_path_mkdir(&path, dentry, > mode_strip_umask(path.dentry->d_inode, mode)); > if (!error) { > - error = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > + dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode, > dentry, mode); > + if (IS_ERR(dentry)) > + error = PTR_ERR(dentry); > } > done_path_create(&path, dentry); > if (retry_estale(error, lookup_flags)) { > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c > index 28f4d5311c40..c1d9bd07285f 100644 > --- a/fs/nfsd/nfs4recover.c > +++ b/fs/nfsd/nfs4recover.c > @@ -233,9 +233,12 @@ nfsd4_create_clid_dir(struct nfs4_client *clp) > * as well be forgiving and just succeed silently. > */ > goto out_put; > - status = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU); > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU); > + if (IS_ERR(dentry)) > + status = PTR_ERR(dentry); > out_put: > - dput(dentry); > + if (!status) > + dput(dentry); > out_unlock: > inode_unlock(d_inode(dir)); > if (status == 0) { > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 29cb7b812d71..34d7aa531662 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -1461,7 +1461,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > struct inode *dirp; > struct iattr *iap = attrs->na_iattr; > __be32 err; > - int host_err; > + int host_err = 0; > > dentry = fhp->fh_dentry; > dirp = d_inode(dentry); > @@ -1488,28 +1488,15 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > nfsd_check_ignore_resizing(iap); > break; > case S_IFDIR: > - host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); > - if (!host_err && unlikely(d_unhashed(dchild))) { > - struct dentry *d; > - d = lookup_one_len(dchild->d_name.name, > - dchild->d_parent, > - dchild->d_name.len); > - if (IS_ERR(d)) { > - host_err = PTR_ERR(d); > - break; > - } > - if (unlikely(d_is_negative(d))) { > - dput(d); > - err = nfserr_serverfault; > - goto out; > - } > + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); > + if (IS_ERR(dchild)) { > + host_err = PTR_ERR(dchild); > + } else if (d_is_negative(dchild)) { > + err = nfserr_serverfault; > + goto out; > + } else if (unlikely(dchild != resfhp->fh_dentry)) { > dput(resfhp->fh_dentry); > - resfhp->fh_dentry = dget(d); > - err = fh_update(resfhp); Hi Neil, why is this fh_update() call no longer necessary? > - dput(dchild); > - dchild = d; > - if (err) > - goto out; > + resfhp->fh_dentry = dget(dchild); > } > break; > case S_IFCHR: > @@ -1530,7 +1517,8 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs); > > out: > - dput(dchild); > + if (!IS_ERR(dchild)) > + dput(dchild); > return err; > > out_nfserr: > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c > index 21c3aaf7b274..fe493f3ed6b6 100644 > --- a/fs/overlayfs/dir.c > +++ b/fs/overlayfs/dir.c > @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, > goto out; > } > > -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, > - struct dentry **newdentry, umode_t mode) > -{ > - int err; > - struct dentry *d, *dentry = *newdentry; > - > - err = ovl_do_mkdir(ofs, dir, dentry, mode); > - if (err) > - return err; > - > - if (likely(!d_unhashed(dentry))) > - return 0; > - > - /* > - * vfs_mkdir() may succeed and leave the dentry passed > - * to it unhashed and negative. If that happens, try to > - * lookup a new hashed and positive dentry. > - */ > - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, > - dentry->d_name.len); > - if (IS_ERR(d)) { > - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", > - dentry, err); > - return PTR_ERR(d); > - } > - dput(dentry); > - *newdentry = d; > - > - return 0; > -} > - > struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > struct dentry *newdentry, struct ovl_cattr *attr) > { > @@ -191,7 +160,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > > case S_IFDIR: > /* mkdir is special... */ > - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); > + newdentry = ovl_do_mkdir(ofs, dir, newdentry, attr->mode); > + err = PTR_ERR_OR_ZERO(newdentry); > break; > > case S_IFCHR: > @@ -219,7 +189,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > } > out: > if (err) { > - dput(newdentry); > + if (!IS_ERR(newdentry)) > + dput(newdentry); > return ERR_PTR(err); > } > return newdentry; > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h > index 0021e2025020..6f2f8f4cfbbc 100644 > --- a/fs/overlayfs/overlayfs.h > +++ b/fs/overlayfs/overlayfs.h > @@ -241,13 +241,14 @@ static inline int ovl_do_create(struct ovl_fs *ofs, > return err; > } > > -static inline int ovl_do_mkdir(struct ovl_fs *ofs, > - struct inode *dir, struct dentry *dentry, > - umode_t mode) > +static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs, > + struct inode *dir, > + struct dentry *dentry, > + umode_t mode) > { > - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); > - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); > - return err; > + dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); > + pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry)); > + return dentry; > } > > static inline int ovl_do_mknod(struct ovl_fs *ofs, > @@ -838,8 +839,6 @@ struct ovl_cattr { > > #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) > > -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, > - struct dentry **newdentry, umode_t mode); > struct dentry *ovl_create_real(struct ovl_fs *ofs, > struct inode *dir, struct dentry *newdentry, > struct ovl_cattr *attr); > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c > index 61e21c3129e8..b63474d1b064 100644 > --- a/fs/overlayfs/super.c > +++ b/fs/overlayfs/super.c > @@ -327,9 +327,10 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, > goto retry; > } > > - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); > - if (err) > - goto out_dput; > + work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode); > + err = PTR_ERR(work); > + if (IS_ERR(work)) > + goto out_err; > > /* Weird filesystem returning with hashed negative (kernfs)? */ > err = -EINVAL; > diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c > index fe29acef5872..8554aa5a1059 100644 > --- a/fs/smb/server/vfs.c > +++ b/fs/smb/server/vfs.c > @@ -206,8 +206,8 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) > { > struct mnt_idmap *idmap; > struct path path; > - struct dentry *dentry; > - int err; > + struct dentry *dentry, *d; > + int err = 0; > > dentry = ksmbd_vfs_kern_path_create(work, name, > LOOKUP_NO_SYMLINKS | LOOKUP_DIRECTORY, > @@ -222,27 +222,15 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) > > idmap = mnt_idmap(path.mnt); > mode |= S_IFDIR; > - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); > - if (!err && d_unhashed(dentry)) { > - struct dentry *d; > - > - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, > - dentry->d_name.len); > - if (IS_ERR(d)) { > - err = PTR_ERR(d); > - goto out_err; > - } > - if (unlikely(d_is_negative(d))) { > - dput(d); > - err = -ENOENT; > - goto out_err; > - } > - > - ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); > - dput(d); > - } > + d = dentry; > + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); > + if (IS_ERR(dentry)) > + err = PTR_ERR(dentry); > + else if (d_is_negative(dentry)) > + err = -ENOENT; > + if (!err && dentry != d) > + ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(dentry)); > > -out_err: > done_path_create(&path, dentry); > if (err) > pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); > diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c > index c287c755f2c5..3537f3cca6d5 100644 > --- a/fs/xfs/scrub/orphanage.c > +++ b/fs/xfs/scrub/orphanage.c > @@ -167,10 +167,11 @@ xrep_orphanage_create( > * directory to control access to a file we put in here. > */ > if (d_really_is_negative(orphanage_dentry)) { > - error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, > - 0750); > - if (error) > - goto out_dput_orphanage; > + orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode, > + orphanage_dentry, 0750); > + error = PTR_ERR(orphanage_dentry); > + if (IS_ERR(orphanage_dentry)) > + goto out_unlock_root; > } > > /* Not a directory? Bail out. */ > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 8f4fbecd40fc..eaad8e31c0d4 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1971,8 +1971,8 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap, > */ > int vfs_create(struct mnt_idmap *, struct inode *, > struct dentry *, umode_t, bool); > -int vfs_mkdir(struct mnt_idmap *, struct inode *, > - struct dentry *, umode_t); > +struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *, > + struct dentry *, umode_t); > int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, > umode_t, dev_t); > int vfs_symlink(struct mnt_idmap *, struct inode *, -- Chuck Lever ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry. 2025-02-22 0:32 ` Chuck Lever @ 2025-02-24 2:51 ` NeilBrown 2025-02-24 14:22 ` Chuck Lever 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-24 2:51 UTC (permalink / raw) To: Chuck Lever Cc: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On Sat, 22 Feb 2025, Chuck Lever wrote: > On 2/20/25 6:36 PM, NeilBrown wrote: ... > > + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); > > + if (IS_ERR(dchild)) { > > + host_err = PTR_ERR(dchild); > > + } else if (d_is_negative(dchild)) { > > + err = nfserr_serverfault; > > + goto out; > > + } else if (unlikely(dchild != resfhp->fh_dentry)) { > > dput(resfhp->fh_dentry); > > - resfhp->fh_dentry = dget(d); > > - err = fh_update(resfhp); > > Hi Neil, why is this fh_update() call no longer necessary? > I tried to explain that in the commit message: I removed the fh_update() call as that is not needed and out-of-place. A subsequent nfsd_create_setattr() call will call fh_update() when needed. I don't think the fh_update() was needed even when first added in Commit 3819bb0d79f5 ("nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed") as there was already an fh_update() call later in the function. Thanks, NeilBrown > > > - dput(dchild); > > - dchild = d; > > - if (err) > > - goto out; > > + resfhp->fh_dentry = dget(dchild); > > } > > break; > > case S_IFCHR: > > @@ -1530,7 +1517,8 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, > > err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs); > > > > out: > > - dput(dchild); > > + if (!IS_ERR(dchild)) > > + dput(dchild); > > return err; > > > > out_nfserr: > > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c > > index 21c3aaf7b274..fe493f3ed6b6 100644 > > --- a/fs/overlayfs/dir.c > > +++ b/fs/overlayfs/dir.c > > @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, > > goto out; > > } > > > > -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, > > - struct dentry **newdentry, umode_t mode) > > -{ > > - int err; > > - struct dentry *d, *dentry = *newdentry; > > - > > - err = ovl_do_mkdir(ofs, dir, dentry, mode); > > - if (err) > > - return err; > > - > > - if (likely(!d_unhashed(dentry))) > > - return 0; > > - > > - /* > > - * vfs_mkdir() may succeed and leave the dentry passed > > - * to it unhashed and negative. If that happens, try to > > - * lookup a new hashed and positive dentry. > > - */ > > - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, > > - dentry->d_name.len); > > - if (IS_ERR(d)) { > > - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", > > - dentry, err); > > - return PTR_ERR(d); > > - } > > - dput(dentry); > > - *newdentry = d; > > - > > - return 0; > > -} > > - > > struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > > struct dentry *newdentry, struct ovl_cattr *attr) > > { > > @@ -191,7 +160,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > > > > case S_IFDIR: > > /* mkdir is special... */ > > - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); > > + newdentry = ovl_do_mkdir(ofs, dir, newdentry, attr->mode); > > + err = PTR_ERR_OR_ZERO(newdentry); > > break; > > > > case S_IFCHR: > > @@ -219,7 +189,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, > > } > > out: > > if (err) { > > - dput(newdentry); > > + if (!IS_ERR(newdentry)) > > + dput(newdentry); > > return ERR_PTR(err); > > } > > return newdentry; > > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h > > index 0021e2025020..6f2f8f4cfbbc 100644 > > --- a/fs/overlayfs/overlayfs.h > > +++ b/fs/overlayfs/overlayfs.h > > @@ -241,13 +241,14 @@ static inline int ovl_do_create(struct ovl_fs *ofs, > > return err; > > } > > > > -static inline int ovl_do_mkdir(struct ovl_fs *ofs, > > - struct inode *dir, struct dentry *dentry, > > - umode_t mode) > > +static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs, > > + struct inode *dir, > > + struct dentry *dentry, > > + umode_t mode) > > { > > - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); > > - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); > > - return err; > > + dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); > > + pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry)); > > + return dentry; > > } > > > > static inline int ovl_do_mknod(struct ovl_fs *ofs, > > @@ -838,8 +839,6 @@ struct ovl_cattr { > > > > #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) > > > > -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, > > - struct dentry **newdentry, umode_t mode); > > struct dentry *ovl_create_real(struct ovl_fs *ofs, > > struct inode *dir, struct dentry *newdentry, > > struct ovl_cattr *attr); > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c > > index 61e21c3129e8..b63474d1b064 100644 > > --- a/fs/overlayfs/super.c > > +++ b/fs/overlayfs/super.c > > @@ -327,9 +327,10 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, > > goto retry; > > } > > > > - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); > > - if (err) > > - goto out_dput; > > + work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode); > > + err = PTR_ERR(work); > > + if (IS_ERR(work)) > > + goto out_err; > > > > /* Weird filesystem returning with hashed negative (kernfs)? */ > > err = -EINVAL; > > diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c > > index fe29acef5872..8554aa5a1059 100644 > > --- a/fs/smb/server/vfs.c > > +++ b/fs/smb/server/vfs.c > > @@ -206,8 +206,8 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) > > { > > struct mnt_idmap *idmap; > > struct path path; > > - struct dentry *dentry; > > - int err; > > + struct dentry *dentry, *d; > > + int err = 0; > > > > dentry = ksmbd_vfs_kern_path_create(work, name, > > LOOKUP_NO_SYMLINKS | LOOKUP_DIRECTORY, > > @@ -222,27 +222,15 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) > > > > idmap = mnt_idmap(path.mnt); > > mode |= S_IFDIR; > > - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); > > - if (!err && d_unhashed(dentry)) { > > - struct dentry *d; > > - > > - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, > > - dentry->d_name.len); > > - if (IS_ERR(d)) { > > - err = PTR_ERR(d); > > - goto out_err; > > - } > > - if (unlikely(d_is_negative(d))) { > > - dput(d); > > - err = -ENOENT; > > - goto out_err; > > - } > > - > > - ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); > > - dput(d); > > - } > > + d = dentry; > > + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); > > + if (IS_ERR(dentry)) > > + err = PTR_ERR(dentry); > > + else if (d_is_negative(dentry)) > > + err = -ENOENT; > > + if (!err && dentry != d) > > + ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(dentry)); > > > > -out_err: > > done_path_create(&path, dentry); > > if (err) > > pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); > > diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c > > index c287c755f2c5..3537f3cca6d5 100644 > > --- a/fs/xfs/scrub/orphanage.c > > +++ b/fs/xfs/scrub/orphanage.c > > @@ -167,10 +167,11 @@ xrep_orphanage_create( > > * directory to control access to a file we put in here. > > */ > > if (d_really_is_negative(orphanage_dentry)) { > > - error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, > > - 0750); > > - if (error) > > - goto out_dput_orphanage; > > + orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode, > > + orphanage_dentry, 0750); > > + error = PTR_ERR(orphanage_dentry); > > + if (IS_ERR(orphanage_dentry)) > > + goto out_unlock_root; > > } > > > > /* Not a directory? Bail out. */ > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 8f4fbecd40fc..eaad8e31c0d4 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -1971,8 +1971,8 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap, > > */ > > int vfs_create(struct mnt_idmap *, struct inode *, > > struct dentry *, umode_t, bool); > > -int vfs_mkdir(struct mnt_idmap *, struct inode *, > > - struct dentry *, umode_t); > > +struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *, > > + struct dentry *, umode_t); > > int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, > > umode_t, dev_t); > > int vfs_symlink(struct mnt_idmap *, struct inode *, > > > -- > Chuck Lever > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry. 2025-02-24 2:51 ` NeilBrown @ 2025-02-24 14:22 ` Chuck Lever 0 siblings, 0 replies; 36+ messages in thread From: Chuck Lever @ 2025-02-24 14:22 UTC (permalink / raw) To: NeilBrown Cc: Alexander Viro, Christian Brauner, Jan Kara, Miklos Szeredi, Xiubo Li, Ilya Dryomov, Richard Weinberger, Anton Ivanov, Johannes Berg, Trond Myklebust, Anna Schumaker, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, Sergey Senozhatsky, linux-fsdevel, linux-kernel, linux-cifs, linux-nfs, linux-um, ceph-devel, netfs On 2/23/25 9:51 PM, NeilBrown wrote: > On Sat, 22 Feb 2025, Chuck Lever wrote: >> On 2/20/25 6:36 PM, NeilBrown wrote: > ... >>> + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); >>> + if (IS_ERR(dchild)) { >>> + host_err = PTR_ERR(dchild); >>> + } else if (d_is_negative(dchild)) { >>> + err = nfserr_serverfault; >>> + goto out; >>> + } else if (unlikely(dchild != resfhp->fh_dentry)) { >>> dput(resfhp->fh_dentry); >>> - resfhp->fh_dentry = dget(d); >>> - err = fh_update(resfhp); >> >> Hi Neil, why is this fh_update() call no longer necessary? >> > > I tried to explain that in the commit message: > > I removed the fh_update() > call as that is not needed and out-of-place. A subsequent > nfsd_create_setattr() call will call fh_update() when needed. > > I don't think the fh_update() was needed even when first added in > Commit 3819bb0d79f5 ("nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed") > > as there was already an fh_update() call later in the function. Thanks for the patch description verbiage, and sorry I missed it. Even so, IMHO this belongs in a separate patch instead of buried in this unrelated API change. This doesn't fix a bug nor is it necessary for changing the return value of vfs_mkdir() AFAICT. At the very least, a separate patch makes it possible to include a sensible reference to 3819bb0d79f5, which is helpful. IME these tiny weird looking warts often have a purpose that is revealed only once the code is made to look reasonable. Make the fh_update() removal a pre-requisite clean-up to this patch, maybe? > Thanks, > NeilBrown > > > >> >>> - dput(dchild); >>> - dchild = d; >>> - if (err) >>> - goto out; >>> + resfhp->fh_dentry = dget(dchild); >>> } >>> break; >>> case S_IFCHR: >>> @@ -1530,7 +1517,8 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, >>> err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs); >>> >>> out: >>> - dput(dchild); >>> + if (!IS_ERR(dchild)) >>> + dput(dchild); >>> return err; >>> >>> out_nfserr: >>> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c >>> index 21c3aaf7b274..fe493f3ed6b6 100644 >>> --- a/fs/overlayfs/dir.c >>> +++ b/fs/overlayfs/dir.c >>> @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, >>> goto out; >>> } >>> >>> -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, >>> - struct dentry **newdentry, umode_t mode) >>> -{ >>> - int err; >>> - struct dentry *d, *dentry = *newdentry; >>> - >>> - err = ovl_do_mkdir(ofs, dir, dentry, mode); >>> - if (err) >>> - return err; >>> - >>> - if (likely(!d_unhashed(dentry))) >>> - return 0; >>> - >>> - /* >>> - * vfs_mkdir() may succeed and leave the dentry passed >>> - * to it unhashed and negative. If that happens, try to >>> - * lookup a new hashed and positive dentry. >>> - */ >>> - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, >>> - dentry->d_name.len); >>> - if (IS_ERR(d)) { >>> - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", >>> - dentry, err); >>> - return PTR_ERR(d); >>> - } >>> - dput(dentry); >>> - *newdentry = d; >>> - >>> - return 0; >>> -} >>> - >>> struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, >>> struct dentry *newdentry, struct ovl_cattr *attr) >>> { >>> @@ -191,7 +160,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, >>> >>> case S_IFDIR: >>> /* mkdir is special... */ >>> - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); >>> + newdentry = ovl_do_mkdir(ofs, dir, newdentry, attr->mode); >>> + err = PTR_ERR_OR_ZERO(newdentry); >>> break; >>> >>> case S_IFCHR: >>> @@ -219,7 +189,8 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, >>> } >>> out: >>> if (err) { >>> - dput(newdentry); >>> + if (!IS_ERR(newdentry)) >>> + dput(newdentry); >>> return ERR_PTR(err); >>> } >>> return newdentry; >>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h >>> index 0021e2025020..6f2f8f4cfbbc 100644 >>> --- a/fs/overlayfs/overlayfs.h >>> +++ b/fs/overlayfs/overlayfs.h >>> @@ -241,13 +241,14 @@ static inline int ovl_do_create(struct ovl_fs *ofs, >>> return err; >>> } >>> >>> -static inline int ovl_do_mkdir(struct ovl_fs *ofs, >>> - struct inode *dir, struct dentry *dentry, >>> - umode_t mode) >>> +static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs, >>> + struct inode *dir, >>> + struct dentry *dentry, >>> + umode_t mode) >>> { >>> - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); >>> - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); >>> - return err; >>> + dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); >>> + pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry)); >>> + return dentry; >>> } >>> >>> static inline int ovl_do_mknod(struct ovl_fs *ofs, >>> @@ -838,8 +839,6 @@ struct ovl_cattr { >>> >>> #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) >>> >>> -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, >>> - struct dentry **newdentry, umode_t mode); >>> struct dentry *ovl_create_real(struct ovl_fs *ofs, >>> struct inode *dir, struct dentry *newdentry, >>> struct ovl_cattr *attr); >>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c >>> index 61e21c3129e8..b63474d1b064 100644 >>> --- a/fs/overlayfs/super.c >>> +++ b/fs/overlayfs/super.c >>> @@ -327,9 +327,10 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, >>> goto retry; >>> } >>> >>> - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); >>> - if (err) >>> - goto out_dput; >>> + work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode); >>> + err = PTR_ERR(work); >>> + if (IS_ERR(work)) >>> + goto out_err; >>> >>> /* Weird filesystem returning with hashed negative (kernfs)? */ >>> err = -EINVAL; >>> diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c >>> index fe29acef5872..8554aa5a1059 100644 >>> --- a/fs/smb/server/vfs.c >>> +++ b/fs/smb/server/vfs.c >>> @@ -206,8 +206,8 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) >>> { >>> struct mnt_idmap *idmap; >>> struct path path; >>> - struct dentry *dentry; >>> - int err; >>> + struct dentry *dentry, *d; >>> + int err = 0; >>> >>> dentry = ksmbd_vfs_kern_path_create(work, name, >>> LOOKUP_NO_SYMLINKS | LOOKUP_DIRECTORY, >>> @@ -222,27 +222,15 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) >>> >>> idmap = mnt_idmap(path.mnt); >>> mode |= S_IFDIR; >>> - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); >>> - if (!err && d_unhashed(dentry)) { >>> - struct dentry *d; >>> - >>> - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, >>> - dentry->d_name.len); >>> - if (IS_ERR(d)) { >>> - err = PTR_ERR(d); >>> - goto out_err; >>> - } >>> - if (unlikely(d_is_negative(d))) { >>> - dput(d); >>> - err = -ENOENT; >>> - goto out_err; >>> - } >>> - >>> - ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); >>> - dput(d); >>> - } >>> + d = dentry; >>> + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); >>> + if (IS_ERR(dentry)) >>> + err = PTR_ERR(dentry); >>> + else if (d_is_negative(dentry)) >>> + err = -ENOENT; >>> + if (!err && dentry != d) >>> + ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(dentry)); >>> >>> -out_err: >>> done_path_create(&path, dentry); >>> if (err) >>> pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); >>> diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c >>> index c287c755f2c5..3537f3cca6d5 100644 >>> --- a/fs/xfs/scrub/orphanage.c >>> +++ b/fs/xfs/scrub/orphanage.c >>> @@ -167,10 +167,11 @@ xrep_orphanage_create( >>> * directory to control access to a file we put in here. >>> */ >>> if (d_really_is_negative(orphanage_dentry)) { >>> - error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, >>> - 0750); >>> - if (error) >>> - goto out_dput_orphanage; >>> + orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode, >>> + orphanage_dentry, 0750); >>> + error = PTR_ERR(orphanage_dentry); >>> + if (IS_ERR(orphanage_dentry)) >>> + goto out_unlock_root; >>> } >>> >>> /* Not a directory? Bail out. */ >>> diff --git a/include/linux/fs.h b/include/linux/fs.h >>> index 8f4fbecd40fc..eaad8e31c0d4 100644 >>> --- a/include/linux/fs.h >>> +++ b/include/linux/fs.h >>> @@ -1971,8 +1971,8 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap, >>> */ >>> int vfs_create(struct mnt_idmap *, struct inode *, >>> struct dentry *, umode_t, bool); >>> -int vfs_mkdir(struct mnt_idmap *, struct inode *, >>> - struct dentry *, umode_t); >>> +struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *, >>> + struct dentry *, umode_t); >>> int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, >>> umode_t, dev_t); >>> int vfs_symlink(struct mnt_idmap *, struct inode *, >> >> >> -- >> Chuck Lever >> > -- Chuck Lever ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 0/6 v2] Change ->mkdir() and vfs_mkdir() to return a dentry @ 2025-02-27 1:32 NeilBrown 2025-02-27 1:32 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-27 1:32 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara Cc: Chuck Lever, Jeff Layton, Trond Myklebust, Anna Schumaker, linux-nfs, Ilya Dryomov, Xiubo Li, ceph-devel, Miklos Szeredi, linux-fsdevel, Richard Weinberger, Anton Ivanov, Johannes Berg, linux-um, linux-kernel This revised series contains a few clean-ups as requested by various people but no substantial changes. It is based on vfs/vfs-6.15.async.dir plus vfs/vfs-6.15.sysv: I dropped the change to sysv as it seemed pointless preserving them. I reviewed the mkdir functions in many (all?) filesystems and found a few that use d_instantiate() on an unlocked inode (after unlock_new_inode()) and also support export_operations. These could potentially call d_instantiate() on a directory inode which is already attached to an dentry, though making that happen would usually require guessing the filehandle correctly. I haven't tried to address those here, (this patch set doesn't make that situation any worse) but I may in the future. Thanks, NeilBrown [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible. [PATCH 3/6] ceph: return the correct dentry on mkdir [PATCH 4/6] fuse: return correct dentry for ->mkdir [PATCH 5/6] nfs: change mkdir inode_operation to return alternate [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-27 1:32 [PATCH 0/6 v2] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown @ 2025-02-27 1:32 ` NeilBrown 2025-02-27 11:34 ` Christian Brauner 0 siblings, 1 reply; 36+ messages in thread From: NeilBrown @ 2025-02-27 1:32 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara Cc: Chuck Lever, Jeff Layton, Trond Myklebust, Anna Schumaker, linux-nfs, Ilya Dryomov, Xiubo Li, ceph-devel, Miklos Szeredi, linux-fsdevel, Richard Weinberger, Anton Ivanov, Johannes Berg, linux-um, linux-kernel Some filesystems, such as NFS, cifs, ceph, and fuse, do not have complete control of sequencing on the actual filesystem (e.g. on a different server) and may find that the inode created for a mkdir request already exists in the icache and dcache by the time the mkdir request returns. For example, if the filesystem is mounted twice the directory could be visible on the other mount before it is on the original mount, and a pair of name_to_handle_at(), open_by_handle_at() calls could instantiate the directory inode with an IS_ROOT() dentry before the first mkdir returns. This means that the dentry passed to ->mkdir() may not be the one that is associated with the inode after the ->mkdir() completes. Some callers need to interact with the inode after the ->mkdir completes and they currently need to perform a lookup in the (rare) case that the dentry is no longer hashed. This lookup-after-mkdir requires that the directory remains locked to avoid races. Planned future patches to lock the dentry rather than the directory will mean that this lookup cannot be performed atomically with the mkdir. To remove this barrier, this patch changes ->mkdir to return the resulting dentry if it is different from the one passed in. Possible returns are: NULL - the directory was created and no other dentry was used ERR_PTR() - an error occurred non-NULL - this other dentry was spliced in This patch only changes file-systems to return "ERR_PTR(err)" instead of "err" or equivalent transformations. Subsequent patches will make further changes to some file-systems to return a correct dentry. Not all filesystems reliably result in a positive hashed dentry: - NFS, cifs, hostfs will sometimes need to perform a lookup of the name to get inode information. Races could result in this returning something different. Note that this lookup is non-atomic which is what we are trying to avoid. Placing the lookup in filesystem code means it only happens when the filesystem has no other option. - kernfs and tracefs leave the dentry negative and the ->revalidate operation ensures that lookup will be called to correctly populate the dentry. This could be fixed but I don't think it is important to any of the users of vfs_mkdir() which look at the dentry. The recommendation to use d_drop();d_splice_alias() is ugly but fits with current practice. A planned future patch will change this. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> (VFS, ext2, ext4, ocfs2, udf) Signed-off-by: NeilBrown <neilb@suse.de> --- Documentation/filesystems/locking.rst | 2 +- Documentation/filesystems/porting.rst | 19 +++++++++++++++++++ Documentation/filesystems/vfs.rst | 23 +++++++++++++++++++++-- fs/9p/vfs_inode.c | 7 +++---- fs/9p/vfs_inode_dotl.c | 8 ++++---- fs/affs/affs.h | 2 +- fs/affs/namei.c | 8 ++++---- fs/afs/dir.c | 12 ++++++------ fs/autofs/root.c | 14 +++++++------- fs/bad_inode.c | 6 +++--- fs/bcachefs/fs.c | 6 +++--- fs/btrfs/inode.c | 8 ++++---- fs/ceph/dir.c | 8 ++++---- fs/coda/dir.c | 14 +++++++------- fs/configfs/dir.c | 6 +++--- fs/ecryptfs/inode.c | 6 +++--- fs/exfat/namei.c | 8 ++++---- fs/ext2/namei.c | 9 +++++---- fs/ext4/namei.c | 10 +++++----- fs/f2fs/namei.c | 14 +++++++------- fs/fat/namei_msdos.c | 8 ++++---- fs/fat/namei_vfat.c | 8 ++++---- fs/fuse/dir.c | 6 +++--- fs/gfs2/inode.c | 9 +++++---- fs/hfs/dir.c | 10 +++++----- fs/hfsplus/dir.c | 6 +++--- fs/hostfs/hostfs_kern.c | 8 ++++---- fs/hpfs/namei.c | 10 +++++----- fs/hugetlbfs/inode.c | 6 +++--- fs/jffs2/dir.c | 18 +++++++++--------- fs/jfs/namei.c | 8 ++++---- fs/kernfs/dir.c | 12 ++++++------ fs/minix/namei.c | 8 ++++---- fs/namei.c | 15 ++++++++++++--- fs/nfs/dir.c | 8 ++++---- fs/nfs/internal.h | 4 ++-- fs/nilfs2/namei.c | 8 ++++---- fs/ntfs3/namei.c | 8 ++++---- fs/ocfs2/dlmfs/dlmfs.c | 10 +++++----- fs/ocfs2/namei.c | 10 +++++----- fs/omfs/dir.c | 6 +++--- fs/orangefs/namei.c | 8 ++++---- fs/overlayfs/dir.c | 9 +++++---- fs/ramfs/inode.c | 6 +++--- fs/smb/client/cifsfs.h | 4 ++-- fs/smb/client/inode.c | 10 +++++----- fs/tracefs/inode.c | 10 +++++----- fs/ubifs/dir.c | 10 +++++----- fs/udf/namei.c | 12 ++++++------ fs/ufs/namei.c | 8 ++++---- fs/vboxsf/dir.c | 8 ++++---- fs/xfs/xfs_iops.c | 4 ++-- include/linux/fs.h | 4 ++-- kernel/bpf/inode.c | 8 ++++---- mm/shmem.c | 8 ++++---- security/apparmor/apparmorfs.c | 8 ++++---- 56 files changed, 271 insertions(+), 222 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index d20a32b77b60..0ec0bb6eb0fb 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -66,7 +66,7 @@ prototypes:: int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); - int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry *(*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index 3ed3f39ecf71..fe0581271d5b 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -1178,3 +1178,22 @@ these conditions don't require explicit checks: LOOKUP_EXCL now means "target must not exist". It can be combined with LOOK_CREATE or LOOKUP_RENAME_TARGET. + +--- + +** mandatory** + +->mkdir() now returns a 'struct dentry *'. If the created inode is +found to already be in cache and have a dentry (often IS_ROOT()), it will +need to be spliced into the given name in place of the given dentry. +That dentry now needs to be returned. If the original dentry is used, +NULL should be returned. Any error should be returned with +ERR_PTR(). + +In general, filesystems which use d_instantiate_new() to install the new +inode can safely return NULL. Filesystems which may not have an I_NEW inode +should use d_drop();d_splice_alias() and return the result of the latter. + +If a positive dentry cannot be returned for some reason, in-kernel +clients such as cachefiles, nfsd, smb/server may not perform ideally but +will fail-safe. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 31eea688609a..ae79c30b6c0c 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -495,7 +495,7 @@ As of kernel 2.6.22, the following members are defined: int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); - int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry *(*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, @@ -562,7 +562,26 @@ otherwise noted. ``mkdir`` called by the mkdir(2) system call. Only required if you want to support creating subdirectories. You will probably need to - call d_instantiate() just as you would in the create() method + call d_instantiate_new() just as you would in the create() method. + + If d_instantiate_new() is not used and if the fh_to_dentry() + export operation is provided, or if the storage might be + accessible by another path (e.g. with a network filesystem) + then more care may be needed. Importantly d_instantate() + should not be used with an inode that is no longer I_NEW if there + any chance that the inode could already be attached to a dentry. + This is because of a hard rule in the VFS that a directory must + only ever have one dentry. + + For example, if an NFS filesystem is mounted twice the new directory + could be visible on the other mount before it is on the original + mount, and a pair of name_to_handle_at(), open_by_handle_at() + calls could instantiate the directory inode with an IS_ROOT() + dentry before the first mkdir returns. + + If there is any chance this could happen, then the new inode + should be d_drop()ed and attached with d_splice_alias(). The + returned dentry (if any) should be returned by ->mkdir(). ``rmdir`` called by the rmdir(2) system call. Only required if you want diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 3e68521f4e2f..399d455d50d6 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -669,8 +669,8 @@ v9fs_vfs_create(struct mnt_idmap *idmap, struct inode *dir, * */ -static int v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int err; u32 perm; @@ -692,8 +692,7 @@ static int v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (fid) p9_fid_put(fid); - - return err; + return ERR_PTR(err); } /** diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c index 143ac03b7425..cc2007be2173 100644 --- a/fs/9p/vfs_inode_dotl.c +++ b/fs/9p/vfs_inode_dotl.c @@ -350,9 +350,9 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry, * */ -static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap, - struct inode *dir, struct dentry *dentry, - umode_t omode) +static struct dentry *v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap, + struct inode *dir, struct dentry *dentry, + umode_t omode) { int err; struct v9fs_session_info *v9ses; @@ -417,7 +417,7 @@ static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap, p9_fid_put(fid); v9fs_put_acl(dacl, pacl); p9_fid_put(dfid); - return err; + return ERR_PTR(err); } static int diff --git a/fs/affs/affs.h b/fs/affs/affs.h index e8c2c4535cb3..ac4e9a02910b 100644 --- a/fs/affs/affs.h +++ b/fs/affs/affs.h @@ -168,7 +168,7 @@ extern struct dentry *affs_lookup(struct inode *dir, struct dentry *dentry, unsi extern int affs_unlink(struct inode *dir, struct dentry *dentry); extern int affs_create(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode, bool); -extern int affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, +extern struct dentry *affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode); extern int affs_rmdir(struct inode *dir, struct dentry *dentry); extern int affs_link(struct dentry *olddentry, struct inode *dir, diff --git a/fs/affs/namei.c b/fs/affs/namei.c index 8c154490a2d6..f883be50db12 100644 --- a/fs/affs/namei.c +++ b/fs/affs/namei.c @@ -273,7 +273,7 @@ affs_create(struct mnt_idmap *idmap, struct inode *dir, return 0; } -int +struct dentry * affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode) { @@ -285,7 +285,7 @@ affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, inode = affs_new_inode(dir); if (!inode) - return -ENOSPC; + return ERR_PTR(-ENOSPC); inode->i_mode = S_IFDIR | mode; affs_mode_to_prot(inode); @@ -298,9 +298,9 @@ affs_mkdir(struct mnt_idmap *idmap, struct inode *dir, clear_nlink(inode); mark_inode_dirty(inode); iput(inode); - return error; + return ERR_PTR(error); } - return 0; + return NULL; } int diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 02cbf38e1a77..5bddcc20786e 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -33,8 +33,8 @@ static bool afs_lookup_filldir(struct dir_context *ctx, const char *name, int nl loff_t fpos, u64 ino, unsigned dtype); static int afs_create(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode, bool excl); -static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode); +static struct dentry *afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode); static int afs_rmdir(struct inode *dir, struct dentry *dentry); static int afs_unlink(struct inode *dir, struct dentry *dentry); static int afs_link(struct dentry *from, struct inode *dir, @@ -1315,8 +1315,8 @@ static const struct afs_operation_ops afs_mkdir_operation = { /* * create a directory on an AFS filesystem */ -static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct afs_operation *op; struct afs_vnode *dvnode = AFS_FS_I(dir); @@ -1328,7 +1328,7 @@ static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, op = afs_alloc_operation(NULL, dvnode->volume); if (IS_ERR(op)) { d_drop(dentry); - return PTR_ERR(op); + return ERR_CAST(op); } fscache_use_cookie(afs_vnode_cache(dvnode), true); @@ -1344,7 +1344,7 @@ static int afs_mkdir(struct mnt_idmap *idmap, struct inode *dir, op->ops = &afs_mkdir_operation; ret = afs_do_sync_operation(op); afs_dir_unuse_cookie(dvnode, ret); - return ret; + return ERR_PTR(ret); } /* diff --git a/fs/autofs/root.c b/fs/autofs/root.c index 530d18827e35..174c7205fee4 100644 --- a/fs/autofs/root.c +++ b/fs/autofs/root.c @@ -15,8 +15,8 @@ static int autofs_dir_symlink(struct mnt_idmap *, struct inode *, struct dentry *, const char *); static int autofs_dir_unlink(struct inode *, struct dentry *); static int autofs_dir_rmdir(struct inode *, struct dentry *); -static int autofs_dir_mkdir(struct mnt_idmap *, struct inode *, - struct dentry *, umode_t); +static struct dentry *autofs_dir_mkdir(struct mnt_idmap *, struct inode *, + struct dentry *, umode_t); static long autofs_root_ioctl(struct file *, unsigned int, unsigned long); #ifdef CONFIG_COMPAT static long autofs_root_compat_ioctl(struct file *, @@ -720,9 +720,9 @@ static int autofs_dir_rmdir(struct inode *dir, struct dentry *dentry) return 0; } -static int autofs_dir_mkdir(struct mnt_idmap *idmap, - struct inode *dir, struct dentry *dentry, - umode_t mode) +static struct dentry *autofs_dir_mkdir(struct mnt_idmap *idmap, + struct inode *dir, struct dentry *dentry, + umode_t mode) { struct autofs_sb_info *sbi = autofs_sbi(dir->i_sb); struct autofs_info *ino = autofs_dentry_ino(dentry); @@ -739,7 +739,7 @@ static int autofs_dir_mkdir(struct mnt_idmap *idmap, inode = autofs_get_inode(dir->i_sb, S_IFDIR | mode); if (!inode) - return -ENOMEM; + return ERR_PTR(-ENOMEM); d_add(dentry, inode); if (sbi->version < 5) @@ -751,7 +751,7 @@ static int autofs_dir_mkdir(struct mnt_idmap *idmap, inc_nlink(dir); inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); - return 0; + return NULL; } /* Get/set timeout ioctl() operation */ diff --git a/fs/bad_inode.c b/fs/bad_inode.c index 316d88da2ce1..0ef9bcb744dd 100644 --- a/fs/bad_inode.c +++ b/fs/bad_inode.c @@ -58,10 +58,10 @@ static int bad_inode_symlink(struct mnt_idmap *idmap, return -EIO; } -static int bad_inode_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *bad_inode_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return -EIO; + return ERR_PTR(-EIO); } static int bad_inode_rmdir (struct inode *dir, struct dentry *dentry) diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c index 90ade8f648d9..1c94a680fcce 100644 --- a/fs/bcachefs/fs.c +++ b/fs/bcachefs/fs.c @@ -858,10 +858,10 @@ static int bch2_symlink(struct mnt_idmap *idmap, return bch2_err_class(ret); } -static int bch2_mkdir(struct mnt_idmap *idmap, - struct inode *vdir, struct dentry *dentry, umode_t mode) +static struct dentry *bch2_mkdir(struct mnt_idmap *idmap, + struct inode *vdir, struct dentry *dentry, umode_t mode) { - return bch2_mknod(idmap, vdir, dentry, mode|S_IFDIR, 0); + return ERR_PTR(bch2_mknod(idmap, vdir, dentry, mode|S_IFDIR, 0)); } static int bch2_rename2(struct mnt_idmap *idmap, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a9322601ab5c..851d3e8a06a7 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6739,18 +6739,18 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, return err; } -static int btrfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *btrfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; inode = new_inode(dir->i_sb); if (!inode) - return -ENOMEM; + return ERR_PTR(-ENOMEM); inode_init_owner(idmap, inode, dir, S_IFDIR | mode); inode->i_op = &btrfs_dir_inode_operations; inode->i_fop = &btrfs_dir_file_operations; - return btrfs_create_common(dir, dentry, inode); + return ERR_PTR(btrfs_create_common(dir, dentry, inode)); } static noinline int uncompress_inline(struct btrfs_path *path, diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 62e99e65250d..39e0f240de06 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1092,8 +1092,8 @@ static int ceph_symlink(struct mnt_idmap *idmap, struct inode *dir, return err; } -static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); struct ceph_client *cl = mdsc->fsc->client; @@ -1104,7 +1104,7 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, err = ceph_wait_on_conflict_unlink(dentry); if (err) - return err; + return ERR_PTR(err); if (ceph_snap(dir) == CEPH_SNAPDIR) { /* mkdir .snap/foo is a MKSNAP */ @@ -1173,7 +1173,7 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir, else d_drop(dentry); ceph_release_acl_sec_ctx(&as_ctx); - return err; + return ERR_PTR(err); } static int ceph_link(struct dentry *old_dentry, struct inode *dir, diff --git a/fs/coda/dir.c b/fs/coda/dir.c index a3e2dfeedfbf..ab69d8f0cec2 100644 --- a/fs/coda/dir.c +++ b/fs/coda/dir.c @@ -166,8 +166,8 @@ static int coda_create(struct mnt_idmap *idmap, struct inode *dir, return error; } -static int coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *de, umode_t mode) +static struct dentry *coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *de, umode_t mode) { struct inode *inode; struct coda_vattr attrs; @@ -177,14 +177,14 @@ static int coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct CodaFid newfid; if (is_root_inode(dir) && coda_iscontrol(name, len)) - return -EPERM; + return ERR_PTR(-EPERM); attrs.va_mode = mode; - error = venus_mkdir(dir->i_sb, coda_i2f(dir), + error = venus_mkdir(dir->i_sb, coda_i2f(dir), name, len, &newfid, &attrs); if (error) goto err_out; - + inode = coda_iget(dir->i_sb, &newfid, &attrs); if (IS_ERR(inode)) { error = PTR_ERR(inode); @@ -195,10 +195,10 @@ static int coda_mkdir(struct mnt_idmap *idmap, struct inode *dir, coda_dir_inc_nlink(dir); coda_dir_update_mtime(dir); d_instantiate(de, inode); - return 0; + return NULL; err_out: d_drop(de); - return error; + return ERR_PTR(error); } /* try to make de an entry in dir_inodde linked to source_de */ diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 7d10278db30d..5568cb74b322 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -1280,8 +1280,8 @@ int configfs_depend_item_unlocked(struct configfs_subsystem *caller_subsys, } EXPORT_SYMBOL(configfs_depend_item_unlocked); -static int configfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *configfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int ret = 0; int module_got = 0; @@ -1461,7 +1461,7 @@ static int configfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, put_fragment(frag); out: - return ret; + return ERR_PTR(ret); } static int configfs_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c index a9819ddb1ab8..6315dd194228 100644 --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -503,8 +503,8 @@ static int ecryptfs_symlink(struct mnt_idmap *idmap, return rc; } -static int ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int rc; struct dentry *lower_dentry; @@ -526,7 +526,7 @@ static int ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, inode_unlock(lower_dir); if (d_really_is_negative(dentry)) d_drop(dentry); - return rc; + return ERR_PTR(rc); } static int ecryptfs_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c index 691dd77b6ab5..1660c9bbcfa9 100644 --- a/fs/exfat/namei.c +++ b/fs/exfat/namei.c @@ -835,8 +835,8 @@ static int exfat_unlink(struct inode *dir, struct dentry *dentry) return err; } -static int exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct super_block *sb = dir->i_sb; struct inode *inode; @@ -846,7 +846,7 @@ static int exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, loff_t size = i_size_read(dir); if (unlikely(exfat_forced_shutdown(sb))) - return -EIO; + return ERR_PTR(-EIO); mutex_lock(&EXFAT_SB(sb)->s_lock); exfat_set_volume_dirty(sb); @@ -877,7 +877,7 @@ static int exfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, unlock: mutex_unlock(&EXFAT_SB(sb)->s_lock); - return err; + return ERR_PTR(err); } static int exfat_check_dir_empty(struct super_block *sb, diff --git a/fs/ext2/namei.c b/fs/ext2/namei.c index 8346ab9534c1..bde617a66cec 100644 --- a/fs/ext2/namei.c +++ b/fs/ext2/namei.c @@ -225,15 +225,16 @@ static int ext2_link (struct dentry * old_dentry, struct inode * dir, return err; } -static int ext2_mkdir(struct mnt_idmap * idmap, - struct inode * dir, struct dentry * dentry, umode_t mode) +static struct dentry *ext2_mkdir(struct mnt_idmap * idmap, + struct inode * dir, struct dentry * dentry, + umode_t mode) { struct inode * inode; int err; err = dquot_initialize(dir); if (err) - return err; + return ERR_PTR(err); inode_inc_link_count(dir); @@ -258,7 +259,7 @@ static int ext2_mkdir(struct mnt_idmap * idmap, d_instantiate_new(dentry, inode); out: - return err; + return ERR_PTR(err); out_fail: inode_dec_link_count(inode); diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 536d56d15072..716cc6096870 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3004,19 +3004,19 @@ int ext4_init_new_dir(handle_t *handle, struct inode *dir, return err; } -static int ext4_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ext4_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { handle_t *handle; struct inode *inode; int err, err2 = 0, credits, retries = 0; if (EXT4_DIR_LINK_MAX(dir)) - return -EMLINK; + return ERR_PTR(-EMLINK); err = dquot_initialize(dir); if (err) - return err; + return ERR_PTR(err); credits = (EXT4_DATA_TRANS_BLOCKS(dir->i_sb) + EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3); @@ -3066,7 +3066,7 @@ static int ext4_mkdir(struct mnt_idmap *idmap, struct inode *dir, out_retry: if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries)) goto retry; - return err; + return ERR_PTR(err); } /* diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c index a278c7da8177..24dca4dc85a9 100644 --- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -684,23 +684,23 @@ static int f2fs_symlink(struct mnt_idmap *idmap, struct inode *dir, return err; } -static int f2fs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *f2fs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct f2fs_sb_info *sbi = F2FS_I_SB(dir); struct inode *inode; int err; if (unlikely(f2fs_cp_error(sbi))) - return -EIO; + return ERR_PTR(-EIO); err = f2fs_dquot_initialize(dir); if (err) - return err; + return ERR_PTR(err); inode = f2fs_new_inode(idmap, dir, S_IFDIR | mode, NULL); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); inode->i_op = &f2fs_dir_inode_operations; inode->i_fop = &f2fs_dir_operations; @@ -722,12 +722,12 @@ static int f2fs_mkdir(struct mnt_idmap *idmap, struct inode *dir, f2fs_sync_fs(sbi->sb, 1); f2fs_balance_fs(sbi, true); - return 0; + return NULL; out_fail: clear_inode_flag(inode, FI_INC_LINK); f2fs_handle_failed_inode(inode); - return err; + return ERR_PTR(err); } static int f2fs_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c index f06f6ba643cc..23e9b9371ec3 100644 --- a/fs/fat/namei_msdos.c +++ b/fs/fat/namei_msdos.c @@ -339,8 +339,8 @@ static int msdos_rmdir(struct inode *dir, struct dentry *dentry) } /***** Make a directory */ -static int msdos_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *msdos_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct super_block *sb = dir->i_sb; struct fat_slot_info sinfo; @@ -389,13 +389,13 @@ static int msdos_mkdir(struct mnt_idmap *idmap, struct inode *dir, mutex_unlock(&MSDOS_SB(sb)->s_lock); fat_flush_inodes(sb, dir, inode); - return 0; + return NULL; out_free: fat_free_clusters(dir, cluster); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); - return err; + return ERR_PTR(err); } /***** Unlink a file */ diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index 926c26e90ef8..dd910edd2404 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -841,8 +841,8 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry) return err; } -static int vfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *vfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct super_block *sb = dir->i_sb; struct inode *inode; @@ -877,13 +877,13 @@ static int vfat_mkdir(struct mnt_idmap *idmap, struct inode *dir, d_instantiate(dentry, inode); mutex_unlock(&MSDOS_SB(sb)->s_lock); - return 0; + return NULL; out_free: fat_free_clusters(dir, cluster); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); - return err; + return ERR_PTR(err); } static int vfat_get_dotdot_de(struct inode *inode, struct buffer_head **bh, diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 3805f9b06c9d..d0289ce068ba 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -898,8 +898,8 @@ static int fuse_tmpfile(struct mnt_idmap *idmap, struct inode *dir, return err; } -static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *entry, umode_t mode) +static struct dentry *fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *entry, umode_t mode) { struct fuse_mkdir_in inarg; struct fuse_mount *fm = get_fuse_mount(dir); @@ -917,7 +917,7 @@ static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; - return create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR); + return ERR_PTR(create_new_entry(idmap, fm, &args, dir, entry, S_IFDIR)); } static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 6fbbaaad1cd0..198a8cbaf5e5 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1248,14 +1248,15 @@ static int gfs2_symlink(struct mnt_idmap *idmap, struct inode *dir, * @dentry: The dentry of the new directory * @mode: The mode of the new directory * - * Returns: errno + * Returns: the dentry, or ERR_PTR(errno) */ -static int gfs2_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *gfs2_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { unsigned dsize = gfs2_max_stuffed_size(GFS2_I(dir)); - return gfs2_create_inode(dir, dentry, NULL, S_IFDIR | mode, 0, NULL, dsize, 0); + + return ERR_PTR(gfs2_create_inode(dir, dentry, NULL, S_IFDIR | mode, 0, NULL, dsize, 0)); } /** diff --git a/fs/hfs/dir.c b/fs/hfs/dir.c index b75c26045df4..86a6b317b474 100644 --- a/fs/hfs/dir.c +++ b/fs/hfs/dir.c @@ -219,26 +219,26 @@ static int hfs_create(struct mnt_idmap *idmap, struct inode *dir, * in a directory, given the inode for the parent directory and the * name (and its length) of the new directory. */ -static int hfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; int res; inode = hfs_new_inode(dir, &dentry->d_name, S_IFDIR | mode); if (!inode) - return -ENOMEM; + return ERR_PTR(-ENOMEM); res = hfs_cat_create(inode->i_ino, dir, &dentry->d_name, inode); if (res) { clear_nlink(inode); hfs_delete_inode(inode); iput(inode); - return res; + return ERR_PTR(res); } d_instantiate(dentry, inode); mark_inode_dirty(inode); - return 0; + return NULL; } /* diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c index f5c4b3e31a1c..876bbb80fb4d 100644 --- a/fs/hfsplus/dir.c +++ b/fs/hfsplus/dir.c @@ -523,10 +523,10 @@ static int hfsplus_create(struct mnt_idmap *idmap, struct inode *dir, return hfsplus_mknod(&nop_mnt_idmap, dir, dentry, mode, 0); } -static int hfsplus_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hfsplus_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return hfsplus_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0); + return ERR_PTR(hfsplus_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0)); } static int hfsplus_rename(struct mnt_idmap *idmap, diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index e0741e468956..ccbb48fe830d 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -679,17 +679,17 @@ static int hostfs_symlink(struct mnt_idmap *idmap, struct inode *ino, return err; } -static int hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino, - struct dentry *dentry, umode_t mode) +static struct dentry *hostfs_mkdir(struct mnt_idmap *idmap, struct inode *ino, + struct dentry *dentry, umode_t mode) { char *file; int err; if ((file = dentry_name(dentry)) == NULL) - return -ENOMEM; + return ERR_PTR(-ENOMEM); err = do_mkdir(file, mode); __putname(file); - return err; + return ERR_PTR(err); } static int hostfs_rmdir(struct inode *ino, struct dentry *dentry) diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c index d0edf9ed33b6..e3cdc421dfba 100644 --- a/fs/hpfs/namei.c +++ b/fs/hpfs/namei.c @@ -19,8 +19,8 @@ static void hpfs_update_directory_times(struct inode *dir) hpfs_write_inode_nolock(dir); } -static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { const unsigned char *name = dentry->d_name.name; unsigned len = dentry->d_name.len; @@ -35,7 +35,7 @@ static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, int r; struct hpfs_dirent dee; int err; - if ((err = hpfs_chk_name(name, &len))) return err==-ENOENT ? -EINVAL : err; + if ((err = hpfs_chk_name(name, &len))) return ERR_PTR(err==-ENOENT ? -EINVAL : err); hpfs_lock(dir->i_sb); err = -ENOSPC; fnode = hpfs_alloc_fnode(dir->i_sb, hpfs_i(dir)->i_dno, &fno, &bh); @@ -112,7 +112,7 @@ static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, hpfs_update_directory_times(dir); d_instantiate(dentry, result); hpfs_unlock(dir->i_sb); - return 0; + return NULL; bail3: iput(result); bail2: @@ -123,7 +123,7 @@ static int hpfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, hpfs_free_sectors(dir->i_sb, fno, 1); bail: hpfs_unlock(dir->i_sb); - return err; + return ERR_PTR(err); } static int hpfs_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 0fc179a59830..d98caedbb723 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -991,14 +991,14 @@ static int hugetlbfs_mknod(struct mnt_idmap *idmap, struct inode *dir, return 0; } -static int hugetlbfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *hugetlbfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int retval = hugetlbfs_mknod(idmap, dir, dentry, mode | S_IFDIR, 0); if (!retval) inc_nlink(dir); - return retval; + return ERR_PTR(retval); } static int hugetlbfs_create(struct mnt_idmap *idmap, diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c index 2b2938970da3..dd91f725ded6 100644 --- a/fs/jffs2/dir.c +++ b/fs/jffs2/dir.c @@ -32,8 +32,8 @@ static int jffs2_link (struct dentry *,struct inode *,struct dentry *); static int jffs2_unlink (struct inode *,struct dentry *); static int jffs2_symlink (struct mnt_idmap *, struct inode *, struct dentry *, const char *); -static int jffs2_mkdir (struct mnt_idmap *, struct inode *,struct dentry *, - umode_t); +static struct dentry *jffs2_mkdir (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t); static int jffs2_rmdir (struct inode *,struct dentry *); static int jffs2_mknod (struct mnt_idmap *, struct inode *,struct dentry *, umode_t,dev_t); @@ -446,8 +446,8 @@ static int jffs2_symlink (struct mnt_idmap *idmap, struct inode *dir_i, } -static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, - struct dentry *dentry, umode_t mode) +static struct dentry *jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, + struct dentry *dentry, umode_t mode) { struct jffs2_inode_info *f, *dir_f; struct jffs2_sb_info *c; @@ -464,7 +464,7 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, ri = jffs2_alloc_raw_inode(); if (!ri) - return -ENOMEM; + return ERR_PTR(-ENOMEM); c = JFFS2_SB_INFO(dir_i->i_sb); @@ -477,7 +477,7 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, if (ret) { jffs2_free_raw_inode(ri); - return ret; + return ERR_PTR(ret); } inode = jffs2_new_inode(dir_i, mode, ri); @@ -485,7 +485,7 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, if (IS_ERR(inode)) { jffs2_free_raw_inode(ri); jffs2_complete_reservation(c); - return PTR_ERR(inode); + return ERR_CAST(inode); } inode->i_op = &jffs2_dir_inode_operations; @@ -584,11 +584,11 @@ static int jffs2_mkdir (struct mnt_idmap *idmap, struct inode *dir_i, jffs2_complete_reservation(c); d_instantiate_new(dentry, inode); - return 0; + return NULL; fail: iget_failed(inode); - return ret; + return ERR_PTR(ret); } static int jffs2_rmdir (struct inode *dir_i, struct dentry *dentry) diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c index fc8ede43afde..65a218eba8fa 100644 --- a/fs/jfs/namei.c +++ b/fs/jfs/namei.c @@ -187,13 +187,13 @@ static int jfs_create(struct mnt_idmap *idmap, struct inode *dip, * dentry - dentry of child directory * mode - create mode (rwxrwxrwx). * - * RETURN: Errors from subroutines + * RETURN: ERR_PTR() of errors from subroutines. * * note: * EACCES: user needs search+write permission on the parent directory */ -static int jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, - struct dentry *dentry, umode_t mode) +static struct dentry *jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, + struct dentry *dentry, umode_t mode) { int rc = 0; tid_t tid; /* transaction id */ @@ -308,7 +308,7 @@ static int jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, out1: jfs_info("jfs_mkdir: rc:%d", rc); - return rc; + return ERR_PTR(rc); } /* diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 5f0f8b95f44c..d296aad70800 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -1230,24 +1230,24 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir, return d_splice_alias(inode, dentry); } -static int kernfs_iop_mkdir(struct mnt_idmap *idmap, - struct inode *dir, struct dentry *dentry, - umode_t mode) +static struct dentry *kernfs_iop_mkdir(struct mnt_idmap *idmap, + struct inode *dir, struct dentry *dentry, + umode_t mode) { struct kernfs_node *parent = dir->i_private; struct kernfs_syscall_ops *scops = kernfs_root(parent)->syscall_ops; int ret; if (!scops || !scops->mkdir) - return -EPERM; + return ERR_PTR(-EPERM); if (!kernfs_get_active(parent)) - return -ENODEV; + return ERR_PTR(-ENODEV); ret = scops->mkdir(parent, dentry->d_name.name, mode); kernfs_put_active(parent); - return ret; + return ERR_PTR(ret); } static int kernfs_iop_rmdir(struct inode *dir, struct dentry *dentry) diff --git a/fs/minix/namei.c b/fs/minix/namei.c index 5d9c1406fe27..8938536d8d3c 100644 --- a/fs/minix/namei.c +++ b/fs/minix/namei.c @@ -104,15 +104,15 @@ static int minix_link(struct dentry * old_dentry, struct inode * dir, return add_nondir(dentry, inode); } -static int minix_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *minix_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode * inode; int err; inode = minix_new_inode(dir, S_IFDIR | mode); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); inode_inc_link_count(dir); minix_set_inode(inode, 0); @@ -128,7 +128,7 @@ static int minix_mkdir(struct mnt_idmap *idmap, struct inode *dir, d_instantiate(dentry, inode); out: - return err; + return ERR_PTR(err); out_fail: inode_dec_link_count(inode); diff --git a/fs/namei.c b/fs/namei.c index 9243d0fb0370..e26574651a28 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4290,6 +4290,7 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, { int error; unsigned max_links = dir->i_sb->s_max_links; + struct dentry *de; error = may_create(idmap, dir, dentry); if (error) @@ -4306,10 +4307,18 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (max_links && dir->i_nlink >= max_links) return -EMLINK; - error = dir->i_op->mkdir(idmap, dir, dentry, mode); - if (!error) + de = dir->i_op->mkdir(idmap, dir, dentry, mode); + if (IS_ERR(de)) + return PTR_ERR(de); + if (de) { + fsnotify_mkdir(dir, de); + /* Cannot return de yet */ + dput(de); + } else { fsnotify_mkdir(dir, dentry); - return error; + } + + return 0; } EXPORT_SYMBOL(vfs_mkdir); diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 56cf16a72334..101b1098e87b 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2422,8 +2422,8 @@ EXPORT_SYMBOL_GPL(nfs_mknod); /* * See comments for nfs_proc_create regarding failed operations. */ -int nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +struct dentry *nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct iattr attr; int error; @@ -2439,10 +2439,10 @@ int nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, trace_nfs_mkdir_exit(dir, dentry, error); if (error != 0) goto out_err; - return 0; + return NULL; out_err: d_drop(dentry); - return error; + return ERR_PTR(error); } EXPORT_SYMBOL_GPL(nfs_mkdir); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index fae2c7ae4acc..1ac1d3eec517 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -400,8 +400,8 @@ struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int); void nfs_d_prune_case_insensitive_aliases(struct inode *inode); int nfs_create(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, bool); -int nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, - umode_t); +struct dentry *nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, + umode_t); int nfs_rmdir(struct inode *, struct dentry *); int nfs_unlink(struct inode *, struct dentry *); int nfs_symlink(struct mnt_idmap *, struct inode *, struct dentry *, diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c index 953fbd5f0851..40f4b1a28705 100644 --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -218,8 +218,8 @@ static int nilfs_link(struct dentry *old_dentry, struct inode *dir, return err; } -static int nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct nilfs_transaction_info ti; @@ -227,7 +227,7 @@ static int nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, err = nilfs_transaction_begin(dir->i_sb, &ti, 1); if (err) - return err; + return ERR_PTR(err); inc_nlink(dir); @@ -258,7 +258,7 @@ static int nilfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, else nilfs_transaction_abort(dir->i_sb); - return err; + return ERR_PTR(err); out_fail: drop_nlink(inode); diff --git a/fs/ntfs3/namei.c b/fs/ntfs3/namei.c index abf7e81584a9..652735a0b0c4 100644 --- a/fs/ntfs3/namei.c +++ b/fs/ntfs3/namei.c @@ -201,11 +201,11 @@ static int ntfs_symlink(struct mnt_idmap *idmap, struct inode *dir, /* * ntfs_mkdir- inode_operations::mkdir */ -static int ntfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ntfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return ntfs_create_inode(idmap, dir, dentry, NULL, S_IFDIR | mode, 0, - NULL, 0, NULL); + return ERR_PTR(ntfs_create_inode(idmap, dir, dentry, NULL, S_IFDIR | mode, 0, + NULL, 0, NULL)); } /* diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c index 2a7f36643895..5130ec44e5e1 100644 --- a/fs/ocfs2/dlmfs/dlmfs.c +++ b/fs/ocfs2/dlmfs/dlmfs.c @@ -402,10 +402,10 @@ static struct inode *dlmfs_get_inode(struct inode *parent, * File creation. Allocate an inode, and we're done.. */ /* SMP-safe */ -static int dlmfs_mkdir(struct mnt_idmap * idmap, - struct inode * dir, - struct dentry * dentry, - umode_t mode) +static struct dentry *dlmfs_mkdir(struct mnt_idmap * idmap, + struct inode * dir, + struct dentry * dentry, + umode_t mode) { int status; struct inode *inode = NULL; @@ -448,7 +448,7 @@ static int dlmfs_mkdir(struct mnt_idmap * idmap, bail: if (status < 0) iput(inode); - return status; + return ERR_PTR(status); } static int dlmfs_create(struct mnt_idmap *idmap, diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c index 0ec63a1a94b8..99278c8f0e24 100644 --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -644,10 +644,10 @@ static int ocfs2_mknod_locked(struct ocfs2_super *osb, suballoc_loc, suballoc_bit); } -static int ocfs2_mkdir(struct mnt_idmap *idmap, - struct inode *dir, - struct dentry *dentry, - umode_t mode) +static struct dentry *ocfs2_mkdir(struct mnt_idmap *idmap, + struct inode *dir, + struct dentry *dentry, + umode_t mode) { int ret; @@ -657,7 +657,7 @@ static int ocfs2_mkdir(struct mnt_idmap *idmap, if (ret) mlog_errno(ret); - return ret; + return ERR_PTR(ret); } static int ocfs2_create(struct mnt_idmap *idmap, diff --git a/fs/omfs/dir.c b/fs/omfs/dir.c index 6bda275826d6..2ed541fccf33 100644 --- a/fs/omfs/dir.c +++ b/fs/omfs/dir.c @@ -279,10 +279,10 @@ static int omfs_add_node(struct inode *dir, struct dentry *dentry, umode_t mode) return err; } -static int omfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *omfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return omfs_add_node(dir, dentry, mode | S_IFDIR); + return ERR_PTR(omfs_add_node(dir, dentry, mode | S_IFDIR)); } static int omfs_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/orangefs/namei.c b/fs/orangefs/namei.c index 200558ec72f0..82395fe2b956 100644 --- a/fs/orangefs/namei.c +++ b/fs/orangefs/namei.c @@ -300,8 +300,8 @@ static int orangefs_symlink(struct mnt_idmap *idmap, return ret; } -static int orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct orangefs_inode_s *parent = ORANGEFS_I(dir); struct orangefs_kernel_op_s *new_op; @@ -312,7 +312,7 @@ static int orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, new_op = op_alloc(ORANGEFS_VFS_OP_MKDIR); if (!new_op) - return -ENOMEM; + return ERR_PTR(-ENOMEM); new_op->upcall.req.mkdir.parent_refn = parent->refn; @@ -366,7 +366,7 @@ static int orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, __orangefs_setattr(dir, &iattr); out: op_release(new_op); - return ret; + return ERR_PTR(ret); } static int orangefs_rename(struct mnt_idmap *idmap, diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index c9993ff66fc2..21c3aaf7b274 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -282,7 +282,8 @@ static int ovl_instantiate(struct dentry *dentry, struct inode *inode, * XXX: if we ever use ovl_obtain_alias() to decode directory * file handles, need to use ovl_get_inode_locked() and * d_instantiate_new() here to prevent from creating two - * hashed directory inode aliases. + * hashed directory inode aliases. We then need to return + * the obtained alias to ovl_mkdir(). */ inode = ovl_get_inode(dentry->d_sb, &oip); if (IS_ERR(inode)) @@ -687,10 +688,10 @@ static int ovl_create(struct mnt_idmap *idmap, struct inode *dir, return ovl_create_object(dentry, (mode & 07777) | S_IFREG, 0, NULL); } -static int ovl_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ovl_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { - return ovl_create_object(dentry, (mode & 07777) | S_IFDIR, 0, NULL); + return ERR_PTR(ovl_create_object(dentry, (mode & 07777) | S_IFDIR, 0, NULL)); } static int ovl_mknod(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c index 8006faaaf0ec..775fa905fda0 100644 --- a/fs/ramfs/inode.c +++ b/fs/ramfs/inode.c @@ -119,13 +119,13 @@ ramfs_mknod(struct mnt_idmap *idmap, struct inode *dir, return error; } -static int ramfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ramfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int retval = ramfs_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0); if (!retval) inc_nlink(dir); - return retval; + return ERR_PTR(retval); } static int ramfs_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h index 831fee962c4d..8dea0cf3a8de 100644 --- a/fs/smb/client/cifsfs.h +++ b/fs/smb/client/cifsfs.h @@ -59,8 +59,8 @@ extern int cifs_unlink(struct inode *dir, struct dentry *dentry); extern int cifs_hardlink(struct dentry *, struct inode *, struct dentry *); extern int cifs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, dev_t); -extern int cifs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, - umode_t); +extern struct dentry *cifs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, + umode_t); extern int cifs_rmdir(struct inode *, struct dentry *); extern int cifs_rename2(struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index 616149c7f0a5..3bb21aa58474 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -2207,8 +2207,8 @@ cifs_posix_mkdir(struct inode *inode, struct dentry *dentry, umode_t mode, } #endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */ -int cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, - struct dentry *direntry, umode_t mode) +struct dentry *cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, + struct dentry *direntry, umode_t mode) { int rc = 0; unsigned int xid; @@ -2224,10 +2224,10 @@ int cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, cifs_sb = CIFS_SB(inode->i_sb); if (unlikely(cifs_forced_shutdown(cifs_sb))) - return -EIO; + return ERR_PTR(-EIO); tlink = cifs_sb_tlink(cifs_sb); if (IS_ERR(tlink)) - return PTR_ERR(tlink); + return ERR_CAST(tlink); tcon = tlink_tcon(tlink); xid = get_xid(); @@ -2283,7 +2283,7 @@ int cifs_mkdir(struct mnt_idmap *idmap, struct inode *inode, free_dentry_path(page); free_xid(xid); cifs_put_tlink(tlink); - return rc; + return ERR_PTR(rc); } int cifs_rmdir(struct inode *inode, struct dentry *direntry) diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c index 53214499e384..cb1af30b49f5 100644 --- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -109,9 +109,9 @@ static char *get_dname(struct dentry *dentry) return name; } -static int tracefs_syscall_mkdir(struct mnt_idmap *idmap, - struct inode *inode, struct dentry *dentry, - umode_t mode) +static struct dentry *tracefs_syscall_mkdir(struct mnt_idmap *idmap, + struct inode *inode, struct dentry *dentry, + umode_t mode) { struct tracefs_inode *ti; char *name; @@ -119,7 +119,7 @@ static int tracefs_syscall_mkdir(struct mnt_idmap *idmap, name = get_dname(dentry); if (!name) - return -ENOMEM; + return ERR_PTR(-ENOMEM); /* * This is a new directory that does not take the default of @@ -141,7 +141,7 @@ static int tracefs_syscall_mkdir(struct mnt_idmap *idmap, kfree(name); - return ret; + return ERR_PTR(ret); } static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index fda82f3e16e8..3c3d3ad4fa6c 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1002,8 +1002,8 @@ static int ubifs_rmdir(struct inode *dir, struct dentry *dentry) return err; } -static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct ubifs_inode *dir_ui = ubifs_inode(dir); @@ -1023,7 +1023,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, err = ubifs_budget_space(c, &req); if (err) - return err; + return ERR_PTR(err); err = ubifs_prepare_create(dir, dentry, &nm); if (err) @@ -1060,7 +1060,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, ubifs_release_budget(c, &req); d_instantiate(dentry, inode); fscrypt_free_filename(&nm); - return 0; + return NULL; out_cancel: dir->i_size -= sz_change; @@ -1074,7 +1074,7 @@ static int ubifs_mkdir(struct mnt_idmap *idmap, struct inode *dir, fscrypt_free_filename(&nm); out_budg: ubifs_release_budget(c, &req); - return err; + return ERR_PTR(err); } static int ubifs_mknod(struct mnt_idmap *idmap, struct inode *dir, diff --git a/fs/udf/namei.c b/fs/udf/namei.c index 2cb49b6b0716..5f2e9a892bff 100644 --- a/fs/udf/namei.c +++ b/fs/udf/namei.c @@ -419,8 +419,8 @@ static int udf_mknod(struct mnt_idmap *idmap, struct inode *dir, return udf_add_nondir(dentry, inode); } -static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct udf_fileident_iter iter; @@ -430,7 +430,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, inode = udf_new_inode(dir, S_IFDIR | mode); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); iinfo = UDF_I(inode); inode->i_op = &udf_dir_inode_operations; @@ -439,7 +439,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (err) { clear_nlink(inode); discard_new_inode(inode); - return err; + return ERR_PTR(err); } set_nlink(inode, 2); iter.fi.icb.extLength = cpu_to_le32(inode->i_sb->s_blocksize); @@ -456,7 +456,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (err) { clear_nlink(inode); discard_new_inode(inode); - return err; + return ERR_PTR(err); } iter.fi.icb.extLength = cpu_to_le32(inode->i_sb->s_blocksize); iter.fi.icb.extLocation = cpu_to_lelb(iinfo->i_location); @@ -471,7 +471,7 @@ static int udf_mkdir(struct mnt_idmap *idmap, struct inode *dir, mark_inode_dirty(dir); d_instantiate_new(dentry, inode); - return 0; + return NULL; } static int empty_dir(struct inode *dir) diff --git a/fs/ufs/namei.c b/fs/ufs/namei.c index 38a024c8cccd..5b3c85c93242 100644 --- a/fs/ufs/namei.c +++ b/fs/ufs/namei.c @@ -166,8 +166,8 @@ static int ufs_link (struct dentry * old_dentry, struct inode * dir, return error; } -static int ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, - struct dentry * dentry, umode_t mode) +static struct dentry *ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, + struct dentry * dentry, umode_t mode) { struct inode * inode; int err; @@ -194,7 +194,7 @@ static int ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, goto out_fail; d_instantiate_new(dentry, inode); - return 0; + return NULL; out_fail: inode_dec_link_count(inode); @@ -202,7 +202,7 @@ static int ufs_mkdir(struct mnt_idmap * idmap, struct inode * dir, discard_new_inode(inode); out_dir: inode_dec_link_count(dir); - return err; + return ERR_PTR(err); } static int ufs_unlink(struct inode *dir, struct dentry *dentry) diff --git a/fs/vboxsf/dir.c b/fs/vboxsf/dir.c index a859ac9b74ba..770e29ec3557 100644 --- a/fs/vboxsf/dir.c +++ b/fs/vboxsf/dir.c @@ -303,11 +303,11 @@ static int vboxsf_dir_mkfile(struct mnt_idmap *idmap, return vboxsf_dir_create(parent, dentry, mode, false, excl, NULL); } -static int vboxsf_dir_mkdir(struct mnt_idmap *idmap, - struct inode *parent, struct dentry *dentry, - umode_t mode) +static struct dentry *vboxsf_dir_mkdir(struct mnt_idmap *idmap, + struct inode *parent, struct dentry *dentry, + umode_t mode) { - return vboxsf_dir_create(parent, dentry, mode, true, true, NULL); + return ERR_PTR(vboxsf_dir_create(parent, dentry, mode, true, true, NULL)); } static int vboxsf_dir_atomic_open(struct inode *parent, struct dentry *dentry, diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 40289fe6f5b2..a4480098d2bf 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -298,14 +298,14 @@ xfs_vn_create( return xfs_generic_create(idmap, dir, dentry, mode, 0, NULL); } -STATIC int +STATIC struct dentry * xfs_vn_mkdir( struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode) { - return xfs_generic_create(idmap, dir, dentry, mode | S_IFDIR, 0, NULL); + return ERR_PTR(xfs_generic_create(idmap, dir, dentry, mode | S_IFDIR, 0, NULL)); } STATIC struct dentry * diff --git a/include/linux/fs.h b/include/linux/fs.h index ac7a694a681b..4962f4a4e603 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2201,8 +2201,8 @@ struct inode_operations { int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *, const char *); - int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *, - umode_t); + struct dentry *(*mkdir) (struct mnt_idmap *, struct inode *, + struct dentry *, umode_t); int (*rmdir) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t,dev_t); diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 9aaf5124648b..dc3aa91a6ba0 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -150,14 +150,14 @@ static void bpf_dentry_finalize(struct dentry *dentry, struct inode *inode, inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); } -static int bpf_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *bpf_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; inode = bpf_get_inode(dir->i_sb, dir, mode | S_IFDIR); if (IS_ERR(inode)) - return PTR_ERR(inode); + return ERR_CAST(inode); inode->i_op = &bpf_dir_iops; inode->i_fop = &simple_dir_operations; @@ -166,7 +166,7 @@ static int bpf_mkdir(struct mnt_idmap *idmap, struct inode *dir, inc_nlink(dir); bpf_dentry_finalize(dentry, inode, dir); - return 0; + return NULL; } struct map_iter { diff --git a/mm/shmem.c b/mm/shmem.c index 4ea6109a8043..00ae0146e768 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3889,16 +3889,16 @@ shmem_tmpfile(struct mnt_idmap *idmap, struct inode *dir, return error; } -static int shmem_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *shmem_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { int error; error = shmem_mknod(idmap, dir, dentry, mode | S_IFDIR, 0); if (error) - return error; + return ERR_PTR(error); inc_nlink(dir); - return 0; + return NULL; } static int shmem_create(struct mnt_idmap *idmap, struct inode *dir, diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index c07d150685d7..6039afae4bfc 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -1795,8 +1795,8 @@ int __aafs_profile_mkdir(struct aa_profile *profile, struct dentry *parent) return error; } -static int ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +static struct dentry *ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) { struct aa_ns *ns, *parent; /* TODO: improve permission check */ @@ -1808,7 +1808,7 @@ static int ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, AA_MAY_LOAD_POLICY); end_current_label_crit_section(label); if (error) - return error; + return ERR_PTR(error); parent = aa_get_ns(dir->i_private); AA_BUG(d_inode(ns_subns_dir(parent)) != dir); @@ -1843,7 +1843,7 @@ static int ns_mkdir_op(struct mnt_idmap *idmap, struct inode *dir, mutex_unlock(&parent->lock); aa_put_ns(parent); - return error; + return ERR_PTR(error); } static int ns_rmdir_op(struct inode *dir, struct dentry *dentry) -- 2.48.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * 2025-02-27 1:32 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown @ 2025-02-27 11:34 ` Christian Brauner 0 siblings, 0 replies; 36+ messages in thread From: Christian Brauner @ 2025-02-27 11:34 UTC (permalink / raw) To: NeilBrown Cc: Christian Brauner, Chuck Lever, Jeff Layton, Trond Myklebust, Anna Schumaker, linux-nfs, Ilya Dryomov, Xiubo Li, ceph-devel, Miklos Szeredi, linux-fsdevel, Richard Weinberger, Anton Ivanov, Johannes Berg, linux-um, linux-kernel, Alexander Viro, Jan Kara On Thu, 27 Feb 2025 12:32:53 +1100, NeilBrown wrote: > Some filesystems, such as NFS, cifs, ceph, and fuse, do not have > complete control of sequencing on the actual filesystem (e.g. on a > different server) and may find that the inode created for a mkdir > request already exists in the icache and dcache by the time the mkdir > request returns. For example, if the filesystem is mounted twice the > directory could be visible on the other mount before it is on the > original mount, and a pair of name_to_handle_at(), open_by_handle_at() > calls could instantiate the directory inode with an IS_ROOT() dentry > before the first mkdir returns. > > [...] Applied to the vfs-6.15.async.dir branch of the vfs/vfs.git tree. Patches in the vfs-6.15.async.dir branch should appear in linux-next soon. Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it. It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated. Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch. tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs-6.15.async.dir [1/6] Change inode_operations.mkdir to return struct dentry * https://git.kernel.org/vfs/vfs/c/10a5b48c3eeb [2/6] hostfs: store inode in dentry after mkdir if possible. https://git.kernel.org/vfs/vfs/c/28d16ecaa2a8 [3/6] ceph: return the correct dentry on mkdir https://git.kernel.org/vfs/vfs/c/948ec6393e44 [4/6] fuse: return correct dentry for ->mkdir https://git.kernel.org/vfs/vfs/c/ef04f867aeb2 [5/6] nfs: change mkdir inode_operation to return alternate dentry if needed. https://git.kernel.org/vfs/vfs/c/5ca75f993a4a [6/6] VFS: Change vfs_mkdir() to return the dentry. https://git.kernel.org/vfs/vfs/c/9cdf09f608d0 ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2025-02-27 12:09 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-20 23:36 [PATCH 0/6] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown 2025-02-20 23:36 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown 2025-02-22 4:19 ` Al Viro 2025-02-24 1:34 ` NeilBrown 2025-02-24 2:09 ` Al Viro 2025-02-24 3:09 ` NeilBrown 2025-02-24 15:56 ` Trond Myklebust 2025-02-26 2:09 ` NeilBrown 2025-02-26 2:34 ` Trond Myklebust 2025-02-26 3:18 ` NeilBrown 2025-02-26 3:35 ` Al Viro 2025-02-22 4:56 ` Al Viro 2025-02-20 23:36 ` [PATCH 2/6] hostfs: store inode in dentry after mkdir if possible NeilBrown 2025-02-21 13:17 ` Jeff Layton 2025-02-20 23:36 ` [PATCH 3/6] ceph: return the correct dentry on mkdir NeilBrown 2025-02-21 1:48 ` Viacheslav Dubeyko 2025-02-24 2:15 ` NeilBrown 2025-02-24 22:09 ` Viacheslav Dubeyko 2025-02-24 22:53 ` Jeff Layton 2025-02-24 23:29 ` NeilBrown 2025-02-21 13:31 ` Jeff Layton 2025-02-20 23:36 ` [PATCH 4/6] fuse: return correct dentry for ->mkdir NeilBrown 2025-02-21 13:39 ` Jeff Layton 2025-02-22 4:24 ` Al Viro 2025-02-24 2:26 ` NeilBrown 2025-02-24 2:53 ` Al Viro 2025-02-20 23:36 ` [PATCH 5/6] nfs: change mkdir inode_operation to return alternate dentry if needed NeilBrown 2025-02-22 4:41 ` Al Viro 2025-02-24 2:41 ` NeilBrown 2025-02-20 23:36 ` [PATCH 6/6] VFS: Change vfs_mkdir() to return the dentry NeilBrown 2025-02-21 14:25 ` Jeff Layton 2025-02-22 0:32 ` Chuck Lever 2025-02-24 2:51 ` NeilBrown 2025-02-24 14:22 ` Chuck Lever -- strict thread matches above, loose matches on Subject: below -- 2025-02-27 1:32 [PATCH 0/6 v2] Change ->mkdir() and vfs_mkdir() to return a dentry NeilBrown 2025-02-27 1:32 ` [PATCH 1/6] Change inode_operations.mkdir to return struct dentry * NeilBrown 2025-02-27 11:34 ` Christian Brauner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).