* [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd
@ 2025-10-21 15:25 Jeff Layton
2025-10-21 15:25 ` [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
` (14 more replies)
0 siblings, 15 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
Behold, another version of directory delegations. This version contains
support for recall-only delegations. Support for CB_NOTIFY will be
forthcoming (once the client-side patches have caught up).
This main differences in this version are bugfixes, but the last patch
adds a more formal API for userland to request a delegation. That
support is optional. We can drop it and the rest of the series should be
fine.
My main interest in making delegations available to userland is to allow
testing this support without nfsd. I have an xfstest ready to submit for
this if that support looks acceptable. If it is, then I'll also plan to
submit an update for fcntl(2).
Christian, Chuck mentioned he was fine with you merging the nfsd bits
too, if you're willing to take the whole pile.
Thanks!
Jeff
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Changes in v3:
- Fix potential nfsd_file refcount leaks on GET_DIR_DELEGATION error
- Add missing parent dir deleg break in vfs_symlink()
- Add F_SETDELEG/F_GETDELEG support to fcntl()
- Link to v2: https://lore.kernel.org/r/20251017-dir-deleg-ro-v2-0-8c8f6dd23c8b@kernel.org
Changes in v2:
- handle lease conflict resolution inside of nfsd
- drop the lm_may_setlease lock_manager operation
- just add extra argument to vfs_create() instead of creating wrapper
- don't allocate fsnotify_mark for open directories
- Link to v1: https://lore.kernel.org/r/20251013-dir-deleg-ro-v1-0-406780a70e5e@kernel.org
---
Jeff Layton (13):
filelock: push the S_ISREG check down to ->setlease handlers
vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
vfs: allow mkdir to wait for delegation break on parent
vfs: allow rmdir to wait for delegation break on parent
vfs: break parent dir delegations in open(..., O_CREAT) codepath
vfs: make vfs_create break delegations on parent directory
vfs: make vfs_mknod break delegations on parent directory
vfs: make vfs_symlink break delegations on parent dir
filelock: lift the ban on directory leases in generic_setlease
nfsd: allow filecache to hold S_IFDIR files
nfsd: allow DELEGRETURN on directories
nfsd: wire up GET_DIR_DELEGATION handling
vfs: expose delegation support to userland
drivers/base/devtmpfs.c | 6 +-
fs/cachefiles/namei.c | 2 +-
fs/ecryptfs/inode.c | 10 +--
fs/fcntl.c | 9 +++
fs/fuse/dir.c | 1 +
fs/init.c | 6 +-
fs/locks.c | 68 +++++++++++++++-----
fs/namei.c | 150 +++++++++++++++++++++++++++++++++++----------
fs/nfs/nfs4file.c | 2 +
fs/nfsd/filecache.c | 57 ++++++++++++-----
fs/nfsd/filecache.h | 2 +
fs/nfsd/nfs3proc.c | 2 +-
fs/nfsd/nfs4proc.c | 22 ++++++-
fs/nfsd/nfs4recover.c | 6 +-
fs/nfsd/nfs4state.c | 103 ++++++++++++++++++++++++++++++-
fs/nfsd/state.h | 5 ++
fs/nfsd/vfs.c | 16 ++---
fs/nfsd/vfs.h | 2 +-
fs/open.c | 2 +-
fs/overlayfs/overlayfs.h | 10 +--
fs/smb/client/cifsfs.c | 3 +
fs/smb/server/vfs.c | 8 +--
fs/xfs/scrub/orphanage.c | 2 +-
include/linux/filelock.h | 12 ++++
include/linux/fs.h | 13 ++--
include/uapi/linux/fcntl.h | 10 +++
net/unix/af_unix.c | 2 +-
27 files changed, 425 insertions(+), 106 deletions(-)
---
base-commit: d2ced3cadfab04c7e915adf0a73c53fcf1642719
change-id: 20251013-dir-deleg-ro-d0fe19823b21
Best regards,
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-22 8:58 ` Jan Kara
2025-10-21 15:25 ` [PATCH v3 02/13] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
` (13 subsequent siblings)
14 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
When nfsd starts requesting directory delegations, setlease handlers may
see requests for leases on directories. Push the !S_ISREG check down
into the non-trivial setlease handlers, so we can selectively enable
them where they're supported.
FUSE is special: It's the only filesystem that supports atomic_open and
allows kernel-internal leases. atomic_open is issued when the VFS
doesn't know the state of the dentry being opened. If the file doesn't
exist, it may be created, in which case the dir lease should be broken.
The existing kernel-internal lease implementation has no provision for
this. Ensure that we don't allow directory leases by default going
forward by explicitly disabling them there.
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/fuse/dir.c | 1 +
fs/locks.c | 5 +++--
fs/nfs/nfs4file.c | 2 ++
fs/smb/client/cifsfs.c | 3 +++
4 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index ecaec0fea3a132e7cbb88121e7db7fb504d57d3c..667774cc72a1d49796f531fcb342d2e4878beb85 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -2230,6 +2230,7 @@ static const struct file_operations fuse_dir_operations = {
.fsync = fuse_dir_fsync,
.unlocked_ioctl = fuse_dir_ioctl,
.compat_ioctl = fuse_dir_compat_ioctl,
+ .setlease = simple_nosetlease,
};
static const struct inode_operations fuse_common_inode_operations = {
diff --git a/fs/locks.c b/fs/locks.c
index 04a3f0e2072461b6e2d3d1cd12f2b089d69a7db3..0b16921fb52e602ea2e0c3de39d9d772af98ba7d 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1929,6 +1929,9 @@ static int generic_delete_lease(struct file *filp, void *owner)
int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
void **priv)
{
+ if (!S_ISREG(file_inode(filp)->i_mode))
+ return -EINVAL;
+
switch (arg) {
case F_UNLCK:
return generic_delete_lease(filp, *priv);
@@ -2018,8 +2021,6 @@ vfs_setlease(struct file *filp, int arg, struct file_lease **lease, void **priv)
if ((!vfsuid_eq_kuid(vfsuid, current_fsuid())) && !capable(CAP_LEASE))
return -EACCES;
- if (!S_ISREG(inode->i_mode))
- return -EINVAL;
error = security_file_lock(filp, arg);
if (error)
return error;
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 7f43e890d3564a000dab9365048a3e17dc96395c..7317f26892c5782a39660cae87ec1afea24e36c0 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -431,6 +431,8 @@ void nfs42_ssc_unregister_ops(void)
static int nfs4_setlease(struct file *file, int arg, struct file_lease **lease,
void **priv)
{
+ if (!S_ISREG(file_inode(file)->i_mode))
+ return -EINVAL;
return nfs4_proc_setlease(file, arg, lease, priv);
}
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index 4f959f1e08d235071a151c1438c753fcd05099e5..1522c6b61b48c05c93f2bedeab0d35b6d85378e2 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -1149,6 +1149,9 @@ cifs_setlease(struct file *file, int arg, struct file_lease **lease, void **priv
struct inode *inode = file_inode(file);
struct cifsFileInfo *cfile = file->private_data;
+ if (!S_ISREG(inode->i_mode))
+ return -EINVAL;
+
/* Check if file is oplocked if this is request for new lease */
if (arg == F_UNLCK ||
((arg == F_RDLCK) && CIFS_CACHE_READ(CIFS_I(inode))) ||
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 02/13] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
2025-10-21 15:25 ` [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
` (12 subsequent siblings)
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.
vfs_link, vfs_unlink, and vfs_rename all have existing delegation break
handling for the children in the rename. Add the necessary calls for
breaking delegations in the parent(s) as well.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/namei.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/fs/namei.c b/fs/namei.c
index 7377020a2cba02501483020e0fc93c279fb38d3e..6e61e0215b34134b1690f864e2719e3f82cf71a8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4667,6 +4667,9 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir,
else {
error = security_inode_unlink(dir, dentry);
if (!error) {
+ error = try_break_deleg(dir, delegated_inode);
+ if (error)
+ goto out;
error = try_break_deleg(target, delegated_inode);
if (error)
goto out;
@@ -4936,7 +4939,9 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
else if (max_links && inode->i_nlink >= max_links)
error = -EMLINK;
else {
- error = try_break_deleg(inode, delegated_inode);
+ error = try_break_deleg(dir, delegated_inode);
+ if (!error)
+ error = try_break_deleg(inode, delegated_inode);
if (!error)
error = dir->i_op->link(old_dentry, dir, new_dentry);
}
@@ -5203,6 +5208,14 @@ int vfs_rename(struct renamedata *rd)
old_dir->i_nlink >= max_links)
goto out;
}
+ error = try_break_deleg(old_dir, delegated_inode);
+ if (error)
+ goto out;
+ if (new_dir != old_dir) {
+ error = try_break_deleg(new_dir, delegated_inode);
+ if (error)
+ goto out;
+ }
if (!is_dir) {
error = try_break_deleg(source, delegated_inode);
if (error)
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
2025-10-21 15:25 ` [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
2025-10-21 15:25 ` [PATCH v3 02/13] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-29 13:04 ` Christian Brauner
2025-10-21 15:25 ` [PATCH v3 04/13] vfs: allow rmdir " Jeff Layton
` (11 subsequent siblings)
14 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.
Add a new delegated_inode parameter to vfs_mkdir. All of the existing
callers set that to NULL for now, except for do_mkdirat which will
properly block until the lease is gone.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
drivers/base/devtmpfs.c | 2 +-
fs/cachefiles/namei.c | 2 +-
fs/ecryptfs/inode.c | 2 +-
fs/init.c | 2 +-
fs/namei.c | 24 ++++++++++++++++++------
fs/nfsd/nfs4recover.c | 2 +-
fs/nfsd/vfs.c | 2 +-
fs/overlayfs/overlayfs.h | 2 +-
fs/smb/server/vfs.c | 2 +-
fs/xfs/scrub/orphanage.c | 2 +-
include/linux/fs.h | 2 +-
11 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 9d4e46ad8352257a6a65d85526ebdbf9bf2d4b19..0e79621cb0f79870003b867ca384199171ded4e0 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -180,7 +180,7 @@ static int dev_mkdir(const char *name, umode_t mode)
if (IS_ERR(dentry))
return PTR_ERR(dentry);
- dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode);
+ dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode, NULL);
if (!IS_ERR(dentry))
/* mark as kernel-created inode */
d_inode(dentry)->i_private = &thread;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index d1edb2ac38376c4f9d2a18026450bb3c774f7824..50c0f9c76d1fd4c05db90d7d0d1bad574523ead0 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -130,7 +130,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
goto mkdir_error;
ret = cachefiles_inject_write_error();
if (ret == 0)
- subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
+ subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL);
else
subdir = ERR_PTR(ret);
if (IS_ERR(subdir)) {
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index ed1394da8d6bd7065f2a074378331f13fcda17f9..35830b3144f8f71374a78b3e7463b864f4fc216e 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -508,7 +508,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
goto out;
lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir,
- lower_dentry, mode);
+ lower_dentry, mode, NULL);
rc = PTR_ERR(lower_dentry);
if (IS_ERR(lower_dentry))
goto out;
diff --git a/fs/init.c b/fs/init.c
index 07f592ccdba868509d0f3aaf9936d8d890fdbec5..895f8a09a71acfd03e11164e3b441a7d4e2de146 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -233,7 +233,7 @@ int __init init_mkdir(const char *pathname, umode_t mode)
error = security_path_mkdir(&path, dentry, mode);
if (!error) {
dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
- dentry, mode);
+ dentry, mode, NULL);
if (IS_ERR(dentry))
error = PTR_ERR(dentry);
}
diff --git a/fs/namei.c b/fs/namei.c
index 6e61e0215b34134b1690f864e2719e3f82cf71a8..86cf6eca1f485361c6732974e4103cf5ea721539 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4407,10 +4407,11 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
/**
* vfs_mkdir - create directory returning correct dentry if possible
- * @idmap: idmap of the mount the inode was found from
- * @dir: inode of the parent directory
- * @dentry: dentry of the child directory
- * @mode: mode of the child directory
+ * @idmap: idmap of the mount the inode was found from
+ * @dir: inode of the parent directory
+ * @dentry: dentry of the child directory
+ * @mode: mode of the child directory
+ * @delegated_inode: returns parent inode, if the inode is delegated.
*
* Create a directory.
*
@@ -4427,7 +4428,8 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
* In case of an error the dentry is dput() and an ERR_PTR() is returned.
*/
struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
- struct dentry *dentry, umode_t mode)
+ struct dentry *dentry, umode_t mode,
+ struct inode **delegated_inode)
{
int error;
unsigned max_links = dir->i_sb->s_max_links;
@@ -4450,6 +4452,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
if (max_links && dir->i_nlink >= max_links)
goto err;
+ error = try_break_deleg(dir, delegated_inode);
+ if (error)
+ goto err;
+
de = dir->i_op->mkdir(idmap, dir, dentry, mode);
error = PTR_ERR(de);
if (IS_ERR(de))
@@ -4473,6 +4479,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
struct path path;
int error;
unsigned int lookup_flags = LOOKUP_DIRECTORY;
+ struct inode *delegated_inode = NULL;
retry:
dentry = filename_create(dfd, name, &path, lookup_flags);
@@ -4484,11 +4491,16 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
mode_strip_umask(path.dentry->d_inode, mode));
if (!error) {
dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
- dentry, mode);
+ dentry, mode, &delegated_inode);
if (IS_ERR(dentry))
error = PTR_ERR(dentry);
}
end_creating_path(&path, dentry);
+ if (delegated_inode) {
+ error = break_deleg_wait(&delegated_inode);
+ if (!error)
+ goto retry;
+ }
if (retry_estale(error, lookup_flags)) {
lookup_flags |= LOOKUP_REVAL;
goto retry;
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index b1005abcb9035b2cf743200808a251b00af7e3f4..423dd102b51198ea7c447be2b9a0a5020c950dba 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -202,7 +202,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
* as well be forgiving and just succeed silently.
*/
goto out_put;
- dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
+ dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, 0700, NULL);
if (IS_ERR(dentry))
status = PTR_ERR(dentry);
out_put:
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 8b2dc7a88aab015d1e39da0dd4e6daf7e276aabe..5f24af289d509bea54a324b8851fa06de6050353 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1645,7 +1645,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
nfsd_check_ignore_resizing(iap);
break;
case S_IFDIR:
- dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode);
+ dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode, NULL);
if (IS_ERR(dchild)) {
host_err = PTR_ERR(dchild);
} else if (d_is_negative(dchild)) {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index c8fd5951fc5ece1ae6b3e2a0801ca15f9faf7d72..0f65f9a5d54d4786b39e4f4f30f416d5b9016e70 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -248,7 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
{
struct dentry *ret;
- ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
+ ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, NULL);
pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(ret));
return ret;
}
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 891ed2dc2b7351a5cb14a2241d71095ffdd03f08..3d2190f26623b23ea79c63410905a3c3ad684048 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -230,7 +230,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode)
idmap = mnt_idmap(path.mnt);
mode |= S_IFDIR;
d = dentry;
- dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode);
+ dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode, NULL);
if (IS_ERR(dentry))
err = PTR_ERR(dentry);
else if (d_is_negative(dentry))
diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
index 9c12cb8442311ca26b169e4d1567939ae44a5be0..91c9d07b97f306f57aebb9b69ba564b0c2cb8c17 100644
--- a/fs/xfs/scrub/orphanage.c
+++ b/fs/xfs/scrub/orphanage.c
@@ -167,7 +167,7 @@ xrep_orphanage_create(
*/
if (d_really_is_negative(orphanage_dentry)) {
orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode,
- orphanage_dentry, 0750);
+ orphanage_dentry, 0750, NULL);
error = PTR_ERR(orphanage_dentry);
if (IS_ERR(orphanage_dentry))
goto out_unlock_root;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c895146c1444be36e0a779df55622cc38c9419ff..1040df3792794cd353b86558b41618294e25b8a6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2113,7 +2113,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
int vfs_create(struct mnt_idmap *, struct inode *,
struct dentry *, umode_t, bool);
struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
- struct dentry *, umode_t);
+ struct dentry *, umode_t, struct inode **);
int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
umode_t, dev_t);
int vfs_symlink(struct mnt_idmap *, struct inode *,
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 04/13] vfs: allow rmdir to wait for delegation break on parent
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (2 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 05/13] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
` (10 subsequent siblings)
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.
Add a "delegated_inode" return pointer to vfs_rmdir() and populate that
pointer with the parent inode if it's non-NULL. Most existing in-kernel
callers pass in a NULL pointer.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
drivers/base/devtmpfs.c | 2 +-
fs/ecryptfs/inode.c | 2 +-
fs/namei.c | 22 +++++++++++++++++-----
fs/nfsd/nfs4recover.c | 4 ++--
fs/nfsd/vfs.c | 2 +-
fs/overlayfs/overlayfs.h | 2 +-
fs/smb/server/vfs.c | 4 ++--
include/linux/fs.h | 3 ++-
8 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 0e79621cb0f79870003b867ca384199171ded4e0..104025104ef75381984fd94dfbd50feeaa8cdd22 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -261,7 +261,7 @@ static int dev_rmdir(const char *name)
return PTR_ERR(dentry);
if (d_inode(dentry)->i_private == &thread)
err = vfs_rmdir(&nop_mnt_idmap, d_inode(parent.dentry),
- dentry);
+ dentry, NULL);
else
err = -EPERM;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 35830b3144f8f71374a78b3e7463b864f4fc216e..88631291b32535f623a3fbe4ea9b6ed48a306ca0 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -540,7 +540,7 @@ static int ecryptfs_rmdir(struct inode *dir, struct dentry *dentry)
if (d_unhashed(lower_dentry))
rc = -EINVAL;
else
- rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry);
+ rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry, NULL);
}
if (!rc) {
clear_nlink(d_inode(dentry));
diff --git a/fs/namei.c b/fs/namei.c
index 86cf6eca1f485361c6732974e4103cf5ea721539..4b5a99653c558397e592715d9d4663cd4a63ef86 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4522,9 +4522,10 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
/**
* vfs_rmdir - remove directory
- * @idmap: idmap of the mount the inode was found from
- * @dir: inode of the parent directory
- * @dentry: dentry of the child directory
+ * @idmap: idmap of the mount the inode was found from
+ * @dir: inode of the parent directory
+ * @dentry: dentry of the child directory
+ * @delegated_inode: returns parent inode, if it's delegated.
*
* Remove a directory.
*
@@ -4535,7 +4536,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
* raw inode simply pass @nop_mnt_idmap.
*/
int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
- struct dentry *dentry)
+ struct dentry *dentry, struct inode **delegated_inode)
{
int error = may_delete(idmap, dir, dentry, 1);
@@ -4557,6 +4558,10 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
if (error)
goto out;
+ error = try_break_deleg(dir, delegated_inode);
+ if (error)
+ goto out;
+
error = dir->i_op->rmdir(dir, dentry);
if (error)
goto out;
@@ -4583,6 +4588,7 @@ int do_rmdir(int dfd, struct filename *name)
struct qstr last;
int type;
unsigned int lookup_flags = 0;
+ struct inode *delegated_inode = NULL;
retry:
error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
if (error)
@@ -4612,7 +4618,8 @@ int do_rmdir(int dfd, struct filename *name)
error = security_path_rmdir(&path, dentry);
if (error)
goto exit4;
- error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry);
+ error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode,
+ dentry, &delegated_inode);
exit4:
dput(dentry);
exit3:
@@ -4620,6 +4627,11 @@ int do_rmdir(int dfd, struct filename *name)
mnt_drop_write(path.mnt);
exit2:
path_put(&path);
+ if (delegated_inode) {
+ error = break_deleg_wait(&delegated_inode);
+ if (!error)
+ goto retry;
+ }
if (retry_estale(error, lookup_flags)) {
lookup_flags |= LOOKUP_REVAL;
goto retry;
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 423dd102b51198ea7c447be2b9a0a5020c950dba..71f8bf25d209937e13c9ae563101b7d8bf55f4ce 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -315,7 +315,7 @@ nfsd4_unlink_clid_dir(char *name, struct nfsd_net *nn)
status = -ENOENT;
if (d_really_is_negative(dentry))
goto out;
- status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry);
+ status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry, NULL);
out:
dput(dentry);
out_unlock:
@@ -409,7 +409,7 @@ purge_old(struct dentry *parent, char *cname, struct nfsd_net *nn)
inode_lock_nested(d_inode(parent), I_MUTEX_PARENT);
child = lookup_one(&nop_mnt_idmap, &QSTR(cname), parent);
if (!IS_ERR(child)) {
- status = vfs_rmdir(&nop_mnt_idmap, d_inode(parent), child);
+ status = vfs_rmdir(&nop_mnt_idmap, d_inode(parent), child, NULL);
if (status)
printk("failed to remove client recovery directory %pd\n",
child);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 5f24af289d509bea54a324b8851fa06de6050353..85afd2fad7e08b66b1a9ce372afdea1df52086be 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -2195,7 +2195,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
break;
}
} else {
- host_err = vfs_rmdir(&nop_mnt_idmap, dirp, rdentry);
+ host_err = vfs_rmdir(&nop_mnt_idmap, dirp, rdentry, NULL);
}
fh_fill_post_attrs(fhp);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 0f65f9a5d54d4786b39e4f4f30f416d5b9016e70..d215d7349489686b66bb66e939b27046f7d836f6 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -206,7 +206,7 @@ static inline int ovl_do_notify_change(struct ovl_fs *ofs,
static inline int ovl_do_rmdir(struct ovl_fs *ofs,
struct inode *dir, struct dentry *dentry)
{
- int err = vfs_rmdir(ovl_upper_mnt_idmap(ofs), dir, dentry);
+ int err = vfs_rmdir(ovl_upper_mnt_idmap(ofs), dir, dentry, NULL);
pr_debug("rmdir(%pd2) = %i\n", dentry, err);
return err;
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 3d2190f26623b23ea79c63410905a3c3ad684048..c5f0f3170d586cb2dc4d416b80948c642797fb82 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -609,7 +609,7 @@ int ksmbd_vfs_remove_file(struct ksmbd_work *work, const struct path *path)
idmap = mnt_idmap(path->mnt);
if (S_ISDIR(d_inode(path->dentry)->i_mode)) {
- err = vfs_rmdir(idmap, d_inode(parent), path->dentry);
+ err = vfs_rmdir(idmap, d_inode(parent), path->dentry, NULL);
if (err && err != -ENOTEMPTY)
ksmbd_debug(VFS, "rmdir failed, err %d\n", err);
} else {
@@ -1090,7 +1090,7 @@ int ksmbd_vfs_unlink(struct file *filp)
dget(dentry);
if (S_ISDIR(d_inode(dentry)->i_mode))
- err = vfs_rmdir(idmap, d_inode(dir), dentry);
+ err = vfs_rmdir(idmap, d_inode(dir), dentry, NULL);
else
err = vfs_unlink(idmap, d_inode(dir), dentry, NULL);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1040df3792794cd353b86558b41618294e25b8a6..d8bdaf7c87502ff17775602f5391d375738b4ed8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2120,7 +2120,8 @@ int vfs_symlink(struct mnt_idmap *, struct inode *,
struct dentry *, const char *);
int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
struct dentry *, struct inode **);
-int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *);
+int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *,
+ struct inode **);
int vfs_unlink(struct mnt_idmap *, struct inode *, struct dentry *,
struct inode **);
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 05/13] vfs: break parent dir delegations in open(..., O_CREAT) codepath
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (3 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 04/13] vfs: allow rmdir " Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory Jeff Layton
` (9 subsequent siblings)
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.
Add a delegated_inode parameter to lookup_open and have it break the
delegation. Then, open_last_lookups can wait for the delegation break
and retry the call to lookup_open once it's done.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/namei.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 4b5a99653c558397e592715d9d4663cd4a63ef86..786f42bd184b5dbf6d754fa1fb6c94c0f75429f2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3697,7 +3697,7 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
*/
static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
const struct open_flags *op,
- bool got_write)
+ bool got_write, struct inode **delegated_inode)
{
struct mnt_idmap *idmap;
struct dentry *dir = nd->path.dentry;
@@ -3786,6 +3786,11 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
/* Negative dentry, just create the file */
if (!dentry->d_inode && (open_flag & O_CREAT)) {
+ /* but break the directory lease first! */
+ error = try_break_deleg(dir_inode, delegated_inode);
+ if (error)
+ goto out_dput;
+
file->f_mode |= FMODE_CREATED;
audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE);
if (!dir_inode->i_op->create) {
@@ -3849,6 +3854,7 @@ static const char *open_last_lookups(struct nameidata *nd,
struct file *file, const struct open_flags *op)
{
struct dentry *dir = nd->path.dentry;
+ struct inode *delegated_inode = NULL;
int open_flag = op->open_flag;
bool got_write = false;
struct dentry *dentry;
@@ -3879,7 +3885,7 @@ static const char *open_last_lookups(struct nameidata *nd,
return ERR_PTR(-ECHILD);
}
}
-
+retry:
if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) {
got_write = !mnt_want_write(nd->path.mnt);
/*
@@ -3892,7 +3898,7 @@ static const char *open_last_lookups(struct nameidata *nd,
inode_lock(dir->d_inode);
else
inode_lock_shared(dir->d_inode);
- dentry = lookup_open(nd, file, op, got_write);
+ dentry = lookup_open(nd, file, op, got_write, &delegated_inode);
if (!IS_ERR(dentry)) {
if (file->f_mode & FMODE_CREATED)
fsnotify_create(dir->d_inode, dentry);
@@ -3907,8 +3913,16 @@ static const char *open_last_lookups(struct nameidata *nd,
if (got_write)
mnt_drop_write(nd->path.mnt);
- if (IS_ERR(dentry))
+ if (IS_ERR(dentry)) {
+ if (delegated_inode) {
+ int error = break_deleg_wait(&delegated_inode);
+
+ if (!error)
+ goto retry;
+ return ERR_PTR(error);
+ }
return ERR_CAST(dentry);
+ }
if (file->f_mode & (FMODE_OPENED | FMODE_CREATED)) {
dput(nd->path.dentry);
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (4 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 05/13] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-29 13:23 ` Christian Brauner
2025-10-21 15:25 ` [PATCH v3 07/13] vfs: make vfs_mknod " Jeff Layton
` (8 subsequent siblings)
14 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.
Add a delegated_inode parameter to vfs_create. Most callers are
converted to pass in NULL, but do_mknodat() is changed to wait for a
delegation break if there is one.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/ecryptfs/inode.c | 2 +-
fs/namei.c | 26 +++++++++++++++++++-------
fs/nfsd/nfs3proc.c | 2 +-
fs/nfsd/vfs.c | 3 +--
fs/open.c | 2 +-
fs/overlayfs/overlayfs.h | 2 +-
fs/smb/server/vfs.c | 2 +-
include/linux/fs.h | 2 +-
8 files changed, 26 insertions(+), 15 deletions(-)
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 88631291b32535f623a3fbe4ea9b6ed48a306ca0..661709b157ce854c3bfdfdb13f7c10435fad9756 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -189,7 +189,7 @@ ecryptfs_do_create(struct inode *directory_inode,
rc = lock_parent(ecryptfs_dentry, &lower_dentry, &lower_dir);
if (!rc)
rc = vfs_create(&nop_mnt_idmap, lower_dir,
- lower_dentry, mode, true);
+ lower_dentry, mode, true, NULL);
if (rc) {
printk(KERN_ERR "%s: Failure to create dentry in lower fs; "
"rc = [%d]\n", __func__, rc);
diff --git a/fs/namei.c b/fs/namei.c
index 786f42bd184b5dbf6d754fa1fb6c94c0f75429f2..7510942e0249de19df4363b92f813b3acdfc2254 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3460,11 +3460,12 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
/**
* vfs_create - create new file
- * @idmap: idmap of the mount the inode was found from
- * @dir: inode of the parent directory
- * @dentry: dentry of the child file
- * @mode: mode of the child file
- * @want_excl: whether the file must not yet exist
+ * @idmap: idmap of the mount the inode was found from
+ * @dir: inode of the parent directory
+ * @dentry: dentry of the child file
+ * @mode: mode of the child file
+ * @want_excl: whether the file must not yet exist
+ * @delegated_inode: returns parent inode, if the inode is delegated.
*
* Create a new file.
*
@@ -3475,7 +3476,8 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
* raw inode simply pass @nop_mnt_idmap.
*/
int vfs_create(struct mnt_idmap *idmap, struct inode *dir,
- struct dentry *dentry, umode_t mode, bool want_excl)
+ struct dentry *dentry, umode_t mode, bool want_excl,
+ struct inode **delegated_inode)
{
int error;
@@ -3488,6 +3490,9 @@ int vfs_create(struct mnt_idmap *idmap, struct inode *dir,
mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG);
error = security_inode_create(dir, dentry, mode);
+ if (error)
+ return error;
+ error = try_break_deleg(dir, delegated_inode);
if (error)
return error;
error = dir->i_op->create(idmap, dir, dentry, mode, want_excl);
@@ -4365,6 +4370,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
struct path path;
int error;
unsigned int lookup_flags = 0;
+ struct inode *delegated_inode = NULL;
error = may_mknod(mode);
if (error)
@@ -4384,7 +4390,8 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
switch (mode & S_IFMT) {
case 0: case S_IFREG:
error = vfs_create(idmap, path.dentry->d_inode,
- dentry, mode, true);
+ dentry, mode, true,
+ &delegated_inode);
if (!error)
security_path_post_mknod(idmap, dentry);
break;
@@ -4399,6 +4406,11 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
}
out2:
end_creating_path(&path, dentry);
+ if (delegated_inode) {
+ error = break_deleg_wait(&delegated_inode);
+ if (!error)
+ goto retry;
+ }
if (retry_estale(error, lookup_flags)) {
lookup_flags |= LOOKUP_REVAL;
goto retry;
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index ad14b34583bb9fd7cd1e29f5f8676fa3442dd661..dbf70d184c0276d198599a22eb0953a2a1dde2c8 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -344,7 +344,7 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp,
status = fh_fill_pre_attrs(fhp);
if (status != nfs_ok)
goto out;
- host_err = vfs_create(&nop_mnt_idmap, inode, child, iap->ia_mode, true);
+ host_err = vfs_create(&nop_mnt_idmap, inode, child, iap->ia_mode, true, NULL);
if (host_err < 0) {
status = nfserrno(host_err);
goto out;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 85afd2fad7e08b66b1a9ce372afdea1df52086be..7eaae44467188fab0909fabec986e103bcd52457 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1639,8 +1639,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
err = 0;
switch (type) {
case S_IFREG:
- host_err = vfs_create(&nop_mnt_idmap, dirp, dchild,
- iap->ia_mode, true);
+ host_err = vfs_create(&nop_mnt_idmap, dirp, dchild, iap->ia_mode, true, NULL);
if (!host_err)
nfsd_check_ignore_resizing(iap);
break;
diff --git a/fs/open.c b/fs/open.c
index 3d64372ecc675e4795eb0a0deda10f8f67b95640..4d98f8b52b98bc95e52cb247d14871ff6e4a1b5c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1173,7 +1173,7 @@ struct file *dentry_create(const struct path *path, int flags, umode_t mode,
error = vfs_create(mnt_idmap(path->mnt),
d_inode(path->dentry->d_parent),
- path->dentry, mode, true);
+ path->dentry, mode, true, NULL);
if (!error)
error = vfs_open(path, f);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d215d7349489686b66bb66e939b27046f7d836f6..d3123f5d97e86b58e4c9608cf6ef2abd1fcddbcd 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -235,7 +235,7 @@ static inline int ovl_do_create(struct ovl_fs *ofs,
struct inode *dir, struct dentry *dentry,
umode_t mode)
{
- int err = vfs_create(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, true);
+ int err = vfs_create(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, true, NULL);
pr_debug("create(%pd2, 0%o) = %i\n", dentry, mode, err);
return err;
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index c5f0f3170d586cb2dc4d416b80948c642797fb82..be278bb6b71bab8aa41aed06a8806e7bc2de4cd3 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -189,7 +189,7 @@ int ksmbd_vfs_create(struct ksmbd_work *work, const char *name, umode_t mode)
mode |= S_IFREG;
err = vfs_create(mnt_idmap(path.mnt), d_inode(path.dentry),
- dentry, mode, true);
+ dentry, mode, true, NULL);
if (!err) {
ksmbd_vfs_inherit_owner(work, d_inode(path.dentry),
d_inode(dentry));
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d8bdaf7c87502ff17775602f5391d375738b4ed8..5fcf64d9cf42ce135c0fbcbf6dfbf8816ae0bcb1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2111,7 +2111,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
* VFS helper functions..
*/
int vfs_create(struct mnt_idmap *, struct inode *,
- struct dentry *, umode_t, bool);
+ struct dentry *, umode_t, bool, struct inode **);
struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
struct dentry *, umode_t, struct inode **);
int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 07/13] vfs: make vfs_mknod break delegations on parent directory
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (5 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 08/13] vfs: make vfs_symlink break delegations on parent dir Jeff Layton
` (7 subsequent siblings)
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.
Add a new delegated_inode return pointer to vfs_mknod() and have the
appropriate callers wait when there is an outstanding delegation. All
other callers just set the pointer to NULL.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
drivers/base/devtmpfs.c | 2 +-
fs/ecryptfs/inode.c | 2 +-
fs/init.c | 2 +-
fs/namei.c | 25 +++++++++++++++++--------
fs/nfsd/vfs.c | 2 +-
fs/overlayfs/overlayfs.h | 2 +-
include/linux/fs.h | 4 ++--
net/unix/af_unix.c | 2 +-
8 files changed, 25 insertions(+), 16 deletions(-)
diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 104025104ef75381984fd94dfbd50feeaa8cdd22..2f576ecf18324f767cd5ac6cbd28adbf9f46b958 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -231,7 +231,7 @@ static int handle_create(const char *nodename, umode_t mode, kuid_t uid,
return PTR_ERR(dentry);
err = vfs_mknod(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode,
- dev->devt);
+ dev->devt, NULL);
if (!err) {
struct iattr newattrs;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 661709b157ce854c3bfdfdb13f7c10435fad9756..639ae42bcd56890d04592f7269e4ffc099b44f09 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -565,7 +565,7 @@ ecryptfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
rc = lock_parent(dentry, &lower_dentry, &lower_dir);
if (!rc)
rc = vfs_mknod(&nop_mnt_idmap, lower_dir,
- lower_dentry, mode, dev);
+ lower_dentry, mode, dev, NULL);
if (rc || d_really_is_negative(lower_dentry))
goto out;
rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb);
diff --git a/fs/init.c b/fs/init.c
index 895f8a09a71acfd03e11164e3b441a7d4e2de146..4f02260dd65b0dfcbfbf5812d2ec6a33444a3b1f 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -157,7 +157,7 @@ int __init init_mknod(const char *filename, umode_t mode, unsigned int dev)
error = security_path_mknod(&path, dentry, mode, dev);
if (!error)
error = vfs_mknod(mnt_idmap(path.mnt), path.dentry->d_inode,
- dentry, mode, new_decode_dev(dev));
+ dentry, mode, new_decode_dev(dev), NULL);
end_creating_path(&path, dentry);
return error;
}
diff --git a/fs/namei.c b/fs/namei.c
index 7510942e0249de19df4363b92f813b3acdfc2254..7e400cbdbc6af1c72eb684f051d0571e944a27d7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4297,13 +4297,15 @@ inline struct dentry *start_creating_user_path(
}
EXPORT_SYMBOL(start_creating_user_path);
+
/**
* vfs_mknod - create device node or file
- * @idmap: idmap of the mount the inode was found from
- * @dir: inode of the parent directory
- * @dentry: dentry of the child device node
- * @mode: mode of the child device node
- * @dev: device number of device to create
+ * @idmap: idmap of the mount the inode was found from
+ * @dir: inode of the parent directory
+ * @dentry: dentry of the child device node
+ * @mode: mode of the child device node
+ * @dev: device number of device to create
+ * @delegated_inode: returns parent inode, if the inode is delegated.
*
* Create a device node or file.
*
@@ -4314,7 +4316,8 @@ EXPORT_SYMBOL(start_creating_user_path);
* raw inode simply pass @nop_mnt_idmap.
*/
int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
- struct dentry *dentry, umode_t mode, dev_t dev)
+ struct dentry *dentry, umode_t mode, dev_t dev,
+ struct inode **delegated_inode)
{
bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV;
int error = may_create(idmap, dir, dentry);
@@ -4338,6 +4341,10 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
if (error)
return error;
+ error = try_break_deleg(dir, delegated_inode);
+ if (error)
+ return error;
+
error = dir->i_op->mknod(idmap, dir, dentry, mode, dev);
if (!error)
fsnotify_create(dir, dentry);
@@ -4397,11 +4404,13 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
break;
case S_IFCHR: case S_IFBLK:
error = vfs_mknod(idmap, path.dentry->d_inode,
- dentry, mode, new_decode_dev(dev));
+ dentry, mode, new_decode_dev(dev),
+ &delegated_inode);
break;
case S_IFIFO: case S_IFSOCK:
error = vfs_mknod(idmap, path.dentry->d_inode,
- dentry, mode, 0);
+ dentry, mode, 0,
+ &delegated_inode);
break;
}
out2:
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 7eaae44467188fab0909fabec986e103bcd52457..44debf3d0be450ddc245e2fa4f57fe076e1454a2 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1660,7 +1660,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
case S_IFIFO:
case S_IFSOCK:
host_err = vfs_mknod(&nop_mnt_idmap, dirp, dchild,
- iap->ia_mode, rdev);
+ iap->ia_mode, rdev, NULL);
break;
default:
printk(KERN_WARNING "nfsd: bad file type %o in nfsd_create\n",
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d3123f5d97e86b58e4c9608cf6ef2abd1fcddbcd..87b82dada7ec1b8429299c68078cda24176c5607 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -257,7 +257,7 @@ static inline int ovl_do_mknod(struct ovl_fs *ofs,
struct inode *dir, struct dentry *dentry,
umode_t mode, dev_t dev)
{
- int err = vfs_mknod(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, dev);
+ int err = vfs_mknod(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, dev, NULL);
pr_debug("mknod(%pd2, 0%o, 0%o) = %i\n", dentry, mode, dev, err);
return err;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5fcf64d9cf42ce135c0fbcbf6dfbf8816ae0bcb1..a1e1afe39e01a46bf0a81e241b92690947402851 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2115,7 +2115,7 @@ int vfs_create(struct mnt_idmap *, struct inode *,
struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
struct dentry *, umode_t, struct inode **);
int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
- umode_t, dev_t);
+ umode_t, dev_t, struct inode **);
int vfs_symlink(struct mnt_idmap *, struct inode *,
struct dentry *, const char *);
int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
@@ -2151,7 +2151,7 @@ static inline int vfs_whiteout(struct mnt_idmap *idmap,
struct inode *dir, struct dentry *dentry)
{
return vfs_mknod(idmap, dir, dentry, S_IFCHR | WHITEOUT_MODE,
- WHITEOUT_DEV);
+ WHITEOUT_DEV, NULL);
}
struct file *kernel_tmpfile_open(struct mnt_idmap *idmap,
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 768098dec2310008632558ae928703b37c3cc8ef..db1fd8d6a84c2c7c0d45b43d9c5a936b3d491b7b 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1399,7 +1399,7 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr,
idmap = mnt_idmap(parent.mnt);
err = security_path_mknod(&parent, dentry, mode, 0);
if (!err)
- err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0);
+ err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0, NULL);
if (err)
goto out_path;
err = mutex_lock_interruptible(&u->bindlock);
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 08/13] vfs: make vfs_symlink break delegations on parent dir
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (6 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 07/13] vfs: make vfs_mknod " Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-22 9:01 ` Jan Kara
2025-10-21 15:25 ` [PATCH v3 09/13] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
` (6 subsequent siblings)
14 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
In order to add directory delegation support, we must break delegations
on the parent on any change to the directory.
Add a delegated_inode parameter to vfs_symlink() and have it break the
delegation. do_symlinkat() can then wait on the delegation break before
proceeding.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/ecryptfs/inode.c | 2 +-
fs/init.c | 2 +-
fs/namei.c | 16 ++++++++++++++--
fs/nfsd/vfs.c | 2 +-
fs/overlayfs/overlayfs.h | 2 +-
include/linux/fs.h | 2 +-
6 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 639ae42bcd56890d04592f7269e4ffc099b44f09..d430ec5a63094ea4cd42828e7d44f0f8d918fcec 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -480,7 +480,7 @@ static int ecryptfs_symlink(struct mnt_idmap *idmap,
if (rc)
goto out_lock;
rc = vfs_symlink(&nop_mnt_idmap, lower_dir, lower_dentry,
- encoded_symname);
+ encoded_symname, NULL);
kfree(encoded_symname);
if (rc || d_really_is_negative(lower_dentry))
goto out_lock;
diff --git a/fs/init.c b/fs/init.c
index 4f02260dd65b0dfcbfbf5812d2ec6a33444a3b1f..e0f5429c0a49d046bd3f231a260954ed0f90ef44 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -209,7 +209,7 @@ int __init init_symlink(const char *oldname, const char *newname)
error = security_path_symlink(&path, dentry, oldname);
if (!error)
error = vfs_symlink(mnt_idmap(path.mnt), path.dentry->d_inode,
- dentry, oldname);
+ dentry, oldname, NULL);
end_creating_path(&path, dentry);
return error;
}
diff --git a/fs/namei.c b/fs/namei.c
index 7e400cbdbc6af1c72eb684f051d0571e944a27d7..71af256cdd941e200389570538f64a3f795e6c83 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4851,6 +4851,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
* @dir: inode of the parent directory
* @dentry: dentry of the child symlink file
* @oldname: name of the file to link to
+ * @delegated_inode: returns victim inode, if the inode is delegated.
*
* Create a symlink.
*
@@ -4861,7 +4862,8 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
* raw inode simply pass @nop_mnt_idmap.
*/
int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir,
- struct dentry *dentry, const char *oldname)
+ struct dentry *dentry, const char *oldname,
+ struct inode **delegated_inode)
{
int error;
@@ -4876,6 +4878,10 @@ int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir,
if (error)
return error;
+ error = try_break_deleg(dir, delegated_inode);
+ if (error)
+ return error;
+
error = dir->i_op->symlink(idmap, dir, dentry, oldname);
if (!error)
fsnotify_create(dir, dentry);
@@ -4889,6 +4895,7 @@ int do_symlinkat(struct filename *from, int newdfd, struct filename *to)
struct dentry *dentry;
struct path path;
unsigned int lookup_flags = 0;
+ struct inode *delegated_inode = NULL;
if (IS_ERR(from)) {
error = PTR_ERR(from);
@@ -4903,8 +4910,13 @@ int do_symlinkat(struct filename *from, int newdfd, struct filename *to)
error = security_path_symlink(&path, dentry, from->name);
if (!error)
error = vfs_symlink(mnt_idmap(path.mnt), path.dentry->d_inode,
- dentry, from->name);
+ dentry, from->name, &delegated_inode);
end_creating_path(&path, dentry);
+ if (delegated_inode) {
+ error = break_deleg_wait(&delegated_inode);
+ if (!error)
+ goto retry;
+ }
if (retry_estale(error, lookup_flags)) {
lookup_flags |= LOOKUP_REVAL;
goto retry;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 44debf3d0be450ddc245e2fa4f57fe076e1454a2..386f454badce7ed448399ef93e9c8edafbcc4d79 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1829,7 +1829,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
err = fh_fill_pre_attrs(fhp);
if (err != nfs_ok)
goto out_unlock;
- host_err = vfs_symlink(&nop_mnt_idmap, d_inode(dentry), dnew, path);
+ host_err = vfs_symlink(&nop_mnt_idmap, d_inode(dentry), dnew, path, NULL);
err = nfserrno(host_err);
cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp);
if (!err)
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 87b82dada7ec1b8429299c68078cda24176c5607..94bb4540f7ae2e0571b3b88393c180bd73c3c09c 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -267,7 +267,7 @@ static inline int ovl_do_symlink(struct ovl_fs *ofs,
struct inode *dir, struct dentry *dentry,
const char *oldname)
{
- int err = vfs_symlink(ovl_upper_mnt_idmap(ofs), dir, dentry, oldname);
+ int err = vfs_symlink(ovl_upper_mnt_idmap(ofs), dir, dentry, oldname, NULL);
pr_debug("symlink(\"%s\", %pd2) = %i\n", oldname, dentry, err);
return err;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a1e1afe39e01a46bf0a81e241b92690947402851..d8c7245da3bf3200b435c7ea6cafcf7903ebf293 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2117,7 +2117,7 @@ struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
umode_t, dev_t, struct inode **);
int vfs_symlink(struct mnt_idmap *, struct inode *,
- struct dentry *, const char *);
+ struct dentry *, const char *, struct inode **);
int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
struct dentry *, struct inode **);
int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *,
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 09/13] filelock: lift the ban on directory leases in generic_setlease
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (7 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 08/13] vfs: make vfs_symlink break delegations on parent dir Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-22 9:03 ` Jan Kara
2025-10-21 15:25 ` [PATCH v3 10/13] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
` (5 subsequent siblings)
14 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
With the addition of the try_break_lease calls in directory changing
operations, allow generic_setlease to hand them out. Write leases on
directories are never allowed however, so continue to reject them.
For now, there is no API for requesting delegations from userland, so
ensure that userland is prevented from acquiring a lease on a directory.
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/locks.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index 0b16921fb52e602ea2e0c3de39d9d772af98ba7d..b47552106769ec5a189babfe12518e34aa59c759 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1929,14 +1929,19 @@ static int generic_delete_lease(struct file *filp, void *owner)
int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
void **priv)
{
- if (!S_ISREG(file_inode(filp)->i_mode))
+ struct inode *inode = file_inode(filp);
+
+ if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
return -EINVAL;
switch (arg) {
case F_UNLCK:
return generic_delete_lease(filp, *priv);
- case F_RDLCK:
case F_WRLCK:
+ if (S_ISDIR(inode->i_mode))
+ return -EINVAL;
+ fallthrough;
+ case F_RDLCK:
if (!(*flp)->fl_lmops->lm_break) {
WARN_ON_ONCE(1);
return -ENOLCK;
@@ -2065,6 +2070,9 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, int arg)
*/
int fcntl_setlease(unsigned int fd, struct file *filp, int arg)
{
+ if (S_ISDIR(file_inode(filp)->i_mode))
+ return -EINVAL;
+
if (arg == F_UNLCK)
return vfs_setlease(filp, F_UNLCK, NULL, (void **)&filp);
return do_fcntl_add_lease(fd, filp, arg);
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 10/13] nfsd: allow filecache to hold S_IFDIR files
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (8 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 09/13] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 11/13] nfsd: allow DELEGRETURN on directories Jeff Layton
` (4 subsequent siblings)
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
The filecache infrastructure will only handle S_IFREG files at the
moment. Directory delegations will require adding support for opening
S_IFDIR inodes.
Plumb a "type" argument into nfsd_file_do_acquire() and have all of the
existing callers set it to S_IFREG. Add a new nfsd_file_acquire_dir()
wrapper that nfsd can call to request a nfsd_file that holds a directory
open.
For now, there is no need for a fsnotify_mark for directories, as
CB_NOTIFY is not yet supported. Change nfsd_file_do_acquire() to avoid
allocating one for non-S_IFREG inodes.
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/filecache.c | 57 ++++++++++++++++++++++++++++++++++++++++-------------
fs/nfsd/filecache.h | 2 ++
fs/nfsd/vfs.c | 5 +++--
fs/nfsd/vfs.h | 2 +-
4 files changed, 49 insertions(+), 17 deletions(-)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index a238b6725008a5c2988bd3da874d1f34ee778437..93798575b8075c63f95cd415b6d24df706ada0f6 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1086,7 +1086,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
struct auth_domain *client,
struct svc_fh *fhp,
unsigned int may_flags, struct file *file,
- struct nfsd_file **pnf, bool want_gc)
+ umode_t type, bool want_gc, struct nfsd_file **pnf)
{
unsigned char need = may_flags & NFSD_FILE_MAY_MASK;
struct nfsd_file *new, *nf;
@@ -1097,13 +1097,13 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
int ret;
retry:
- if (rqstp) {
- status = fh_verify(rqstp, fhp, S_IFREG,
+ if (rqstp)
+ status = fh_verify(rqstp, fhp, type,
may_flags|NFSD_MAY_OWNER_OVERRIDE);
- } else {
- status = fh_verify_local(net, cred, client, fhp, S_IFREG,
+ else
+ status = fh_verify_local(net, cred, client, fhp, type,
may_flags|NFSD_MAY_OWNER_OVERRIDE);
- }
+
if (status != nfs_ok)
return status;
inode = d_inode(fhp->fh_dentry);
@@ -1176,15 +1176,18 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
open_file:
trace_nfsd_file_alloc(nf);
- nf->nf_mark = nfsd_file_mark_find_or_create(inode);
- if (nf->nf_mark) {
+
+ if (type == S_IFREG)
+ nf->nf_mark = nfsd_file_mark_find_or_create(inode);
+
+ if (type != S_IFREG || nf->nf_mark) {
if (file) {
get_file(file);
nf->nf_file = file;
status = nfs_ok;
trace_nfsd_file_opened(nf, status);
} else {
- ret = nfsd_open_verified(fhp, may_flags, &nf->nf_file);
+ ret = nfsd_open_verified(fhp, type, may_flags, &nf->nf_file);
if (ret == -EOPENSTALE && stale_retry) {
stale_retry = false;
nfsd_file_unhash(nf);
@@ -1246,7 +1249,7 @@ nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **pnf)
{
return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL,
- fhp, may_flags, NULL, pnf, true);
+ fhp, may_flags, NULL, S_IFREG, true, pnf);
}
/**
@@ -1271,7 +1274,7 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **pnf)
{
return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL,
- fhp, may_flags, NULL, pnf, false);
+ fhp, may_flags, NULL, S_IFREG, false, pnf);
}
/**
@@ -1314,8 +1317,8 @@ nfsd_file_acquire_local(struct net *net, struct svc_cred *cred,
const struct cred *save_cred = get_current_cred();
__be32 beres;
- beres = nfsd_file_do_acquire(NULL, net, cred, client,
- fhp, may_flags, NULL, pnf, false);
+ beres = nfsd_file_do_acquire(NULL, net, cred, client, fhp, may_flags,
+ NULL, S_IFREG, false, pnf);
put_cred(revert_creds(save_cred));
return beres;
}
@@ -1344,7 +1347,33 @@ nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct nfsd_file **pnf)
{
return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL,
- fhp, may_flags, file, pnf, false);
+ fhp, may_flags, file, S_IFREG, false, pnf);
+}
+
+/**
+ * nfsd_file_acquire_dir - Get a struct nfsd_file with an open directory
+ * @rqstp: the RPC transaction being executed
+ * @fhp: the NFS filehandle of the file to be opened
+ * @pnf: OUT: new or found "struct nfsd_file" object
+ *
+ * The nfsd_file_object returned by this API is reference-counted
+ * but not garbage-collected. The object is unhashed after the
+ * final nfsd_file_put(). This opens directories only, and only
+ * in O_RDONLY mode.
+ *
+ * Return values:
+ * %nfs_ok - @pnf points to an nfsd_file with its reference
+ * count boosted.
+ *
+ * On error, an nfsstat value in network byte order is returned.
+ */
+__be32
+nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ struct nfsd_file **pnf)
+{
+ return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL, fhp,
+ NFSD_MAY_READ|NFSD_MAY_64BIT_COOKIE,
+ NULL, S_IFDIR, false, pnf);
}
/*
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index e3d6ca2b60308e5e91ba4bb32d935f54527d8bda..b383dbc5b9218d21a29b852572f80fab08de9fa9 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -82,5 +82,7 @@ __be32 nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
__be32 nfsd_file_acquire_local(struct net *net, struct svc_cred *cred,
struct auth_domain *client, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **pnf);
+__be32 nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ struct nfsd_file **pnf);
int nfsd_file_cache_stats_show(struct seq_file *m, void *v);
#endif /* _FS_NFSD_FILECACHE_H */
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 386f454badce7ed448399ef93e9c8edafbcc4d79..f1c6b6e87d84fa6e1923b44a89baf5183ede54b8 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -959,15 +959,16 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
/**
* nfsd_open_verified - Open a regular file for the filecache
* @fhp: NFS filehandle of the file to open
+ * @type: S_IFMT inode type allowed (0 means any type is allowed)
* @may_flags: internal permission flags
* @filp: OUT: open "struct file *"
*
* Returns zero on success, or a negative errno value.
*/
int
-nfsd_open_verified(struct svc_fh *fhp, int may_flags, struct file **filp)
+nfsd_open_verified(struct svc_fh *fhp, umode_t type, int may_flags, struct file **filp)
{
- return __nfsd_open(fhp, S_IFREG, may_flags, filp);
+ return __nfsd_open(fhp, type, may_flags, filp);
}
/*
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index c713ed0b04e0311ab606c5c456c8ce92dd506cce..12309426410f923492e73f6867cd6597a9c0a097 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -114,7 +114,7 @@ __be32 nfsd_setxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
int nfsd_open_break_lease(struct inode *, int);
__be32 nfsd_open(struct svc_rqst *, struct svc_fh *, umode_t,
int, struct file **);
-int nfsd_open_verified(struct svc_fh *fhp, int may_flags,
+int nfsd_open_verified(struct svc_fh *fhp, umode_t type, int may_flags,
struct file **filp);
__be32 nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct file *file, loff_t offset,
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 11/13] nfsd: allow DELEGRETURN on directories
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (9 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 10/13] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 12/13] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
` (3 subsequent siblings)
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
As Trond pointed out: "...provided that the presented stateid is
actually valid, it is also sufficient to uniquely identify the file to
which it is associated (see RFC8881 Section 8.2.4), so the filehandle
should be considered mostly irrelevant for operations like DELEGRETURN."
Don't ask fh_verify to filter on file type.
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4state.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 35004568d43eb27254802f6f5784a3c04c20fe08..8efa37055b21ca2202488e90377d5162613b9343 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7832,7 +7832,8 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
- if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
+ status = fh_verify(rqstp, &cstate->current_fh, 0, 0);
+ if (status)
return status;
status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, SC_STATUS_REVOKED, &s, nn);
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 12/13] nfsd: wire up GET_DIR_DELEGATION handling
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (10 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 11/13] nfsd: allow DELEGRETURN on directories Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-21 16:16 ` Chuck Lever
2025-10-21 15:25 ` [PATCH v3 13/13] vfs: expose delegation support to userland Jeff Layton
` (2 subsequent siblings)
14 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
Add a new routine for acquiring a read delegation on a directory. These
are recallable-only delegations with no support for CB_NOTIFY. That will
be added in a later phase.
Since the same CB_RECALL/DELEGRETURN infrastructure is used for regular
and directory delegations, a normal nfs4_delegation is used to represent
a directory delegation.
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4proc.c | 22 +++++++++++-
fs/nfsd/nfs4state.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/state.h | 5 +++
3 files changed, 126 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 2222bb283baff35703b4035fa0fc593b54d8b937..4f0b1210702ecf4eaa20c74e548aabbee33b7fd1 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2342,6 +2342,13 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
union nfsd4_op_u *u)
{
struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
+ struct nfs4_delegation *dd;
+ struct nfsd_file *nf;
+ __be32 status;
+
+ status = nfsd_file_acquire_dir(rqstp, &cstate->current_fh, &nf);
+ if (status != nfs_ok)
+ return status;
/*
* RFC 8881, section 18.39.3 says:
@@ -2355,7 +2362,20 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
* return NFS4_OK with a non-fatal status of GDD4_UNAVAIL in this
* situation.
*/
- gdd->gddrnf_status = GDD4_UNAVAIL;
+ dd = nfsd_get_dir_deleg(cstate, gdd, nf);
+ nfsd_file_put(nf);
+ if (IS_ERR(dd)) {
+ int err = PTR_ERR(dd);
+
+ if (err != -EAGAIN)
+ return nfserrno(err);
+ gdd->gddrnf_status = GDD4_UNAVAIL;
+ return nfs_ok;
+ }
+
+ gdd->gddrnf_status = GDD4_OK;
+ memcpy(&gdd->gddr_stateid, &dd->dl_stid.sc_stateid, sizeof(gdd->gddr_stateid));
+ nfs4_put_stid(&dd->dl_stid);
return nfs_ok;
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8efa37055b21ca2202488e90377d5162613b9343..808c24fb5c9a0b432d3271c051b409fcb75970cd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -9367,3 +9367,103 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
nfs4_put_stid(&dp->dl_stid);
return status;
}
+
+/**
+ * nfsd_get_dir_deleg - attempt to get a directory delegation
+ * @cstate: compound state
+ * @gdd: GET_DIR_DELEGATION arg/resp structure
+ * @nf: nfsd_file opened on the directory
+ *
+ * Given a GET_DIR_DELEGATION request @gdd, attempt to acquire a delegation
+ * on the directory to which @nf refers. Note that this does not set up any
+ * sort of async notifications for the delegation.
+ */
+struct nfs4_delegation *
+nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
+ struct nfsd4_get_dir_delegation *gdd,
+ struct nfsd_file *nf)
+{
+ struct nfs4_client *clp = cstate->clp;
+ struct nfs4_delegation *dp;
+ struct file_lease *fl;
+ struct nfs4_file *fp, *rfp;
+ int status = 0;
+
+ fp = nfsd4_alloc_file();
+ if (!fp)
+ return ERR_PTR(-ENOMEM);
+
+ nfsd4_file_init(&cstate->current_fh, fp);
+
+ rfp = nfsd4_file_hash_insert(fp, &cstate->current_fh);
+ if (unlikely(!rfp)) {
+ put_nfs4_file(fp);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ if (rfp != fp) {
+ put_nfs4_file(fp);
+ fp = rfp;
+ }
+
+ /* if this client already has one, return that it's unavailable */
+ spin_lock(&state_lock);
+ spin_lock(&fp->fi_lock);
+ /* existing delegation? */
+ if (nfs4_delegation_exists(clp, fp)) {
+ status = -EAGAIN;
+ } else if (!fp->fi_deleg_file) {
+ fp->fi_deleg_file = nfsd_file_get(nf);
+ fp->fi_delegees = 1;
+ } else {
+ ++fp->fi_delegees;
+ }
+ spin_unlock(&fp->fi_lock);
+ spin_unlock(&state_lock);
+
+ if (status) {
+ put_nfs4_file(fp);
+ return ERR_PTR(status);
+ }
+
+ /* Try to set up the lease */
+ status = -ENOMEM;
+ dp = alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ);
+ if (!dp)
+ goto out_delegees;
+
+ fl = nfs4_alloc_init_lease(dp);
+ if (!fl)
+ goto out_put_stid;
+
+ status = kernel_setlease(nf->nf_file,
+ fl->c.flc_type, &fl, NULL);
+ if (fl)
+ locks_free_lease(fl);
+ if (status)
+ goto out_put_stid;
+
+ /*
+ * Now, try to hash it. This can fail if we race another nfsd task
+ * trying to set a delegation on the same file. If that happens,
+ * then just say UNAVAIL.
+ */
+ spin_lock(&state_lock);
+ spin_lock(&clp->cl_lock);
+ spin_lock(&fp->fi_lock);
+ status = hash_delegation_locked(dp, fp);
+ spin_unlock(&fp->fi_lock);
+ spin_unlock(&clp->cl_lock);
+ spin_unlock(&state_lock);
+
+ if (!status)
+ return dp;
+
+ /* Something failed. Drop the lease and clean up the stid */
+ kernel_setlease(fp->fi_deleg_file->nf_file, F_UNLCK, NULL, (void **)&dp);
+out_put_stid:
+ nfs4_put_stid(&dp->dl_stid);
+out_delegees:
+ put_deleg_file(fp);
+ return ERR_PTR(status);
+}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 1e736f4024263ffa9c93bcc9ec48f44566a8cc77..b052c1effdc5356487c610db9728df8ecfe851d4 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -867,4 +867,9 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
struct dentry *dentry, struct nfs4_delegation **pdp);
+
+struct nfsd4_get_dir_delegation;
+struct nfs4_delegation *nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
+ struct nfsd4_get_dir_delegation *gdd,
+ struct nfsd_file *nf);
#endif /* NFSD4_STATE_H */
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 13/13] vfs: expose delegation support to userland
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (11 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 12/13] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
@ 2025-10-21 15:25 ` Jeff Layton
2025-10-29 12:55 ` [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Christian Brauner
2025-10-29 13:38 ` Christian Brauner
14 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-21 15:25 UTC (permalink / raw)
To: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev, Jeff Layton
Now that support for recallable directory delegations is complete,
expose this functionality to userland with new F_SETDELEG and F_GETDELEG
commands for fcntl(). This also allows userland to request a FL_DELEG
type lease on files too. Userland applications that do will get
signalled when there are metadata changes in addition to just data
changes (which is a limitation of FL_LEASE leases).
These commands accept a new "struct delegation" argument that contains
a flags field for future expansion.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/fcntl.c | 9 ++++++++
fs/locks.c | 53 ++++++++++++++++++++++++++++++++++++----------
include/linux/filelock.h | 12 +++++++++++
include/uapi/linux/fcntl.h | 10 +++++++++
4 files changed, 73 insertions(+), 11 deletions(-)
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 72f8433d9109889eecef56b32d20a85b4e12ea44..f34f0a07f993f9f95a60f2954bb4304d3c218498 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -445,6 +445,7 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
struct file *filp)
{
void __user *argp = (void __user *)arg;
+ struct delegation deleg;
int argi = (int)arg;
struct flock flock;
long err = -EINVAL;
@@ -550,6 +551,14 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
case F_SET_RW_HINT:
err = fcntl_set_rw_hint(filp, arg);
break;
+ case F_GETDELEG:
+ err = fcntl_getdeleg(filp);
+ break;
+ case F_SETDELEG:
+ if (copy_from_user(&deleg, argp, sizeof(deleg)))
+ return -EFAULT;
+ err = fcntl_setdeleg(fd, filp, &deleg);
+ break;
default:
break;
}
diff --git a/fs/locks.c b/fs/locks.c
index b47552106769ec5a189babfe12518e34aa59c759..fda62897e371fbc8d04b8073df3d2267d2c7c430 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -585,7 +585,7 @@ static const struct lease_manager_operations lease_manager_ops = {
/*
* Initialize a lease, use the default lock manager operations
*/
-static int lease_init(struct file *filp, int type, struct file_lease *fl)
+static int lease_init(struct file *filp, unsigned int flags, int type, struct file_lease *fl)
{
if (assign_type(&fl->c, type) != 0)
return -EINVAL;
@@ -594,13 +594,13 @@ static int lease_init(struct file *filp, int type, struct file_lease *fl)
fl->c.flc_pid = current->tgid;
fl->c.flc_file = filp;
- fl->c.flc_flags = FL_LEASE;
+ fl->c.flc_flags = flags;
fl->fl_lmops = &lease_manager_ops;
return 0;
}
/* Allocate a file_lock initialised to this type of lease */
-static struct file_lease *lease_alloc(struct file *filp, int type)
+static struct file_lease *lease_alloc(struct file *filp, unsigned int flags, int type)
{
struct file_lease *fl = locks_alloc_lease();
int error = -ENOMEM;
@@ -608,7 +608,7 @@ static struct file_lease *lease_alloc(struct file *filp, int type)
if (fl == NULL)
return ERR_PTR(error);
- error = lease_init(filp, type, fl);
+ error = lease_init(filp, flags, type, fl);
if (error) {
locks_free_lease(fl);
return ERR_PTR(error);
@@ -1548,10 +1548,9 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
int want_write = (mode & O_ACCMODE) != O_RDONLY;
LIST_HEAD(dispose);
- new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
+ new_fl = lease_alloc(NULL, type, want_write ? F_WRLCK : F_RDLCK);
if (IS_ERR(new_fl))
return PTR_ERR(new_fl);
- new_fl->c.flc_flags = type;
/* typically we will check that ctx is non-NULL before calling */
ctx = locks_inode_context(inode);
@@ -1697,7 +1696,7 @@ EXPORT_SYMBOL(lease_get_mtime);
* XXX: sfr & willy disagree over whether F_INPROGRESS
* should be returned to userspace.
*/
-int fcntl_getlease(struct file *filp)
+static int __fcntl_getlease(struct file *filp, unsigned int flavor)
{
struct file_lease *fl;
struct inode *inode = file_inode(filp);
@@ -1713,7 +1712,8 @@ int fcntl_getlease(struct file *filp)
list_for_each_entry(fl, &ctx->flc_lease, c.flc_list) {
if (fl->c.flc_file != filp)
continue;
- type = target_leasetype(fl);
+ if (fl->c.flc_flags & flavor)
+ type = target_leasetype(fl);
break;
}
spin_unlock(&ctx->flc_lock);
@@ -1724,6 +1724,16 @@ int fcntl_getlease(struct file *filp)
return type;
}
+int fcntl_getlease(struct file *filp)
+{
+ return __fcntl_getlease(filp, FL_LEASE);
+}
+
+int fcntl_getdeleg(struct file *filp)
+{
+ return __fcntl_getlease(filp, FL_DELEG);
+}
+
/**
* check_conflicting_open - see if the given file points to an inode that has
* an existing open that would conflict with the
@@ -2033,13 +2043,13 @@ vfs_setlease(struct file *filp, int arg, struct file_lease **lease, void **priv)
}
EXPORT_SYMBOL_GPL(vfs_setlease);
-static int do_fcntl_add_lease(unsigned int fd, struct file *filp, int arg)
+static int do_fcntl_add_lease(unsigned int fd, struct file *filp, unsigned int flavor, int arg)
{
struct file_lease *fl;
struct fasync_struct *new;
int error;
- fl = lease_alloc(filp, arg);
+ fl = lease_alloc(filp, flavor, arg);
if (IS_ERR(fl))
return PTR_ERR(fl);
@@ -2075,7 +2085,28 @@ int fcntl_setlease(unsigned int fd, struct file *filp, int arg)
if (arg == F_UNLCK)
return vfs_setlease(filp, F_UNLCK, NULL, (void **)&filp);
- return do_fcntl_add_lease(fd, filp, arg);
+ return do_fcntl_add_lease(fd, filp, FL_LEASE, arg);
+}
+
+/**
+ * fcntl_setdeleg - sets a delegation on an open file
+ * @fd: open file descriptor
+ * @filp: file pointer
+ * @deleg: delegation request from userland
+ *
+ * Call this fcntl to establish a delegation on the file.
+ * Note that you also need to call %F_SETSIG to
+ * receive a signal when the lease is broken.
+ */
+int fcntl_setdeleg(unsigned int fd, struct file *filp, struct delegation *deleg)
+{
+ /* For now, no flags are supported */
+ if (deleg->d_flags != 0)
+ return -EINVAL;
+
+ if (deleg->d_type == F_UNLCK)
+ return vfs_setlease(filp, F_UNLCK, NULL, (void **)&filp);
+ return do_fcntl_add_lease(fd, filp, FL_DELEG, deleg->d_type);
}
/**
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index c2ce8ba05d068b451ecf8f513b7e532819a29944..69b8fa8dce35dab670e6c7b288e13dc4caed1bc0 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -159,6 +159,8 @@ int fcntl_setlk64(unsigned int, struct file *, unsigned int,
int fcntl_setlease(unsigned int fd, struct file *filp, int arg);
int fcntl_getlease(struct file *filp);
+int fcntl_setdeleg(unsigned int fd, struct file *filp, struct delegation *deleg);
+int fcntl_getdeleg(struct file *filp);
static inline bool lock_is_unlock(struct file_lock *fl)
{
@@ -271,6 +273,16 @@ static inline int fcntl_getlease(struct file *filp)
return F_UNLCK;
}
+static inline int fcntl_setdeleg(unsigned int fd, struct file *filp, struct delegation *deleg)
+{
+ return -EINVAL;
+}
+
+static inline int fcntl_getdeleg(struct file *filp)
+{
+ return F_UNLCK;
+}
+
static inline bool lock_is_unlock(struct file_lock *fl)
{
return false;
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 3741ea1b73d8500061567b6590ccf5fb4c6770f0..aae88f4b5c05205b2b28ae46b21bca9817197e04 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -79,6 +79,16 @@
*/
#define RWF_WRITE_LIFE_NOT_SET RWH_WRITE_LIFE_NOT_SET
+/* Set/Get delegations */
+#define F_GETDELEG (F_LINUX_SPECIFIC_BASE + 15)
+#define F_SETDELEG (F_LINUX_SPECIFIC_BASE + 16)
+
+/* Argument structure for F_GETDELEG and F_SETDELEG */
+struct delegation {
+ short d_type; /* F_RDLCK, F_WRLCK, F_UNLCK */
+ unsigned int d_flags;
+};
+
/*
* Types of directory notifications that may be requested.
*/
--
2.51.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v3 12/13] nfsd: wire up GET_DIR_DELEGATION handling
2025-10-21 15:25 ` [PATCH v3 12/13] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
@ 2025-10-21 16:16 ` Chuck Lever
0 siblings, 0 replies; 26+ messages in thread
From: Chuck Lever @ 2025-10-21 16:16 UTC (permalink / raw)
To: Jeff Layton, Miklos Szeredi, Alexander Viro, Christian Brauner,
Jan Kara, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
samba-technical, netfs, ecryptfs, linux-unionfs, linux-xfs,
netdev
On 10/21/25 11:25 AM, Jeff Layton wrote:
> Add a new routine for acquiring a read delegation on a directory. These
> are recallable-only delegations with no support for CB_NOTIFY. That will
> be added in a later phase.
>
> Since the same CB_RECALL/DELEGRETURN infrastructure is used for regular
> and directory delegations, a normal nfs4_delegation is used to represent
> a directory delegation.
>
> Reviewed-by: NeilBrown <neil@brown.name>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> fs/nfsd/nfs4proc.c | 22 +++++++++++-
> fs/nfsd/nfs4state.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> fs/nfsd/state.h | 5 +++
> 3 files changed, 126 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 2222bb283baff35703b4035fa0fc593b54d8b937..4f0b1210702ecf4eaa20c74e548aabbee33b7fd1 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -2342,6 +2342,13 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
> union nfsd4_op_u *u)
> {
> struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
> + struct nfs4_delegation *dd;
> + struct nfsd_file *nf;
> + __be32 status;
> +
> + status = nfsd_file_acquire_dir(rqstp, &cstate->current_fh, &nf);
> + if (status != nfs_ok)
> + return status;
>
> /*
> * RFC 8881, section 18.39.3 says:
> @@ -2355,7 +2362,20 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
> * return NFS4_OK with a non-fatal status of GDD4_UNAVAIL in this
> * situation.
> */
> - gdd->gddrnf_status = GDD4_UNAVAIL;
> + dd = nfsd_get_dir_deleg(cstate, gdd, nf);
> + nfsd_file_put(nf);
> + if (IS_ERR(dd)) {
> + int err = PTR_ERR(dd);
> +
> + if (err != -EAGAIN)
> + return nfserrno(err);
> + gdd->gddrnf_status = GDD4_UNAVAIL;
> + return nfs_ok;
> + }
> +
> + gdd->gddrnf_status = GDD4_OK;
> + memcpy(&gdd->gddr_stateid, &dd->dl_stid.sc_stateid, sizeof(gdd->gddr_stateid));
> + nfs4_put_stid(&dd->dl_stid);
> return nfs_ok;
> }
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 8efa37055b21ca2202488e90377d5162613b9343..808c24fb5c9a0b432d3271c051b409fcb75970cd 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -9367,3 +9367,103 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
> nfs4_put_stid(&dp->dl_stid);
> return status;
> }
> +
> +/**
> + * nfsd_get_dir_deleg - attempt to get a directory delegation
> + * @cstate: compound state
> + * @gdd: GET_DIR_DELEGATION arg/resp structure
> + * @nf: nfsd_file opened on the directory
> + *
> + * Given a GET_DIR_DELEGATION request @gdd, attempt to acquire a delegation
> + * on the directory to which @nf refers. Note that this does not set up any
> + * sort of async notifications for the delegation.
> + */
> +struct nfs4_delegation *
> +nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
> + struct nfsd4_get_dir_delegation *gdd,
> + struct nfsd_file *nf)
> +{
> + struct nfs4_client *clp = cstate->clp;
> + struct nfs4_delegation *dp;
> + struct file_lease *fl;
> + struct nfs4_file *fp, *rfp;
> + int status = 0;
> +
> + fp = nfsd4_alloc_file();
> + if (!fp)
> + return ERR_PTR(-ENOMEM);
> +
> + nfsd4_file_init(&cstate->current_fh, fp);
> +
> + rfp = nfsd4_file_hash_insert(fp, &cstate->current_fh);
> + if (unlikely(!rfp)) {
> + put_nfs4_file(fp);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + if (rfp != fp) {
> + put_nfs4_file(fp);
> + fp = rfp;
> + }
> +
> + /* if this client already has one, return that it's unavailable */
> + spin_lock(&state_lock);
> + spin_lock(&fp->fi_lock);
> + /* existing delegation? */
> + if (nfs4_delegation_exists(clp, fp)) {
> + status = -EAGAIN;
> + } else if (!fp->fi_deleg_file) {
> + fp->fi_deleg_file = nfsd_file_get(nf);
> + fp->fi_delegees = 1;
> + } else {
> + ++fp->fi_delegees;
> + }
> + spin_unlock(&fp->fi_lock);
> + spin_unlock(&state_lock);
> +
> + if (status) {
> + put_nfs4_file(fp);
> + return ERR_PTR(status);
> + }
> +
> + /* Try to set up the lease */
> + status = -ENOMEM;
> + dp = alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ);
> + if (!dp)
> + goto out_delegees;
> +
> + fl = nfs4_alloc_init_lease(dp);
> + if (!fl)
> + goto out_put_stid;
> +
> + status = kernel_setlease(nf->nf_file,
> + fl->c.flc_type, &fl, NULL);
> + if (fl)
> + locks_free_lease(fl);
> + if (status)
> + goto out_put_stid;
> +
> + /*
> + * Now, try to hash it. This can fail if we race another nfsd task
> + * trying to set a delegation on the same file. If that happens,
> + * then just say UNAVAIL.
> + */
> + spin_lock(&state_lock);
> + spin_lock(&clp->cl_lock);
> + spin_lock(&fp->fi_lock);
> + status = hash_delegation_locked(dp, fp);
> + spin_unlock(&fp->fi_lock);
> + spin_unlock(&clp->cl_lock);
> + spin_unlock(&state_lock);
> +
> + if (!status)
> + return dp;
> +
> + /* Something failed. Drop the lease and clean up the stid */
> + kernel_setlease(fp->fi_deleg_file->nf_file, F_UNLCK, NULL, (void **)&dp);
> +out_put_stid:
> + nfs4_put_stid(&dp->dl_stid);
> +out_delegees:
> + put_deleg_file(fp);
> + return ERR_PTR(status);
> +}
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 1e736f4024263ffa9c93bcc9ec48f44566a8cc77..b052c1effdc5356487c610db9728df8ecfe851d4 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -867,4 +867,9 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
>
> extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
> struct dentry *dentry, struct nfs4_delegation **pdp);
> +
> +struct nfsd4_get_dir_delegation;
> +struct nfs4_delegation *nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
> + struct nfsd4_get_dir_delegation *gdd,
> + struct nfsd_file *nf);
> #endif /* NFSD4_STATE_H */
>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
--
Chuck Lever
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers
2025-10-21 15:25 ` [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
@ 2025-10-22 8:58 ` Jan Kara
0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2025-10-22 8:58 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue 21-10-25 11:25:36, Jeff Layton wrote:
> When nfsd starts requesting directory delegations, setlease handlers may
> see requests for leases on directories. Push the !S_ISREG check down
> into the non-trivial setlease handlers, so we can selectively enable
> them where they're supported.
>
> FUSE is special: It's the only filesystem that supports atomic_open and
> allows kernel-internal leases. atomic_open is issued when the VFS
> doesn't know the state of the dentry being opened. If the file doesn't
> exist, it may be created, in which case the dir lease should be broken.
>
> The existing kernel-internal lease implementation has no provision for
> this. Ensure that we don't allow directory leases by default going
> forward by explicitly disabling them there.
>
> Reviewed-by: NeilBrown <neil@brown.name>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fuse/dir.c | 1 +
> fs/locks.c | 5 +++--
> fs/nfs/nfs4file.c | 2 ++
> fs/smb/client/cifsfs.c | 3 +++
> 4 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index ecaec0fea3a132e7cbb88121e7db7fb504d57d3c..667774cc72a1d49796f531fcb342d2e4878beb85 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -2230,6 +2230,7 @@ static const struct file_operations fuse_dir_operations = {
> .fsync = fuse_dir_fsync,
> .unlocked_ioctl = fuse_dir_ioctl,
> .compat_ioctl = fuse_dir_compat_ioctl,
> + .setlease = simple_nosetlease,
> };
>
> static const struct inode_operations fuse_common_inode_operations = {
> diff --git a/fs/locks.c b/fs/locks.c
> index 04a3f0e2072461b6e2d3d1cd12f2b089d69a7db3..0b16921fb52e602ea2e0c3de39d9d772af98ba7d 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1929,6 +1929,9 @@ static int generic_delete_lease(struct file *filp, void *owner)
> int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
> void **priv)
> {
> + if (!S_ISREG(file_inode(filp)->i_mode))
> + return -EINVAL;
> +
> switch (arg) {
> case F_UNLCK:
> return generic_delete_lease(filp, *priv);
> @@ -2018,8 +2021,6 @@ vfs_setlease(struct file *filp, int arg, struct file_lease **lease, void **priv)
>
> if ((!vfsuid_eq_kuid(vfsuid, current_fsuid())) && !capable(CAP_LEASE))
> return -EACCES;
> - if (!S_ISREG(inode->i_mode))
> - return -EINVAL;
> error = security_file_lock(filp, arg);
> if (error)
> return error;
> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> index 7f43e890d3564a000dab9365048a3e17dc96395c..7317f26892c5782a39660cae87ec1afea24e36c0 100644
> --- a/fs/nfs/nfs4file.c
> +++ b/fs/nfs/nfs4file.c
> @@ -431,6 +431,8 @@ void nfs42_ssc_unregister_ops(void)
> static int nfs4_setlease(struct file *file, int arg, struct file_lease **lease,
> void **priv)
> {
> + if (!S_ISREG(file_inode(file)->i_mode))
> + return -EINVAL;
> return nfs4_proc_setlease(file, arg, lease, priv);
> }
>
> diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
> index 4f959f1e08d235071a151c1438c753fcd05099e5..1522c6b61b48c05c93f2bedeab0d35b6d85378e2 100644
> --- a/fs/smb/client/cifsfs.c
> +++ b/fs/smb/client/cifsfs.c
> @@ -1149,6 +1149,9 @@ cifs_setlease(struct file *file, int arg, struct file_lease **lease, void **priv
> struct inode *inode = file_inode(file);
> struct cifsFileInfo *cfile = file->private_data;
>
> + if (!S_ISREG(inode->i_mode))
> + return -EINVAL;
> +
> /* Check if file is oplocked if this is request for new lease */
> if (arg == F_UNLCK ||
> ((arg == F_RDLCK) && CIFS_CACHE_READ(CIFS_I(inode))) ||
>
> --
> 2.51.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 08/13] vfs: make vfs_symlink break delegations on parent dir
2025-10-21 15:25 ` [PATCH v3 08/13] vfs: make vfs_symlink break delegations on parent dir Jeff Layton
@ 2025-10-22 9:01 ` Jan Kara
0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2025-10-22 9:01 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue 21-10-25 11:25:43, Jeff Layton wrote:
> In order to add directory delegation support, we must break delegations
> on the parent on any change to the directory.
>
> Add a delegated_inode parameter to vfs_symlink() and have it break the
> delegation. do_symlinkat() can then wait on the delegation break before
> proceeding.
>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ecryptfs/inode.c | 2 +-
> fs/init.c | 2 +-
> fs/namei.c | 16 ++++++++++++++--
> fs/nfsd/vfs.c | 2 +-
> fs/overlayfs/overlayfs.h | 2 +-
> include/linux/fs.h | 2 +-
> 6 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index 639ae42bcd56890d04592f7269e4ffc099b44f09..d430ec5a63094ea4cd42828e7d44f0f8d918fcec 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -480,7 +480,7 @@ static int ecryptfs_symlink(struct mnt_idmap *idmap,
> if (rc)
> goto out_lock;
> rc = vfs_symlink(&nop_mnt_idmap, lower_dir, lower_dentry,
> - encoded_symname);
> + encoded_symname, NULL);
> kfree(encoded_symname);
> if (rc || d_really_is_negative(lower_dentry))
> goto out_lock;
> diff --git a/fs/init.c b/fs/init.c
> index 4f02260dd65b0dfcbfbf5812d2ec6a33444a3b1f..e0f5429c0a49d046bd3f231a260954ed0f90ef44 100644
> --- a/fs/init.c
> +++ b/fs/init.c
> @@ -209,7 +209,7 @@ int __init init_symlink(const char *oldname, const char *newname)
> error = security_path_symlink(&path, dentry, oldname);
> if (!error)
> error = vfs_symlink(mnt_idmap(path.mnt), path.dentry->d_inode,
> - dentry, oldname);
> + dentry, oldname, NULL);
> end_creating_path(&path, dentry);
> return error;
> }
> diff --git a/fs/namei.c b/fs/namei.c
> index 7e400cbdbc6af1c72eb684f051d0571e944a27d7..71af256cdd941e200389570538f64a3f795e6c83 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4851,6 +4851,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
> * @dir: inode of the parent directory
> * @dentry: dentry of the child symlink file
> * @oldname: name of the file to link to
> + * @delegated_inode: returns victim inode, if the inode is delegated.
> *
> * Create a symlink.
> *
> @@ -4861,7 +4862,8 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
> * raw inode simply pass @nop_mnt_idmap.
> */
> int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir,
> - struct dentry *dentry, const char *oldname)
> + struct dentry *dentry, const char *oldname,
> + struct inode **delegated_inode)
> {
> int error;
>
> @@ -4876,6 +4878,10 @@ int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir,
> if (error)
> return error;
>
> + error = try_break_deleg(dir, delegated_inode);
> + if (error)
> + return error;
> +
> error = dir->i_op->symlink(idmap, dir, dentry, oldname);
> if (!error)
> fsnotify_create(dir, dentry);
> @@ -4889,6 +4895,7 @@ int do_symlinkat(struct filename *from, int newdfd, struct filename *to)
> struct dentry *dentry;
> struct path path;
> unsigned int lookup_flags = 0;
> + struct inode *delegated_inode = NULL;
>
> if (IS_ERR(from)) {
> error = PTR_ERR(from);
> @@ -4903,8 +4910,13 @@ int do_symlinkat(struct filename *from, int newdfd, struct filename *to)
> error = security_path_symlink(&path, dentry, from->name);
> if (!error)
> error = vfs_symlink(mnt_idmap(path.mnt), path.dentry->d_inode,
> - dentry, from->name);
> + dentry, from->name, &delegated_inode);
> end_creating_path(&path, dentry);
> + if (delegated_inode) {
> + error = break_deleg_wait(&delegated_inode);
> + if (!error)
> + goto retry;
> + }
> if (retry_estale(error, lookup_flags)) {
> lookup_flags |= LOOKUP_REVAL;
> goto retry;
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 44debf3d0be450ddc245e2fa4f57fe076e1454a2..386f454badce7ed448399ef93e9c8edafbcc4d79 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1829,7 +1829,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
> err = fh_fill_pre_attrs(fhp);
> if (err != nfs_ok)
> goto out_unlock;
> - host_err = vfs_symlink(&nop_mnt_idmap, d_inode(dentry), dnew, path);
> + host_err = vfs_symlink(&nop_mnt_idmap, d_inode(dentry), dnew, path, NULL);
> err = nfserrno(host_err);
> cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp);
> if (!err)
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 87b82dada7ec1b8429299c68078cda24176c5607..94bb4540f7ae2e0571b3b88393c180bd73c3c09c 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -267,7 +267,7 @@ static inline int ovl_do_symlink(struct ovl_fs *ofs,
> struct inode *dir, struct dentry *dentry,
> const char *oldname)
> {
> - int err = vfs_symlink(ovl_upper_mnt_idmap(ofs), dir, dentry, oldname);
> + int err = vfs_symlink(ovl_upper_mnt_idmap(ofs), dir, dentry, oldname, NULL);
>
> pr_debug("symlink(\"%s\", %pd2) = %i\n", oldname, dentry, err);
> return err;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a1e1afe39e01a46bf0a81e241b92690947402851..d8c7245da3bf3200b435c7ea6cafcf7903ebf293 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2117,7 +2117,7 @@ struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
> int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> umode_t, dev_t, struct inode **);
> int vfs_symlink(struct mnt_idmap *, struct inode *,
> - struct dentry *, const char *);
> + struct dentry *, const char *, struct inode **);
> int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
> struct dentry *, struct inode **);
> int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *,
>
> --
> 2.51.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 09/13] filelock: lift the ban on directory leases in generic_setlease
2025-10-21 15:25 ` [PATCH v3 09/13] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
@ 2025-10-22 9:03 ` Jan Kara
0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2025-10-22 9:03 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Christian Brauner, Jan Kara,
Chuck Lever, Alexander Aring, Trond Myklebust, Anna Schumaker,
Steve French, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
Tom Talpey, Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue 21-10-25 11:25:44, Jeff Layton wrote:
> With the addition of the try_break_lease calls in directory changing
> operations, allow generic_setlease to hand them out. Write leases on
> directories are never allowed however, so continue to reject them.
>
> For now, there is no API for requesting delegations from userland, so
> ensure that userland is prevented from acquiring a lease on a directory.
>
> Reviewed-by: NeilBrown <neil@brown.name>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/locks.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fs/locks.c b/fs/locks.c
> index 0b16921fb52e602ea2e0c3de39d9d772af98ba7d..b47552106769ec5a189babfe12518e34aa59c759 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1929,14 +1929,19 @@ static int generic_delete_lease(struct file *filp, void *owner)
> int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
> void **priv)
> {
> - if (!S_ISREG(file_inode(filp)->i_mode))
> + struct inode *inode = file_inode(filp);
> +
> + if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> return -EINVAL;
>
> switch (arg) {
> case F_UNLCK:
> return generic_delete_lease(filp, *priv);
> - case F_RDLCK:
> case F_WRLCK:
> + if (S_ISDIR(inode->i_mode))
> + return -EINVAL;
> + fallthrough;
> + case F_RDLCK:
> if (!(*flp)->fl_lmops->lm_break) {
> WARN_ON_ONCE(1);
> return -ENOLCK;
> @@ -2065,6 +2070,9 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, int arg)
> */
> int fcntl_setlease(unsigned int fd, struct file *filp, int arg)
> {
> + if (S_ISDIR(file_inode(filp)->i_mode))
> + return -EINVAL;
> +
> if (arg == F_UNLCK)
> return vfs_setlease(filp, F_UNLCK, NULL, (void **)&filp);
> return do_fcntl_add_lease(fd, filp, arg);
>
> --
> 2.51.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (12 preceding siblings ...)
2025-10-21 15:25 ` [PATCH v3 13/13] vfs: expose delegation support to userland Jeff Layton
@ 2025-10-29 12:55 ` Christian Brauner
2025-10-29 13:38 ` Christian Brauner
14 siblings, 0 replies; 26+ messages in thread
From: Christian Brauner @ 2025-10-29 12:55 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue, Oct 21, 2025 at 11:25:35AM -0400, Jeff Layton wrote:
> Behold, another version of directory delegations. This version contains
> support for recall-only delegations. Support for CB_NOTIFY will be
> forthcoming (once the client-side patches have caught up).
>
> This main differences in this version are bugfixes, but the last patch
> adds a more formal API for userland to request a delegation. That
> support is optional. We can drop it and the rest of the series should be
> fine.
>
> My main interest in making delegations available to userland is to allow
> testing this support without nfsd. I have an xfstest ready to submit for
> this if that support looks acceptable. If it is, then I'll also plan to
> submit an update for fcntl(2).
>
> Christian, Chuck mentioned he was fine with you merging the nfsd bits
> too, if you're willing to take the whole pile.
Absolutely!
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent
2025-10-21 15:25 ` [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
@ 2025-10-29 13:04 ` Christian Brauner
2025-10-29 13:37 ` Jeff Layton
0 siblings, 1 reply; 26+ messages in thread
From: Christian Brauner @ 2025-10-29 13:04 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue, Oct 21, 2025 at 11:25:38AM -0400, Jeff Layton wrote:
> In order to add directory delegation support, we need to break
> delegations on the parent whenever there is going to be a change in the
> directory.
>
> Add a new delegated_inode parameter to vfs_mkdir. All of the existing
> callers set that to NULL for now, except for do_mkdirat which will
> properly block until the lease is gone.
>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Reviewed-by: NeilBrown <neil@brown.name>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> drivers/base/devtmpfs.c | 2 +-
> fs/cachefiles/namei.c | 2 +-
> fs/ecryptfs/inode.c | 2 +-
> fs/init.c | 2 +-
> fs/namei.c | 24 ++++++++++++++++++------
> fs/nfsd/nfs4recover.c | 2 +-
> fs/nfsd/vfs.c | 2 +-
> fs/overlayfs/overlayfs.h | 2 +-
> fs/smb/server/vfs.c | 2 +-
> fs/xfs/scrub/orphanage.c | 2 +-
> include/linux/fs.h | 2 +-
> 11 files changed, 28 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
> index 9d4e46ad8352257a6a65d85526ebdbf9bf2d4b19..0e79621cb0f79870003b867ca384199171ded4e0 100644
> --- a/drivers/base/devtmpfs.c
> +++ b/drivers/base/devtmpfs.c
> @@ -180,7 +180,7 @@ static int dev_mkdir(const char *name, umode_t mode)
> if (IS_ERR(dentry))
> return PTR_ERR(dentry);
>
> - dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode);
> + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode, NULL);
> if (!IS_ERR(dentry))
> /* mark as kernel-created inode */
> d_inode(dentry)->i_private = &thread;
> diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> index d1edb2ac38376c4f9d2a18026450bb3c774f7824..50c0f9c76d1fd4c05db90d7d0d1bad574523ead0 100644
> --- a/fs/cachefiles/namei.c
> +++ b/fs/cachefiles/namei.c
> @@ -130,7 +130,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
> goto mkdir_error;
> ret = cachefiles_inject_write_error();
> if (ret == 0)
> - subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
> + subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL);
> else
> subdir = ERR_PTR(ret);
> if (IS_ERR(subdir)) {
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index ed1394da8d6bd7065f2a074378331f13fcda17f9..35830b3144f8f71374a78b3e7463b864f4fc216e 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -508,7 +508,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> goto out;
>
> lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir,
> - lower_dentry, mode);
> + lower_dentry, mode, NULL);
> rc = PTR_ERR(lower_dentry);
> if (IS_ERR(lower_dentry))
> goto out;
> diff --git a/fs/init.c b/fs/init.c
> index 07f592ccdba868509d0f3aaf9936d8d890fdbec5..895f8a09a71acfd03e11164e3b441a7d4e2de146 100644
> --- a/fs/init.c
> +++ b/fs/init.c
> @@ -233,7 +233,7 @@ int __init init_mkdir(const char *pathname, umode_t mode)
> error = security_path_mkdir(&path, dentry, mode);
> if (!error) {
> dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> - dentry, mode);
> + dentry, mode, NULL);
> if (IS_ERR(dentry))
> error = PTR_ERR(dentry);
> }
> diff --git a/fs/namei.c b/fs/namei.c
> index 6e61e0215b34134b1690f864e2719e3f82cf71a8..86cf6eca1f485361c6732974e4103cf5ea721539 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4407,10 +4407,11 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
>
> /**
> * vfs_mkdir - create directory returning correct dentry if possible
> - * @idmap: idmap of the mount the inode was found from
> - * @dir: inode of the parent directory
> - * @dentry: dentry of the child directory
> - * @mode: mode of the child directory
> + * @idmap: idmap of the mount the inode was found from
> + * @dir: inode of the parent directory
> + * @dentry: dentry of the child directory
> + * @mode: mode of the child directory
> + * @delegated_inode: returns parent inode, if the inode is delegated.
I wonder if it would be feasible and potentially elegant if delegated
inodes were returned as separate type like struct delegated_inode
similar to the vfsuid_t just a struct wrapper around the inode itself.
The advantage is that it's not possible to accidently abuse this thing
as we're passing that stuff around to try_break_deleg() and so on.
> *
> * Create a directory.
> *
> @@ -4427,7 +4428,8 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
> * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> */
> struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> - struct dentry *dentry, umode_t mode)
> + struct dentry *dentry, umode_t mode,
> + struct inode **delegated_inode)
> {
> int error;
> unsigned max_links = dir->i_sb->s_max_links;
> @@ -4450,6 +4452,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> if (max_links && dir->i_nlink >= max_links)
> goto err;
>
> + error = try_break_deleg(dir, delegated_inode);
> + if (error)
> + goto err;
> +
> de = dir->i_op->mkdir(idmap, dir, dentry, mode);
> error = PTR_ERR(de);
> if (IS_ERR(de))
> @@ -4473,6 +4479,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> struct path path;
> int error;
> unsigned int lookup_flags = LOOKUP_DIRECTORY;
> + struct inode *delegated_inode = NULL;
>
> retry:
> dentry = filename_create(dfd, name, &path, lookup_flags);
> @@ -4484,11 +4491,16 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> mode_strip_umask(path.dentry->d_inode, mode));
> if (!error) {
> dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> - dentry, mode);
> + dentry, mode, &delegated_inode);
> if (IS_ERR(dentry))
> error = PTR_ERR(dentry);
> }
> end_creating_path(&path, dentry);
> + if (delegated_inode) {
> + error = break_deleg_wait(&delegated_inode);
> + if (!error)
> + goto retry;
> + }
> if (retry_estale(error, lookup_flags)) {
> lookup_flags |= LOOKUP_REVAL;
> goto retry;
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index b1005abcb9035b2cf743200808a251b00af7e3f4..423dd102b51198ea7c447be2b9a0a5020c950dba 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -202,7 +202,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
> * as well be forgiving and just succeed silently.
> */
> goto out_put;
> - dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
> + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, 0700, NULL);
> if (IS_ERR(dentry))
> status = PTR_ERR(dentry);
> out_put:
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 8b2dc7a88aab015d1e39da0dd4e6daf7e276aabe..5f24af289d509bea54a324b8851fa06de6050353 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1645,7 +1645,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
> nfsd_check_ignore_resizing(iap);
> break;
> case S_IFDIR:
> - dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode);
> + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode, NULL);
> if (IS_ERR(dchild)) {
> host_err = PTR_ERR(dchild);
> } else if (d_is_negative(dchild)) {
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index c8fd5951fc5ece1ae6b3e2a0801ca15f9faf7d72..0f65f9a5d54d4786b39e4f4f30f416d5b9016e70 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -248,7 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
> {
> struct dentry *ret;
>
> - ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
> + ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, NULL);
> pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(ret));
> return ret;
> }
> diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
> index 891ed2dc2b7351a5cb14a2241d71095ffdd03f08..3d2190f26623b23ea79c63410905a3c3ad684048 100644
> --- a/fs/smb/server/vfs.c
> +++ b/fs/smb/server/vfs.c
> @@ -230,7 +230,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode)
> idmap = mnt_idmap(path.mnt);
> mode |= S_IFDIR;
> d = dentry;
> - dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode);
> + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode, NULL);
> if (IS_ERR(dentry))
> err = PTR_ERR(dentry);
> else if (d_is_negative(dentry))
> diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
> index 9c12cb8442311ca26b169e4d1567939ae44a5be0..91c9d07b97f306f57aebb9b69ba564b0c2cb8c17 100644
> --- a/fs/xfs/scrub/orphanage.c
> +++ b/fs/xfs/scrub/orphanage.c
> @@ -167,7 +167,7 @@ xrep_orphanage_create(
> */
> if (d_really_is_negative(orphanage_dentry)) {
> orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode,
> - orphanage_dentry, 0750);
> + orphanage_dentry, 0750, NULL);
> error = PTR_ERR(orphanage_dentry);
> if (IS_ERR(orphanage_dentry))
> goto out_unlock_root;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c895146c1444be36e0a779df55622cc38c9419ff..1040df3792794cd353b86558b41618294e25b8a6 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2113,7 +2113,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
> int vfs_create(struct mnt_idmap *, struct inode *,
> struct dentry *, umode_t, bool);
> struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
> - struct dentry *, umode_t);
> + struct dentry *, umode_t, struct inode **);
> int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> umode_t, dev_t);
> int vfs_symlink(struct mnt_idmap *, struct inode *,
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory
2025-10-21 15:25 ` [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory Jeff Layton
@ 2025-10-29 13:23 ` Christian Brauner
2025-10-29 13:38 ` Jeff Layton
0 siblings, 1 reply; 26+ messages in thread
From: Christian Brauner @ 2025-10-29 13:23 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue, Oct 21, 2025 at 11:25:41AM -0400, Jeff Layton wrote:
> In order to add directory delegation support, we need to break
> delegations on the parent whenever there is going to be a change in the
> directory.
>
> Add a delegated_inode parameter to vfs_create. Most callers are
> converted to pass in NULL, but do_mknodat() is changed to wait for a
> delegation break if there is one.
>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Reviewed-by: NeilBrown <neil@brown.name>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> fs/ecryptfs/inode.c | 2 +-
> fs/namei.c | 26 +++++++++++++++++++-------
> fs/nfsd/nfs3proc.c | 2 +-
> fs/nfsd/vfs.c | 3 +--
> fs/open.c | 2 +-
> fs/overlayfs/overlayfs.h | 2 +-
> fs/smb/server/vfs.c | 2 +-
> include/linux/fs.h | 2 +-
> 8 files changed, 26 insertions(+), 15 deletions(-)
>
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index 88631291b32535f623a3fbe4ea9b6ed48a306ca0..661709b157ce854c3bfdfdb13f7c10435fad9756 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -189,7 +189,7 @@ ecryptfs_do_create(struct inode *directory_inode,
> rc = lock_parent(ecryptfs_dentry, &lower_dentry, &lower_dir);
> if (!rc)
> rc = vfs_create(&nop_mnt_idmap, lower_dir,
> - lower_dentry, mode, true);
> + lower_dentry, mode, true, NULL);
Starts to look like we should epxlore whether a struct create_args (or
some other name) similar to struct renamedata I did some years ago would
help make the code a bit more legible in the future.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent
2025-10-29 13:04 ` Christian Brauner
@ 2025-10-29 13:37 ` Jeff Layton
2025-10-31 12:23 ` Christian Brauner
0 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2025-10-29 13:37 UTC (permalink / raw)
To: Christian Brauner
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Wed, 2025-10-29 at 14:04 +0100, Christian Brauner wrote:
> On Tue, Oct 21, 2025 at 11:25:38AM -0400, Jeff Layton wrote:
> > In order to add directory delegation support, we need to break
> > delegations on the parent whenever there is going to be a change in the
> > directory.
> >
> > Add a new delegated_inode parameter to vfs_mkdir. All of the existing
> > callers set that to NULL for now, except for do_mkdirat which will
> > properly block until the lease is gone.
> >
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > Reviewed-by: NeilBrown <neil@brown.name>
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > drivers/base/devtmpfs.c | 2 +-
> > fs/cachefiles/namei.c | 2 +-
> > fs/ecryptfs/inode.c | 2 +-
> > fs/init.c | 2 +-
> > fs/namei.c | 24 ++++++++++++++++++------
> > fs/nfsd/nfs4recover.c | 2 +-
> > fs/nfsd/vfs.c | 2 +-
> > fs/overlayfs/overlayfs.h | 2 +-
> > fs/smb/server/vfs.c | 2 +-
> > fs/xfs/scrub/orphanage.c | 2 +-
> > include/linux/fs.h | 2 +-
> > 11 files changed, 28 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
> > index 9d4e46ad8352257a6a65d85526ebdbf9bf2d4b19..0e79621cb0f79870003b867ca384199171ded4e0 100644
> > --- a/drivers/base/devtmpfs.c
> > +++ b/drivers/base/devtmpfs.c
> > @@ -180,7 +180,7 @@ static int dev_mkdir(const char *name, umode_t mode)
> > if (IS_ERR(dentry))
> > return PTR_ERR(dentry);
> >
> > - dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode);
> > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode, NULL);
> > if (!IS_ERR(dentry))
> > /* mark as kernel-created inode */
> > d_inode(dentry)->i_private = &thread;
> > diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> > index d1edb2ac38376c4f9d2a18026450bb3c774f7824..50c0f9c76d1fd4c05db90d7d0d1bad574523ead0 100644
> > --- a/fs/cachefiles/namei.c
> > +++ b/fs/cachefiles/namei.c
> > @@ -130,7 +130,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
> > goto mkdir_error;
> > ret = cachefiles_inject_write_error();
> > if (ret == 0)
> > - subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
> > + subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL);
> > else
> > subdir = ERR_PTR(ret);
> > if (IS_ERR(subdir)) {
> > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> > index ed1394da8d6bd7065f2a074378331f13fcda17f9..35830b3144f8f71374a78b3e7463b864f4fc216e 100644
> > --- a/fs/ecryptfs/inode.c
> > +++ b/fs/ecryptfs/inode.c
> > @@ -508,7 +508,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > goto out;
> >
> > lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir,
> > - lower_dentry, mode);
> > + lower_dentry, mode, NULL);
> > rc = PTR_ERR(lower_dentry);
> > if (IS_ERR(lower_dentry))
> > goto out;
> > diff --git a/fs/init.c b/fs/init.c
> > index 07f592ccdba868509d0f3aaf9936d8d890fdbec5..895f8a09a71acfd03e11164e3b441a7d4e2de146 100644
> > --- a/fs/init.c
> > +++ b/fs/init.c
> > @@ -233,7 +233,7 @@ int __init init_mkdir(const char *pathname, umode_t mode)
> > error = security_path_mkdir(&path, dentry, mode);
> > if (!error) {
> > dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> > - dentry, mode);
> > + dentry, mode, NULL);
> > if (IS_ERR(dentry))
> > error = PTR_ERR(dentry);
> > }
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 6e61e0215b34134b1690f864e2719e3f82cf71a8..86cf6eca1f485361c6732974e4103cf5ea721539 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -4407,10 +4407,11 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
> >
> > /**
> > * vfs_mkdir - create directory returning correct dentry if possible
> > - * @idmap: idmap of the mount the inode was found from
> > - * @dir: inode of the parent directory
> > - * @dentry: dentry of the child directory
> > - * @mode: mode of the child directory
> > + * @idmap: idmap of the mount the inode was found from
> > + * @dir: inode of the parent directory
> > + * @dentry: dentry of the child directory
> > + * @mode: mode of the child directory
> > + * @delegated_inode: returns parent inode, if the inode is delegated.
>
> I wonder if it would be feasible and potentially elegant if delegated
> inodes were returned as separate type like struct delegated_inode
> similar to the vfsuid_t just a struct wrapper around the inode itself.
> The advantage is that it's not possible to accidently abuse this thing
> as we're passing that stuff around to try_break_deleg() and so on.
>
I have a patch that does exactly that:
https://lore.kernel.org/linux-nfs/20250924-dir-deleg-v3-15-9f3af8bc5c40@kernel.org/
I didn't submit it here since it wasn't strictly required for this
patchset. If we get around to implementing CB_NOTIFY support however,
it will be since we'll need to pass back other information than just
the inode.
I could move that into this series if you prefer. If we do that though,
then it might also be cleaner to take the previous patch in the series
that cleans up __break_lease() arguments:
https://lore.kernel.org/linux-nfs/20250924-dir-deleg-v3-14-9f3af8bc5c40@kernel.org/
Let me know what you'd prefer.
> > *
> > * Create a directory.
> > *
> > @@ -4427,7 +4428,8 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
> > * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> > */
> > struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > - struct dentry *dentry, umode_t mode)
> > + struct dentry *dentry, umode_t mode,
> > + struct inode **delegated_inode)
> > {
> > int error;
> > unsigned max_links = dir->i_sb->s_max_links;
> > @@ -4450,6 +4452,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > if (max_links && dir->i_nlink >= max_links)
> > goto err;
> >
> > + error = try_break_deleg(dir, delegated_inode);
> > + if (error)
> > + goto err;
> > +
> > de = dir->i_op->mkdir(idmap, dir, dentry, mode);
> > error = PTR_ERR(de);
> > if (IS_ERR(de))
> > @@ -4473,6 +4479,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > struct path path;
> > int error;
> > unsigned int lookup_flags = LOOKUP_DIRECTORY;
> > + struct inode *delegated_inode = NULL;
> >
> > retry:
> > dentry = filename_create(dfd, name, &path, lookup_flags);
> > @@ -4484,11 +4491,16 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > mode_strip_umask(path.dentry->d_inode, mode));
> > if (!error) {
> > dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> > - dentry, mode);
> > + dentry, mode, &delegated_inode);
> > if (IS_ERR(dentry))
> > error = PTR_ERR(dentry);
> > }
> > end_creating_path(&path, dentry);
> > + if (delegated_inode) {
> > + error = break_deleg_wait(&delegated_inode);
> > + if (!error)
> > + goto retry;
> > + }
> > if (retry_estale(error, lookup_flags)) {
> > lookup_flags |= LOOKUP_REVAL;
> > goto retry;
> > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > index b1005abcb9035b2cf743200808a251b00af7e3f4..423dd102b51198ea7c447be2b9a0a5020c950dba 100644
> > --- a/fs/nfsd/nfs4recover.c
> > +++ b/fs/nfsd/nfs4recover.c
> > @@ -202,7 +202,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
> > * as well be forgiving and just succeed silently.
> > */
> > goto out_put;
> > - dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
> > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, 0700, NULL);
> > if (IS_ERR(dentry))
> > status = PTR_ERR(dentry);
> > out_put:
> > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > index 8b2dc7a88aab015d1e39da0dd4e6daf7e276aabe..5f24af289d509bea54a324b8851fa06de6050353 100644
> > --- a/fs/nfsd/vfs.c
> > +++ b/fs/nfsd/vfs.c
> > @@ -1645,7 +1645,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > nfsd_check_ignore_resizing(iap);
> > break;
> > case S_IFDIR:
> > - dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode);
> > + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode, NULL);
> > if (IS_ERR(dchild)) {
> > host_err = PTR_ERR(dchild);
> > } else if (d_is_negative(dchild)) {
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index c8fd5951fc5ece1ae6b3e2a0801ca15f9faf7d72..0f65f9a5d54d4786b39e4f4f30f416d5b9016e70 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -248,7 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
> > {
> > struct dentry *ret;
> >
> > - ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
> > + ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, NULL);
> > pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(ret));
> > return ret;
> > }
> > diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
> > index 891ed2dc2b7351a5cb14a2241d71095ffdd03f08..3d2190f26623b23ea79c63410905a3c3ad684048 100644
> > --- a/fs/smb/server/vfs.c
> > +++ b/fs/smb/server/vfs.c
> > @@ -230,7 +230,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode)
> > idmap = mnt_idmap(path.mnt);
> > mode |= S_IFDIR;
> > d = dentry;
> > - dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode);
> > + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode, NULL);
> > if (IS_ERR(dentry))
> > err = PTR_ERR(dentry);
> > else if (d_is_negative(dentry))
> > diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
> > index 9c12cb8442311ca26b169e4d1567939ae44a5be0..91c9d07b97f306f57aebb9b69ba564b0c2cb8c17 100644
> > --- a/fs/xfs/scrub/orphanage.c
> > +++ b/fs/xfs/scrub/orphanage.c
> > @@ -167,7 +167,7 @@ xrep_orphanage_create(
> > */
> > if (d_really_is_negative(orphanage_dentry)) {
> > orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode,
> > - orphanage_dentry, 0750);
> > + orphanage_dentry, 0750, NULL);
> > error = PTR_ERR(orphanage_dentry);
> > if (IS_ERR(orphanage_dentry))
> > goto out_unlock_root;
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index c895146c1444be36e0a779df55622cc38c9419ff..1040df3792794cd353b86558b41618294e25b8a6 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2113,7 +2113,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
> > int vfs_create(struct mnt_idmap *, struct inode *,
> > struct dentry *, umode_t, bool);
> > struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
> > - struct dentry *, umode_t);
> > + struct dentry *, umode_t, struct inode **);
> > int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> > umode_t, dev_t);
> > int vfs_symlink(struct mnt_idmap *, struct inode *,
> >
> > --
> > 2.51.0
> >
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
` (13 preceding siblings ...)
2025-10-29 12:55 ` [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Christian Brauner
@ 2025-10-29 13:38 ` Christian Brauner
2025-10-29 13:39 ` Jeff Layton
14 siblings, 1 reply; 26+ messages in thread
From: Christian Brauner @ 2025-10-29 13:38 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Tue, Oct 21, 2025 at 11:25:35AM -0400, Jeff Layton wrote:
> Behold, another version of directory delegations. This version contains
> support for recall-only delegations. Support for CB_NOTIFY will be
> forthcoming (once the client-side patches have caught up).
>
> This main differences in this version are bugfixes, but the last patch
> adds a more formal API for userland to request a delegation. That
> support is optional. We can drop it and the rest of the series should be
> fine.
>
> My main interest in making delegations available to userland is to allow
> testing this support without nfsd. I have an xfstest ready to submit for
> this if that support looks acceptable. If it is, then I'll also plan to
> submit an update for fcntl(2).
>
> Christian, Chuck mentioned he was fine with you merging the nfsd bits
> too, if you're willing to take the whole pile.
This all looks good to me btw. The only thing I'm having issues with is:
Base: base-commit d2ced3cadfab04c7e915adf0a73c53fcf1642719 not known, ignoring
Base: attempting to guess base-commit...
Base: tags/v6.18-rc1-23-g2c09630d09c6 (best guess, 21/27 blobs matched)
Base: v6.18-rc1
Magic: Preparing a sparse worktree
Unable to cleanly apply series, see failure log below
---
Applying: filelock: push the S_ISREG check down to ->setlease handlers
Applying: vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
Applying: vfs: allow mkdir to wait for delegation break on parent
Applying: vfs: allow rmdir to wait for delegation break on parent
Patch failed at 0004 vfs: allow rmdir to wait for delegation break on parent
error: invalid object 100644 423dd102b51198ea7c447be2b9a0a5020c950dba for 'fs/nfsd/nfs4recover.c'
error: Repository lacks necessary blobs to fall back on 3-way merge.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
That commit isn't in -next nor in any of my branches?
Can you resend on top of: vfs-6.19.directory.delegations please?
>
> Thanks!
> Jeff
>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> Changes in v3:
> - Fix potential nfsd_file refcount leaks on GET_DIR_DELEGATION error
> - Add missing parent dir deleg break in vfs_symlink()
> - Add F_SETDELEG/F_GETDELEG support to fcntl()
> - Link to v2: https://lore.kernel.org/r/20251017-dir-deleg-ro-v2-0-8c8f6dd23c8b@kernel.org
>
> Changes in v2:
> - handle lease conflict resolution inside of nfsd
> - drop the lm_may_setlease lock_manager operation
> - just add extra argument to vfs_create() instead of creating wrapper
> - don't allocate fsnotify_mark for open directories
> - Link to v1: https://lore.kernel.org/r/20251013-dir-deleg-ro-v1-0-406780a70e5e@kernel.org
>
> ---
> Jeff Layton (13):
> filelock: push the S_ISREG check down to ->setlease handlers
> vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
> vfs: allow mkdir to wait for delegation break on parent
> vfs: allow rmdir to wait for delegation break on parent
> vfs: break parent dir delegations in open(..., O_CREAT) codepath
> vfs: make vfs_create break delegations on parent directory
> vfs: make vfs_mknod break delegations on parent directory
> vfs: make vfs_symlink break delegations on parent dir
> filelock: lift the ban on directory leases in generic_setlease
> nfsd: allow filecache to hold S_IFDIR files
> nfsd: allow DELEGRETURN on directories
> nfsd: wire up GET_DIR_DELEGATION handling
> vfs: expose delegation support to userland
>
> drivers/base/devtmpfs.c | 6 +-
> fs/cachefiles/namei.c | 2 +-
> fs/ecryptfs/inode.c | 10 +--
> fs/fcntl.c | 9 +++
> fs/fuse/dir.c | 1 +
> fs/init.c | 6 +-
> fs/locks.c | 68 +++++++++++++++-----
> fs/namei.c | 150 +++++++++++++++++++++++++++++++++++----------
> fs/nfs/nfs4file.c | 2 +
> fs/nfsd/filecache.c | 57 ++++++++++++-----
> fs/nfsd/filecache.h | 2 +
> fs/nfsd/nfs3proc.c | 2 +-
> fs/nfsd/nfs4proc.c | 22 ++++++-
> fs/nfsd/nfs4recover.c | 6 +-
> fs/nfsd/nfs4state.c | 103 ++++++++++++++++++++++++++++++-
> fs/nfsd/state.h | 5 ++
> fs/nfsd/vfs.c | 16 ++---
> fs/nfsd/vfs.h | 2 +-
> fs/open.c | 2 +-
> fs/overlayfs/overlayfs.h | 10 +--
> fs/smb/client/cifsfs.c | 3 +
> fs/smb/server/vfs.c | 8 +--
> fs/xfs/scrub/orphanage.c | 2 +-
> include/linux/filelock.h | 12 ++++
> include/linux/fs.h | 13 ++--
> include/uapi/linux/fcntl.h | 10 +++
> net/unix/af_unix.c | 2 +-
> 27 files changed, 425 insertions(+), 106 deletions(-)
> ---
> base-commit: d2ced3cadfab04c7e915adf0a73c53fcf1642719
> change-id: 20251013-dir-deleg-ro-d0fe19823b21
>
> Best regards,
> --
> Jeff Layton <jlayton@kernel.org>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory
2025-10-29 13:23 ` Christian Brauner
@ 2025-10-29 13:38 ` Jeff Layton
0 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-29 13:38 UTC (permalink / raw)
To: Christian Brauner
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Wed, 2025-10-29 at 14:23 +0100, Christian Brauner wrote:
> On Tue, Oct 21, 2025 at 11:25:41AM -0400, Jeff Layton wrote:
> > In order to add directory delegation support, we need to break
> > delegations on the parent whenever there is going to be a change in the
> > directory.
> >
> > Add a delegated_inode parameter to vfs_create. Most callers are
> > converted to pass in NULL, but do_mknodat() is changed to wait for a
> > delegation break if there is one.
> >
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > Reviewed-by: NeilBrown <neil@brown.name>
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > fs/ecryptfs/inode.c | 2 +-
> > fs/namei.c | 26 +++++++++++++++++++-------
> > fs/nfsd/nfs3proc.c | 2 +-
> > fs/nfsd/vfs.c | 3 +--
> > fs/open.c | 2 +-
> > fs/overlayfs/overlayfs.h | 2 +-
> > fs/smb/server/vfs.c | 2 +-
> > include/linux/fs.h | 2 +-
> > 8 files changed, 26 insertions(+), 15 deletions(-)
> >
> > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> > index 88631291b32535f623a3fbe4ea9b6ed48a306ca0..661709b157ce854c3bfdfdb13f7c10435fad9756 100644
> > --- a/fs/ecryptfs/inode.c
> > +++ b/fs/ecryptfs/inode.c
> > @@ -189,7 +189,7 @@ ecryptfs_do_create(struct inode *directory_inode,
> > rc = lock_parent(ecryptfs_dentry, &lower_dentry, &lower_dir);
> > if (!rc)
> > rc = vfs_create(&nop_mnt_idmap, lower_dir,
> > - lower_dentry, mode, true);
> > + lower_dentry, mode, true, NULL);
>
> Starts to look like we should epxlore whether a struct create_args (or
> some other name) similar to struct renamedata I did some years ago would
> help make the code a bit more legible in the future.
I like that idea. Let me see if I can rework this patch along those
lines. I'll plan to send an updated set.
Thanks,
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd
2025-10-29 13:38 ` Christian Brauner
@ 2025-10-29 13:39 ` Jeff Layton
0 siblings, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2025-10-29 13:39 UTC (permalink / raw)
To: Christian Brauner
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Wed, 2025-10-29 at 14:38 +0100, Christian Brauner wrote:
> On Tue, Oct 21, 2025 at 11:25:35AM -0400, Jeff Layton wrote:
> > Behold, another version of directory delegations. This version contains
> > support for recall-only delegations. Support for CB_NOTIFY will be
> > forthcoming (once the client-side patches have caught up).
> >
> > This main differences in this version are bugfixes, but the last patch
> > adds a more formal API for userland to request a delegation. That
> > support is optional. We can drop it and the rest of the series should be
> > fine.
> >
> > My main interest in making delegations available to userland is to allow
> > testing this support without nfsd. I have an xfstest ready to submit for
> > this if that support looks acceptable. If it is, then I'll also plan to
> > submit an update for fcntl(2).
> >
> > Christian, Chuck mentioned he was fine with you merging the nfsd bits
> > too, if you're willing to take the whole pile.
>
> This all looks good to me btw. The only thing I'm having issues with is:
>
> Base: base-commit d2ced3cadfab04c7e915adf0a73c53fcf1642719 not known, ignoring
> Base: attempting to guess base-commit...
> Base: tags/v6.18-rc1-23-g2c09630d09c6 (best guess, 21/27 blobs matched)
> Base: v6.18-rc1
> Magic: Preparing a sparse worktree
> Unable to cleanly apply series, see failure log below
> ---
> Applying: filelock: push the S_ISREG check down to ->setlease handlers
> Applying: vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
> Applying: vfs: allow mkdir to wait for delegation break on parent
> Applying: vfs: allow rmdir to wait for delegation break on parent
> Patch failed at 0004 vfs: allow rmdir to wait for delegation break on parent
> error: invalid object 100644 423dd102b51198ea7c447be2b9a0a5020c950dba for 'fs/nfsd/nfs4recover.c'
> error: Repository lacks necessary blobs to fall back on 3-way merge.
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> hint: When you have resolved this problem, run "git am --continue".
> hint: If you prefer to skip this patch, run "git am --skip" instead.
> hint: To restore the original branch and stop patching, run "git am --abort".
> hint: Disable this message with "git config advice.mergeConflict false"
>
> That commit isn't in -next nor in any of my branches?
> Can you resend on top of: vfs-6.19.directory.delegations please?
>
Will do. It's a simple fix. I had based this on top of fs-next, which
has Chuck's tree in it too.
> >
> > Thanks!
> > Jeff
> >
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > Changes in v3:
> > - Fix potential nfsd_file refcount leaks on GET_DIR_DELEGATION error
> > - Add missing parent dir deleg break in vfs_symlink()
> > - Add F_SETDELEG/F_GETDELEG support to fcntl()
> > - Link to v2: https://lore.kernel.org/r/20251017-dir-deleg-ro-v2-0-8c8f6dd23c8b@kernel.org
> >
> > Changes in v2:
> > - handle lease conflict resolution inside of nfsd
> > - drop the lm_may_setlease lock_manager operation
> > - just add extra argument to vfs_create() instead of creating wrapper
> > - don't allocate fsnotify_mark for open directories
> > - Link to v1: https://lore.kernel.org/r/20251013-dir-deleg-ro-v1-0-406780a70e5e@kernel.org
> >
> > ---
> > Jeff Layton (13):
> > filelock: push the S_ISREG check down to ->setlease handlers
> > vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
> > vfs: allow mkdir to wait for delegation break on parent
> > vfs: allow rmdir to wait for delegation break on parent
> > vfs: break parent dir delegations in open(..., O_CREAT) codepath
> > vfs: make vfs_create break delegations on parent directory
> > vfs: make vfs_mknod break delegations on parent directory
> > vfs: make vfs_symlink break delegations on parent dir
> > filelock: lift the ban on directory leases in generic_setlease
> > nfsd: allow filecache to hold S_IFDIR files
> > nfsd: allow DELEGRETURN on directories
> > nfsd: wire up GET_DIR_DELEGATION handling
> > vfs: expose delegation support to userland
> >
> > drivers/base/devtmpfs.c | 6 +-
> > fs/cachefiles/namei.c | 2 +-
> > fs/ecryptfs/inode.c | 10 +--
> > fs/fcntl.c | 9 +++
> > fs/fuse/dir.c | 1 +
> > fs/init.c | 6 +-
> > fs/locks.c | 68 +++++++++++++++-----
> > fs/namei.c | 150 +++++++++++++++++++++++++++++++++++----------
> > fs/nfs/nfs4file.c | 2 +
> > fs/nfsd/filecache.c | 57 ++++++++++++-----
> > fs/nfsd/filecache.h | 2 +
> > fs/nfsd/nfs3proc.c | 2 +-
> > fs/nfsd/nfs4proc.c | 22 ++++++-
> > fs/nfsd/nfs4recover.c | 6 +-
> > fs/nfsd/nfs4state.c | 103 ++++++++++++++++++++++++++++++-
> > fs/nfsd/state.h | 5 ++
> > fs/nfsd/vfs.c | 16 ++---
> > fs/nfsd/vfs.h | 2 +-
> > fs/open.c | 2 +-
> > fs/overlayfs/overlayfs.h | 10 +--
> > fs/smb/client/cifsfs.c | 3 +
> > fs/smb/server/vfs.c | 8 +--
> > fs/xfs/scrub/orphanage.c | 2 +-
> > include/linux/filelock.h | 12 ++++
> > include/linux/fs.h | 13 ++--
> > include/uapi/linux/fcntl.h | 10 +++
> > net/unix/af_unix.c | 2 +-
> > 27 files changed, 425 insertions(+), 106 deletions(-)
> > ---
> > base-commit: d2ced3cadfab04c7e915adf0a73c53fcf1642719
> > change-id: 20251013-dir-deleg-ro-d0fe19823b21
> >
> > Best regards,
> > --
> > Jeff Layton <jlayton@kernel.org>
> >
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent
2025-10-29 13:37 ` Jeff Layton
@ 2025-10-31 12:23 ` Christian Brauner
0 siblings, 0 replies; 26+ messages in thread
From: Christian Brauner @ 2025-10-31 12:23 UTC (permalink / raw)
To: Jeff Layton
Cc: Miklos Szeredi, Alexander Viro, Jan Kara, Chuck Lever,
Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
Bharath SM, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, David Howells, Tyler Hicks, NeilBrown,
Olga Kornievskaia, Dai Ngo, Amir Goldstein, Namjae Jeon,
Steve French, Sergey Senozhatsky, Carlos Maiolino,
Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, linux-fsdevel, linux-kernel, linux-nfs,
linux-cifs, samba-technical, netfs, ecryptfs, linux-unionfs,
linux-xfs, netdev
On Wed, Oct 29, 2025 at 09:37:22AM -0400, Jeff Layton wrote:
> On Wed, 2025-10-29 at 14:04 +0100, Christian Brauner wrote:
> > On Tue, Oct 21, 2025 at 11:25:38AM -0400, Jeff Layton wrote:
> > > In order to add directory delegation support, we need to break
> > > delegations on the parent whenever there is going to be a change in the
> > > directory.
> > >
> > > Add a new delegated_inode parameter to vfs_mkdir. All of the existing
> > > callers set that to NULL for now, except for do_mkdirat which will
> > > properly block until the lease is gone.
> > >
> > > Reviewed-by: Jan Kara <jack@suse.cz>
> > > Reviewed-by: NeilBrown <neil@brown.name>
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > > drivers/base/devtmpfs.c | 2 +-
> > > fs/cachefiles/namei.c | 2 +-
> > > fs/ecryptfs/inode.c | 2 +-
> > > fs/init.c | 2 +-
> > > fs/namei.c | 24 ++++++++++++++++++------
> > > fs/nfsd/nfs4recover.c | 2 +-
> > > fs/nfsd/vfs.c | 2 +-
> > > fs/overlayfs/overlayfs.h | 2 +-
> > > fs/smb/server/vfs.c | 2 +-
> > > fs/xfs/scrub/orphanage.c | 2 +-
> > > include/linux/fs.h | 2 +-
> > > 11 files changed, 28 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
> > > index 9d4e46ad8352257a6a65d85526ebdbf9bf2d4b19..0e79621cb0f79870003b867ca384199171ded4e0 100644
> > > --- a/drivers/base/devtmpfs.c
> > > +++ b/drivers/base/devtmpfs.c
> > > @@ -180,7 +180,7 @@ static int dev_mkdir(const char *name, umode_t mode)
> > > if (IS_ERR(dentry))
> > > return PTR_ERR(dentry);
> > >
> > > - dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode);
> > > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(path.dentry), dentry, mode, NULL);
> > > if (!IS_ERR(dentry))
> > > /* mark as kernel-created inode */
> > > d_inode(dentry)->i_private = &thread;
> > > diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> > > index d1edb2ac38376c4f9d2a18026450bb3c774f7824..50c0f9c76d1fd4c05db90d7d0d1bad574523ead0 100644
> > > --- a/fs/cachefiles/namei.c
> > > +++ b/fs/cachefiles/namei.c
> > > @@ -130,7 +130,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
> > > goto mkdir_error;
> > > ret = cachefiles_inject_write_error();
> > > if (ret == 0)
> > > - subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
> > > + subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL);
> > > else
> > > subdir = ERR_PTR(ret);
> > > if (IS_ERR(subdir)) {
> > > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> > > index ed1394da8d6bd7065f2a074378331f13fcda17f9..35830b3144f8f71374a78b3e7463b864f4fc216e 100644
> > > --- a/fs/ecryptfs/inode.c
> > > +++ b/fs/ecryptfs/inode.c
> > > @@ -508,7 +508,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > > goto out;
> > >
> > > lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir,
> > > - lower_dentry, mode);
> > > + lower_dentry, mode, NULL);
> > > rc = PTR_ERR(lower_dentry);
> > > if (IS_ERR(lower_dentry))
> > > goto out;
> > > diff --git a/fs/init.c b/fs/init.c
> > > index 07f592ccdba868509d0f3aaf9936d8d890fdbec5..895f8a09a71acfd03e11164e3b441a7d4e2de146 100644
> > > --- a/fs/init.c
> > > +++ b/fs/init.c
> > > @@ -233,7 +233,7 @@ int __init init_mkdir(const char *pathname, umode_t mode)
> > > error = security_path_mkdir(&path, dentry, mode);
> > > if (!error) {
> > > dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> > > - dentry, mode);
> > > + dentry, mode, NULL);
> > > if (IS_ERR(dentry))
> > > error = PTR_ERR(dentry);
> > > }
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index 6e61e0215b34134b1690f864e2719e3f82cf71a8..86cf6eca1f485361c6732974e4103cf5ea721539 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -4407,10 +4407,11 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
> > >
> > > /**
> > > * vfs_mkdir - create directory returning correct dentry if possible
> > > - * @idmap: idmap of the mount the inode was found from
> > > - * @dir: inode of the parent directory
> > > - * @dentry: dentry of the child directory
> > > - * @mode: mode of the child directory
> > > + * @idmap: idmap of the mount the inode was found from
> > > + * @dir: inode of the parent directory
> > > + * @dentry: dentry of the child directory
> > > + * @mode: mode of the child directory
> > > + * @delegated_inode: returns parent inode, if the inode is delegated.
> >
> > I wonder if it would be feasible and potentially elegant if delegated
> > inodes were returned as separate type like struct delegated_inode
> > similar to the vfsuid_t just a struct wrapper around the inode itself.
> > The advantage is that it's not possible to accidently abuse this thing
> > as we're passing that stuff around to try_break_deleg() and so on.
> >
>
> I have a patch that does exactly that:
>
> https://lore.kernel.org/linux-nfs/20250924-dir-deleg-v3-15-9f3af8bc5c40@kernel.org/
I love it!
>
> I didn't submit it here since it wasn't strictly required for this
> patchset. If we get around to implementing CB_NOTIFY support however,
> it will be since we'll need to pass back other information than just
> the inode.
>
> I could move that into this series if you prefer. If we do that though,
> then it might also be cleaner to take the previous patch in the series
> that cleans up __break_lease() arguments:
>
> https://lore.kernel.org/linux-nfs/20250924-dir-deleg-v3-14-9f3af8bc5c40@kernel.org/
If you have a wholesome story to tell then by all means tell it all.
Don't GRRM us with half a tale. IOW, it's fine to have a large patch
series. If it's well split-up then it's great. Al or I apparently can't
get a series out the door that's under 20 patches so you get the same
leeway. :)
>
> Let me know what you'd prefer.
>
> > > *
> > > * Create a directory.
> > > *
> > > @@ -4427,7 +4428,8 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
> > > * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> > > */
> > > struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > > - struct dentry *dentry, umode_t mode)
> > > + struct dentry *dentry, umode_t mode,
> > > + struct inode **delegated_inode)
> > > {
> > > int error;
> > > unsigned max_links = dir->i_sb->s_max_links;
> > > @@ -4450,6 +4452,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > > if (max_links && dir->i_nlink >= max_links)
> > > goto err;
> > >
> > > + error = try_break_deleg(dir, delegated_inode);
> > > + if (error)
> > > + goto err;
> > > +
> > > de = dir->i_op->mkdir(idmap, dir, dentry, mode);
> > > error = PTR_ERR(de);
> > > if (IS_ERR(de))
> > > @@ -4473,6 +4479,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > > struct path path;
> > > int error;
> > > unsigned int lookup_flags = LOOKUP_DIRECTORY;
> > > + struct inode *delegated_inode = NULL;
> > >
> > > retry:
> > > dentry = filename_create(dfd, name, &path, lookup_flags);
> > > @@ -4484,11 +4491,16 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > > mode_strip_umask(path.dentry->d_inode, mode));
> > > if (!error) {
> > > dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> > > - dentry, mode);
> > > + dentry, mode, &delegated_inode);
> > > if (IS_ERR(dentry))
> > > error = PTR_ERR(dentry);
> > > }
> > > end_creating_path(&path, dentry);
> > > + if (delegated_inode) {
> > > + error = break_deleg_wait(&delegated_inode);
> > > + if (!error)
> > > + goto retry;
> > > + }
> > > if (retry_estale(error, lookup_flags)) {
> > > lookup_flags |= LOOKUP_REVAL;
> > > goto retry;
> > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > index b1005abcb9035b2cf743200808a251b00af7e3f4..423dd102b51198ea7c447be2b9a0a5020c950dba 100644
> > > --- a/fs/nfsd/nfs4recover.c
> > > +++ b/fs/nfsd/nfs4recover.c
> > > @@ -202,7 +202,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
> > > * as well be forgiving and just succeed silently.
> > > */
> > > goto out_put;
> > > - dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
> > > + dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, 0700, NULL);
> > > if (IS_ERR(dentry))
> > > status = PTR_ERR(dentry);
> > > out_put:
> > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > index 8b2dc7a88aab015d1e39da0dd4e6daf7e276aabe..5f24af289d509bea54a324b8851fa06de6050353 100644
> > > --- a/fs/nfsd/vfs.c
> > > +++ b/fs/nfsd/vfs.c
> > > @@ -1645,7 +1645,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > nfsd_check_ignore_resizing(iap);
> > > break;
> > > case S_IFDIR:
> > > - dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode);
> > > + dchild = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode, NULL);
> > > if (IS_ERR(dchild)) {
> > > host_err = PTR_ERR(dchild);
> > > } else if (d_is_negative(dchild)) {
> > > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > > index c8fd5951fc5ece1ae6b3e2a0801ca15f9faf7d72..0f65f9a5d54d4786b39e4f4f30f416d5b9016e70 100644
> > > --- a/fs/overlayfs/overlayfs.h
> > > +++ b/fs/overlayfs/overlayfs.h
> > > @@ -248,7 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
> > > {
> > > struct dentry *ret;
> > >
> > > - ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
> > > + ret = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode, NULL);
> > > pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(ret));
> > > return ret;
> > > }
> > > diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
> > > index 891ed2dc2b7351a5cb14a2241d71095ffdd03f08..3d2190f26623b23ea79c63410905a3c3ad684048 100644
> > > --- a/fs/smb/server/vfs.c
> > > +++ b/fs/smb/server/vfs.c
> > > @@ -230,7 +230,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode)
> > > idmap = mnt_idmap(path.mnt);
> > > mode |= S_IFDIR;
> > > d = dentry;
> > > - dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode);
> > > + dentry = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode, NULL);
> > > if (IS_ERR(dentry))
> > > err = PTR_ERR(dentry);
> > > else if (d_is_negative(dentry))
> > > diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
> > > index 9c12cb8442311ca26b169e4d1567939ae44a5be0..91c9d07b97f306f57aebb9b69ba564b0c2cb8c17 100644
> > > --- a/fs/xfs/scrub/orphanage.c
> > > +++ b/fs/xfs/scrub/orphanage.c
> > > @@ -167,7 +167,7 @@ xrep_orphanage_create(
> > > */
> > > if (d_really_is_negative(orphanage_dentry)) {
> > > orphanage_dentry = vfs_mkdir(&nop_mnt_idmap, root_inode,
> > > - orphanage_dentry, 0750);
> > > + orphanage_dentry, 0750, NULL);
> > > error = PTR_ERR(orphanage_dentry);
> > > if (IS_ERR(orphanage_dentry))
> > > goto out_unlock_root;
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index c895146c1444be36e0a779df55622cc38c9419ff..1040df3792794cd353b86558b41618294e25b8a6 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -2113,7 +2113,7 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap,
> > > int vfs_create(struct mnt_idmap *, struct inode *,
> > > struct dentry *, umode_t, bool);
> > > struct dentry *vfs_mkdir(struct mnt_idmap *, struct inode *,
> > > - struct dentry *, umode_t);
> > > + struct dentry *, umode_t, struct inode **);
> > > int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
> > > umode_t, dev_t);
> > > int vfs_symlink(struct mnt_idmap *, struct inode *,
> > >
> > > --
> > > 2.51.0
> > >
>
> --
> Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2025-10-31 12:23 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-21 15:25 [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Jeff Layton
2025-10-21 15:25 ` [PATCH v3 01/13] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
2025-10-22 8:58 ` Jan Kara
2025-10-21 15:25 ` [PATCH v3 02/13] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
2025-10-21 15:25 ` [PATCH v3 03/13] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
2025-10-29 13:04 ` Christian Brauner
2025-10-29 13:37 ` Jeff Layton
2025-10-31 12:23 ` Christian Brauner
2025-10-21 15:25 ` [PATCH v3 04/13] vfs: allow rmdir " Jeff Layton
2025-10-21 15:25 ` [PATCH v3 05/13] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
2025-10-21 15:25 ` [PATCH v3 06/13] vfs: make vfs_create break delegations on parent directory Jeff Layton
2025-10-29 13:23 ` Christian Brauner
2025-10-29 13:38 ` Jeff Layton
2025-10-21 15:25 ` [PATCH v3 07/13] vfs: make vfs_mknod " Jeff Layton
2025-10-21 15:25 ` [PATCH v3 08/13] vfs: make vfs_symlink break delegations on parent dir Jeff Layton
2025-10-22 9:01 ` Jan Kara
2025-10-21 15:25 ` [PATCH v3 09/13] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
2025-10-22 9:03 ` Jan Kara
2025-10-21 15:25 ` [PATCH v3 10/13] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
2025-10-21 15:25 ` [PATCH v3 11/13] nfsd: allow DELEGRETURN on directories Jeff Layton
2025-10-21 15:25 ` [PATCH v3 12/13] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
2025-10-21 16:16 ` Chuck Lever
2025-10-21 15:25 ` [PATCH v3 13/13] vfs: expose delegation support to userland Jeff Layton
2025-10-29 12:55 ` [PATCH v3 00/13] vfs: recall-only directory delegations for knfsd Christian Brauner
2025-10-29 13:38 ` Christian Brauner
2025-10-29 13:39 ` Jeff Layton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).