[PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations

linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations
@ 2025-06-02 14:01 Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 01/28] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
                   ` (27 more replies)
  0 siblings, 28 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

This patchset is an update to a patchset that I posted just over a year
ago [1]. That version had client and server patches. This one is just
the server-side patches.

NFSv4.1 adds a GET_DIR_DELEGATION operation, to allow clients
to request a delegation on a directory. If the client holds a directory
delegation, then it knows that nothing will change the dentries in it
until it has been recalled.

In 2023, Rick Macklem gave a talk at the NFS Bakeathon on his
implementation of directory delegations for FreeBSD [2], and showed that
it can greatly improve LOOKUP-heavy workloads. There is also some
earlier work by CITI [3] that showed similar results. The SMB protocol
also has a similar sort of construct, and they have also seen large
performance improvements on certain workloads.

This version also starts with support for trivial directory delegations.
From there it adds VFS support for ignoring certain break_lease() events
on on directories. The server can then request leases that ignore
certain events (like a create or delete) and set its fsnotify mask to
receive a callback after that event occurs. That allows it to avoid
breaking the lease.

When a fsnotify callback comes in, the server will encode the
information directly as XDR in a buffer attached to the delegation. The
CB_NOTIFY callback is then queued, which will scoop up that buffer and
allocate another to start gathering more events.  If it runs out of
space to spool events, it will give up and trigger a recall of the
delegation.

This is still a work-in-progress however:

The main thing missing at this point is support for sending attributes
in the CB_NOTIFY, particularly on ADD events. The right set of fattrs
would allow the client to instantiate a dentry and inode without having
to contact the server.

Still, it's getting close to the point where the server side is somewhat
functional so it's a good time to post what I have so far.

Anna has graciously agreed to work on the client-side pieces. I do have
some patches, but that piece is still pretty rough:

    https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git/log/?h=dir-deleg-clnt

In a nutshell, the client-side GDD4 support is still simplistic, and
there is no support for CB_NOTIFY yet.

I also have a MR up for wireshark [4], and I have patches for some basic
pynfs tests that I've been using to drive the server (to be posted
soon).

At this point I'm mainly interested in feedback on the VFS bits,
particularly the delegated_inode changes. Also, I should make special
mention of atomic_open since Al pointed it out in the last set:

I think we can't reasonably support dir delegations on filesystems that
support atomic_open. When we do a create on those filesystems, we don't
know whether the file exists or not, so we can't know whether we need to
break a dir delegation.

It would be nice to have a compile-time check for that, but I'm not sure
how we could reasonably do it. For now, I've settled for disabling
directory leases in FUSE, NFS and CIFS, which should work around the
potential problem.

[1]: https://lore.kernel.org/linux-nfs/20240315-dir-deleg-v1-0-a1d6209a3654@kernel.org/
[2]: https://www.youtube.com/watch?v=DdFyH3BN5pI
[3]: https://linux-nfs.org/wiki/index.php/CITI_Experience_with_Directory_Delegations
[4]: https://gitlab.com/wireshark/wireshark/-/merge_requests/20048

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Changes in v2:
- add support for ignoring certain break_lease() events
- basic support for CB_NOTIFY
- Link to v1: https://lore.kernel.org/r/20240315-dir-deleg-v1-0-a1d6209a3654@kernel.org

---
Jeff Layton (28):
      filelock: push the S_ISREG check down to ->setlease handlers
      filelock: add a lm_may_setlease lease_manager callback
      vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
      vfs: allow mkdir to wait for delegation break on parent
      vfs: allow rmdir to wait for delegation break on parent
      vfs: break parent dir delegations in open(..., O_CREAT) codepath
      vfs: make vfs_create break delegations on parent directory
      vfs: make vfs_mknod break delegations on parent directory
      filelock: lift the ban on directory leases in generic_setlease
      nfsd: allow filecache to hold S_IFDIR files
      nfsd: allow DELEGRETURN on directories
      nfsd: check for delegation conflicts vs. the same client
      nfsd: wire up GET_DIR_DELEGATION handling
      filelock: rework the __break_lease API to use flags
      filelock: add struct delegated_inode
      filelock: add support for ignoring deleg breaks for dir change events
      filelock: add an inode_lease_ignore_mask helper
      nfsd: add protocol support for CB_NOTIFY
      nfsd: add callback encoding and decoding linkages for CB_NOTIFY
      nfsd: add data structures for handling CB_NOTIFY to directory delegation
      fsnotify: export fsnotify_recalc_mask()
      nfsd: update the fsnotify mark when setting or removing a dir delegation
      nfsd: make nfsd4_callback_ops->prepare operation bool return
      nfsd: add notification handlers for dir events
      nfsd: allow nfsd to get a dir lease with an ignore mask
      nfsd: add a tracepoint for nfsd_file_fsnotify_handle_dir_event()
      nfsd: add support for NOTIFY4_ADD_ENTRY events
      nfsd: add support for NOTIFY4_RENAME_ENTRY events

 Documentation/sunrpc/xdr/nfs4_1.x    | 252 ++++++++++++++++-
 fs/attr.c                            |   4 +-
 fs/fuse/dir.c                        |   1 +
 fs/locks.c                           | 120 ++++++--
 fs/namei.c                           | 296 ++++++++++++-------
 fs/nfs/nfs4file.c                    |   2 +
 fs/nfsd/filecache.c                  | 103 +++++--
 fs/nfsd/filecache.h                  |   2 +
 fs/nfsd/nfs4callback.c               |  60 +++-
 fs/nfsd/nfs4layouts.c                |   3 +-
 fs/nfsd/nfs4proc.c                   |  24 +-
 fs/nfsd/nfs4state.c                  | 535 +++++++++++++++++++++++++++++++++--
 fs/nfsd/nfs4xdr_gen.c                | 506 ++++++++++++++++++++++++++++++++-
 fs/nfsd/nfs4xdr_gen.h                |  17 +-
 fs/nfsd/state.h                      |  47 ++-
 fs/nfsd/trace.h                      |  26 +-
 fs/nfsd/vfs.c                        |   5 +-
 fs/nfsd/vfs.h                        |   2 +-
 fs/nfsd/xdr4cb.h                     |  11 +
 fs/notify/mark.c                     |   1 +
 fs/open.c                            |   8 +-
 fs/posix_acl.c                       |  12 +-
 fs/smb/client/cifsfs.c               |   3 +
 fs/utimes.c                          |   4 +-
 fs/xattr.c                           |  16 +-
 include/linux/filelock.h             | 143 +++++++---
 include/linux/fs.h                   |   9 +-
 include/linux/nfs4.h                 | 127 ---------
 include/linux/sunrpc/xdrgen/nfs4_1.h | 293 ++++++++++++++++++-
 include/linux/xattr.h                |   4 +-
 include/uapi/linux/nfs4.h            |   2 -
 31 files changed, 2249 insertions(+), 389 deletions(-)
---
base-commit: 22b71eb34051a70c39c86997657de92722ec1838
change-id: 20240215-dir-deleg-e212210ba9d4

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 01/28] filelock: push the S_ISREG check down to ->setlease handlers
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 02/28] filelock: add a lm_may_setlease lease_manager callback Jeff Layton
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

When nfsd starts requesting directory delegations, setlease handlers may
see requests for leases on directories. Push the !S_ISREG check down
into the non-trivial setlease handlers, so we can selectively enable
them where they're supported.

FUSE is specialr:. It's the only filesystem that supports atomic_open and
allow kernel-internal leases. Ensure that we don't allow directory
leases by default going forward by explicitly disabling them there.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/fuse/dir.c          | 1 +
 fs/locks.c             | 5 +++--
 fs/nfs/nfs4file.c      | 2 ++
 fs/smb/client/cifsfs.c | 3 +++
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 33b82529cb6e4bffa607e1b20bd09ac489b0667f..c83e61b52e0fff106ef2a3d62efcc3949ccf39e7 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -2218,6 +2218,7 @@ static const struct file_operations fuse_dir_operations = {
 	.fsync		= fuse_dir_fsync,
 	.unlocked_ioctl	= fuse_dir_ioctl,
 	.compat_ioctl	= fuse_dir_compat_ioctl,
+	.setlease	= simple_nosetlease,
 };
 
 static const struct inode_operations fuse_common_inode_operations = {
diff --git a/fs/locks.c b/fs/locks.c
index 1619cddfa7a4d799f0f84f0bc8f28458d8d280db..a35d033dcaf0b604b73395260562af08f7711c12 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1929,6 +1929,9 @@ static int generic_delete_lease(struct file *filp, void *owner)
 int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
 			void **priv)
 {
+	if (!S_ISREG(file_inode(filp)->i_mode))
+		return -EINVAL;
+
 	switch (arg) {
 	case F_UNLCK:
 		return generic_delete_lease(filp, *priv);
@@ -2018,8 +2021,6 @@ vfs_setlease(struct file *filp, int arg, struct file_lease **lease, void **priv)
 
 	if ((!vfsuid_eq_kuid(vfsuid, current_fsuid())) && !capable(CAP_LEASE))
 		return -EACCES;
-	if (!S_ISREG(inode->i_mode))
-		return -EINVAL;
 	error = security_file_lock(filp, arg);
 	if (error)
 		return error;
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 1cd9652f3c280358209f22503ea573a906a6194e..b7630a437ad22fbfb658086953b25b9ae6b4f057 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -442,6 +442,8 @@ void nfs42_ssc_unregister_ops(void)
 static int nfs4_setlease(struct file *file, int arg, struct file_lease **lease,
 			 void **priv)
 {
+	if (!S_ISREG(file_inode(file)->i_mode))
+		return -EINVAL;
 	return nfs4_proc_setlease(file, arg, lease, priv);
 }
 
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index fb04e263611cadaea210f5c1d90c05bea37fb496..991583b0b77e480c002974b410f06853740e4b1b 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -1094,6 +1094,9 @@ cifs_setlease(struct file *file, int arg, struct file_lease **lease, void **priv
 	struct inode *inode = file_inode(file);
 	struct cifsFileInfo *cfile = file->private_data;
 
+	if (!S_ISREG(inode->i_mode))
+		return -EINVAL;
+
 	/* Check if file is oplocked if this is request for new lease */
 	if (arg == F_UNLCK ||
 	    ((arg == F_RDLCK) && CIFS_CACHE_READ(CIFS_I(inode))) ||

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 02/28] filelock: add a lm_may_setlease lease_manager callback
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 01/28] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 03/28] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

The NFSv4.1 protocol adds support for directory delegations, but it
specifies that if you already have a delegation and try to request a new
one on the same filehandle, the server must reply that the delegation is
unavailable.

Add a new lease manager callback to allow the lease manager (nfsd in
this case) to impose this extra check when performing a setlease.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c               |  5 +++++
 include/linux/filelock.h | 14 ++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index a35d033dcaf0b604b73395260562af08f7711c12..1985f38d326d938f58009e0880b45e588af6a422 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1826,6 +1826,11 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
 			continue;
 		}
 
+		/* Allow the lease manager to veto the setlease */
+		if (lease->fl_lmops->lm_may_setlease &&
+		    !lease->fl_lmops->lm_may_setlease(lease, fl))
+			goto out;
+
 		/*
 		 * No exclusive leases if someone else has a lease on
 		 * this file:
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index c412ded9171ed781ebe9e8d2e0426dcd10793292..60c76c8fb4dfdcaaa2cfa3f41f0f26ffcb3db29f 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -49,6 +49,20 @@ struct lease_manager_operations {
 	int (*lm_change)(struct file_lease *, int, struct list_head *);
 	void (*lm_setup)(struct file_lease *, void **);
 	bool (*lm_breaker_owns_lease)(struct file_lease *);
+
+	/**
+	 * lm_may_setlease - extra conditions for setlease
+	 * @new: new file_lease being set
+	 * @old: old (extant) file_lease
+	 *
+	 * This allows the lease manager to add extra conditions when
+	 * setting a lease, based on the presence of an existing lease.
+	 *
+	 * Return values:
+	 *   %false: @new and @old conflict
+	 *   %true: No conflict detected
+	 */
+	bool (*lm_may_setlease)(struct file_lease *new, struct file_lease *old);
 };
 
 struct lock_manager {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 03/28] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 01/28] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 02/28] filelock: add a lm_may_setlease lease_manager callback Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

vfs_link, vfs_unlink, and vfs_rename all have existing delegation break
handling for the children in the rename. Add the necessary calls for
breaking delegations in the parent(s) as well.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/namei.c b/fs/namei.c
index 4bb889fc980b7d44914e11ec38ae3e8fdfbafadd..0fea12860036162c01a291558e068fde9c986142 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4580,6 +4580,9 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir,
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
+			error = try_break_deleg(dir, delegated_inode);
+			if (error)
+				goto out;
 			error = try_break_deleg(target, delegated_inode);
 			if (error)
 				goto out;
@@ -4849,7 +4852,9 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
 	else {
-		error = try_break_deleg(inode, delegated_inode);
+		error = try_break_deleg(dir, delegated_inode);
+		if (!error)
+			error = try_break_deleg(inode, delegated_inode);
 		if (!error)
 			error = dir->i_op->link(old_dentry, dir, new_dentry);
 	}
@@ -5115,6 +5120,14 @@ int vfs_rename(struct renamedata *rd)
 		    old_dir->i_nlink >= max_links)
 			goto out;
 	}
+	error = try_break_deleg(old_dir, delegated_inode);
+	if (error)
+		goto out;
+	if (new_dir != old_dir) {
+		error = try_break_deleg(new_dir, delegated_inode);
+		if (error)
+			goto out;
+	}
 	if (!is_dir) {
 		error = try_break_deleg(source, delegated_inode);
 		if (error)

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (2 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 03/28] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-05 11:19   ` Jan Kara
  2025-06-02 14:01 ` [PATCH RFC v2 05/28] vfs: allow rmdir " Jeff Layton
                   ` (23 subsequent siblings)
  27 siblings, 1 reply; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Rename the existing vfs_mkdir to __vfs_mkdir, make it static and add a
new delegated_inode parameter. Add a new exported vfs_mkdir wrapper
around it that passes a NULL pointer for delegated_inode.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 67 +++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 25 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 0fea12860036162c01a291558e068fde9c986142..7c9e237ed1b1a535934ffe5e523424bb035e7ae0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4318,29 +4318,9 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
 	return do_mknodat(AT_FDCWD, getname(filename), mode, dev);
 }
 
-/**
- * vfs_mkdir - create directory returning correct dentry if possible
- * @idmap:	idmap of the mount the inode was found from
- * @dir:	inode of the parent directory
- * @dentry:	dentry of the child directory
- * @mode:	mode of the child directory
- *
- * Create a directory.
- *
- * If the inode has been found through an idmapped mount the idmap of
- * the vfsmount must be passed through @idmap. This function will then take
- * care to map the inode according to @idmap before checking permissions.
- * On non-idmapped mounts or if permission checking is to be performed on the
- * raw inode simply pass @nop_mnt_idmap.
- *
- * In the event that the filesystem does not use the *@dentry but leaves it
- * negative or unhashes it and possibly splices a different one returning it,
- * the original dentry is dput() and the alternate is returned.
- *
- * In case of an error the dentry is dput() and an ERR_PTR() is returned.
- */
-struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
-			 struct dentry *dentry, umode_t mode)
+static struct dentry *__vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
+				  struct dentry *dentry, umode_t mode,
+				  struct inode **delegated_inode)
 {
 	int error;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -4363,6 +4343,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	if (max_links && dir->i_nlink >= max_links)
 		goto err;
 
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		goto err;
+
 	de = dir->i_op->mkdir(idmap, dir, dentry, mode);
 	error = PTR_ERR(de);
 	if (IS_ERR(de))
@@ -4378,6 +4362,33 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	dput(dentry);
 	return ERR_PTR(error);
 }
+
+/**
+ * vfs_mkdir - create directory returning correct dentry if possible
+ * @idmap:	idmap of the mount the inode was found from
+ * @dir:	inode of the parent directory
+ * @dentry:	dentry of the child directory
+ * @mode:	mode of the child directory
+ *
+ * Create a directory.
+ *
+ * If the inode has been found through an idmapped mount the idmap of
+ * the vfsmount must be passed through @idmap. This function will then take
+ * care to map the inode according to @idmap before checking permissions.
+ * On non-idmapped mounts or if permission checking is to be performed on the
+ * raw inode simply pass @nop_mnt_idmap.
+ *
+ * In the event that the filesystem does not use the *@dentry but leaves it
+ * negative or unhashes it and possibly splices a different one returning it,
+ * the original dentry is dput() and the alternate is returned.
+ *
+ * In case of an error the dentry is dput() and an ERR_PTR() is returned.
+ */
+struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
+			 struct dentry *dentry, umode_t mode)
+{
+	return __vfs_mkdir(idmap, dir, dentry, mode, NULL);
+}
 EXPORT_SYMBOL(vfs_mkdir);
 
 int do_mkdirat(int dfd, struct filename *name, umode_t mode)
@@ -4386,6 +4397,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
 	struct path path;
 	int error;
 	unsigned int lookup_flags = LOOKUP_DIRECTORY;
+	struct inode *delegated_inode = NULL;
 
 retry:
 	dentry = filename_create(dfd, name, &path, lookup_flags);
@@ -4396,12 +4408,17 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
 	error = security_path_mkdir(&path, dentry,
 			mode_strip_umask(path.dentry->d_inode, mode));
 	if (!error) {
-		dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
-				  dentry, mode);
+		dentry = __vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
+				     dentry, mode, &delegated_inode);
 		if (IS_ERR(dentry))
 			error = PTR_ERR(dentry);
 	}
 	done_path_create(&path, dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
 		goto retry;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 05/28] vfs: allow rmdir to wait for delegation break on parent
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (3 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 06/28] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Rename vfs_rmdir as __vfs_rmdir, make it static and add a new
delegated_inode parameter. Add a vfs_rmdir wrapper that passes in a NULL
pointer for it. Add the necessary try_break_deleg calls to
__vfs_rmdir(). Convert do_rmdir to use __vfs_rmdir and wait for the
delegation break to complete before proceeding.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 51 ++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 34 insertions(+), 17 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7c9e237ed1b1a535934ffe5e523424bb035e7ae0..2211ed9f427cc97391d068b1a33ce388266a3e02 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4438,22 +4438,8 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
 	return do_mkdirat(AT_FDCWD, getname(pathname), mode);
 }
 
-/**
- * vfs_rmdir - remove directory
- * @idmap:	idmap of the mount the inode was found from
- * @dir:	inode of the parent directory
- * @dentry:	dentry of the child directory
- *
- * Remove a directory.
- *
- * If the inode has been found through an idmapped mount the idmap of
- * the vfsmount must be passed through @idmap. This function will then take
- * care to map the inode according to @idmap before checking permissions.
- * On non-idmapped mounts or if permission checking is to be performed on the
- * raw inode simply pass @nop_mnt_idmap.
- */
-int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
-		     struct dentry *dentry)
+static int __vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
+		       struct dentry *dentry, struct inode **delegated_inode)
 {
 	int error = may_delete(idmap, dir, dentry, 1);
 
@@ -4475,6 +4461,10 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
 	if (error)
 		goto out;
 
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		goto out;
+
 	error = dir->i_op->rmdir(dir, dentry);
 	if (error)
 		goto out;
@@ -4491,6 +4481,26 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
 		d_delete_notify(dir, dentry);
 	return error;
 }
+
+/**
+ * vfs_rmdir - remove directory
+ * @idmap:	idmap of the mount the inode was found from
+ * @dir:	inode of the parent directory
+ * @dentry:	dentry of the child directory
+ *
+ * Remove a directory.
+ *
+ * If the inode has been found through an idmapped mount the idmap of
+ * the vfsmount must be passed through @idmap. This function will then take
+ * care to map the inode according to @idmap before checking permissions.
+ * On non-idmapped mounts or if permission checking is to be performed on the
+ * raw inode simply pass @nop_mnt_idmap.
+ */
+int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
+		     struct dentry *dentry)
+{
+	return __vfs_rmdir(idmap, dir, dentry, NULL);
+}
 EXPORT_SYMBOL(vfs_rmdir);
 
 int do_rmdir(int dfd, struct filename *name)
@@ -4501,6 +4511,7 @@ int do_rmdir(int dfd, struct filename *name)
 	struct qstr last;
 	int type;
 	unsigned int lookup_flags = 0;
+	struct inode *delegated_inode = NULL;
 retry:
 	error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
 	if (error)
@@ -4530,7 +4541,8 @@ int do_rmdir(int dfd, struct filename *name)
 	error = security_path_rmdir(&path, dentry);
 	if (error)
 		goto exit4;
-	error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry);
+	error = __vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode,
+			    dentry, &delegated_inode);
 exit4:
 	dput(dentry);
 exit3:
@@ -4538,6 +4550,11 @@ int do_rmdir(int dfd, struct filename *name)
 	mnt_drop_write(path.mnt);
 exit2:
 	path_put(&path);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
 		goto retry;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 06/28] vfs: break parent dir delegations in open(..., O_CREAT) codepath
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (4 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 05/28] vfs: allow rmdir " Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 07/28] vfs: make vfs_create break delegations on parent directory Jeff Layton
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Add a delegated_inode parameter to lookup_open and have it break the
delegation. Then, open_last_lookups can wait for the delegation break
and retry the call to lookup_open once it's done.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 2211ed9f427cc97391d068b1a33ce388266a3e02..c8fe924cbb7dcefac9a4930df9f8303d9a478508 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3609,7 +3609,7 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
  */
 static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 				  const struct open_flags *op,
-				  bool got_write)
+				  bool got_write, struct inode **delegated_inode)
 {
 	struct mnt_idmap *idmap;
 	struct dentry *dir = nd->path.dentry;
@@ -3698,6 +3698,11 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 
 	/* Negative dentry, just create the file */
 	if (!dentry->d_inode && (open_flag & O_CREAT)) {
+		/* but break the directory lease first! */
+		error = try_break_deleg(dir_inode, delegated_inode);
+		if (error)
+			goto out_dput;
+
 		file->f_mode |= FMODE_CREATED;
 		audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE);
 		if (!dir_inode->i_op->create) {
@@ -3761,6 +3766,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		   struct file *file, const struct open_flags *op)
 {
 	struct dentry *dir = nd->path.dentry;
+	struct inode *delegated_inode = NULL;
 	int open_flag = op->open_flag;
 	bool got_write = false;
 	struct dentry *dentry;
@@ -3791,7 +3797,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 				return ERR_PTR(-ECHILD);
 		}
 	}
-
+retry:
 	if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) {
 		got_write = !mnt_want_write(nd->path.mnt);
 		/*
@@ -3804,7 +3810,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		inode_lock(dir->d_inode);
 	else
 		inode_lock_shared(dir->d_inode);
-	dentry = lookup_open(nd, file, op, got_write);
+	dentry = lookup_open(nd, file, op, got_write, &delegated_inode);
 	if (!IS_ERR(dentry)) {
 		if (file->f_mode & FMODE_CREATED)
 			fsnotify_create(dir->d_inode, dentry);
@@ -3819,8 +3825,16 @@ static const char *open_last_lookups(struct nameidata *nd,
 	if (got_write)
 		mnt_drop_write(nd->path.mnt);
 
-	if (IS_ERR(dentry))
+	if (IS_ERR(dentry)) {
+		if (delegated_inode) {
+			int error = break_deleg_wait(&delegated_inode);
+
+			if (!error)
+				goto retry;
+			return ERR_PTR(error);
+		}
 		return ERR_CAST(dentry);
+	}
 
 	if (file->f_mode & (FMODE_OPENED | FMODE_CREATED)) {
 		dput(nd->path.dentry);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 07/28] vfs: make vfs_create break delegations on parent directory
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (5 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 06/28] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 08/28] vfs: make vfs_mknod " Jeff Layton
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Rename vfs_create as __vfs_create, make it static, and add a new
delegated_inode parameter. Fix do_mknodat to call __vfs_create and wait
for a delegation break if there is one. Add a new exported vfs_create
wrapper that passes in NULL for delegated_inode.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 55 ++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index c8fe924cbb7dcefac9a4930df9f8303d9a478508..7b27a9bc4616d3880d6365f1e37f13f7f45bc2c9 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3370,6 +3370,32 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
 	return mode;
 }
 
+static int __vfs_create(struct mnt_idmap *idmap, struct inode *dir,
+			struct dentry *dentry, umode_t mode, bool want_excl,
+			struct inode **delegated_inode)
+{
+	int error;
+
+	error = may_create(idmap, dir, dentry);
+	if (error)
+		return error;
+
+	if (!dir->i_op->create)
+		return -EACCES;	/* shouldn't it be ENOSYS? */
+
+	mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG);
+	error = security_inode_create(dir, dentry, mode);
+	if (error)
+		return error;
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		return error;
+	error = dir->i_op->create(idmap, dir, dentry, mode, want_excl);
+	if (!error)
+		fsnotify_create(dir, dentry);
+	return error;
+}
+
 /**
  * vfs_create - create new file
  * @idmap:	idmap of the mount the inode was found from
@@ -3389,23 +3415,7 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
 int vfs_create(struct mnt_idmap *idmap, struct inode *dir,
 	       struct dentry *dentry, umode_t mode, bool want_excl)
 {
-	int error;
-
-	error = may_create(idmap, dir, dentry);
-	if (error)
-		return error;
-
-	if (!dir->i_op->create)
-		return -EACCES;	/* shouldn't it be ENOSYS? */
-
-	mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG);
-	error = security_inode_create(dir, dentry, mode);
-	if (error)
-		return error;
-	error = dir->i_op->create(idmap, dir, dentry, mode, want_excl);
-	if (!error)
-		fsnotify_create(dir, dentry);
-	return error;
+	return __vfs_create(idmap, dir, dentry, mode, want_excl, NULL);
 }
 EXPORT_SYMBOL(vfs_create);
 
@@ -4278,6 +4288,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	struct path path;
 	int error;
 	unsigned int lookup_flags = 0;
+	struct inode *delegated_inode = NULL;
 
 	error = may_mknod(mode);
 	if (error)
@@ -4296,8 +4307,9 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	idmap = mnt_idmap(path.mnt);
 	switch (mode & S_IFMT) {
 		case 0: case S_IFREG:
-			error = vfs_create(idmap, path.dentry->d_inode,
-					   dentry, mode, true);
+			error = __vfs_create(idmap, path.dentry->d_inode,
+					     dentry, mode, true,
+					     &delegated_inode);
 			if (!error)
 				security_path_post_mknod(idmap, dentry);
 			break;
@@ -4312,6 +4324,11 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	}
 out2:
 	done_path_create(&path, dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry;
+	}
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
 		goto retry;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 08/28] vfs: make vfs_mknod break delegations on parent directory
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (6 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 07/28] vfs: make vfs_create break delegations on parent directory Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 09/28] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

In order to add directory delegation support, we need to break
delegations on the parent whenever there is going to be a change in the
directory.

Rename vfs_mknod as __vfs_mknod, make it static, and add a new
delegated_inode parameter.  Make do_mknodat call __vfs_mknod and wait
synchronously for delegation breaks to complete. Add a new exported
vfs_mknod wrapper that calls __vfs_mknod with a NULL delegated_inode
pointer.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/namei.c | 57 +++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 35 insertions(+), 22 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7b27a9bc4616d3880d6365f1e37f13f7f45bc2c9..8f0517ade308134ed6566566d9b575c4e9fb0d4e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4215,24 +4215,9 @@ inline struct dentry *user_path_create(int dfd, const char __user *pathname,
 }
 EXPORT_SYMBOL(user_path_create);
 
-/**
- * vfs_mknod - create device node or file
- * @idmap:	idmap of the mount the inode was found from
- * @dir:	inode of the parent directory
- * @dentry:	dentry of the child device node
- * @mode:	mode of the child device node
- * @dev:	device number of device to create
- *
- * Create a device node or file.
- *
- * If the inode has been found through an idmapped mount the idmap of
- * the vfsmount must be passed through @idmap. This function will then take
- * care to map the inode according to @idmap before checking permissions.
- * On non-idmapped mounts or if permission checking is to be performed on the
- * raw inode simply pass @nop_mnt_idmap.
- */
-int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
-	      struct dentry *dentry, umode_t mode, dev_t dev)
+static int __vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
+		       struct dentry *dentry, umode_t mode, dev_t dev,
+		       struct inode **delegated_inode)
 {
 	bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV;
 	int error = may_create(idmap, dir, dentry);
@@ -4256,11 +4241,37 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	if (error)
 		return error;
 
+	error = try_break_deleg(dir, delegated_inode);
+	if (error)
+		return error;
+
 	error = dir->i_op->mknod(idmap, dir, dentry, mode, dev);
 	if (!error)
 		fsnotify_create(dir, dentry);
 	return error;
 }
+
+/**
+ * vfs_mknod - create device node or file
+ * @idmap:	idmap of the mount the inode was found from
+ * @dir:	inode of the parent directory
+ * @dentry:	dentry of the child device node
+ * @mode:	mode of the child device node
+ * @dev:	device number of device to create
+ *
+ * Create a device node or file.
+ *
+ * If the inode has been found through an idmapped mount the idmap of
+ * the vfsmount must be passed through @idmap. This function will then take
+ * care to map the inode according to @idmap before checking permissions.
+ * On non-idmapped mounts or if permission checking is to be performed on the
+ * raw inode simply pass @nop_mnt_idmap.
+ */
+int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
+	      struct dentry *dentry, umode_t mode, dev_t dev)
+{
+	return __vfs_mknod(idmap, dir, dentry, mode, dev, NULL);
+}
 EXPORT_SYMBOL(vfs_mknod);
 
 static int may_mknod(umode_t mode)
@@ -4314,12 +4325,14 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 				security_path_post_mknod(idmap, dentry);
 			break;
 		case S_IFCHR: case S_IFBLK:
-			error = vfs_mknod(idmap, path.dentry->d_inode,
-					  dentry, mode, new_decode_dev(dev));
+			error = __vfs_mknod(idmap, path.dentry->d_inode,
+					    dentry, mode, new_decode_dev(dev),
+					    &delegated_inode);
 			break;
 		case S_IFIFO: case S_IFSOCK:
-			error = vfs_mknod(idmap, path.dentry->d_inode,
-					  dentry, mode, 0);
+			error = __vfs_mknod(idmap, path.dentry->d_inode,
+					    dentry, mode, 0,
+					    &delegated_inode);
 			break;
 	}
 out2:

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 09/28] filelock: lift the ban on directory leases in generic_setlease
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (7 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 08/28] vfs: make vfs_mknod " Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 10/28] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

With the addition of the try_break_lease calls in directory changing
operations, allow generic_setlease to hand them out.

Note that this also makes directory leases available to userland via
fcntl(). I don't see a real reason to prevent userland from acquiring
one, but we could reinstate the prohibition if that's preferable.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 1985f38d326d938f58009e0880b45e588af6a422..82a1b528dc9dae8c1f3a81084072e649d481e8f1 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1934,7 +1934,9 @@ static int generic_delete_lease(struct file *filp, void *owner)
 int generic_setlease(struct file *filp, int arg, struct file_lease **flp,
 			void **priv)
 {
-	if (!S_ISREG(file_inode(filp)->i_mode))
+	struct inode *inode = file_inode(filp);
+
+	if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
 		return -EINVAL;
 
 	switch (arg) {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 10/28] nfsd: allow filecache to hold S_IFDIR files
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (8 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 09/28] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 11/28] nfsd: allow DELEGRETURN on directories Jeff Layton
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

The filecache infrastructure will only handle S_ISREG files at the
moment. Plumb a "type" variable into nfsd_file_do_acquire and have all
of the existing callers set it to S_ISREG. Add a new
nfsd_file_acquire_dir() wrapper that we can then call to request a
nfsd_file that holds a directory open.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 50 ++++++++++++++++++++++++++++++++++++++------------
 fs/nfsd/filecache.h |  2 ++
 fs/nfsd/vfs.c       |  5 +++--
 fs/nfsd/vfs.h       |  2 +-
 4 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index ab85e6a2454f4c783fcb1175ebbeb63a31519c18..3468883146afc080d2b4862e6002b2c6ff7315b9 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1049,7 +1049,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
 		     struct auth_domain *client,
 		     struct svc_fh *fhp,
 		     unsigned int may_flags, struct file *file,
-		     struct nfsd_file **pnf, bool want_gc)
+		     umode_t type, bool want_gc, struct nfsd_file **pnf)
 {
 	unsigned char need = may_flags & NFSD_FILE_MAY_MASK;
 	struct nfsd_file *new, *nf;
@@ -1060,13 +1060,13 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
 	int ret;
 
 retry:
-	if (rqstp) {
-		status = fh_verify(rqstp, fhp, S_IFREG,
+	if (rqstp)
+		status = fh_verify(rqstp, fhp, type,
 				   may_flags|NFSD_MAY_OWNER_OVERRIDE);
-	} else {
-		status = fh_verify_local(net, cred, client, fhp, S_IFREG,
+	else
+		status = fh_verify_local(net, cred, client, fhp, type,
 					 may_flags|NFSD_MAY_OWNER_OVERRIDE);
-	}
+
 	if (status != nfs_ok)
 		return status;
 	inode = d_inode(fhp->fh_dentry);
@@ -1147,7 +1147,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
 			status = nfs_ok;
 			trace_nfsd_file_opened(nf, status);
 		} else {
-			ret = nfsd_open_verified(fhp, may_flags, &nf->nf_file);
+			ret = nfsd_open_verified(fhp, type, may_flags, &nf->nf_file);
 			if (ret == -EOPENSTALE && stale_retry) {
 				stale_retry = false;
 				nfsd_file_unhash(nf);
@@ -1207,7 +1207,7 @@ nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		     unsigned int may_flags, struct nfsd_file **pnf)
 {
 	return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL,
-				    fhp, may_flags, NULL, pnf, true);
+				    fhp, may_flags, NULL, S_IFREG, true, pnf);
 }
 
 /**
@@ -1232,7 +1232,7 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		  unsigned int may_flags, struct nfsd_file **pnf)
 {
 	return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL,
-				    fhp, may_flags, NULL, pnf, false);
+				    fhp, may_flags, NULL, S_IFREG, false, pnf);
 }
 
 /**
@@ -1275,8 +1275,8 @@ nfsd_file_acquire_local(struct net *net, struct svc_cred *cred,
 	const struct cred *save_cred = get_current_cred();
 	__be32 beres;
 
-	beres = nfsd_file_do_acquire(NULL, net, cred, client,
-				     fhp, may_flags, NULL, pnf, false);
+	beres = nfsd_file_do_acquire(NULL, net, cred, client, fhp, may_flags,
+				     NULL, S_IFREG, false, pnf);
 	put_cred(revert_creds(save_cred));
 	return beres;
 }
@@ -1305,7 +1305,33 @@ nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			 struct nfsd_file **pnf)
 {
 	return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL,
-				    fhp, may_flags, file, pnf, false);
+				    fhp, may_flags, file, S_IFREG, false, pnf);
+}
+
+/**
+ * nfsd_file_acquire_dir - Get a struct nfsd_file with an open directory
+ * @rqstp: the RPC transaction being executed
+ * @fhp: the NFS filehandle of the file to be opened
+ * @pnf: OUT: new or found "struct nfsd_file" object
+ *
+ * The nfsd_file_object returned by this API is reference-counted
+ * but not garbage-collected. The object is unhashed after the
+ * final nfsd_file_put(). This opens directories only, and only
+ * in O_RDONLY mode.
+ *
+ * Return values:
+ *   %nfs_ok - @pnf points to an nfsd_file with its reference
+ *   count boosted.
+ *
+ * On error, an nfsstat value in network byte order is returned.
+ */
+__be32
+nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		      struct nfsd_file **pnf)
+{
+	return nfsd_file_do_acquire(rqstp, SVC_NET(rqstp), NULL, NULL, fhp,
+				    NFSD_MAY_READ|NFSD_MAY_64BIT_COOKIE,
+				    NULL, S_IFDIR, false, pnf);
 }
 
 /*
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 5865f9c7271214de7269ab33480689cd61c1c552..3717503f6b75462c7afca6216d9c915e055b117d 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -78,5 +78,7 @@ __be32 nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
 __be32 nfsd_file_acquire_local(struct net *net, struct svc_cred *cred,
 			       struct auth_domain *client, struct svc_fh *fhp,
 			       unsigned int may_flags, struct nfsd_file **pnf);
+__be32 nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		  struct nfsd_file **pnf);
 int nfsd_file_cache_stats_show(struct seq_file *m, void *v);
 #endif /* _FS_NFSD_FILECACHE_H */
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index cd689df2ca5d7396cffb5ed9dc14f774a8f3881c..86873662925e2f50823d4ce371f1018b749b4c6d 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -949,15 +949,16 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 /**
  * nfsd_open_verified - Open a regular file for the filecache
  * @fhp: NFS filehandle of the file to open
+ * @type: S_IFMT inode type allowed (0 means any type is allowed)
  * @may_flags: internal permission flags
  * @filp: OUT: open "struct file *"
  *
  * Returns zero on success, or a negative errno value.
  */
 int
-nfsd_open_verified(struct svc_fh *fhp, int may_flags, struct file **filp)
+nfsd_open_verified(struct svc_fh *fhp, umode_t type, int may_flags, struct file **filp)
 {
-	return __nfsd_open(fhp, S_IFREG, may_flags, filp);
+	return __nfsd_open(fhp, type, may_flags, filp);
 }
 
 /*
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index eff04959606fe55c141ab4a2eed97c7e0716a5f5..9c309cdee6c491d445d56cdc6133f37248ab4b46 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -114,7 +114,7 @@ __be32		nfsd_setxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
 int 		nfsd_open_break_lease(struct inode *, int);
 __be32		nfsd_open(struct svc_rqst *, struct svc_fh *, umode_t,
 				int, struct file **);
-int		nfsd_open_verified(struct svc_fh *fhp, int may_flags,
+int		nfsd_open_verified(struct svc_fh *fhp, umode_t type, int may_flags,
 				struct file **filp);
 __be32		nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				struct file *file, loff_t offset,

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 11/28] nfsd: allow DELEGRETURN on directories
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (9 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 10/28] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 12/28] nfsd: check for delegation conflicts vs. the same client Jeff Layton
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

As Trond pointed out: "...provided that the presented stateid is
actually valid, it is also sufficient to uniquely identify the file to
which it is associated (see RFC8881 Section 8.2.4), so the filehandle
should be considered mostly irrelevant for operations like DELEGRETURN."

Don't ask fh_verify to filter on file type.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a0e3fa2718c7ef331925e9ba8f2a66f331c76db5..5bf12abe4778ca0a16cd68965062da25470c8a93 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7758,7 +7758,8 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	__be32 status;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
-	if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
+	status = fh_verify(rqstp, &cstate->current_fh, 0, 0);
+	if (status)
 		return status;
 
 	status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, SC_STATUS_REVOKED, &s, nn);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 12/28] nfsd: check for delegation conflicts vs. the same client
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (10 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 11/28] nfsd: allow DELEGRETURN on directories Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 13/28] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

RFC 8881 requires that the server reply with GDD_UNAVAIL when the client
requests a directory delegation that it already holds.

When setting a directory delegation, check that the client associated
with the stateid doesn't match an existing delegation. If it does,
reject the setlease attempt.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5bf12abe4778ca0a16cd68965062da25470c8a93..12f20e3c9c54b68cdd4c62aa2904c22c9ccfae0a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -88,6 +88,7 @@ void nfsd4_end_grace(struct nfsd_net *nn);
 static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps);
 static void nfsd4_file_hash_remove(struct nfs4_file *fi);
 static void deleg_reaper(struct nfsd_net *nn);
+static bool nfsd_dir_may_setlease(struct file_lease *new, struct file_lease *old);
 
 /* Locking: */
 
@@ -5503,6 +5504,31 @@ static const struct lease_manager_operations nfsd_lease_mng_ops = {
 	.lm_change = nfsd_change_deleg_cb,
 };
 
+static const struct lease_manager_operations nfsd_dir_lease_mng_ops = {
+	.lm_breaker_owns_lease = nfsd_breaker_owns_lease,
+	.lm_break = nfsd_break_deleg_cb,
+	.lm_change = nfsd_change_deleg_cb,
+	.lm_may_setlease = nfsd_dir_may_setlease,
+};
+
+static bool
+nfsd_dir_may_setlease(struct file_lease *new, struct file_lease *old)
+{
+	struct nfs4_delegation *od, *nd;
+
+	/* Only conflicts with other nfsd dir delegs */
+	if (old->fl_lmops != &nfsd_dir_lease_mng_ops)
+		return true;
+
+	od = old->c.flc_owner;
+	nd = new->c.flc_owner;
+
+	/* Are these for the same client? No bueno if so */
+	if (od->dl_stid.sc_client == nd->dl_stid.sc_client)
+		return false;
+	return true;
+}
+
 static __be32 nfsd4_check_seqid(struct nfsd4_compound_state *cstate, struct nfs4_stateowner *so, u32 seqid)
 {
 	if (nfsd4_has_session(cstate))
@@ -5841,12 +5867,13 @@ static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp)
 	fl = locks_alloc_lease();
 	if (!fl)
 		return NULL;
-	fl->fl_lmops = &nfsd_lease_mng_ops;
 	fl->c.flc_flags = FL_DELEG;
 	fl->c.flc_type = deleg_is_read(dp->dl_type) ? F_RDLCK : F_WRLCK;
 	fl->c.flc_owner = (fl_owner_t)dp;
 	fl->c.flc_pid = current->tgid;
 	fl->c.flc_file = dp->dl_stid.sc_file->fi_deleg_file->nf_file;
+	fl->fl_lmops = S_ISDIR(file_inode(fl->c.flc_file)->i_mode) ?
+				&nfsd_dir_lease_mng_ops : &nfsd_lease_mng_ops;
 	return fl;
 }
 

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 13/28] nfsd: wire up GET_DIR_DELEGATION handling
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (11 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 12/28] nfsd: check for delegation conflicts vs. the same client Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 14/28] filelock: rework the __break_lease API to use flags Jeff Layton
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add a new routine for acquiring a read delegation on a directory. Since
the same CB_RECALL/DELEGRETURN infrastrure is used for regular and
directory delegations, we can just use a normal nfs4_delegation to
represent it.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4proc.c  | 21 +++++++++++++-
 fs/nfsd/nfs4state.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/state.h     |  5 ++++
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f13abbb13b388d223165b1168dc2c07eafb259cb..fa6f2980bcacd798c41387c71d55a59fdbc8043c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2298,6 +2298,13 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
 			 union nfsd4_op_u *u)
 {
 	struct nfsd4_get_dir_delegation *gdd = &u->get_dir_delegation;
+	struct nfs4_delegation *dd;
+	struct nfsd_file *nf;
+	__be32 status;
+
+	status = nfsd_file_acquire_dir(rqstp, &cstate->current_fh, &nf);
+	if (status != nfs_ok)
+		return status;
 
 	/*
 	 * RFC 8881, section 18.39.3 says:
@@ -2311,7 +2318,19 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
 	 * return NFS4_OK with a non-fatal status of GDD4_UNAVAIL in this
 	 * situation.
 	 */
-	gdd->gddrnf_status = GDD4_UNAVAIL;
+	dd = nfsd_get_dir_deleg(cstate, gdd, nf);
+	if (IS_ERR(dd)) {
+		int err = PTR_ERR(dd);
+
+		if (err != -EAGAIN)
+			return nfserrno(err);
+		gdd->gddrnf_status = GDD4_UNAVAIL;
+		return nfs_ok;
+	}
+
+	gdd->gddrnf_status = GDD4_OK;
+	memcpy(&gdd->gddr_stateid, &dd->dl_stid.sc_stateid, sizeof(gdd->gddr_stateid));
+	nfs4_put_stid(&dd->dl_stid);
 	return nfs_ok;
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 12f20e3c9c54b68cdd4c62aa2904c22c9ccfae0a..ed5d6486d171ea0c886bd1f1ea1129bf4ccf429c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -9307,3 +9307,85 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
 	nfs4_put_stid(&dp->dl_stid);
 	return status;
 }
+
+/**
+ * nfsd_get_dir_deleg - attempt to get a directory delegation
+ * @cstate: compound state
+ * @gdd: GET_DIR_DELEGATION arg/resp structure
+ * @nf: nfsd_file opened on the directory
+ *
+ * Given a GET_DIR_DELEGATION request @gdd, attempt to acquire a delegation
+ * on the directory to which @nf refers. Note that this does not set up any
+ * sort of async notifications for the delegation.
+ */
+struct nfs4_delegation *
+nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
+		   struct nfsd4_get_dir_delegation *gdd,
+		   struct nfsd_file *nf)
+{
+	struct nfs4_client *clp = cstate->clp;
+	struct nfs4_delegation *dp;
+	struct file_lease *fl;
+	struct nfs4_file *fp;
+	int status = 0;
+
+	fp = nfsd4_alloc_file();
+	if (!fp)
+		return ERR_PTR(-ENOMEM);
+
+	nfsd4_file_init(&cstate->current_fh, fp);
+	fp->fi_deleg_file = nf;
+	fp->fi_delegees = 1;
+
+	/* if this client already has one, return that it's unavailable */
+	spin_lock(&state_lock);
+	spin_lock(&fp->fi_lock);
+	if (nfs4_delegation_exists(clp, fp))
+		status = -EAGAIN;
+	spin_unlock(&fp->fi_lock);
+	spin_unlock(&state_lock);
+
+	if (status)
+		goto out_delegees;
+
+	/* Try to set up the lease */
+	status = -ENOMEM;
+	dp = alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ);
+	if (!dp)
+		goto out_delegees;
+
+	fl = nfs4_alloc_init_lease(dp);
+	if (!fl)
+		goto out_put_stid;
+
+	status = kernel_setlease(nf->nf_file,
+				 fl->c.flc_type, &fl, NULL);
+	if (fl)
+		locks_free_lease(fl);
+	if (status)
+		goto out_put_stid;
+
+	/*
+	 * Now, try to hash it. This can fail if we race another nfsd task
+	 * trying to set a delegation on the same file. If that happens,
+	 * then just say UNAVAIL.
+	 */
+	spin_lock(&state_lock);
+	spin_lock(&clp->cl_lock);
+	spin_lock(&fp->fi_lock);
+	status = hash_delegation_locked(dp, fp);
+	spin_unlock(&fp->fi_lock);
+	spin_unlock(&clp->cl_lock);
+	spin_unlock(&state_lock);
+
+	if (!status)
+		return dp;
+
+	/* Something failed. Drop the lease and clean up the stid */
+	kernel_setlease(fp->fi_deleg_file->nf_file, F_UNLCK, NULL, (void **)&dp);
+out_put_stid:
+	nfs4_put_stid(&dp->dl_stid);
+out_delegees:
+	put_deleg_file(fp);
+	return ERR_PTR(status);
+}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 8adc2550129e67a4e6646395fa2811e1c2acb98e..0eeecd824770c4df8e1cc29fc738e568d91d5e5f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -855,4 +855,9 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
 
 extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
 		struct dentry *dentry, struct nfs4_delegation **pdp);
+
+struct nfsd4_get_dir_delegation;
+struct nfs4_delegation *nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
+						struct nfsd4_get_dir_delegation *gdd,
+						struct nfsd_file *nf);
 #endif   /* NFSD4_STATE_H */

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 14/28] filelock: rework the __break_lease API to use flags
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (12 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 13/28] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 15/28] filelock: add struct delegated_inode Jeff Layton
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Declare a set of LEASE_BREAK_* flags that can be used to control how
lease breaks work instead of requiring a type and an openmode.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c               | 30 +++++++++++++++++-----------
 include/linux/filelock.h | 52 +++++++++++++++++++++++++++++++++++-------------
 2 files changed, 56 insertions(+), 26 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 82a1b528dc9dae8c1f3a81084072e649d481e8f1..6e46176d1e00962904f03c151500e593f410e4c6 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1529,29 +1529,35 @@ any_leases_conflict(struct inode *inode, struct file_lease *breaker)
 /**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
- *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
- *	    break all leases
- *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
- *	    only delegations
+ *	@flags: LEASE_BREAK_* flags
  *
  *	break_lease (inlined for speed) has checked there already is at least
  *	some kind of lock (maybe a lease) on this file.  Leases are broken on
- *	a call to open() or truncate().  This function can sleep unless you
- *	specified %O_NONBLOCK to your open().
+ *	a call to open() or truncate().  This function can block waiting for the
+ *	lease break unless you specify LEASE_BREAK_NONBLOCK.
  */
-int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+int __break_lease(struct inode *inode, unsigned int flags)
 {
-	int error = 0;
-	struct file_lock_context *ctx;
 	struct file_lease *new_fl, *fl, *tmp;
+	struct file_lock_context *ctx;
 	unsigned long break_time;
-	int want_write = (mode & O_ACCMODE) != O_RDONLY;
 	LIST_HEAD(dispose);
+	bool want_write = !(flags & LEASE_BREAK_OPEN_RDONLY);
+	int error = 0;
+
 
 	new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
-	new_fl->c.flc_flags = type;
+
+	if (flags & LEASE_BREAK_LEASE)
+		new_fl->c.flc_flags = FL_LEASE;
+	else if (flags & LEASE_BREAK_DELEG)
+		new_fl->c.flc_flags = FL_DELEG;
+	else if (flags & LEASE_BREAK_LAYOUT)
+		new_fl->c.flc_flags = FL_LAYOUT;
+	else
+		return -EINVAL;
 
 	/* typically we will check that ctx is non-NULL before calling */
 	ctx = locks_inode_context(inode);
@@ -1596,7 +1602,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 	if (list_empty(&ctx->flc_lease))
 		goto out;
 
-	if (mode & O_NONBLOCK) {
+	if (flags & LEASE_BREAK_NONBLOCK) {
 		trace_break_lease_noblock(inode, new_fl);
 		error = -EWOULDBLOCK;
 		goto out;
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index 60c76c8fb4dfdcaaa2cfa3f41f0f26ffcb3db29f..0fe368060781d0b22f735c2cfb8d8c1a6a238290 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -221,7 +221,14 @@ int locks_lock_inode_wait(struct inode *inode, struct file_lock *fl);
 void locks_init_lease(struct file_lease *);
 void locks_free_lease(struct file_lease *fl);
 struct file_lease *locks_alloc_lease(void);
-int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
+
+#define LEASE_BREAK_LEASE		BIT(0)	// break leases and delegations
+#define LEASE_BREAK_DELEG		BIT(1)	// break delegations only
+#define LEASE_BREAK_LAYOUT		BIT(2)	// break layouts only
+#define LEASE_BREAK_NONBLOCK		BIT(3)	// non-blocking break
+#define LEASE_BREAK_OPEN_RDONLY		BIT(4)	// readonly open event
+
+int __break_lease(struct inode *inode, unsigned int flags);
 void lease_get_mtime(struct inode *, struct timespec64 *time);
 int generic_setlease(struct file *, int, struct file_lease **, void **priv);
 int kernel_setlease(struct file *, int, struct file_lease **, void **);
@@ -376,7 +383,7 @@ static inline int locks_lock_inode_wait(struct inode *inode, struct file_lock *f
 	return -ENOLCK;
 }
 
-static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+static inline int __break_lease(struct inode *inode, unsigned int flags)
 {
 	return 0;
 }
@@ -437,6 +444,17 @@ static inline int locks_lock_file_wait(struct file *filp, struct file_lock *fl)
 }
 
 #ifdef CONFIG_FILE_LOCKING
+static inline unsigned int openmode_to_lease_flags(unsigned int mode)
+{
+	unsigned int flags = 0;
+
+	if ((mode & O_ACCMODE) == O_RDONLY)
+		flags |= LEASE_BREAK_OPEN_RDONLY;
+	if (mode & O_NONBLOCK)
+		flags |= LEASE_BREAK_NONBLOCK;
+	return flags;
+}
+
 static inline int break_lease(struct inode *inode, unsigned int mode)
 {
 	struct file_lock_context *flctx;
@@ -452,11 +470,11 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 		return 0;
 	smp_mb();
 	if (!list_empty_careful(&flctx->flc_lease))
-		return __break_lease(inode, mode, FL_LEASE);
+		return __break_lease(inode, LEASE_BREAK_LEASE | openmode_to_lease_flags(mode));
 	return 0;
 }
 
-static inline int break_deleg(struct inode *inode, unsigned int mode)
+static inline int break_deleg(struct inode *inode, unsigned int flags)
 {
 	struct file_lock_context *flctx;
 
@@ -470,8 +488,10 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	if (!flctx)
 		return 0;
 	smp_mb();
-	if (!list_empty_careful(&flctx->flc_lease))
-		return __break_lease(inode, mode, FL_DELEG);
+	if (!list_empty_careful(&flctx->flc_lease)) {
+		flags |= LEASE_BREAK_DELEG;
+		return __break_lease(inode, flags);
+	}
 	return 0;
 }
 
@@ -479,7 +499,7 @@ static inline int try_break_deleg(struct inode *inode, struct inode **delegated_
 {
 	int ret;
 
-	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
+	ret = break_deleg(inode, LEASE_BREAK_NONBLOCK);
 	if (ret == -EWOULDBLOCK && delegated_inode) {
 		*delegated_inode = inode;
 		ihold(inode);
@@ -491,7 +511,7 @@ static inline int break_deleg_wait(struct inode **delegated_inode)
 {
 	int ret;
 
-	ret = break_deleg(*delegated_inode, O_WRONLY);
+	ret = break_deleg(*delegated_inode, 0);
 	iput(*delegated_inode);
 	*delegated_inode = NULL;
 	return ret;
@@ -500,20 +520,24 @@ static inline int break_deleg_wait(struct inode **delegated_inode)
 static inline int break_layout(struct inode *inode, bool wait)
 {
 	smp_mb();
-	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
-		return __break_lease(inode,
-				wait ? O_WRONLY : O_WRONLY | O_NONBLOCK,
-				FL_LAYOUT);
+	if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease)) {
+		unsigned int flags = LEASE_BREAK_LAYOUT;
+
+		if (!wait)
+			flags |= LEASE_BREAK_NONBLOCK;
+
+		return __break_lease(inode, flags);
+	}
 	return 0;
 }
 
 #else /* !CONFIG_FILE_LOCKING */
-static inline int break_lease(struct inode *inode, unsigned int mode)
+static inline int break_lease(struct inode *inode, bool wait)
 {
 	return 0;
 }
 
-static inline int break_deleg(struct inode *inode, unsigned int mode)
+static inline int break_deleg(struct inode *inode, unsigned int flags)
 {
 	return 0;
 }

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 15/28] filelock: add struct delegated_inode
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (13 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 14/28] filelock: rework the __break_lease API to use flags Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:01 ` [PATCH RFC v2 16/28] filelock: add support for ignoring deleg breaks for dir change events Jeff Layton
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Later patches will add support for ignoring certain events rather than
breaking the delegation. To do this, the VFS must inform __break_lease()
about why the delegation is being broken. Convert the delegated_inode
double pointer in various VFS functions into a struct that has a inode
and a reason for the delegation break. Also set the reason value in the
appropriate places.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/attr.c                |  4 +--
 fs/namei.c               | 75 +++++++++++++++++++++++++-----------------------
 fs/open.c                |  8 +++---
 fs/posix_acl.c           | 12 ++++----
 fs/utimes.c              |  4 +--
 fs/xattr.c               | 16 +++++------
 include/linux/filelock.h | 63 +++++++++++++++++++++++++++++-----------
 include/linux/fs.h       |  9 +++---
 include/linux/xattr.h    |  4 +--
 9 files changed, 115 insertions(+), 80 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 9caf63d20d03e86c535e9c8c91d49c2a34d34b7a..02f685a56729c2f8b3f6b6d636a9297a1e52062a 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -424,7 +424,7 @@ EXPORT_SYMBOL(may_setattr);
  * performed on the raw inode simply pass @nop_mnt_idmap.
  */
 int notify_change(struct mnt_idmap *idmap, struct dentry *dentry,
-		  struct iattr *attr, struct inode **delegated_inode)
+		  struct iattr *attr, struct delegated_inode *delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
@@ -543,7 +543,7 @@ int notify_change(struct mnt_idmap *idmap, struct dentry *dentry,
 	 * breaking the delegation in this case.
 	 */
 	if (!(ia_valid & ATTR_DELEG)) {
-		error = try_break_deleg(inode, delegated_inode);
+		error = try_break_deleg(inode, 0, delegated_inode);
 		if (error)
 			return error;
 	}
diff --git a/fs/namei.c b/fs/namei.c
index 8f0517ade308134ed6566566d9b575c4e9fb0d4e..ba9cbdfb591d54cfe3315d8821ce276a6f12700f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3372,7 +3372,7 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap,
 
 static int __vfs_create(struct mnt_idmap *idmap, struct inode *dir,
 			struct dentry *dentry, umode_t mode, bool want_excl,
-			struct inode **delegated_inode)
+			struct delegated_inode *delegated_inode)
 {
 	int error;
 
@@ -3387,7 +3387,7 @@ static int __vfs_create(struct mnt_idmap *idmap, struct inode *dir,
 	error = security_inode_create(dir, dentry, mode);
 	if (error)
 		return error;
-	error = try_break_deleg(dir, delegated_inode);
+	error = try_break_deleg(dir, LEASE_BREAK_DIR_CREATE, delegated_inode);
 	if (error)
 		return error;
 	error = dir->i_op->create(idmap, dir, dentry, mode, want_excl);
@@ -3618,8 +3618,8 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
  * An error code is returned on failure.
  */
 static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
-				  const struct open_flags *op,
-				  bool got_write, struct inode **delegated_inode)
+				  const struct open_flags *op, bool got_write,
+				  struct delegated_inode *delegated_inode)
 {
 	struct mnt_idmap *idmap;
 	struct dentry *dir = nd->path.dentry;
@@ -3709,7 +3709,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 	/* Negative dentry, just create the file */
 	if (!dentry->d_inode && (open_flag & O_CREAT)) {
 		/* but break the directory lease first! */
-		error = try_break_deleg(dir_inode, delegated_inode);
+		error = try_break_deleg(dir_inode, LEASE_BREAK_DIR_CREATE, delegated_inode);
 		if (error)
 			goto out_dput;
 
@@ -3776,7 +3776,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		   struct file *file, const struct open_flags *op)
 {
 	struct dentry *dir = nd->path.dentry;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	int open_flag = op->open_flag;
 	bool got_write = false;
 	struct dentry *dentry;
@@ -3836,7 +3836,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 		mnt_drop_write(nd->path.mnt);
 
 	if (IS_ERR(dentry)) {
-		if (delegated_inode) {
+		if (deleg_inode(&delegated_inode)) {
 			int error = break_deleg_wait(&delegated_inode);
 
 			if (!error)
@@ -4217,7 +4217,7 @@ EXPORT_SYMBOL(user_path_create);
 
 static int __vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 		       struct dentry *dentry, umode_t mode, dev_t dev,
-		       struct inode **delegated_inode)
+		       struct delegated_inode *delegated_inode)
 {
 	bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV;
 	int error = may_create(idmap, dir, dentry);
@@ -4241,7 +4241,7 @@ static int __vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	if (error)
 		return error;
 
-	error = try_break_deleg(dir, delegated_inode);
+	error = try_break_deleg(dir, LEASE_BREAK_DIR_CREATE, delegated_inode);
 	if (error)
 		return error;
 
@@ -4299,7 +4299,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	struct path path;
 	int error;
 	unsigned int lookup_flags = 0;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 
 	error = may_mknod(mode);
 	if (error)
@@ -4337,7 +4337,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode,
 	}
 out2:
 	done_path_create(&path, dentry);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry;
@@ -4364,7 +4364,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
 
 static struct dentry *__vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 				  struct dentry *dentry, umode_t mode,
-				  struct inode **delegated_inode)
+				  struct delegated_inode *delegated_inode)
 {
 	int error;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -4387,7 +4387,7 @@ static struct dentry *__vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	if (max_links && dir->i_nlink >= max_links)
 		goto err;
 
-	error = try_break_deleg(dir, delegated_inode);
+	error = try_break_deleg(dir, LEASE_BREAK_DIR_CREATE, delegated_inode);
 	if (error)
 		goto err;
 
@@ -4441,7 +4441,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
 	struct path path;
 	int error;
 	unsigned int lookup_flags = LOOKUP_DIRECTORY;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 
 retry:
 	dentry = filename_create(dfd, name, &path, lookup_flags);
@@ -4458,7 +4458,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
 			error = PTR_ERR(dentry);
 	}
 	done_path_create(&path, dentry);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry;
@@ -4483,7 +4483,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
 }
 
 static int __vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
-		       struct dentry *dentry, struct inode **delegated_inode)
+		       struct dentry *dentry, struct delegated_inode *delegated_inode)
 {
 	int error = may_delete(idmap, dir, dentry, 1);
 
@@ -4505,7 +4505,7 @@ static int __vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir,
 	if (error)
 		goto out;
 
-	error = try_break_deleg(dir, delegated_inode);
+	error = try_break_deleg(dir, LEASE_BREAK_DIR_DELETE, delegated_inode);
 	if (error)
 		goto out;
 
@@ -4555,7 +4555,7 @@ int do_rmdir(int dfd, struct filename *name)
 	struct qstr last;
 	int type;
 	unsigned int lookup_flags = 0;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 retry:
 	error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
 	if (error)
@@ -4594,7 +4594,7 @@ int do_rmdir(int dfd, struct filename *name)
 	mnt_drop_write(path.mnt);
 exit2:
 	path_put(&path);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry;
@@ -4639,7 +4639,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
  * raw inode simply pass @nop_mnt_idmap.
  */
 int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir,
-	       struct dentry *dentry, struct inode **delegated_inode)
+	       struct dentry *dentry, struct delegated_inode *delegated_inode)
 {
 	struct inode *target = dentry->d_inode;
 	int error = may_delete(idmap, dir, dentry, 0);
@@ -4658,10 +4658,10 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir,
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
-			error = try_break_deleg(dir, delegated_inode);
+			error = try_break_deleg(dir, LEASE_BREAK_DIR_DELETE, delegated_inode);
 			if (error)
 				goto out;
-			error = try_break_deleg(target, delegated_inode);
+			error = try_break_deleg(target, 0, delegated_inode);
 			if (error)
 				goto out;
 			error = dir->i_op->unlink(dir, dentry);
@@ -4700,7 +4700,7 @@ int do_unlinkat(int dfd, struct filename *name)
 	struct qstr last;
 	int type;
 	struct inode *inode = NULL;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	unsigned int lookup_flags = 0;
 retry:
 	error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
@@ -4737,7 +4737,7 @@ int do_unlinkat(int dfd, struct filename *name)
 	if (inode)
 		iput(inode);	/* truncate the inode here */
 	inode = NULL;
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
@@ -4886,7 +4886,7 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
  */
 int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
 	     struct inode *dir, struct dentry *new_dentry,
-	     struct inode **delegated_inode)
+	     struct delegated_inode *delegated_inode)
 {
 	struct inode *inode = old_dentry->d_inode;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -4930,9 +4930,9 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
 	else {
-		error = try_break_deleg(dir, delegated_inode);
+		error = try_break_deleg(dir, LEASE_BREAK_DIR_CREATE, delegated_inode);
 		if (!error)
-			error = try_break_deleg(inode, delegated_inode);
+			error = try_break_deleg(inode, 0, delegated_inode);
 		if (!error)
 			error = dir->i_op->link(old_dentry, dir, new_dentry);
 	}
@@ -4964,7 +4964,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd,
 	struct mnt_idmap *idmap;
 	struct dentry *new_dentry;
 	struct path old_path, new_path;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	int how = 0;
 	int error;
 
@@ -5008,7 +5008,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd,
 			 new_dentry, &delegated_inode);
 out_dput:
 	done_path_create(&new_path, new_dentry);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error) {
 			path_put(&old_path);
@@ -5093,7 +5093,7 @@ int vfs_rename(struct renamedata *rd)
 	struct inode *old_dir = rd->old_dir, *new_dir = rd->new_dir;
 	struct dentry *old_dentry = rd->old_dentry;
 	struct dentry *new_dentry = rd->new_dentry;
-	struct inode **delegated_inode = rd->delegated_inode;
+	struct delegated_inode *delegated_inode = rd->delegated_inode;
 	unsigned int flags = rd->flags;
 	bool is_dir = d_is_dir(old_dentry);
 	struct inode *source = old_dentry->d_inode;
@@ -5198,21 +5198,24 @@ int vfs_rename(struct renamedata *rd)
 		    old_dir->i_nlink >= max_links)
 			goto out;
 	}
-	error = try_break_deleg(old_dir, delegated_inode);
+	error = try_break_deleg(old_dir,
+				old_dir == new_dir ? LEASE_BREAK_DIR_RENAME :
+						     LEASE_BREAK_DIR_DELETE,
+				delegated_inode);
 	if (error)
 		goto out;
 	if (new_dir != old_dir) {
-		error = try_break_deleg(new_dir, delegated_inode);
+		error = try_break_deleg(new_dir, LEASE_BREAK_DIR_CREATE, delegated_inode);
 		if (error)
 			goto out;
 	}
 	if (!is_dir) {
-		error = try_break_deleg(source, delegated_inode);
+		error = try_break_deleg(source, 0, delegated_inode);
 		if (error)
 			goto out;
 	}
 	if (target && !new_is_dir) {
-		error = try_break_deleg(target, delegated_inode);
+		error = try_break_deleg(target, 0, delegated_inode);
 		if (error)
 			goto out;
 	}
@@ -5264,7 +5267,7 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd,
 	struct path old_path, new_path;
 	struct qstr old_last, new_last;
 	int old_type, new_type;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	unsigned int lookup_flags = 0, target_flags =
 		LOOKUP_RENAME_TARGET | LOOKUP_CREATE;
 	bool should_retry = false;
@@ -5373,7 +5376,7 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd,
 exit3:
 	unlock_rename(new_path.dentry, old_path.dentry);
 exit_lock_rename:
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
diff --git a/fs/open.c b/fs/open.c
index 7828234a7caa40c83e69683bd1ecfe69a90e2b49..529f9d4ee73453a9e3da818ebd4ba0eb17245521 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -630,7 +630,7 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename)
 int chmod_common(const struct path *path, umode_t mode)
 {
 	struct inode *inode = path->dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	struct iattr newattrs;
 	int error;
 
@@ -650,7 +650,7 @@ int chmod_common(const struct path *path, umode_t mode)
 			      &newattrs, &delegated_inode);
 out_unlock:
 	inode_unlock(inode);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
@@ -755,7 +755,7 @@ int chown_common(const struct path *path, uid_t user, gid_t group)
 	struct mnt_idmap *idmap;
 	struct user_namespace *fs_userns;
 	struct inode *inode = path->dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	int error;
 	struct iattr newattrs;
 	kuid_t uid;
@@ -790,7 +790,7 @@ int chown_common(const struct path *path, uid_t user, gid_t group)
 		error = notify_change(idmap, path->dentry, &newattrs,
 				      &delegated_inode);
 	inode_unlock(inode);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 4050942ab52f95741da2df13d191ade5c5ca12a2..19a45fc8e413d0fb2e2d906488c3ce648bb318a4 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -1091,7 +1091,7 @@ int vfs_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 	int acl_type;
 	int error;
 	struct inode *inode = d_inode(dentry);
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 
 	acl_type = posix_acl_type(acl_name);
 	if (acl_type < 0)
@@ -1125,7 +1125,7 @@ int vfs_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 	if (error)
 		goto out_inode_unlock;
 
-	error = try_break_deleg(inode, &delegated_inode);
+	error = try_break_deleg(inode, 0, &delegated_inode);
 	if (error)
 		goto out_inode_unlock;
 
@@ -1141,7 +1141,7 @@ int vfs_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 out_inode_unlock:
 	inode_unlock(inode);
 
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
@@ -1212,7 +1212,7 @@ int vfs_remove_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 	int acl_type;
 	int error;
 	struct inode *inode = d_inode(dentry);
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 
 	acl_type = posix_acl_type(acl_name);
 	if (acl_type < 0)
@@ -1233,7 +1233,7 @@ int vfs_remove_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 	if (error)
 		goto out_inode_unlock;
 
-	error = try_break_deleg(inode, &delegated_inode);
+	error = try_break_deleg(inode, 0, &delegated_inode);
 	if (error)
 		goto out_inode_unlock;
 
@@ -1249,7 +1249,7 @@ int vfs_remove_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 out_inode_unlock:
 	inode_unlock(inode);
 
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
diff --git a/fs/utimes.c b/fs/utimes.c
index c7c7958e57b22f91646ca9f76d18781b64d371a3..4145cbbc190ffb5990fef248300c853ec32d643f 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -22,7 +22,7 @@ int vfs_utimes(const struct path *path, struct timespec64 *times)
 	int error;
 	struct iattr newattrs;
 	struct inode *inode = path->dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 
 	if (times) {
 		if (!nsec_valid(times[0].tv_nsec) ||
@@ -66,7 +66,7 @@ int vfs_utimes(const struct path *path, struct timespec64 *times)
 	error = notify_change(mnt_idmap(path->mnt), path->dentry, &newattrs,
 			      &delegated_inode);
 	inode_unlock(inode);
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
diff --git a/fs/xattr.c b/fs/xattr.c
index 8ec5b0204bfdc587e7875893e3b1a1e1479d7d1b..1de49ba91b8e6ff9c45c461fe9963f587420cc5f 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -274,7 +274,7 @@ int __vfs_setxattr_noperm(struct mnt_idmap *idmap,
 int
 __vfs_setxattr_locked(struct mnt_idmap *idmap, struct dentry *dentry,
 		      const char *name, const void *value, size_t size,
-		      int flags, struct inode **delegated_inode)
+		      int flags, struct delegated_inode *delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	int error;
@@ -288,7 +288,7 @@ __vfs_setxattr_locked(struct mnt_idmap *idmap, struct dentry *dentry,
 	if (error)
 		goto out;
 
-	error = try_break_deleg(inode, delegated_inode);
+	error = try_break_deleg(inode, 0, delegated_inode);
 	if (error)
 		goto out;
 
@@ -305,7 +305,7 @@ vfs_setxattr(struct mnt_idmap *idmap, struct dentry *dentry,
 	     const char *name, const void *value, size_t size, int flags)
 {
 	struct inode *inode = dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	const void  *orig_value = value;
 	int error;
 
@@ -322,7 +322,7 @@ vfs_setxattr(struct mnt_idmap *idmap, struct dentry *dentry,
 				      flags, &delegated_inode);
 	inode_unlock(inode);
 
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
@@ -533,7 +533,7 @@ EXPORT_SYMBOL(__vfs_removexattr);
 int
 __vfs_removexattr_locked(struct mnt_idmap *idmap,
 			 struct dentry *dentry, const char *name,
-			 struct inode **delegated_inode)
+			 struct delegated_inode *delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	int error;
@@ -546,7 +546,7 @@ __vfs_removexattr_locked(struct mnt_idmap *idmap,
 	if (error)
 		goto out;
 
-	error = try_break_deleg(inode, delegated_inode);
+	error = try_break_deleg(inode, 0, delegated_inode);
 	if (error)
 		goto out;
 
@@ -567,7 +567,7 @@ vfs_removexattr(struct mnt_idmap *idmap, struct dentry *dentry,
 		const char *name)
 {
 	struct inode *inode = dentry->d_inode;
-	struct inode *delegated_inode = NULL;
+	struct delegated_inode delegated_inode = { };
 	int error;
 
 retry_deleg:
@@ -576,7 +576,7 @@ vfs_removexattr(struct mnt_idmap *idmap, struct dentry *dentry,
 					 name, &delegated_inode);
 	inode_unlock(inode);
 
-	if (delegated_inode) {
+	if (deleg_inode(&delegated_inode)) {
 		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index 0fe368060781d0b22f735c2cfb8d8c1a6a238290..f2b2d1e1d1ab08671895c3bfe398e5bba02353d8 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -160,6 +160,19 @@ struct file_lock_context {
 	struct list_head	flc_lease;
 };
 
+#define LEASE_BREAK_LEASE		BIT(0)	// break leases and delegations
+#define LEASE_BREAK_DELEG		BIT(1)	// break delegations only
+#define LEASE_BREAK_LAYOUT		BIT(2)	// break layouts only
+#define LEASE_BREAK_NONBLOCK		BIT(3)	// non-blocking break
+#define LEASE_BREAK_OPEN_RDONLY		BIT(4)	// readonly open event
+#define LEASE_BREAK_DIR_CREATE		BIT(6)	// dir deleg create event
+#define LEASE_BREAK_DIR_DELETE		BIT(7)	// dir deleg delete event
+#define LEASE_BREAK_DIR_RENAME		BIT(8)	// dir deleg rename event
+
+#define LEASE_BREAK_DIR_REASON_MASK	(LEASE_BREAK_DIR_CREATE | \
+					 LEASE_BREAK_DIR_DELETE | \
+					 LEASE_BREAK_DIR_RENAME)
+
 #ifdef CONFIG_FILE_LOCKING
 int fcntl_getlk(struct file *, unsigned int, struct flock *);
 int fcntl_setlk(unsigned int, struct file *, unsigned int,
@@ -222,12 +235,6 @@ void locks_init_lease(struct file_lease *);
 void locks_free_lease(struct file_lease *fl);
 struct file_lease *locks_alloc_lease(void);
 
-#define LEASE_BREAK_LEASE		BIT(0)	// break leases and delegations
-#define LEASE_BREAK_DELEG		BIT(1)	// break delegations only
-#define LEASE_BREAK_LAYOUT		BIT(2)	// break layouts only
-#define LEASE_BREAK_NONBLOCK		BIT(3)	// non-blocking break
-#define LEASE_BREAK_OPEN_RDONLY		BIT(4)	// readonly open event
-
 int __break_lease(struct inode *inode, unsigned int flags);
 void lease_get_mtime(struct inode *, struct timespec64 *time);
 int generic_setlease(struct file *, int, struct file_lease **, void **priv);
@@ -495,25 +502,41 @@ static inline int break_deleg(struct inode *inode, unsigned int flags)
 	return 0;
 }
 
-static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+struct delegated_inode {
+	struct inode *di_inode;
+	unsigned int di_reason; // LEASE_BREAK_* flags
+};
+
+static inline struct inode *deleg_inode(struct delegated_inode *di)
+{
+	return di->di_inode;
+}
+
+static inline int try_break_deleg(struct inode *inode, unsigned int reason,
+				  struct delegated_inode *di)
 {
 	int ret;
 
-	ret = break_deleg(inode, LEASE_BREAK_NONBLOCK);
-	if (ret == -EWOULDBLOCK && delegated_inode) {
-		*delegated_inode = inode;
+	/* Clear any extraneous reason bits, after warning if any are set */
+	WARN_ON_ONCE(reason & ~LEASE_BREAK_DIR_REASON_MASK);
+	reason &= LEASE_BREAK_DIR_REASON_MASK;
+
+	ret = break_deleg(inode, reason | LEASE_BREAK_NONBLOCK);
+	if (ret == -EWOULDBLOCK && di) {
+		di->di_inode = inode;
+		di->di_reason = reason;
 		ihold(inode);
 	}
 	return ret;
 }
 
-static inline int break_deleg_wait(struct inode **delegated_inode)
+static inline int break_deleg_wait(struct delegated_inode *di)
 {
 	int ret;
 
-	ret = break_deleg(*delegated_inode, 0);
-	iput(*delegated_inode);
-	*delegated_inode = NULL;
+	ret = break_deleg(di->di_inode, di->di_reason);
+	iput(di->di_inode);
+	di->di_inode = NULL;
 	return ret;
 }
 
@@ -532,6 +555,13 @@ static inline int break_layout(struct inode *inode, bool wait)
 }
 
 #else /* !CONFIG_FILE_LOCKING */
+struct delegated_inode { };
+
+static inline struct inode *deleg_inode(struct delegated_inode *di)
+{
+	return NULL;
+}
+
 static inline int break_lease(struct inode *inode, bool wait)
 {
 	return 0;
@@ -542,12 +572,13 @@ static inline int break_deleg(struct inode *inode, unsigned int flags)
 	return 0;
 }
 
-static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+static inline int try_break_deleg(struct inode *inode, unsigned int reason,
+				  struct delegated_inode *delegated_inode)
 {
 	return 0;
 }
 
-static inline int break_deleg_wait(struct inode **delegated_inode)
+static inline int break_deleg_wait(struct delegated_inode *delegated_inode)
 {
 	BUG();
 	return 0;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0db87f8e676cc8d022b28042bf6fd1af5f8928e3..172094c88165f909ee3cd53c8b02ff6d69f04e5a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -82,6 +82,7 @@ struct fs_context;
 struct fs_parameter_spec;
 struct fileattr;
 struct iomap_ops;
+struct delegated_inode;
 
 extern void __init inode_init(void);
 extern void __init inode_init_early(void);
@@ -1997,10 +1998,10 @@ int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
 int vfs_symlink(struct mnt_idmap *, struct inode *,
 		struct dentry *, const char *);
 int vfs_link(struct dentry *, struct mnt_idmap *, struct inode *,
-	     struct dentry *, struct inode **);
+	     struct dentry *, struct delegated_inode *);
 int vfs_rmdir(struct mnt_idmap *, struct inode *, struct dentry *);
 int vfs_unlink(struct mnt_idmap *, struct inode *, struct dentry *,
-	       struct inode **);
+	       struct delegated_inode *);
 
 /**
  * struct renamedata - contains all information required for renaming
@@ -2020,7 +2021,7 @@ struct renamedata {
 	struct mnt_idmap *new_mnt_idmap;
 	struct inode *new_dir;
 	struct dentry *new_dentry;
-	struct inode **delegated_inode;
+	struct delegated_inode *delegated_inode;
 	unsigned int flags;
 } __randomize_layout;
 
@@ -3028,7 +3029,7 @@ static inline int bmap(struct inode *inode,  sector_t *block)
 #endif
 
 int notify_change(struct mnt_idmap *, struct dentry *,
-		  struct iattr *, struct inode **);
+		  struct iattr *, struct delegated_inode *);
 int inode_permission(struct mnt_idmap *, struct inode *, int);
 int generic_permission(struct mnt_idmap *, struct inode *, int);
 static inline int file_permission(struct file *file, int mask)
diff --git a/include/linux/xattr.h b/include/linux/xattr.h
index 86b0d47984a16d935dd1c45ca80a3b8bb5b7295b..64e9afe7d647dc38f686a4b5c6f765e061cde54c 100644
--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -85,12 +85,12 @@ int __vfs_setxattr_noperm(struct mnt_idmap *, struct dentry *,
 			  const char *, const void *, size_t, int);
 int __vfs_setxattr_locked(struct mnt_idmap *, struct dentry *,
 			  const char *, const void *, size_t, int,
-			  struct inode **);
+			  struct delegated_inode *);
 int vfs_setxattr(struct mnt_idmap *, struct dentry *, const char *,
 		 const void *, size_t, int);
 int __vfs_removexattr(struct mnt_idmap *, struct dentry *, const char *);
 int __vfs_removexattr_locked(struct mnt_idmap *, struct dentry *,
-			     const char *, struct inode **);
+			     const char *, struct delegated_inode *);
 int vfs_removexattr(struct mnt_idmap *, struct dentry *, const char *);
 
 ssize_t generic_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 16/28] filelock: add support for ignoring deleg breaks for dir change events
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (14 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 15/28] filelock: add struct delegated_inode Jeff Layton
@ 2025-06-02 14:01 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 17/28] filelock: add an inode_lease_ignore_mask helper Jeff Layton
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:01 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

If a NFS client requests a directory delegation with a notification
bitmask covering directory change events, the server shouldn't recall
the delegation. Instead the client will be notified of the change after
the fact.

Add a support for ignoring lease breaks on directory changes.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c               | 56 ++++++++++++++++++++++++++++++++++++++++--------
 include/linux/filelock.h | 29 ++++++++++++++-----------
 2 files changed, 63 insertions(+), 22 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 6e46176d1e00962904f03c151500e593f410e4c6..95270a1fab4a1792a6fcad738cc9d937d99ad2af 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1526,15 +1526,52 @@ any_leases_conflict(struct inode *inode, struct file_lease *breaker)
 	return false;
 }
 
+static bool
+ignore_dir_deleg_break(struct file_lease *fl, unsigned int flags)
+{
+	if ((flags & LEASE_BREAK_DIR_CREATE) && (fl->c.flc_flags & FL_IGN_DIR_CREATE))
+		return true;
+	if ((flags & LEASE_BREAK_DIR_DELETE) && (fl->c.flc_flags & FL_IGN_DIR_DELETE))
+		return true;
+	if ((flags & LEASE_BREAK_DIR_RENAME) && (fl->c.flc_flags & FL_IGN_DIR_RENAME))
+		return true;
+
+	return false;
+}
+
+static bool
+visible_leases_remaining(struct inode *inode, unsigned int flags)
+{
+	struct file_lock_context *ctx = locks_inode_context(inode);
+	struct file_lease *fl;
+
+	lockdep_assert_held(&ctx->flc_lock);
+
+	if (list_empty(&ctx->flc_lease))
+		return false;
+
+	if (!S_ISDIR(inode->i_mode))
+		return true;
+
+	list_for_each_entry(fl, &ctx->flc_lease, c.flc_list) {
+		if (!ignore_dir_deleg_break(fl, flags))
+			return true;
+	}
+	return false;
+}
+
 /**
- *	__break_lease	-	revoke all outstanding leases on file
- *	@inode: the inode of the file to return
- *	@flags: LEASE_BREAK_* flags
+ * __break_lease	-	revoke all outstanding leases on file
+ * @inode: the inode of the file to return
+ * @flags: LEASE_BREAK_* flags
  *
- *	break_lease (inlined for speed) has checked there already is at least
- *	some kind of lock (maybe a lease) on this file.  Leases are broken on
- *	a call to open() or truncate().  This function can block waiting for the
- *	lease break unless you specify LEASE_BREAK_NONBLOCK.
+ * break_lease (inlined for speed) has checked there already is at least
+ * some kind of lock (maybe a lease) on this file. Leases and Delegations
+ * are broken on a call to open() or truncate(). Delegations are also
+ * broken on any event that would change the ctime. Directory delegations
+ * are broken whenever the directory changes (unless the delegation is set
+ * up to ignore the event). This function can block waiting for the lease
+ * break unless you specify LEASE_BREAK_NONBLOCK.
  */
 int __break_lease(struct inode *inode, unsigned int flags)
 {
@@ -1545,7 +1582,6 @@ int __break_lease(struct inode *inode, unsigned int flags)
 	bool want_write = !(flags & LEASE_BREAK_OPEN_RDONLY);
 	int error = 0;
 
-
 	new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
@@ -1584,6 +1620,8 @@ int __break_lease(struct inode *inode, unsigned int flags)
 	list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, c.flc_list) {
 		if (!leases_conflict(&fl->c, &new_fl->c))
 			continue;
+		if (S_ISDIR(inode->i_mode) && ignore_dir_deleg_break(fl, flags))
+			continue;
 		if (want_write) {
 			if (fl->c.flc_flags & FL_UNLOCK_PENDING)
 				continue;
@@ -1599,7 +1637,7 @@ int __break_lease(struct inode *inode, unsigned int flags)
 			locks_delete_lock_ctx(&fl->c, &dispose);
 	}
 
-	if (list_empty(&ctx->flc_lease))
+	if (!visible_leases_remaining(inode, flags))
 		goto out;
 
 	if (flags & LEASE_BREAK_NONBLOCK) {
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index f2b2d1e1d1ab08671895c3bfe398e5bba02353d8..32b30c14f5fd52727b1a18957e9dbc930c922941 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -4,19 +4,22 @@
 
 #include <linux/fs.h>
 
-#define FL_POSIX	1
-#define FL_FLOCK	2
-#define FL_DELEG	4	/* NFSv4 delegation */
-#define FL_ACCESS	8	/* not trying to lock, just looking */
-#define FL_EXISTS	16	/* when unlocking, test for existence */
-#define FL_LEASE	32	/* lease held on this file */
-#define FL_CLOSE	64	/* unlock on close */
-#define FL_SLEEP	128	/* A blocking lock */
-#define FL_DOWNGRADE_PENDING	256 /* Lease is being downgraded */
-#define FL_UNLOCK_PENDING	512 /* Lease is being broken */
-#define FL_OFDLCK	1024	/* lock is "owned" by struct file */
-#define FL_LAYOUT	2048	/* outstanding pNFS layout */
-#define FL_RECLAIM	4096	/* reclaiming from a reboot server */
+#define FL_POSIX		BIT(0)	/* POSIX lock */
+#define FL_FLOCK		BIT(1)	/* BSD lock */
+#define FL_LEASE		BIT(2)	/* file lease */
+#define FL_DELEG		BIT(3)	/* NFSv4 delegation */
+#define FL_LAYOUT		BIT(4)	/* outstanding pNFS layout */
+#define FL_ACCESS		BIT(5)	/* not trying to lock, just looking */
+#define FL_EXISTS		BIT(6)	/* when unlocking, test for existence */
+#define FL_CLOSE		BIT(7)	/* unlock on close */
+#define FL_SLEEP		BIT(8)	/* A blocking lock */
+#define FL_DOWNGRADE_PENDING	BIT(9)	/* Lease is being downgraded */
+#define FL_UNLOCK_PENDING	BIT(10) /* Lease is being broken */
+#define FL_OFDLCK		BIT(11) /* POSIX lock "owned" by struct file */
+#define FL_RECLAIM		BIT(12) /* reclaiming from a reboot server */
+#define FL_IGN_DIR_CREATE	BIT(13) /* ignore DIR_CREATE events */
+#define FL_IGN_DIR_DELETE	BIT(14) /* ignore DIR_DELETE events */
+#define FL_IGN_DIR_RENAME	BIT(15) /* ignore DIR_RENAME events */
 
 #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE)
 

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 17/28] filelock: add an inode_lease_ignore_mask helper
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (15 preceding siblings ...)
  2025-06-02 14:01 ` [PATCH RFC v2 16/28] filelock: add support for ignoring deleg breaks for dir change events Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 18/28] nfsd: add protocol support for CB_NOTIFY Jeff Layton
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add a new routine that returns a mask of all dir change events that are
currently ignored by any leases. nfsd will use this to determine how to
configure the fsnotify_mark mask.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/locks.c               | 32 ++++++++++++++++++++++++++++++++
 include/linux/filelock.h |  1 +
 2 files changed, 33 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index 95270a1fab4a1792a6fcad738cc9d937d99ad2af..522455196353f64d3150c45c9d1cd260751bd7b9 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1526,6 +1526,38 @@ any_leases_conflict(struct inode *inode, struct file_lease *breaker)
 	return false;
 }
 
+#define IGNORE_MASK	(FL_IGN_DIR_CREATE | FL_IGN_DIR_DELETE | FL_IGN_DIR_RENAME)
+
+/**
+ * inode_lease_ignore_mask - return union of all ignored inode events for this inode
+ * @inode: inode of which to get ignore mask
+ *
+ * Walk the list of leases, and return the result of all of
+ * their FL_IGN_DIR_* bits or'ed together.
+ */
+u32
+inode_lease_ignore_mask(struct inode *inode)
+{
+	struct file_lock_context *ctx;
+	struct file_lock_core *flc;
+	u32 mask = 0;
+
+	ctx = locks_inode_context(inode);
+	if (!ctx)
+		return 0;
+
+	spin_lock(&ctx->flc_lock);
+	list_for_each_entry(flc, &ctx->flc_lease, flc_list) {
+		mask |= flc->flc_flags & IGNORE_MASK;
+		/* If we already have everything, we can stop */
+		if (mask == IGNORE_MASK)
+			break;
+	}
+	spin_unlock(&ctx->flc_lock);
+	return mask;
+}
+EXPORT_SYMBOL_GPL(inode_lease_ignore_mask);
+
 static bool
 ignore_dir_deleg_break(struct file_lease *fl, unsigned int flags)
 {
diff --git a/include/linux/filelock.h b/include/linux/filelock.h
index 32b30c14f5fd52727b1a18957e9dbc930c922941..4513a8dad3974bf5fb08e0df4f085d71155e04f5 100644
--- a/include/linux/filelock.h
+++ b/include/linux/filelock.h
@@ -244,6 +244,7 @@ int generic_setlease(struct file *, int, struct file_lease **, void **priv);
 int kernel_setlease(struct file *, int, struct file_lease **, void **);
 int vfs_setlease(struct file *, int, struct file_lease **, void **);
 int lease_modify(struct file_lease *, int, struct list_head *);
+u32 inode_lease_ignore_mask(struct inode *inode);
 
 struct notifier_block;
 int lease_register_notifier(struct notifier_block *);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 18/28] nfsd: add protocol support for CB_NOTIFY
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (16 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 17/28] filelock: add an inode_lease_ignore_mask helper Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 19/28] nfsd: add callback encoding and decoding linkages " Jeff Layton
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add the necessary bits to nfs4_1.x and remove the duplicate definitions
from nfs4.h and the uapi nfs4 header. Regenerate the xdr files.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/sunrpc/xdr/nfs4_1.x    | 252 ++++++++++++++++-
 fs/nfsd/nfs4xdr_gen.c                | 506 ++++++++++++++++++++++++++++++++++-
 fs/nfsd/nfs4xdr_gen.h                |  17 +-
 fs/nfsd/trace.h                      |   1 +
 include/linux/nfs4.h                 | 127 ---------
 include/linux/sunrpc/xdrgen/nfs4_1.h | 293 +++++++++++++++++++-
 include/uapi/linux/nfs4.h            |   2 -
 7 files changed, 1057 insertions(+), 141 deletions(-)

diff --git a/Documentation/sunrpc/xdr/nfs4_1.x b/Documentation/sunrpc/xdr/nfs4_1.x
index ca95150a3a29fc5418991bf2395326bd73645ea8..2de9ba6426edd053d4c8274e079f3570244af8d4 100644
--- a/Documentation/sunrpc/xdr/nfs4_1.x
+++ b/Documentation/sunrpc/xdr/nfs4_1.x
@@ -45,13 +45,162 @@ pragma header nfs4;
 /*
  * Basic typedefs for RFC 1832 data type definitions
  */
-typedef hyper		int64_t;
-typedef unsigned int	uint32_t;
+typedef int                  int32_t;
+typedef unsigned int         uint32_t;
+typedef hyper                int64_t;
+typedef unsigned hyper       uint64_t;
+
+const NFS4_VERIFIER_SIZE        = 8;
+const NFS4_FHSIZE               = 128;
+
+enum nfsstat4 {
+ NFS4_OK                = 0,    /* everything is okay      */
+ NFS4ERR_PERM           = 1,    /* caller not privileged   */
+ NFS4ERR_NOENT          = 2,    /* no such file/directory  */
+ NFS4ERR_IO             = 5,    /* hard I/O error          */
+ NFS4ERR_NXIO           = 6,    /* no such device          */
+ NFS4ERR_ACCESS         = 13,   /* access denied           */
+ NFS4ERR_EXIST          = 17,   /* file already exists     */
+ NFS4ERR_XDEV           = 18,   /* different filesystems   */
+
+ /*
+  * Please do not allocate value 19; it was used in NFSv3
+  * and we do not want a value in NFSv3 to have a different
+  * meaning in NFSv4.x.
+  */
+
+ NFS4ERR_NOTDIR         = 20,   /* should be a directory   */
+ NFS4ERR_ISDIR          = 21,   /* should not be directory */
+ NFS4ERR_INVAL          = 22,   /* invalid argument        */
+ NFS4ERR_FBIG           = 27,   /* file exceeds server max */
+ NFS4ERR_NOSPC          = 28,   /* no space on filesystem  */
+ NFS4ERR_ROFS           = 30,   /* read-only filesystem    */
+ NFS4ERR_MLINK          = 31,   /* too many hard links     */
+ NFS4ERR_NAMETOOLONG    = 63,   /* name exceeds server max */
+ NFS4ERR_NOTEMPTY       = 66,   /* directory not empty     */
+ NFS4ERR_DQUOT          = 69,   /* hard quota limit reached*/
+ NFS4ERR_STALE          = 70,   /* file no longer exists   */
+ NFS4ERR_BADHANDLE      = 10001,/* Illegal filehandle      */
+ NFS4ERR_BAD_COOKIE     = 10003,/* READDIR cookie is stale */
+ NFS4ERR_NOTSUPP        = 10004,/* operation not supported */
+ NFS4ERR_TOOSMALL       = 10005,/* response limit exceeded */
+ NFS4ERR_SERVERFAULT    = 10006,/* undefined server error  */
+ NFS4ERR_BADTYPE        = 10007,/* type invalid for CREATE */
+ NFS4ERR_DELAY          = 10008,/* file "busy" - retry     */
+ NFS4ERR_SAME           = 10009,/* nverify says attrs same */
+ NFS4ERR_DENIED         = 10010,/* lock unavailable        */
+ NFS4ERR_EXPIRED        = 10011,/* lock lease expired      */
+ NFS4ERR_LOCKED         = 10012,/* I/O failed due to lock  */
+ NFS4ERR_GRACE          = 10013,/* in grace period         */
+ NFS4ERR_FHEXPIRED      = 10014,/* filehandle expired      */
+ NFS4ERR_SHARE_DENIED   = 10015,/* share reserve denied    */
+ NFS4ERR_WRONGSEC       = 10016,/* wrong security flavor   */
+ NFS4ERR_CLID_INUSE     = 10017,/* clientid in use         */
+
+ /* NFS4ERR_RESOURCE is not a valid error in NFSv4.1 */
+ NFS4ERR_RESOURCE       = 10018,/* resource exhaustion     */
+
+ NFS4ERR_MOVED          = 10019,/* filesystem relocated    */
+ NFS4ERR_NOFILEHANDLE   = 10020,/* current FH is not set   */
+ NFS4ERR_MINOR_VERS_MISMATCH= 10021,/* minor vers not supp */
+ NFS4ERR_STALE_CLIENTID = 10022,/* server has rebooted     */
+ NFS4ERR_STALE_STATEID  = 10023,/* server has rebooted     */
+ NFS4ERR_OLD_STATEID    = 10024,/* state is out of sync    */
+ NFS4ERR_BAD_STATEID    = 10025,/* incorrect stateid       */
+ NFS4ERR_BAD_SEQID      = 10026,/* request is out of seq.  */
+ NFS4ERR_NOT_SAME       = 10027,/* verify - attrs not same */
+ NFS4ERR_LOCK_RANGE     = 10028,/* overlapping lock range  */
+ NFS4ERR_SYMLINK        = 10029,/* should be file/directory*/
+ NFS4ERR_RESTOREFH      = 10030,/* no saved filehandle     */
+ NFS4ERR_LEASE_MOVED    = 10031,/* some filesystem moved   */
+ NFS4ERR_ATTRNOTSUPP    = 10032,/* recommended attr not sup*/
+ NFS4ERR_NO_GRACE       = 10033,/* reclaim outside of grace*/
+ NFS4ERR_RECLAIM_BAD    = 10034,/* reclaim error at server */
+ NFS4ERR_RECLAIM_CONFLICT= 10035,/* conflict on reclaim    */
+ NFS4ERR_BADXDR         = 10036,/* XDR decode failed       */
+ NFS4ERR_LOCKS_HELD     = 10037,/* file locks held at CLOSE*/
+ NFS4ERR_OPENMODE       = 10038,/* conflict in OPEN and I/O*/
+ NFS4ERR_BADOWNER       = 10039,/* owner translation bad   */
+ NFS4ERR_BADCHAR        = 10040,/* utf-8 char not supported*/
+ NFS4ERR_BADNAME        = 10041,/* name not supported      */
+ NFS4ERR_BAD_RANGE      = 10042,/* lock range not supported*/
+ NFS4ERR_LOCK_NOTSUPP   = 10043,/* no atomic up/downgrade  */
+ NFS4ERR_OP_ILLEGAL     = 10044,/* undefined operation     */
+ NFS4ERR_DEADLOCK       = 10045,/* file locking deadlock   */
+ NFS4ERR_FILE_OPEN      = 10046,/* open file blocks op.    */
+ NFS4ERR_ADMIN_REVOKED  = 10047,/* lockowner state revoked */
+ NFS4ERR_CB_PATH_DOWN   = 10048,/* callback path down      */
+
+ /* NFSv4.1 errors start here. */
+
+ NFS4ERR_BADIOMODE      = 10049,
+ NFS4ERR_BADLAYOUT      = 10050,
+ NFS4ERR_BAD_SESSION_DIGEST = 10051,
+ NFS4ERR_BADSESSION     = 10052,
+ NFS4ERR_BADSLOT        = 10053,
+ NFS4ERR_COMPLETE_ALREADY = 10054,
+ NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
+ NFS4ERR_DELEG_ALREADY_WANTED = 10056,
+ NFS4ERR_BACK_CHAN_BUSY = 10057,/*backchan reqs outstanding*/
+ NFS4ERR_LAYOUTTRYLATER = 10058,
+ NFS4ERR_LAYOUTUNAVAILABLE = 10059,
+ NFS4ERR_NOMATCHING_LAYOUT = 10060,
+ NFS4ERR_RECALLCONFLICT = 10061,
+ NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
+ NFS4ERR_SEQ_MISORDERED = 10063,/* unexpected seq.ID in req*/
+ NFS4ERR_SEQUENCE_POS   = 10064,/* [CB_]SEQ. op not 1st op */
+ NFS4ERR_REQ_TOO_BIG    = 10065,/* request too big         */
+ NFS4ERR_REP_TOO_BIG    = 10066,/* reply too big           */
+ NFS4ERR_REP_TOO_BIG_TO_CACHE =10067,/* rep. not all cached*/
+ NFS4ERR_RETRY_UNCACHED_REP =10068,/* retry & rep. uncached*/
+ NFS4ERR_UNSAFE_COMPOUND =10069,/* retry/recovery too hard */
+ NFS4ERR_TOO_MANY_OPS   = 10070,/*too many ops in [CB_]COMP*/
+ NFS4ERR_OP_NOT_IN_SESSION =10071,/* op needs [CB_]SEQ. op */
+ NFS4ERR_HASH_ALG_UNSUPP = 10072, /* hash alg. not supp.   */
+                                /* Error 10073 is unused.  */
+ NFS4ERR_CLIENTID_BUSY  = 10074,/* clientid has state      */
+ NFS4ERR_PNFS_IO_HOLE   = 10075,/* IO to _SPARSE file hole */
+ NFS4ERR_SEQ_FALSE_RETRY= 10076,/* Retry != original req.  */
+ NFS4ERR_BAD_HIGH_SLOT  = 10077,/* req has bad highest_slot*/
+ NFS4ERR_DEADSESSION    = 10078,/*new req sent to dead sess*/
+ NFS4ERR_ENCR_ALG_UNSUPP= 10079,/* encr alg. not supp.     */
+ NFS4ERR_PNFS_NO_LAYOUT = 10080,/* I/O without a layout    */
+ NFS4ERR_NOT_ONLY_OP    = 10081,/* addl ops not allowed    */
+ NFS4ERR_WRONG_CRED     = 10082,/* op done by wrong cred   */
+ NFS4ERR_WRONG_TYPE     = 10083,/* op on wrong type object */
+ NFS4ERR_DIRDELEG_UNAVAIL=10084,/* delegation not avail.   */
+ NFS4ERR_REJECT_DELEG   = 10085,/* cb rejected delegation  */
+ NFS4ERR_RETURNCONFLICT = 10086,/* layout get before return*/
+ NFS4ERR_DELEG_REVOKED  = 10087, /* deleg./layout revoked   */
+ NFS4ERR_PARTNER_NOTSUPP = 10088,
+ NFS4ERR_PARTNER_NO_AUTH = 10089,
+ NFS4ERR_UNION_NOTSUPP = 10090,
+ NFS4ERR_OFFLOAD_DENIED = 10091,
+ NFS4ERR_WRONG_LFS = 10092,
+ NFS4ERR_BADLABEL = 10093,
+ NFS4ERR_OFFLOAD_NO_REQS = 10094,
+ NFS4ERR_NOXATTR = 10095,
+ NFS4ERR_XATTR2BIG = 10096,
+
+ /* always set this to one more than the last one in the enum */
+ NFS4ERR_FIRST_FREE = 10097
+};
 
 /*
  * Basic data types
  */
+typedef opaque		attrlist4<>;
 typedef uint32_t	bitmap4<>;
+typedef uint64_t        nfs_cookie4;
+typedef opaque		nfs_fh4<NFS4_FHSIZE>;
+typedef opaque		utf8string<>;
+typedef utf8string      utf8str_cis;
+typedef utf8string      utf8str_cs;
+typedef utf8string      utf8str_mixed;
+typedef utf8str_cs      component4;
+typedef utf8str_cs      linktext4;
+typedef component4      pathname4<>;
+typedef opaque		verifier4[NFS4_VERIFIER_SIZE];
 
 /*
  * Timeval
@@ -61,6 +210,21 @@ struct nfstime4 {
 	uint32_t	nseconds;
 };
 
+/*
+ * File attribute container
+ */
+struct fattr4 {
+        bitmap4         attrmask;
+        attrlist4       attr_vals;
+};
+
+/*
+ * Stateid
+ */
+struct stateid4 {
+        uint32_t        seqid;
+        opaque          other[12];
+};
 
 /*
  * The following content was extracted from draft-ietf-nfsv4-delstid
@@ -184,3 +348,87 @@ enum open_delegation_type4 {
        OPEN_DELEGATE_READ_ATTRS_DELEG      = 4,
        OPEN_DELEGATE_WRITE_ATTRS_DELEG     = 5
 };
+
+/*
+ * Directory notification types.
+ */
+enum notify_type4 {
+        NOTIFY4_CHANGE_CHILD_ATTRS = 0,
+        NOTIFY4_CHANGE_DIR_ATTRS = 1,
+        NOTIFY4_REMOVE_ENTRY = 2,
+        NOTIFY4_ADD_ENTRY = 3,
+        NOTIFY4_RENAME_ENTRY = 4,
+        NOTIFY4_CHANGE_COOKIE_VERIFIER = 5
+};
+
+/* Changed entry information.  */
+struct notify_entry4 {
+        component4      ne_file;
+        fattr4          ne_attrs;
+};
+
+/* Previous entry information */
+struct prev_entry4 {
+        notify_entry4   pe_prev_entry;
+        /* what READDIR returned for this entry */
+        nfs_cookie4     pe_prev_entry_cookie;
+};
+
+struct notify_remove4 {
+        notify_entry4   nrm_old_entry;
+        nfs_cookie4     nrm_old_entry_cookie;
+};
+pragma public notify_remove4;
+
+struct notify_add4 {
+        /*
+         * Information on object
+         * possibly renamed over.
+         */
+        notify_remove4      nad_old_entry<1>;
+        notify_entry4       nad_new_entry;
+        /* what READDIR would have returned for this entry */
+        nfs_cookie4         nad_new_entry_cookie<1>;
+        prev_entry4         nad_prev_entry<1>;
+        bool                nad_last_entry;
+};
+pragma public notify_add4;
+
+struct notify_attr4 {
+        notify_entry4   na_changed_entry;
+};
+
+struct notify_rename4 {
+        notify_remove4  nrn_old_entry;
+        notify_add4     nrn_new_entry;
+};
+pragma public notify_rename4;
+
+struct notify_verifier4 {
+        verifier4       nv_old_cookieverf;
+        verifier4       nv_new_cookieverf;
+};
+
+/*
+ * Objects of type notify_<>4 and
+ * notify_device_<>4 are encoded in this.
+ */
+typedef opaque notifylist4<>;
+
+struct notify4 {
+        /* composed from notify_type4 or notify_deviceid_type4 */
+        bitmap4         notify_mask;
+        notifylist4     notify_vals;
+};
+
+struct CB_NOTIFY4args {
+        stateid4    cna_stateid;
+        nfs_fh4     cna_fh;
+        notify4     cna_changes<>;
+};
+pragma public CB_NOTIFY4args;
+
+struct CB_NOTIFY4res {
+        nfsstat4    cnr_status;
+};
+pragma public CB_NOTIFY4res;
diff --git a/fs/nfsd/nfs4xdr_gen.c b/fs/nfsd/nfs4xdr_gen.c
index a17b5d8e60b3579caa2e2a8b40ed757070e1a622..14e7c4aa9b07168d98fb48a54e6952bfb71d29a7 100644
--- a/fs/nfsd/nfs4xdr_gen.c
+++ b/fs/nfsd/nfs4xdr_gen.c
@@ -1,16 +1,16 @@
 // SPDX-License-Identifier: GPL-2.0
 // Generated by xdrgen. Manual edits will be lost.
 // XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x
-// XDR specification modification time: Mon Oct 14 09:10:13 2024
+// XDR specification modification time: Fri May 16 10:52:35 2025
 
 #include <linux/sunrpc/svc.h>
 
 #include "nfs4xdr_gen.h"
 
 static bool __maybe_unused
-xdrgen_decode_int64_t(struct xdr_stream *xdr, int64_t *ptr)
+xdrgen_decode_int32_t(struct xdr_stream *xdr, int32_t *ptr)
 {
-	return xdrgen_decode_hyper(xdr, ptr);
+	return xdrgen_decode_int(xdr, ptr);
 };
 
 static bool __maybe_unused
@@ -19,6 +19,35 @@ xdrgen_decode_uint32_t(struct xdr_stream *xdr, uint32_t *ptr)
 	return xdrgen_decode_unsigned_int(xdr, ptr);
 };
 
+static bool __maybe_unused
+xdrgen_decode_int64_t(struct xdr_stream *xdr, int64_t *ptr)
+{
+	return xdrgen_decode_hyper(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_uint64_t(struct xdr_stream *xdr, uint64_t *ptr)
+{
+	return xdrgen_decode_unsigned_hyper(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_nfsstat4(struct xdr_stream *xdr, nfsstat4 *ptr)
+{
+	u32 val;
+
+	if (xdr_stream_decode_u32(xdr, &val) < 0)
+		return false;
+	*ptr = val;
+	return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_attrlist4(struct xdr_stream *xdr, attrlist4 *ptr)
+{
+	return xdrgen_decode_opaque(xdr, ptr, 0);
+};
+
 static bool __maybe_unused
 xdrgen_decode_bitmap4(struct xdr_stream *xdr, bitmap4 *ptr)
 {
@@ -30,6 +59,71 @@ xdrgen_decode_bitmap4(struct xdr_stream *xdr, bitmap4 *ptr)
 	return true;
 };
 
+static bool __maybe_unused
+xdrgen_decode_nfs_cookie4(struct xdr_stream *xdr, nfs_cookie4 *ptr)
+{
+	return xdrgen_decode_uint64_t(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_nfs_fh4(struct xdr_stream *xdr, nfs_fh4 *ptr)
+{
+	return xdrgen_decode_opaque(xdr, ptr, NFS4_FHSIZE);
+};
+
+static bool __maybe_unused
+xdrgen_decode_utf8string(struct xdr_stream *xdr, utf8string *ptr)
+{
+	return xdrgen_decode_opaque(xdr, ptr, 0);
+};
+
+static bool __maybe_unused
+xdrgen_decode_utf8str_cis(struct xdr_stream *xdr, utf8str_cis *ptr)
+{
+	return xdrgen_decode_utf8string(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_utf8str_cs(struct xdr_stream *xdr, utf8str_cs *ptr)
+{
+	return xdrgen_decode_utf8string(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_utf8str_mixed(struct xdr_stream *xdr, utf8str_mixed *ptr)
+{
+	return xdrgen_decode_utf8string(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_component4(struct xdr_stream *xdr, component4 *ptr)
+{
+	return xdrgen_decode_utf8str_cs(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_linktext4(struct xdr_stream *xdr, linktext4 *ptr)
+{
+	return xdrgen_decode_utf8str_cs(xdr, ptr);
+};
+
+static bool __maybe_unused
+xdrgen_decode_pathname4(struct xdr_stream *xdr, pathname4 *ptr)
+{
+	if (xdr_stream_decode_u32(xdr, &ptr->count) < 0)
+		return false;
+	for (u32 i = 0; i < ptr->count; i++)
+		if (!xdrgen_decode_component4(xdr, &ptr->element[i]))
+			return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_decode_verifier4(struct xdr_stream *xdr, verifier4 *ptr)
+{
+	return xdr_stream_decode_opaque_fixed(xdr, ptr, NFS4_VERIFIER_SIZE) >= 0;
+};
+
 static bool __maybe_unused
 xdrgen_decode_nfstime4(struct xdr_stream *xdr, struct nfstime4 *ptr)
 {
@@ -40,6 +134,26 @@ xdrgen_decode_nfstime4(struct xdr_stream *xdr, struct nfstime4 *ptr)
 	return true;
 };
 
+static bool __maybe_unused
+xdrgen_decode_fattr4(struct xdr_stream *xdr, struct fattr4 *ptr)
+{
+	if (!xdrgen_decode_bitmap4(xdr, &ptr->attrmask))
+		return false;
+	if (!xdrgen_decode_attrlist4(xdr, &ptr->attr_vals))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_decode_stateid4(struct xdr_stream *xdr, struct stateid4 *ptr)
+{
+	if (!xdrgen_decode_uint32_t(xdr, &ptr->seqid))
+		return false;
+	if (xdr_stream_decode_opaque_fixed(xdr, ptr->other, 12) < 0)
+		return false;
+	return true;
+};
+
 static bool __maybe_unused
 xdrgen_decode_fattr4_offline(struct xdr_stream *xdr, fattr4_offline *ptr)
 {
@@ -147,9 +261,148 @@ xdrgen_decode_open_delegation_type4(struct xdr_stream *xdr, open_delegation_type
 }
 
 static bool __maybe_unused
-xdrgen_encode_int64_t(struct xdr_stream *xdr, const int64_t value)
+xdrgen_decode_notify_type4(struct xdr_stream *xdr, notify_type4 *ptr)
 {
-	return xdrgen_encode_hyper(xdr, value);
+	u32 val;
+
+	if (xdr_stream_decode_u32(xdr, &val) < 0)
+		return false;
+	*ptr = val;
+	return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_notify_entry4(struct xdr_stream *xdr, struct notify_entry4 *ptr)
+{
+	if (!xdrgen_decode_component4(xdr, &ptr->ne_file))
+		return false;
+	if (!xdrgen_decode_fattr4(xdr, &ptr->ne_attrs))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_decode_prev_entry4(struct xdr_stream *xdr, struct prev_entry4 *ptr)
+{
+	if (!xdrgen_decode_notify_entry4(xdr, &ptr->pe_prev_entry))
+		return false;
+	if (!xdrgen_decode_nfs_cookie4(xdr, &ptr->pe_prev_entry_cookie))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_decode_notify_remove4(struct xdr_stream *xdr, struct notify_remove4 *ptr)
+{
+	if (!xdrgen_decode_notify_entry4(xdr, &ptr->nrm_old_entry))
+		return false;
+	if (!xdrgen_decode_nfs_cookie4(xdr, &ptr->nrm_old_entry_cookie))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_decode_notify_add4(struct xdr_stream *xdr, struct notify_add4 *ptr)
+{
+	if (xdr_stream_decode_u32(xdr, &ptr->nad_old_entry.count) < 0)
+		return false;
+	if (ptr->nad_old_entry.count > 1)
+		return false;
+	for (u32 i = 0; i < ptr->nad_old_entry.count; i++)
+		if (!xdrgen_decode_notify_remove4(xdr, &ptr->nad_old_entry.element[i]))
+			return false;
+	if (!xdrgen_decode_notify_entry4(xdr, &ptr->nad_new_entry))
+		return false;
+	if (xdr_stream_decode_u32(xdr, &ptr->nad_new_entry_cookie.count) < 0)
+		return false;
+	if (ptr->nad_new_entry_cookie.count > 1)
+		return false;
+	for (u32 i = 0; i < ptr->nad_new_entry_cookie.count; i++)
+		if (!xdrgen_decode_nfs_cookie4(xdr, &ptr->nad_new_entry_cookie.element[i]))
+			return false;
+	if (xdr_stream_decode_u32(xdr, &ptr->nad_prev_entry.count) < 0)
+		return false;
+	if (ptr->nad_prev_entry.count > 1)
+		return false;
+	for (u32 i = 0; i < ptr->nad_prev_entry.count; i++)
+		if (!xdrgen_decode_prev_entry4(xdr, &ptr->nad_prev_entry.element[i]))
+			return false;
+	if (!xdrgen_decode_bool(xdr, &ptr->nad_last_entry))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_decode_notify_attr4(struct xdr_stream *xdr, struct notify_attr4 *ptr)
+{
+	if (!xdrgen_decode_notify_entry4(xdr, &ptr->na_changed_entry))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_decode_notify_rename4(struct xdr_stream *xdr, struct notify_rename4 *ptr)
+{
+	if (!xdrgen_decode_notify_remove4(xdr, &ptr->nrn_old_entry))
+		return false;
+	if (!xdrgen_decode_notify_add4(xdr, &ptr->nrn_new_entry))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_decode_notify_verifier4(struct xdr_stream *xdr, struct notify_verifier4 *ptr)
+{
+	if (!xdrgen_decode_verifier4(xdr, &ptr->nv_old_cookieverf))
+		return false;
+	if (!xdrgen_decode_verifier4(xdr, &ptr->nv_new_cookieverf))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_decode_notifylist4(struct xdr_stream *xdr, notifylist4 *ptr)
+{
+	return xdrgen_decode_opaque(xdr, ptr, 0);
+};
+
+static bool __maybe_unused
+xdrgen_decode_notify4(struct xdr_stream *xdr, struct notify4 *ptr)
+{
+	if (!xdrgen_decode_bitmap4(xdr, &ptr->notify_mask))
+		return false;
+	if (!xdrgen_decode_notifylist4(xdr, &ptr->notify_vals))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_decode_CB_NOTIFY4args(struct xdr_stream *xdr, struct CB_NOTIFY4args *ptr)
+{
+	if (!xdrgen_decode_stateid4(xdr, &ptr->cna_stateid))
+		return false;
+	if (!xdrgen_decode_nfs_fh4(xdr, &ptr->cna_fh))
+		return false;
+	if (xdr_stream_decode_u32(xdr, &ptr->cna_changes.count) < 0)
+		return false;
+	for (u32 i = 0; i < ptr->cna_changes.count; i++)
+		if (!xdrgen_decode_notify4(xdr, &ptr->cna_changes.element[i]))
+			return false;
+	return true;
+};
+
+bool
+xdrgen_decode_CB_NOTIFY4res(struct xdr_stream *xdr, struct CB_NOTIFY4res *ptr)
+{
+	if (!xdrgen_decode_nfsstat4(xdr, &ptr->cnr_status))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_int32_t(struct xdr_stream *xdr, const int32_t value)
+{
+	return xdrgen_encode_int(xdr, value);
 };
 
 static bool __maybe_unused
@@ -158,6 +411,30 @@ xdrgen_encode_uint32_t(struct xdr_stream *xdr, const uint32_t value)
 	return xdrgen_encode_unsigned_int(xdr, value);
 };
 
+static bool __maybe_unused
+xdrgen_encode_int64_t(struct xdr_stream *xdr, const int64_t value)
+{
+	return xdrgen_encode_hyper(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_uint64_t(struct xdr_stream *xdr, const uint64_t value)
+{
+	return xdrgen_encode_unsigned_hyper(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_nfsstat4(struct xdr_stream *xdr, nfsstat4 value)
+{
+	return xdr_stream_encode_u32(xdr, value) == XDR_UNIT;
+}
+
+static bool __maybe_unused
+xdrgen_encode_attrlist4(struct xdr_stream *xdr, const attrlist4 value)
+{
+	return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+};
+
 static bool __maybe_unused
 xdrgen_encode_bitmap4(struct xdr_stream *xdr, const bitmap4 value)
 {
@@ -169,6 +446,71 @@ xdrgen_encode_bitmap4(struct xdr_stream *xdr, const bitmap4 value)
 	return true;
 };
 
+static bool __maybe_unused
+xdrgen_encode_nfs_cookie4(struct xdr_stream *xdr, const nfs_cookie4 value)
+{
+	return xdrgen_encode_uint64_t(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_nfs_fh4(struct xdr_stream *xdr, const nfs_fh4 value)
+{
+	return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+};
+
+static bool __maybe_unused
+xdrgen_encode_utf8string(struct xdr_stream *xdr, const utf8string value)
+{
+	return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+};
+
+static bool __maybe_unused
+xdrgen_encode_utf8str_cis(struct xdr_stream *xdr, const utf8str_cis value)
+{
+	return xdrgen_encode_utf8string(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_utf8str_cs(struct xdr_stream *xdr, const utf8str_cs value)
+{
+	return xdrgen_encode_utf8string(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_utf8str_mixed(struct xdr_stream *xdr, const utf8str_mixed value)
+{
+	return xdrgen_encode_utf8string(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_component4(struct xdr_stream *xdr, const component4 value)
+{
+	return xdrgen_encode_utf8str_cs(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_linktext4(struct xdr_stream *xdr, const linktext4 value)
+{
+	return xdrgen_encode_utf8str_cs(xdr, value);
+};
+
+static bool __maybe_unused
+xdrgen_encode_pathname4(struct xdr_stream *xdr, const pathname4 value)
+{
+	if (xdr_stream_encode_u32(xdr, value.count) != XDR_UNIT)
+		return false;
+	for (u32 i = 0; i < value.count; i++)
+		if (!xdrgen_encode_component4(xdr, value.element[i]))
+			return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_verifier4(struct xdr_stream *xdr, const verifier4 value)
+{
+	return xdr_stream_encode_opaque_fixed(xdr, value, NFS4_VERIFIER_SIZE) >= 0;
+};
+
 static bool __maybe_unused
 xdrgen_encode_nfstime4(struct xdr_stream *xdr, const struct nfstime4 *value)
 {
@@ -179,6 +521,26 @@ xdrgen_encode_nfstime4(struct xdr_stream *xdr, const struct nfstime4 *value)
 	return true;
 };
 
+static bool __maybe_unused
+xdrgen_encode_fattr4(struct xdr_stream *xdr, const struct fattr4 *value)
+{
+	if (!xdrgen_encode_bitmap4(xdr, value->attrmask))
+		return false;
+	if (!xdrgen_encode_attrlist4(xdr, value->attr_vals))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_stateid4(struct xdr_stream *xdr, const struct stateid4 *value)
+{
+	if (!xdrgen_encode_uint32_t(xdr, value->seqid))
+		return false;
+	if (xdr_stream_encode_opaque_fixed(xdr, value->other, 12) < 0)
+		return false;
+	return true;
+};
+
 static bool __maybe_unused
 xdrgen_encode_fattr4_offline(struct xdr_stream *xdr, const fattr4_offline value)
 {
@@ -254,3 +616,137 @@ xdrgen_encode_open_delegation_type4(struct xdr_stream *xdr, open_delegation_type
 {
 	return xdr_stream_encode_u32(xdr, value) == XDR_UNIT;
 }
+
+static bool __maybe_unused
+xdrgen_encode_notify_type4(struct xdr_stream *xdr, notify_type4 value)
+{
+	return xdr_stream_encode_u32(xdr, value) == XDR_UNIT;
+}
+
+static bool __maybe_unused
+xdrgen_encode_notify_entry4(struct xdr_stream *xdr, const struct notify_entry4 *value)
+{
+	if (!xdrgen_encode_component4(xdr, value->ne_file))
+		return false;
+	if (!xdrgen_encode_fattr4(xdr, &value->ne_attrs))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_prev_entry4(struct xdr_stream *xdr, const struct prev_entry4 *value)
+{
+	if (!xdrgen_encode_notify_entry4(xdr, &value->pe_prev_entry))
+		return false;
+	if (!xdrgen_encode_nfs_cookie4(xdr, value->pe_prev_entry_cookie))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_encode_notify_remove4(struct xdr_stream *xdr, const struct notify_remove4 *value)
+{
+	if (!xdrgen_encode_notify_entry4(xdr, &value->nrm_old_entry))
+		return false;
+	if (!xdrgen_encode_nfs_cookie4(xdr, value->nrm_old_entry_cookie))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_encode_notify_add4(struct xdr_stream *xdr, const struct notify_add4 *value)
+{
+	if (value->nad_old_entry.count > 1)
+		return false;
+	if (xdr_stream_encode_u32(xdr, value->nad_old_entry.count) != XDR_UNIT)
+		return false;
+	for (u32 i = 0; i < value->nad_old_entry.count; i++)
+		if (!xdrgen_encode_notify_remove4(xdr, &value->nad_old_entry.element[i]))
+			return false;
+	if (!xdrgen_encode_notify_entry4(xdr, &value->nad_new_entry))
+		return false;
+	if (value->nad_new_entry_cookie.count > 1)
+		return false;
+	if (xdr_stream_encode_u32(xdr, value->nad_new_entry_cookie.count) != XDR_UNIT)
+		return false;
+	for (u32 i = 0; i < value->nad_new_entry_cookie.count; i++)
+		if (!xdrgen_encode_nfs_cookie4(xdr, value->nad_new_entry_cookie.element[i]))
+			return false;
+	if (value->nad_prev_entry.count > 1)
+		return false;
+	if (xdr_stream_encode_u32(xdr, value->nad_prev_entry.count) != XDR_UNIT)
+		return false;
+	for (u32 i = 0; i < value->nad_prev_entry.count; i++)
+		if (!xdrgen_encode_prev_entry4(xdr, &value->nad_prev_entry.element[i]))
+			return false;
+	if (!xdrgen_encode_bool(xdr, value->nad_last_entry))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_notify_attr4(struct xdr_stream *xdr, const struct notify_attr4 *value)
+{
+	if (!xdrgen_encode_notify_entry4(xdr, &value->na_changed_entry))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_encode_notify_rename4(struct xdr_stream *xdr, const struct notify_rename4 *value)
+{
+	if (!xdrgen_encode_notify_remove4(xdr, &value->nrn_old_entry))
+		return false;
+	if (!xdrgen_encode_notify_add4(xdr, &value->nrn_new_entry))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_notify_verifier4(struct xdr_stream *xdr, const struct notify_verifier4 *value)
+{
+	if (!xdrgen_encode_verifier4(xdr, value->nv_old_cookieverf))
+		return false;
+	if (!xdrgen_encode_verifier4(xdr, value->nv_new_cookieverf))
+		return false;
+	return true;
+};
+
+static bool __maybe_unused
+xdrgen_encode_notifylist4(struct xdr_stream *xdr, const notifylist4 value)
+{
+	return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+};
+
+static bool __maybe_unused
+xdrgen_encode_notify4(struct xdr_stream *xdr, const struct notify4 *value)
+{
+	if (!xdrgen_encode_bitmap4(xdr, value->notify_mask))
+		return false;
+	if (!xdrgen_encode_notifylist4(xdr, value->notify_vals))
+		return false;
+	return true;
+};
+
+bool
+xdrgen_encode_CB_NOTIFY4args(struct xdr_stream *xdr, const struct CB_NOTIFY4args *value)
+{
+	if (!xdrgen_encode_stateid4(xdr, &value->cna_stateid))
+		return false;
+	if (!xdrgen_encode_nfs_fh4(xdr, value->cna_fh))
+		return false;
+	if (xdr_stream_encode_u32(xdr, value->cna_changes.count) != XDR_UNIT)
+		return false;
+	for (u32 i = 0; i < value->cna_changes.count; i++)
+		if (!xdrgen_encode_notify4(xdr, &value->cna_changes.element[i]))
+			return false;
+	return true;
+};
+
+bool
+xdrgen_encode_CB_NOTIFY4res(struct xdr_stream *xdr, const struct CB_NOTIFY4res *value)
+{
+	if (!xdrgen_encode_nfsstat4(xdr, value->cnr_status))
+		return false;
+	return true;
+};
diff --git a/fs/nfsd/nfs4xdr_gen.h b/fs/nfsd/nfs4xdr_gen.h
index 41a0033b72562ee3c1fcdcd4a887ce635385b22b..c2936e1188007a5c6a6a4f3f373a69728bf7459c 100644
--- a/fs/nfsd/nfs4xdr_gen.h
+++ b/fs/nfsd/nfs4xdr_gen.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /* Generated by xdrgen. Manual edits will be lost. */
 /* XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x */
-/* XDR specification modification time: Mon Oct 14 09:10:13 2024 */
+/* XDR specification modification time: Fri May 16 10:52:35 2025 */
 
 #ifndef _LINUX_XDRGEN_NFS4_1_DECL_H
 #define _LINUX_XDRGEN_NFS4_1_DECL_H
@@ -22,4 +22,19 @@ bool xdrgen_encode_fattr4_time_deleg_access(struct xdr_stream *xdr, const fattr4
 bool xdrgen_decode_fattr4_time_deleg_modify(struct xdr_stream *xdr, fattr4_time_deleg_modify *ptr);
 bool xdrgen_encode_fattr4_time_deleg_modify(struct xdr_stream *xdr, const fattr4_time_deleg_modify *value);
 
+bool xdrgen_decode_notify_remove4(struct xdr_stream *xdr, struct notify_remove4 *ptr);
+bool xdrgen_encode_notify_remove4(struct xdr_stream *xdr, const struct notify_remove4 *value);
+
+bool xdrgen_decode_notify_add4(struct xdr_stream *xdr, struct notify_add4 *ptr);
+bool xdrgen_encode_notify_add4(struct xdr_stream *xdr, const struct notify_add4 *value);
+
+bool xdrgen_decode_notify_rename4(struct xdr_stream *xdr, struct notify_rename4 *ptr);
+bool xdrgen_encode_notify_rename4(struct xdr_stream *xdr, const struct notify_rename4 *value);
+
+bool xdrgen_decode_CB_NOTIFY4args(struct xdr_stream *xdr, struct CB_NOTIFY4args *ptr);
+bool xdrgen_encode_CB_NOTIFY4args(struct xdr_stream *xdr, const struct CB_NOTIFY4args *value);
+
+bool xdrgen_decode_CB_NOTIFY4res(struct xdr_stream *xdr, struct CB_NOTIFY4res *ptr);
+bool xdrgen_encode_CB_NOTIFY4res(struct xdr_stream *xdr, const struct CB_NOTIFY4res *value);
+
 #endif /* _LINUX_XDRGEN_NFS4_1_DECL_H */
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 3c5505ef5e3a38d805a48ea4e190063b5341684d..0c68df50eae248c7c9afe0437dfcf29837e09275 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1615,6 +1615,7 @@ TRACE_EVENT(nfsd_cb_setup_err,
 		{ OP_CB_RECALL,			"CB_RECALL" },		\
 		{ OP_CB_LAYOUTRECALL,		"CB_LAYOUTRECALL" },	\
 		{ OP_CB_RECALL_ANY,		"CB_RECALL_ANY" },	\
+		{ OP_CB_NOTIFY,			"CB_NOTIFY" },		\
 		{ OP_CB_NOTIFY_LOCK,		"CB_NOTIFY_LOCK" },	\
 		{ OP_CB_OFFLOAD,		"CB_OFFLOAD" })
 
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index d8cad844870aa74ce1e0cc78c499fb001d898c93..5e86622259ae75dc199cd54bcddc669d3cda1a99 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -170,133 +170,6 @@ Needs to be updated if more operations are defined in future.*/
 #define LAST_NFS42_OP	OP_REMOVEXATTR
 #define LAST_NFS4_OP	LAST_NFS42_OP
 
-enum nfsstat4 {
-	NFS4_OK = 0,
-	NFS4ERR_PERM = 1,
-	NFS4ERR_NOENT = 2,
-	NFS4ERR_IO = 5,
-	NFS4ERR_NXIO = 6,
-	NFS4ERR_ACCESS = 13,
-	NFS4ERR_EXIST = 17,
-	NFS4ERR_XDEV = 18,
-	/* Unused/reserved 19 */
-	NFS4ERR_NOTDIR = 20,
-	NFS4ERR_ISDIR = 21,
-	NFS4ERR_INVAL = 22,
-	NFS4ERR_FBIG = 27,
-	NFS4ERR_NOSPC = 28,
-	NFS4ERR_ROFS = 30,
-	NFS4ERR_MLINK = 31,
-	NFS4ERR_NAMETOOLONG = 63,
-	NFS4ERR_NOTEMPTY = 66,
-	NFS4ERR_DQUOT = 69,
-	NFS4ERR_STALE = 70,
-	NFS4ERR_BADHANDLE = 10001,
-	NFS4ERR_BAD_COOKIE = 10003,
-	NFS4ERR_NOTSUPP = 10004,
-	NFS4ERR_TOOSMALL = 10005,
-	NFS4ERR_SERVERFAULT = 10006,
-	NFS4ERR_BADTYPE = 10007,
-	NFS4ERR_DELAY = 10008,
-	NFS4ERR_SAME = 10009,
-	NFS4ERR_DENIED = 10010,
-	NFS4ERR_EXPIRED = 10011,
-	NFS4ERR_LOCKED = 10012,
-	NFS4ERR_GRACE = 10013,
-	NFS4ERR_FHEXPIRED = 10014,
-	NFS4ERR_SHARE_DENIED = 10015,
-	NFS4ERR_WRONGSEC = 10016,
-	NFS4ERR_CLID_INUSE = 10017,
-	NFS4ERR_RESOURCE = 10018,
-	NFS4ERR_MOVED = 10019,
-	NFS4ERR_NOFILEHANDLE = 10020,
-	NFS4ERR_MINOR_VERS_MISMATCH = 10021,
-	NFS4ERR_STALE_CLIENTID = 10022,
-	NFS4ERR_STALE_STATEID = 10023,
-	NFS4ERR_OLD_STATEID = 10024,
-	NFS4ERR_BAD_STATEID = 10025,
-	NFS4ERR_BAD_SEQID = 10026,
-	NFS4ERR_NOT_SAME = 10027,
-	NFS4ERR_LOCK_RANGE = 10028,
-	NFS4ERR_SYMLINK = 10029,
-	NFS4ERR_RESTOREFH = 10030,
-	NFS4ERR_LEASE_MOVED = 10031,
-	NFS4ERR_ATTRNOTSUPP = 10032,
-	NFS4ERR_NO_GRACE = 10033,
-	NFS4ERR_RECLAIM_BAD = 10034,
-	NFS4ERR_RECLAIM_CONFLICT = 10035,
-	NFS4ERR_BADXDR = 10036,
-	NFS4ERR_LOCKS_HELD = 10037,
-	NFS4ERR_OPENMODE = 10038,
-	NFS4ERR_BADOWNER = 10039,
-	NFS4ERR_BADCHAR = 10040,
-	NFS4ERR_BADNAME = 10041,
-	NFS4ERR_BAD_RANGE = 10042,
-	NFS4ERR_LOCK_NOTSUPP = 10043,
-	NFS4ERR_OP_ILLEGAL = 10044,
-	NFS4ERR_DEADLOCK = 10045,
-	NFS4ERR_FILE_OPEN = 10046,
-	NFS4ERR_ADMIN_REVOKED = 10047,
-	NFS4ERR_CB_PATH_DOWN = 10048,
-
-	/* nfs41 */
-	NFS4ERR_BADIOMODE	= 10049,
-	NFS4ERR_BADLAYOUT	= 10050,
-	NFS4ERR_BAD_SESSION_DIGEST = 10051,
-	NFS4ERR_BADSESSION	= 10052,
-	NFS4ERR_BADSLOT		= 10053,
-	NFS4ERR_COMPLETE_ALREADY = 10054,
-	NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
-	NFS4ERR_DELEG_ALREADY_WANTED = 10056,
-	NFS4ERR_BACK_CHAN_BUSY	= 10057,	/* backchan reqs outstanding */
-	NFS4ERR_LAYOUTTRYLATER	= 10058,
-	NFS4ERR_LAYOUTUNAVAILABLE = 10059,
-	NFS4ERR_NOMATCHING_LAYOUT = 10060,
-	NFS4ERR_RECALLCONFLICT	= 10061,
-	NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
-	NFS4ERR_SEQ_MISORDERED = 10063, 	/* unexpected seq.id in req */
-	NFS4ERR_SEQUENCE_POS	= 10064,	/* [CB_]SEQ. op not 1st op */
-	NFS4ERR_REQ_TOO_BIG	= 10065,	/* request too big */
-	NFS4ERR_REP_TOO_BIG	= 10066,	/* reply too big */
-	NFS4ERR_REP_TOO_BIG_TO_CACHE = 10067,	/* rep. not all cached */
-	NFS4ERR_RETRY_UNCACHED_REP = 10068,	/* retry & rep. uncached */
-	NFS4ERR_UNSAFE_COMPOUND = 10069,	/* retry/recovery too hard */
-	NFS4ERR_TOO_MANY_OPS	= 10070,	/* too many ops in [CB_]COMP */
-	NFS4ERR_OP_NOT_IN_SESSION = 10071,	/* op needs [CB_]SEQ. op */
-	NFS4ERR_HASH_ALG_UNSUPP = 10072,	/* hash alg. not supp. */
-						/* Error 10073 is unused. */
-	NFS4ERR_CLIENTID_BUSY	= 10074,	/* clientid has state */
-	NFS4ERR_PNFS_IO_HOLE	= 10075,	/* IO to _SPARSE file hole */
-	NFS4ERR_SEQ_FALSE_RETRY	= 10076,	/* retry not original */
-	NFS4ERR_BAD_HIGH_SLOT	= 10077,	/* sequence arg bad */
-	NFS4ERR_DEADSESSION	= 10078,	/* persistent session dead */
-	NFS4ERR_ENCR_ALG_UNSUPP = 10079,	/* SSV alg mismatch */
-	NFS4ERR_PNFS_NO_LAYOUT	= 10080,	/* direct I/O with no layout */
-	NFS4ERR_NOT_ONLY_OP	= 10081,	/* bad compound */
-	NFS4ERR_WRONG_CRED	= 10082,	/* permissions:state change */
-	NFS4ERR_WRONG_TYPE	= 10083,	/* current operation mismatch */
-	NFS4ERR_DIRDELEG_UNAVAIL = 10084,	/* no directory delegation */
-	NFS4ERR_REJECT_DELEG	= 10085,	/* on callback */
-	NFS4ERR_RETURNCONFLICT	= 10086,	/* outstanding layoutreturn */
-	NFS4ERR_DELEG_REVOKED	= 10087,	/* deleg./layout revoked */
-
-	/* nfs42 */
-	NFS4ERR_PARTNER_NOTSUPP	= 10088,
-	NFS4ERR_PARTNER_NO_AUTH	= 10089,
-	NFS4ERR_UNION_NOTSUPP	= 10090,
-	NFS4ERR_OFFLOAD_DENIED	= 10091,
-	NFS4ERR_WRONG_LFS	= 10092,
-	NFS4ERR_BADLABEL	= 10093,
-	NFS4ERR_OFFLOAD_NO_REQS	= 10094,
-
-	/* xattr (RFC8276) */
-	NFS4ERR_NOXATTR		= 10095,
-	NFS4ERR_XATTR2BIG	= 10096,
-
-	/* can be used for internal errors */
-	NFS4ERR_FIRST_FREE
-};
-
 /* error codes for internal client use */
 #define NFS4ERR_RESET_TO_MDS   12001
 #define NFS4ERR_RESET_TO_PNFS  12002
diff --git a/include/linux/sunrpc/xdrgen/nfs4_1.h b/include/linux/sunrpc/xdrgen/nfs4_1.h
index cf21a14aa8850f4b21cd365cb7bc22a02c6097ce..e7bd95e3e19c8b4b8c69119457eac9abc486b0bd 100644
--- a/include/linux/sunrpc/xdrgen/nfs4_1.h
+++ b/include/linux/sunrpc/xdrgen/nfs4_1.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /* Generated by xdrgen. Manual edits will be lost. */
 /* XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x */
-/* XDR specification modification time: Mon Oct 14 09:10:13 2024 */
+/* XDR specification modification time: Fri Nov  1 12:17:17 2024 */
 
 #ifndef _LINUX_XDRGEN_NFS4_1_DEF_H
 #define _LINUX_XDRGEN_NFS4_1_DEF_H
@@ -9,20 +9,181 @@
 #include <linux/types.h>
 #include <linux/sunrpc/xdrgen/_defs.h>
 
-typedef s64 int64_t;
+typedef s32 int32_t;
 
 typedef u32 uint32_t;
 
+typedef s64 int64_t;
+
+typedef u64 uint64_t;
+
+enum { NFS4_VERIFIER_SIZE = 8 };
+
+enum { NFS4_FHSIZE = 128 };
+
+enum nfsstat4 {
+	NFS4_OK = 0,
+	NFS4ERR_PERM = 1,
+	NFS4ERR_NOENT = 2,
+	NFS4ERR_IO = 5,
+	NFS4ERR_NXIO = 6,
+	NFS4ERR_ACCESS = 13,
+	NFS4ERR_EXIST = 17,
+	NFS4ERR_XDEV = 18,
+	NFS4ERR_NOTDIR = 20,
+	NFS4ERR_ISDIR = 21,
+	NFS4ERR_INVAL = 22,
+	NFS4ERR_FBIG = 27,
+	NFS4ERR_NOSPC = 28,
+	NFS4ERR_ROFS = 30,
+	NFS4ERR_MLINK = 31,
+	NFS4ERR_NAMETOOLONG = 63,
+	NFS4ERR_NOTEMPTY = 66,
+	NFS4ERR_DQUOT = 69,
+	NFS4ERR_STALE = 70,
+	NFS4ERR_BADHANDLE = 10001,
+	NFS4ERR_BAD_COOKIE = 10003,
+	NFS4ERR_NOTSUPP = 10004,
+	NFS4ERR_TOOSMALL = 10005,
+	NFS4ERR_SERVERFAULT = 10006,
+	NFS4ERR_BADTYPE = 10007,
+	NFS4ERR_DELAY = 10008,
+	NFS4ERR_SAME = 10009,
+	NFS4ERR_DENIED = 10010,
+	NFS4ERR_EXPIRED = 10011,
+	NFS4ERR_LOCKED = 10012,
+	NFS4ERR_GRACE = 10013,
+	NFS4ERR_FHEXPIRED = 10014,
+	NFS4ERR_SHARE_DENIED = 10015,
+	NFS4ERR_WRONGSEC = 10016,
+	NFS4ERR_CLID_INUSE = 10017,
+	NFS4ERR_RESOURCE = 10018,
+	NFS4ERR_MOVED = 10019,
+	NFS4ERR_NOFILEHANDLE = 10020,
+	NFS4ERR_MINOR_VERS_MISMATCH = 10021,
+	NFS4ERR_STALE_CLIENTID = 10022,
+	NFS4ERR_STALE_STATEID = 10023,
+	NFS4ERR_OLD_STATEID = 10024,
+	NFS4ERR_BAD_STATEID = 10025,
+	NFS4ERR_BAD_SEQID = 10026,
+	NFS4ERR_NOT_SAME = 10027,
+	NFS4ERR_LOCK_RANGE = 10028,
+	NFS4ERR_SYMLINK = 10029,
+	NFS4ERR_RESTOREFH = 10030,
+	NFS4ERR_LEASE_MOVED = 10031,
+	NFS4ERR_ATTRNOTSUPP = 10032,
+	NFS4ERR_NO_GRACE = 10033,
+	NFS4ERR_RECLAIM_BAD = 10034,
+	NFS4ERR_RECLAIM_CONFLICT = 10035,
+	NFS4ERR_BADXDR = 10036,
+	NFS4ERR_LOCKS_HELD = 10037,
+	NFS4ERR_OPENMODE = 10038,
+	NFS4ERR_BADOWNER = 10039,
+	NFS4ERR_BADCHAR = 10040,
+	NFS4ERR_BADNAME = 10041,
+	NFS4ERR_BAD_RANGE = 10042,
+	NFS4ERR_LOCK_NOTSUPP = 10043,
+	NFS4ERR_OP_ILLEGAL = 10044,
+	NFS4ERR_DEADLOCK = 10045,
+	NFS4ERR_FILE_OPEN = 10046,
+	NFS4ERR_ADMIN_REVOKED = 10047,
+	NFS4ERR_CB_PATH_DOWN = 10048,
+	NFS4ERR_BADIOMODE = 10049,
+	NFS4ERR_BADLAYOUT = 10050,
+	NFS4ERR_BAD_SESSION_DIGEST = 10051,
+	NFS4ERR_BADSESSION = 10052,
+	NFS4ERR_BADSLOT = 10053,
+	NFS4ERR_COMPLETE_ALREADY = 10054,
+	NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
+	NFS4ERR_DELEG_ALREADY_WANTED = 10056,
+	NFS4ERR_BACK_CHAN_BUSY = 10057,
+	NFS4ERR_LAYOUTTRYLATER = 10058,
+	NFS4ERR_LAYOUTUNAVAILABLE = 10059,
+	NFS4ERR_NOMATCHING_LAYOUT = 10060,
+	NFS4ERR_RECALLCONFLICT = 10061,
+	NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
+	NFS4ERR_SEQ_MISORDERED = 10063,
+	NFS4ERR_SEQUENCE_POS = 10064,
+	NFS4ERR_REQ_TOO_BIG = 10065,
+	NFS4ERR_REP_TOO_BIG = 10066,
+	NFS4ERR_REP_TOO_BIG_TO_CACHE = 10067,
+	NFS4ERR_RETRY_UNCACHED_REP = 10068,
+	NFS4ERR_UNSAFE_COMPOUND = 10069,
+	NFS4ERR_TOO_MANY_OPS = 10070,
+	NFS4ERR_OP_NOT_IN_SESSION = 10071,
+	NFS4ERR_HASH_ALG_UNSUPP = 10072,
+	NFS4ERR_CLIENTID_BUSY = 10074,
+	NFS4ERR_PNFS_IO_HOLE = 10075,
+	NFS4ERR_SEQ_FALSE_RETRY = 10076,
+	NFS4ERR_BAD_HIGH_SLOT = 10077,
+	NFS4ERR_DEADSESSION = 10078,
+	NFS4ERR_ENCR_ALG_UNSUPP = 10079,
+	NFS4ERR_PNFS_NO_LAYOUT = 10080,
+	NFS4ERR_NOT_ONLY_OP = 10081,
+	NFS4ERR_WRONG_CRED = 10082,
+	NFS4ERR_WRONG_TYPE = 10083,
+	NFS4ERR_DIRDELEG_UNAVAIL = 10084,
+	NFS4ERR_REJECT_DELEG = 10085,
+	NFS4ERR_RETURNCONFLICT = 10086,
+	NFS4ERR_DELEG_REVOKED = 10087,
+	NFS4ERR_PARTNER_NOTSUPP = 10088,
+	NFS4ERR_PARTNER_NO_AUTH = 10089,
+	NFS4ERR_UNION_NOTSUPP = 10090,
+	NFS4ERR_OFFLOAD_DENIED = 10091,
+	NFS4ERR_WRONG_LFS = 10092,
+	NFS4ERR_BADLABEL = 10093,
+	NFS4ERR_OFFLOAD_NO_REQS = 10094,
+	NFS4ERR_NOXATTR = 10095,
+	NFS4ERR_XATTR2BIG = 10096,
+	NFS4ERR_FIRST_FREE = 10097,
+};
+typedef enum nfsstat4 nfsstat4;
+
+typedef opaque attrlist4;
+
 typedef struct {
 	u32 count;
 	uint32_t *element;
 } bitmap4;
 
+typedef uint64_t nfs_cookie4;
+
+typedef opaque nfs_fh4;
+
+typedef opaque utf8string;
+
+typedef utf8string utf8str_cis;
+
+typedef utf8string utf8str_cs;
+
+typedef utf8string utf8str_mixed;
+
+typedef utf8str_cs component4;
+
+typedef utf8str_cs linktext4;
+
+typedef struct {
+	u32 count;
+	component4 *element;
+} pathname4;
+
+typedef u8 verifier4[NFS4_VERIFIER_SIZE];
+
 struct nfstime4 {
 	int64_t seconds;
 	uint32_t nseconds;
 };
 
+struct fattr4 {
+	bitmap4 attrmask;
+	attrlist4 attr_vals;
+};
+
+struct stateid4 {
+	uint32_t seqid;
+	u8 other[12];
+};
+
 typedef bool fattr4_offline;
 
 enum { FATTR4_OFFLINE = 83 };
@@ -126,13 +287,115 @@ enum open_delegation_type4 {
 };
 typedef enum open_delegation_type4 open_delegation_type4;
 
-#define NFS4_int64_t_sz                 \
-	(XDR_hyper)
+enum notify_type4 {
+	NOTIFY4_CHANGE_CHILD_ATTRS = 0,
+	NOTIFY4_CHANGE_DIR_ATTRS = 1,
+	NOTIFY4_REMOVE_ENTRY = 2,
+	NOTIFY4_ADD_ENTRY = 3,
+	NOTIFY4_RENAME_ENTRY = 4,
+	NOTIFY4_CHANGE_COOKIE_VERIFIER = 5,
+};
+typedef enum notify_type4 notify_type4;
+
+struct notify_entry4 {
+	component4 ne_file;
+	struct fattr4 ne_attrs;
+};
+
+struct prev_entry4 {
+	struct notify_entry4 pe_prev_entry;
+	nfs_cookie4 pe_prev_entry_cookie;
+};
+
+struct notify_remove4 {
+	struct notify_entry4 nrm_old_entry;
+	nfs_cookie4 nrm_old_entry_cookie;
+};
+
+struct notify_add4 {
+	struct {
+		u32 count;
+		struct notify_remove4 *element;
+	} nad_old_entry;
+	struct notify_entry4 nad_new_entry;
+	struct {
+		u32 count;
+		nfs_cookie4 *element;
+	} nad_new_entry_cookie;
+	struct {
+		u32 count;
+		struct prev_entry4 *element;
+	} nad_prev_entry;
+	bool nad_last_entry;
+};
+
+struct notify_attr4 {
+	struct notify_entry4 na_changed_entry;
+};
+
+struct notify_rename4 {
+	struct notify_remove4 nrn_old_entry;
+	struct notify_add4 nrn_new_entry;
+};
+
+struct notify_verifier4 {
+	verifier4 nv_old_cookieverf;
+	verifier4 nv_new_cookieverf;
+};
+
+typedef opaque notifylist4;
+
+struct notify4 {
+	bitmap4 notify_mask;
+	notifylist4 notify_vals;
+};
+
+struct CB_NOTIFY4args {
+	struct stateid4 cna_stateid;
+	nfs_fh4 cna_fh;
+	struct {
+		u32 count;
+		struct notify4 *element;
+	} cna_changes;
+};
+
+struct CB_NOTIFY4res {
+	nfsstat4 cnr_status;
+};
+
+#define NFS4_int32_t_sz                 \
+	(XDR_int)
 #define NFS4_uint32_t_sz                \
 	(XDR_unsigned_int)
+#define NFS4_int64_t_sz                 \
+	(XDR_hyper)
+#define NFS4_uint64_t_sz                \
+	(XDR_unsigned_hyper)
+#define NFS4_nfsstat4_sz                (XDR_int)
+#define NFS4_attrlist4_sz               (XDR_unsigned_int)
 #define NFS4_bitmap4_sz                 (XDR_unsigned_int)
+#define NFS4_nfs_cookie4_sz             \
+	(NFS4_uint64_t_sz)
+#define NFS4_nfs_fh4_sz                 (XDR_unsigned_int + XDR_QUADLEN(NFS4_FHSIZE))
+#define NFS4_utf8string_sz              (XDR_unsigned_int)
+#define NFS4_utf8str_cis_sz             \
+	(NFS4_utf8string_sz)
+#define NFS4_utf8str_cs_sz              \
+	(NFS4_utf8string_sz)
+#define NFS4_utf8str_mixed_sz           \
+	(NFS4_utf8string_sz)
+#define NFS4_component4_sz              \
+	(NFS4_utf8str_cs_sz)
+#define NFS4_linktext4_sz               \
+	(NFS4_utf8str_cs_sz)
+#define NFS4_pathname4_sz               (XDR_unsigned_int)
+#define NFS4_verifier4_sz               (XDR_QUADLEN(NFS4_VERIFIER_SIZE))
 #define NFS4_nfstime4_sz                \
 	(NFS4_int64_t_sz + NFS4_uint32_t_sz)
+#define NFS4_fattr4_sz                  \
+	(NFS4_bitmap4_sz + NFS4_attrlist4_sz)
+#define NFS4_stateid4_sz                \
+	(NFS4_uint32_t_sz + XDR_QUADLEN(12))
 #define NFS4_fattr4_offline_sz          \
 	(XDR_bool)
 #define NFS4_open_arguments4_sz         \
@@ -149,5 +412,27 @@ typedef enum open_delegation_type4 open_delegation_type4;
 #define NFS4_fattr4_time_deleg_modify_sz \
 	(NFS4_nfstime4_sz)
 #define NFS4_open_delegation_type4_sz   (XDR_int)
+#define NFS4_notify_type4_sz            (XDR_int)
+#define NFS4_notify_entry4_sz           \
+	(NFS4_component4_sz + NFS4_fattr4_sz)
+#define NFS4_prev_entry4_sz             \
+	(NFS4_notify_entry4_sz + NFS4_nfs_cookie4_sz)
+#define NFS4_notify_remove4_sz          \
+	(NFS4_notify_entry4_sz + NFS4_nfs_cookie4_sz)
+#define NFS4_notify_add4_sz             \
+	(XDR_unsigned_int + (1 * (NFS4_notify_remove4_sz)) + NFS4_notify_entry4_sz + XDR_unsigned_int + (1 * (NFS4_nfs_cookie4_sz)) + XDR_unsigned_int + (1 * (NFS4_prev_entry4_sz)) + XDR_bool)
+#define NFS4_notify_attr4_sz            \
+	(NFS4_notify_entry4_sz)
+#define NFS4_notify_rename4_sz          \
+	(NFS4_notify_remove4_sz + NFS4_notify_add4_sz)
+#define NFS4_notify_verifier4_sz        \
+	(NFS4_verifier4_sz + NFS4_verifier4_sz)
+#define NFS4_notifylist4_sz             (XDR_unsigned_int)
+#define NFS4_notify4_sz                 \
+	(NFS4_bitmap4_sz + NFS4_notifylist4_sz)
+#define NFS4_CB_NOTIFY4args_sz          \
+	(NFS4_stateid4_sz + NFS4_nfs_fh4_sz + XDR_unsigned_int)
+#define NFS4_CB_NOTIFY4res_sz           \
+	(NFS4_nfsstat4_sz)
 
 #endif /* _LINUX_XDRGEN_NFS4_1_DEF_H */
diff --git a/include/uapi/linux/nfs4.h b/include/uapi/linux/nfs4.h
index 4273e0249fcbb54996f5642f9920826b9d68b7b9..289205b53a0858e589380c69ad1ba0cfd5f825fd 100644
--- a/include/uapi/linux/nfs4.h
+++ b/include/uapi/linux/nfs4.h
@@ -17,11 +17,9 @@
 #include <linux/types.h>
 
 #define NFS4_BITMAP_SIZE	3
-#define NFS4_VERIFIER_SIZE	8
 #define NFS4_STATEID_SEQID_SIZE 4
 #define NFS4_STATEID_OTHER_SIZE 12
 #define NFS4_STATEID_SIZE	(NFS4_STATEID_SEQID_SIZE + NFS4_STATEID_OTHER_SIZE)
-#define NFS4_FHSIZE		128
 #define NFS4_MAXPATHLEN		PATH_MAX
 #define NFS4_MAXNAMLEN		NAME_MAX
 #define NFS4_OPAQUE_LIMIT	1024

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 19/28] nfsd: add callback encoding and decoding linkages for CB_NOTIFY
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (17 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 18/28] nfsd: add protocol support for CB_NOTIFY Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 20/28] nfsd: add data structures for handling CB_NOTIFY to directory delegation Jeff Layton
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add routines for encoding and decoding CB_NOTIFY messages. These call
into the code generated by xdrgen to do the actual encoding and
decoding.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/state.h        |  1 +
 fs/nfsd/xdr4cb.h       | 11 +++++++++++
 3 files changed, 58 insertions(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index e00b2aea8da2b93f366d88888f404734953f1942..2dca686d67fc0f0fcf7997a252b4f5988b9de6c7 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -865,6 +865,51 @@ static void encode_stateowner(struct xdr_stream *xdr, struct nfs4_stateowner *so
 	xdr_encode_opaque(p, so->so_owner.data, so->so_owner.len);
 }
 
+static void nfs4_xdr_enc_cb_notify(struct rpc_rqst *req,
+				   struct xdr_stream *xdr,
+				   const void *data)
+{
+	const struct nfsd4_callback *cb = data;
+	struct nfs4_cb_compound_hdr hdr = {
+		.ident = 0,
+		.minorversion = cb->cb_clp->cl_minorversion,
+	};
+	struct CB_NOTIFY4args args = { };
+
+	WARN_ON_ONCE(hdr.minorversion == 0);
+
+	encode_cb_compound4args(xdr, &hdr);
+	encode_cb_sequence4args(xdr, cb, &hdr);
+
+	/*
+	 * FIXME: get stateid and fh from delegation. Inline the cna_changes
+	 * buffer, and zero it.
+	 */
+	WARN_ON_ONCE(!xdrgen_encode_CB_NOTIFY4args(xdr, &args));
+
+	hdr.nops++;
+	encode_cb_nops(&hdr);
+}
+
+static int nfs4_xdr_dec_cb_notify(struct rpc_rqst *rqstp,
+				  struct xdr_stream *xdr,
+				  void *data)
+{
+	struct nfsd4_callback *cb = data;
+	struct nfs4_cb_compound_hdr hdr;
+	int status;
+
+	status = decode_cb_compound4res(xdr, &hdr);
+	if (unlikely(status))
+		return status;
+
+	status = decode_cb_sequence4res(xdr, cb);
+	if (unlikely(status || cb->cb_seq_status))
+		return status;
+
+	return decode_cb_op_status(xdr, OP_CB_NOTIFY, &cb->cb_status);
+}
+
 static void nfs4_xdr_enc_cb_notify_lock(struct rpc_rqst *req,
 					struct xdr_stream *xdr,
 					const void *data)
@@ -1026,6 +1071,7 @@ static const struct rpc_procinfo nfs4_cb_procedures[] = {
 #ifdef CONFIG_NFSD_PNFS
 	PROC(CB_LAYOUT,	COMPOUND,	cb_layout,	cb_layout),
 #endif
+	PROC(CB_NOTIFY,		COMPOUND,	cb_notify,	cb_notify),
 	PROC(CB_NOTIFY_LOCK,	COMPOUND,	cb_notify_lock,	cb_notify_lock),
 	PROC(CB_OFFLOAD,	COMPOUND,	cb_offload,	cb_offload),
 	PROC(CB_RECALL_ANY,	COMPOUND,	cb_recall_any,	cb_recall_any),
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 0eeecd824770c4df8e1cc29fc738e568d91d5e5f..5f21c79be032cc1334a301aad73e6bbcc8da5eb0 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -743,6 +743,7 @@ enum nfsd4_cb_op {
 	NFSPROC4_CLNT_CB_NOTIFY_LOCK,
 	NFSPROC4_CLNT_CB_RECALL_ANY,
 	NFSPROC4_CLNT_CB_GETATTR,
+	NFSPROC4_CLNT_CB_NOTIFY,
 };
 
 /* Returns true iff a is later than b: */
diff --git a/fs/nfsd/xdr4cb.h b/fs/nfsd/xdr4cb.h
index f4e29c0c701c9b04c44dadc752e847dc4da163d6..100f726ed92730ba953ae217b47be0bd7aefd4e5 100644
--- a/fs/nfsd/xdr4cb.h
+++ b/fs/nfsd/xdr4cb.h
@@ -33,6 +33,17 @@
 					cb_sequence_dec_sz +            \
 					op_dec_sz)
 
+#define NFS4_enc_cb_notify_sz		(cb_compound_enc_hdr_sz +       \
+					cb_sequence_enc_sz +            \
+					1 + enc_stateid_sz +            \
+					enc_nfs4_fh_sz +		\
+					1)
+					/* followed by an array of notify4's in pages */
+
+#define NFS4_dec_cb_notify_sz		(cb_compound_dec_hdr_sz  +      \
+					cb_sequence_dec_sz +            \
+					op_dec_sz)
+
 #define NFS4_enc_cb_notify_lock_sz	(cb_compound_enc_hdr_sz +        \
 					cb_sequence_enc_sz +             \
 					2 + 1 +				 \

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 20/28] nfsd: add data structures for handling CB_NOTIFY to directory delegation
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (18 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 19/28] nfsd: add callback encoding and decoding linkages " Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask() Jeff Layton
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

When a directory delegation is created, have it allocate the necessary
data structures to collect events and run a CB_NOTIFY callback.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 143 +++++++++++++++++++++++++++++++++++++++++++++-------
 fs/nfsd/state.h     |  33 +++++++++++-
 2 files changed, 157 insertions(+), 19 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ed5d6486d171ea0c886bd1f1ea1129bf4ccf429c..ebebfd6d304627d6c82bae5b84ea6c599d9e9474 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -130,6 +130,7 @@ static void free_session(struct nfsd4_session *);
 static const struct nfsd4_callback_ops nfsd4_cb_recall_ops;
 static const struct nfsd4_callback_ops nfsd4_cb_notify_lock_ops;
 static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops;
+static const struct nfsd4_callback_ops nfsd4_cb_notify_ops;
 
 static struct workqueue_struct *laundry_wq;
 
@@ -1048,6 +1049,45 @@ static void nfs4_free_deleg(struct nfs4_stid *stid)
 	atomic_long_dec(&num_delegations);
 }
 
+static struct nfsd4_notify_spool *alloc_notify_spool(void)
+{
+	struct nfsd4_notify_spool *spool;
+
+	spool = kmalloc(sizeof(*spool), GFP_KERNEL);
+	if (!spool)
+		return NULL;
+
+	spool->nns_page = alloc_page(GFP_KERNEL);
+	if (!spool->nns_page) {
+		kfree(spool);
+		return NULL;
+	}
+
+	spool->nns_idx = 0;
+	spool->nns_xdr.buflen = PAGE_SIZE;
+	spool->nns_xdr.pages = &spool->nns_page;
+
+	xdr_init_encode_pages(&spool->nns_stream, &spool->nns_xdr);
+	return spool;
+}
+
+static void free_notify_spool(struct nfsd4_notify_spool *spool)
+{
+	if (spool) {
+		put_page(spool->nns_page);
+		kfree(spool);
+	}
+}
+
+static void nfs4_free_dir_deleg(struct nfs4_stid *stid)
+{
+	struct nfs4_delegation *dp = delegstateid(stid);
+
+	free_notify_spool(dp->dl_cb_notify.ncn_gather);
+	free_notify_spool(dp->dl_cb_notify.ncn_send);
+	nfs4_free_deleg(stid);
+}
+
 /*
  * When we recall a delegation, we should be careful not to hand it
  * out again straight away.
@@ -1126,29 +1166,22 @@ static void block_delegations(struct knfsd_fh *fh)
 }
 
 static struct nfs4_delegation *
-alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
-		 struct nfs4_clnt_odstate *odstate, u32 dl_type)
+__alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
+		   struct nfs4_clnt_odstate *odstate, u32 dl_type,
+		   void (*sc_free)(struct nfs4_stid *))
 {
+	struct nfs4_stid *stid = nfs4_alloc_stid(clp, deleg_slab, sc_free);
 	struct nfs4_delegation *dp;
-	struct nfs4_stid *stid;
-	long n;
 
-	dprintk("NFSD alloc_init_deleg\n");
-	n = atomic_long_inc_return(&num_delegations);
-	if (n < 0 || n > max_delegations)
-		goto out_dec;
-	if (delegation_blocked(&fp->fi_fhandle))
-		goto out_dec;
-	stid = nfs4_alloc_stid(clp, deleg_slab, nfs4_free_deleg);
 	if (stid == NULL)
-		goto out_dec;
-	dp = delegstateid(stid);
+		return NULL;
 
 	/*
 	 * delegation seqid's are never incremented.  The 4.1 special
 	 * meaning of seqid 0 isn't meaningful, really, but let's avoid
-	 * 0 anyway just for consistency and use 1:
+	 * 0 anyway just for consistency and use 1.
 	 */
+	dp = delegstateid(stid);
 	dp->dl_stid.sc_stateid.si_generation = 1;
 	INIT_LIST_HEAD(&dp->dl_perfile);
 	INIT_LIST_HEAD(&dp->dl_perclnt);
@@ -1158,19 +1191,65 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
 	dp->dl_type = dl_type;
 	dp->dl_retries = 1;
 	dp->dl_recalled = false;
+	get_nfs4_file(fp);
+	dp->dl_stid.sc_file = fp;
 	nfsd4_init_cb(&dp->dl_recall, dp->dl_stid.sc_client,
 		      &nfsd4_cb_recall_ops, NFSPROC4_CLNT_CB_RECALL);
+	return dp;
+}
+
+static struct nfs4_delegation *
+alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
+		 struct nfs4_clnt_odstate *odstate, u32 dl_type)
+{
+	struct nfs4_delegation *dp;
+	long n;
+
+	dprintk("NFSD alloc_init_deleg\n");
+	n = atomic_long_inc_return(&num_delegations);
+	if (n < 0 || n > max_delegations)
+		goto out_dec;
+	if (delegation_blocked(&fp->fi_fhandle))
+		goto out_dec;
+
+	dp = __alloc_init_deleg(clp, fp, odstate, dl_type, nfs4_free_deleg);
+	if (!dp)
+		goto out_dec;
+
 	nfsd4_init_cb(&dp->dl_cb_fattr.ncf_getattr, dp->dl_stid.sc_client,
 			&nfsd4_cb_getattr_ops, NFSPROC4_CLNT_CB_GETATTR);
 	dp->dl_cb_fattr.ncf_file_modified = false;
-	get_nfs4_file(fp);
-	dp->dl_stid.sc_file = fp;
 	return dp;
 out_dec:
 	atomic_long_dec(&num_delegations);
 	return NULL;
 }
 
+static struct nfs4_delegation *
+alloc_init_dir_deleg(struct nfs4_client *clp, struct nfs4_file *fp)
+{
+	struct nfs4_delegation *dp;
+	struct nfsd4_cb_notify *cbn;
+	struct nfsd4_notify_spool *ncn;
+
+	ncn = alloc_notify_spool();
+	if (!ncn)
+		return NULL;
+
+	dp = __alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ,
+				nfs4_free_dir_deleg);
+	if (!dp) {
+		free_notify_spool(ncn);
+		return NULL;
+	}
+
+	cbn = &dp->dl_cb_notify;
+	cbn->ncn_gather = ncn;
+	nfsd4_init_cb(&cbn->ncn_cb, dp->dl_stid.sc_client,
+			&nfsd4_cb_notify_ops, NFSPROC4_CLNT_CB_NOTIFY);
+	return dp;
+}
+
 void
 nfs4_put_stid(struct nfs4_stid *s)
 {
@@ -3197,6 +3276,30 @@ nfsd4_cb_getattr_release(struct nfsd4_callback *cb)
 	nfs4_put_stid(&dp->dl_stid);
 }
 
+static int
+nfsd4_cb_notify_done(struct nfsd4_callback *cb,
+				struct rpc_task *task)
+{
+	switch (task->tk_status) {
+	case -NFS4ERR_DELAY:
+		rpc_delay(task, 2 * HZ);
+		return 0;
+	default:
+		return 1;
+	}
+}
+
+static void
+nfsd4_cb_notify_release(struct nfsd4_callback *cb)
+{
+	struct nfsd4_cb_notify *ncn =
+			container_of(cb, struct nfsd4_cb_notify, ncn_cb);
+	struct nfs4_delegation *dp =
+			container_of(ncn, struct nfs4_delegation, dl_cb_notify);
+
+	nfs4_put_stid(&dp->dl_stid);
+}
+
 static const struct nfsd4_callback_ops nfsd4_cb_recall_any_ops = {
 	.done		= nfsd4_cb_recall_any_done,
 	.release	= nfsd4_cb_recall_any_release,
@@ -3209,6 +3312,12 @@ static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops = {
 	.opcode		= OP_CB_GETATTR,
 };
 
+static const struct nfsd4_callback_ops nfsd4_cb_notify_ops = {
+	.done		= nfsd4_cb_notify_done,
+	.release	= nfsd4_cb_notify_release,
+	.opcode		= OP_CB_NOTIFY,
+};
+
 static void nfs4_cb_getattr(struct nfs4_cb_fattr *ncf)
 {
 	struct nfs4_delegation *dp =
@@ -9350,7 +9459,7 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
 
 	/* Try to set up the lease */
 	status = -ENOMEM;
-	dp = alloc_init_deleg(clp, fp, NULL, NFS4_OPEN_DELEGATE_READ);
+	dp = alloc_init_dir_deleg(clp, fp);
 	if (!dp)
 		goto out_delegees;
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 5f21c79be032cc1334a301aad73e6bbcc8da5eb0..706bbc7076a4f1d0be3ea7067d193683821d74eb 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -188,6 +188,31 @@ struct nfs4_cb_fattr {
 	u64 ncf_cur_fsize;
 };
 
+#define NFSD4_NOTIFY_SPOOL_SZ	16
+
+/* A place to collect notifications */
+struct nfsd4_notify_spool {
+	struct xdr_stream	nns_stream;
+	struct xdr_buf		nns_xdr;
+	struct page		*nns_page;
+	struct notify4		nns_ent[NFSD4_NOTIFY_SPOOL_SZ];
+	u32			nns_idx;
+};
+
+/*
+ * Represents a directory delegation. The callback is for handling CB_NOTIFYs.
+ * As notifications from fsnotify come in, encode the relevant notify_*4 in the
+ * ncn_stream, and append a new ncn_notify_array value.
+ *
+ * Periodically, fire off a CB_NOTIFY request to the server. Replace the with
+ * new ones and send the request.
+ */
+struct nfsd4_cb_notify {
+	struct nfsd4_callback		ncn_cb;
+	struct nfsd4_notify_spool	*ncn_gather;
+	struct nfsd4_notify_spool	*ncn_send;
+};
+
 /*
  * Represents a delegation stateid. The nfs4_client holds references to these
  * and they are put when it is being destroyed or when the delegation is
@@ -222,8 +247,12 @@ struct nfs4_delegation {
 	struct nfsd4_callback	dl_recall;
 	bool			dl_recalled;
 
-	/* for CB_GETATTR */
-	struct nfs4_cb_fattr    dl_cb_fattr;
+	union {
+		/* for CB_GETATTR */
+		struct nfs4_cb_fattr    dl_cb_fattr;
+		/* for CB_NOTIFY */
+		struct nfsd4_cb_notify	dl_cb_notify;
+	};
 };
 
 static inline bool deleg_is_read(u32 dl_type)

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask()
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (19 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 20/28] nfsd: add data structures for handling CB_NOTIFY to directory delegation Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-03 20:13   ` Jan Kara
  2025-06-02 14:02 ` [PATCH RFC v2 22/28] nfsd: update the fsnotify mark when setting or removing a dir delegation Jeff Layton
                   ` (6 subsequent siblings)
  27 siblings, 1 reply; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

nfsd needs to call this when new directory delegations are set or unset.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/notify/mark.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 798340db69d761dd05c1b361c251818dee89b9cf..ff21409c3ca3ad948557225afc586da3728f7cbe 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -308,6 +308,7 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
 	if (update_children)
 		fsnotify_conn_set_children_dentry_flags(conn);
 }
+EXPORT_SYMBOL_GPL(fsnotify_recalc_mask);
 
 /* Free all connectors queued for freeing once SRCU period ends */
 static void fsnotify_connector_destroy_workfn(struct work_struct *work)

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 22/28] nfsd: update the fsnotify mark when setting or removing a dir delegation
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (20 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask() Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 23/28] nfsd: make nfsd4_callback_ops->prepare operation bool return Jeff Layton
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add a new helper function that will update the mask on the nfsd_file's
fsnotify_mark to be a union of all current directory delegations on an
inode. Call that when directory delegations are added or removed.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ebebfd6d304627d6c82bae5b84ea6c599d9e9474..164020a01b737f76d2780b30274e75dcc3def819 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1302,6 +1302,37 @@ static void put_deleg_file(struct nfs4_file *fp)
 		nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ);
 }
 
+static void nfsd_fsnotify_recalc_mask(struct nfsd_file *nf)
+{
+	struct fsnotify_mark *fsn_mark = &nf->nf_mark->nfm_mark;
+	struct inode *inode = file_inode(nf->nf_file);
+	u32 lease_mask, mask = 0;
+	bool recalc = false;
+
+	/* This is only needed when adding or removing dir delegs */
+	if (!S_ISDIR(inode->i_mode))
+		return;
+
+	/* Set up notifications for any ignored delegation events */
+	lease_mask = inode_lease_ignore_mask(inode);
+	if (lease_mask & FL_IGN_DIR_CREATE)
+		mask |= FS_CREATE;
+	if (lease_mask & FL_IGN_DIR_DELETE)
+		mask |= FS_DELETE;
+	if (lease_mask & FL_IGN_DIR_RENAME)
+		mask |= FS_RENAME;
+
+	spin_lock(&fsn_mark->lock);
+	if (fsn_mark->mask != mask) {
+		fsn_mark->mask = mask;
+		recalc = true;
+	}
+	spin_unlock(&fsn_mark->lock);
+
+	if (recalc)
+		fsnotify_recalc_mask(fsn_mark->connector);
+}
+
 static void nfs4_unlock_deleg_lease(struct nfs4_delegation *dp)
 {
 	struct nfs4_file *fp = dp->dl_stid.sc_file;
@@ -1309,6 +1340,7 @@ static void nfs4_unlock_deleg_lease(struct nfs4_delegation *dp)
 
 	WARN_ON_ONCE(!fp->fi_delegees);
 
+	nfsd_fsnotify_recalc_mask(nf);
 	kernel_setlease(nf->nf_file, F_UNLCK, NULL, (void **)&dp);
 	put_deleg_file(fp);
 }
@@ -9487,8 +9519,10 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
 	spin_unlock(&clp->cl_lock);
 	spin_unlock(&state_lock);
 
-	if (!status)
+	if (!status) {
+		nfsd_fsnotify_recalc_mask(nf);
 		return dp;
+	}
 
 	/* Something failed. Drop the lease and clean up the stid */
 	kernel_setlease(fp->fi_deleg_file->nf_file, F_UNLCK, NULL, (void **)&dp);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 23/28] nfsd: make nfsd4_callback_ops->prepare operation bool return
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (21 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 22/28] nfsd: update the fsnotify mark when setting or removing a dir delegation Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 24/28] nfsd: add notification handlers for dir events Jeff Layton
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

For a CB_NOTIFY operation, we need to stop processing the callback
if an allocation fails. Change the ->prepare callback operation to
return true if processing should continue, and false otherwise.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 5 ++++-
 fs/nfsd/nfs4layouts.c  | 3 ++-
 fs/nfsd/nfs4state.c    | 6 ++++--
 fs/nfsd/state.h        | 6 +++---
 4 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 2dca686d67fc0f0fcf7997a252b4f5988b9de6c7..fe7b20b94d76efd309e27c1a3ef359e7101dac80 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1785,7 +1785,10 @@ nfsd4_run_cb_work(struct work_struct *work)
 
 	if (!test_and_clear_bit(NFSD4_CALLBACK_REQUEUE, &cb->cb_flags)) {
 		if (cb->cb_ops && cb->cb_ops->prepare)
-			cb->cb_ops->prepare(cb);
+			if (!cb->cb_ops->prepare(cb)) {
+				nfsd41_destroy_cb(cb);
+				return;
+			}
 	}
 
 	cb->cb_msg.rpc_cred = clp->cl_cb_cred;
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 290271ac424540e4405a5fd0eacc8db9f47603cd..f23699998e4c978b4af0c87cc0a959851ef5ac4b 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -653,7 +653,7 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls, struct nfsd_file *file)
 	}
 }
 
-static void
+static bool
 nfsd4_cb_layout_prepare(struct nfsd4_callback *cb)
 {
 	struct nfs4_layout_stateid *ls =
@@ -662,6 +662,7 @@ nfsd4_cb_layout_prepare(struct nfsd4_callback *cb)
 	mutex_lock(&ls->ls_mutex);
 	nfs4_inc_and_copy_stateid(&ls->ls_recall_sid, &ls->ls_stid);
 	mutex_unlock(&ls->ls_mutex);
+	return true;
 }
 
 static int
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 164020a01b737f76d2780b30274e75dcc3def819..5860d44fea0a4f854d65c87bcacb8eea19ce82e4 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -362,12 +362,13 @@ remove_blocked_locks(struct nfs4_lockowner *lo)
 	}
 }
 
-static void
+static bool
 nfsd4_cb_notify_lock_prepare(struct nfsd4_callback *cb)
 {
 	struct nfsd4_blocked_lock	*nbl = container_of(cb,
 						struct nfsd4_blocked_lock, nbl_cb);
 	locks_delete_block(&nbl->nbl_lock);
+	return true;
 }
 
 static int
@@ -5482,7 +5483,7 @@ bool nfsd_wait_for_delegreturn(struct svc_rqst *rqstp, struct inode *inode)
 	return timeo > 0;
 }
 
-static void nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
+static bool nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
 {
 	struct nfs4_delegation *dp = cb_to_delegation(cb);
 	struct nfsd_net *nn = net_generic(dp->dl_stid.sc_client->net,
@@ -5503,6 +5504,7 @@ static void nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
 		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
 	}
 	spin_unlock(&state_lock);
+	return true;
 }
 
 static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 706bbc7076a4f1d0be3ea7067d193683821d74eb..98f87fa724ee242f3a855faa205223b0e09a16ed 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -97,9 +97,9 @@ struct nfsd4_callback {
 };
 
 struct nfsd4_callback_ops {
-	void (*prepare)(struct nfsd4_callback *);
-	int (*done)(struct nfsd4_callback *, struct rpc_task *);
-	void (*release)(struct nfsd4_callback *);
+	bool (*prepare)(struct nfsd4_callback *cb);
+	int (*done)(struct nfsd4_callback *cb, struct rpc_task *task);
+	void (*release)(struct nfsd4_callback *cb);
 	uint32_t opcode;
 };
 

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 24/28] nfsd: add notification handlers for dir events
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (22 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 23/28] nfsd: make nfsd4_callback_ops->prepare operation bool return Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 25/28] nfsd: allow nfsd to get a dir lease with an ignore mask Jeff Layton
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add the necessary parts to accept a fsnotify callback for directory
change event and create a CB_NOTIFY request for it. When a dir nfsd_file
is created set a handle_event callback to handle the notification. Use
that to marshal the event into the notifylist4 buffer, and kick off the
callback workqueue job to handle the send.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c    |  51 +++++++++++++----
 fs/nfsd/nfs4callback.c |  19 +++++--
 fs/nfsd/nfs4state.c    | 152 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/state.h        |   2 +
 4 files changed, 207 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 3468883146afc080d2b4862e6002b2c6ff7315b9..6cd4cfa0b46bf33c4134987a12e42c8455fc4879 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -72,6 +72,7 @@ static struct kmem_cache		*nfsd_file_mark_slab;
 static struct list_lru			nfsd_file_lru;
 static unsigned long			nfsd_file_flags;
 static struct fsnotify_group		*nfsd_file_fsnotify_group;
+static struct fsnotify_group		*nfsd_dir_fsnotify_group;
 static struct delayed_work		nfsd_filecache_laundrette;
 static struct rhltable			nfsd_file_rhltable
 						____cacheline_aligned_in_smp;
@@ -147,7 +148,7 @@ static void
 nfsd_file_mark_put(struct nfsd_file_mark *nfm)
 {
 	if (refcount_dec_and_test(&nfm->nfm_ref)) {
-		fsnotify_destroy_mark(&nfm->nfm_mark, nfsd_file_fsnotify_group);
+		fsnotify_destroy_mark(&nfm->nfm_mark, nfm->nfm_mark.group);
 		fsnotify_put_mark(&nfm->nfm_mark);
 	}
 }
@@ -155,35 +156,37 @@ nfsd_file_mark_put(struct nfsd_file_mark *nfm)
 static struct nfsd_file_mark *
 nfsd_file_mark_find_or_create(struct inode *inode)
 {
-	int			err;
-	struct fsnotify_mark	*mark;
 	struct nfsd_file_mark	*nfm = NULL, *new;
+	struct fsnotify_group	*group;
+	struct fsnotify_mark	*mark;
+	int			err;
+
+	group = S_ISDIR(inode->i_mode) ? nfsd_dir_fsnotify_group : nfsd_file_fsnotify_group;
 
 	do {
-		fsnotify_group_lock(nfsd_file_fsnotify_group);
-		mark = fsnotify_find_inode_mark(inode,
-						nfsd_file_fsnotify_group);
+		fsnotify_group_lock(group);
+		mark = fsnotify_find_inode_mark(inode, group);
 		if (mark) {
 			nfm = nfsd_file_mark_get(container_of(mark,
 						 struct nfsd_file_mark,
 						 nfm_mark));
-			fsnotify_group_unlock(nfsd_file_fsnotify_group);
+			fsnotify_group_unlock(group);
 			if (nfm) {
 				fsnotify_put_mark(mark);
 				break;
 			}
 			/* Avoid soft lockup race with nfsd_file_mark_put() */
-			fsnotify_destroy_mark(mark, nfsd_file_fsnotify_group);
+			fsnotify_destroy_mark(mark, group);
 			fsnotify_put_mark(mark);
 		} else {
-			fsnotify_group_unlock(nfsd_file_fsnotify_group);
+			fsnotify_group_unlock(group);
 		}
 
 		/* allocate a new nfm */
 		new = kmem_cache_alloc(nfsd_file_mark_slab, GFP_KERNEL);
 		if (!new)
 			return NULL;
-		fsnotify_init_mark(&new->nfm_mark, nfsd_file_fsnotify_group);
+		fsnotify_init_mark(&new->nfm_mark, group);
 		new->nfm_mark.mask = FS_ATTRIB|FS_DELETE_SELF;
 		refcount_set(&new->nfm_ref, 1);
 
@@ -758,12 +761,25 @@ nfsd_file_fsnotify_handle_event(struct fsnotify_mark *mark, u32 mask,
 	return 0;
 }
 
+static int
+nfsd_dir_fsnotify_handle_event(struct fsnotify_group *group, u32 mask,
+			       const void *data, int data_type, struct inode *dir,
+			       const struct qstr *name, u32 cookie,
+			       struct fsnotify_iter_info *iter_info)
+{
+	return nfsd_handle_dir_event(mask, dir, data, data_type, name);
+}
 
 static const struct fsnotify_ops nfsd_file_fsnotify_ops = {
 	.handle_inode_event = nfsd_file_fsnotify_handle_event,
 	.free_mark = nfsd_file_mark_free,
 };
 
+static const struct fsnotify_ops nfsd_dir_fsnotify_ops = {
+	.handle_event = nfsd_dir_fsnotify_handle_event,
+	.free_mark = nfsd_file_mark_free,
+};
+
 int
 nfsd_file_cache_init(void)
 {
@@ -815,8 +831,7 @@ nfsd_file_cache_init(void)
 		goto out_shrinker;
 	}
 
-	nfsd_file_fsnotify_group = fsnotify_alloc_group(&nfsd_file_fsnotify_ops,
-							0);
+	nfsd_file_fsnotify_group = fsnotify_alloc_group(&nfsd_file_fsnotify_ops, 0);
 	if (IS_ERR(nfsd_file_fsnotify_group)) {
 		pr_err("nfsd: unable to create fsnotify group: %ld\n",
 			PTR_ERR(nfsd_file_fsnotify_group));
@@ -825,11 +840,23 @@ nfsd_file_cache_init(void)
 		goto out_notifier;
 	}
 
+	nfsd_dir_fsnotify_group = fsnotify_alloc_group(&nfsd_dir_fsnotify_ops, 0);
+	if (IS_ERR(nfsd_dir_fsnotify_group)) {
+		pr_err("nfsd: unable to create fsnotify group: %ld\n",
+			PTR_ERR(nfsd_dir_fsnotify_group));
+		ret = PTR_ERR(nfsd_dir_fsnotify_group);
+		nfsd_dir_fsnotify_group = NULL;
+		goto out_notify_group;
+	}
+
 	INIT_DELAYED_WORK(&nfsd_filecache_laundrette, nfsd_file_gc_worker);
 out:
 	if (ret)
 		clear_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags);
 	return ret;
+out_notify_group:
+	fsnotify_put_group(nfsd_file_fsnotify_group);
+	nfsd_file_fsnotify_group = NULL;
 out_notifier:
 	lease_unregister_notifier(&nfsd_file_lease_notifier);
 out_shrinker:
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index fe7b20b94d76efd309e27c1a3ef359e7101dac80..69cea84eceabe15b4e1e1aa31db601ad763b00ac 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -870,21 +870,30 @@ static void nfs4_xdr_enc_cb_notify(struct rpc_rqst *req,
 				   const void *data)
 {
 	const struct nfsd4_callback *cb = data;
+	struct nfsd4_cb_notify *ncn = container_of(cb, struct nfsd4_cb_notify, ncn_cb);
+	struct nfs4_delegation *dp = container_of(ncn, struct nfs4_delegation, dl_cb_notify);
 	struct nfs4_cb_compound_hdr hdr = {
 		.ident = 0,
 		.minorversion = cb->cb_clp->cl_minorversion,
 	};
-	struct CB_NOTIFY4args args = { };
+	struct CB_NOTIFY4args args;
+	__be32 *p;
 
 	WARN_ON_ONCE(hdr.minorversion == 0);
 
 	encode_cb_compound4args(xdr, &hdr);
 	encode_cb_sequence4args(xdr, cb, &hdr);
 
-	/*
-	 * FIXME: get stateid and fh from delegation. Inline the cna_changes
-	 * buffer, and zero it.
-	 */
+	p = xdr_reserve_space(xdr, 4);
+	*p = cpu_to_be32(OP_CB_NOTIFY);
+
+	args.cna_stateid.seqid = dp->dl_stid.sc_stateid.si_generation;
+	memcpy(&args.cna_stateid.other, &dp->dl_stid.sc_stateid.si_opaque,
+	       ARRAY_SIZE(args.cna_stateid.other));
+	args.cna_fh.len = dp->dl_stid.sc_file->fi_fhandle.fh_size;
+	args.cna_fh.data = dp->dl_stid.sc_file->fi_fhandle.fh_raw;
+	args.cna_changes.count = ncn->ncn_send->nns_idx;
+	args.cna_changes.element = ncn->ncn_send->nns_ent;
 	WARN_ON_ONCE(!xdrgen_encode_CB_NOTIFY4args(xdr, &args));
 
 	hdr.nops++;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5860d44fea0a4f854d65c87bcacb8eea19ce82e4..35b9e35f8b507cc9b3924fead3037433cd8f9371 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -55,6 +55,7 @@
 #include "netns.h"
 #include "pnfs.h"
 #include "filecache.h"
+#include "nfs4xdr_gen.h"
 #include "trace.h"
 
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
@@ -3309,15 +3310,83 @@ nfsd4_cb_getattr_release(struct nfsd4_callback *cb)
 	nfs4_put_stid(&dp->dl_stid);
 }
 
+static bool
+nfsd4_cb_notify_prepare(struct nfsd4_callback *cb)
+{
+	struct nfsd4_cb_notify *ncn =
+			container_of(cb, struct nfsd4_cb_notify, ncn_cb);
+	struct nfs4_delegation *dp =
+			container_of(ncn, struct nfs4_delegation, dl_cb_notify);
+	struct nfs4_file *fp = dp->dl_stid.sc_file;
+	struct nfsd_file *nf = fp->fi_deleg_file;
+	struct inode *inode = file_inode(nf->nf_file);
+	struct file_lock_context *flc = locks_inode_context(inode);
+	struct nfsd4_notify_spool *spool;
+
+	if (WARN_ON_ONCE(!flc))
+		return false;
+
+	if (WARN_ON_ONCE(ncn->ncn_send))
+		return false;
+
+	spool = alloc_notify_spool();
+	if (!spool) {
+		nfsd4_run_cb(&dp->dl_recall);
+		return false;
+	}
+
+	spin_lock(&flc->flc_lock);
+	ncn->ncn_send = ncn->ncn_gather;
+	ncn->ncn_gather = spool;
+	spin_unlock(&flc->flc_lock);
+	return true;
+}
+
+/* Returns true if more notifications are waiting to be sent */
+static bool
+nfsd4_cb_notify_release_send_spool(struct nfsd4_callback *cb)
+{
+	struct nfsd4_cb_notify *ncn = container_of(cb, struct nfsd4_cb_notify, ncn_cb);
+	struct nfs4_delegation *dp = container_of(ncn, struct nfs4_delegation, dl_cb_notify);
+	struct nfs4_file *fp = dp->dl_stid.sc_file;
+	struct nfsd_file *nf = fp->fi_deleg_file;
+	struct inode *inode = file_inode(nf->nf_file);
+	struct file_lock_context *flc = locks_inode_context(inode);
+	struct nfsd4_notify_spool *spool;
+	bool more;
+
+	spin_lock(&flc->flc_lock);
+	spool = ncn->ncn_send;
+	ncn->ncn_send = NULL;
+	more = ncn->ncn_gather && ncn->ncn_gather->nns_idx;
+	spin_unlock(&flc->flc_lock);
+
+	free_notify_spool(spool);
+	return more;
+}
+
 static int
 nfsd4_cb_notify_done(struct nfsd4_callback *cb,
 				struct rpc_task *task)
 {
+	struct nfsd4_cb_notify *ncn = container_of(cb, struct nfsd4_cb_notify, ncn_cb);
+	struct nfs4_delegation *dp = container_of(ncn, struct nfs4_delegation, dl_cb_notify);
+
 	switch (task->tk_status) {
 	case -NFS4ERR_DELAY:
 		rpc_delay(task, 2 * HZ);
 		return 0;
+	case 0:
+		/* If successful, release the send spool and maybe requeue the cb */
+		if (nfsd4_cb_notify_release_send_spool(cb)) {
+			refcount_inc(&dp->dl_stid.sc_count);
+			nfsd4_run_cb(cb);
+		}
+		return 1;
 	default:
+		/* For any other hard error, recall the deleg */
+		nfsd4_run_cb(&dp->dl_recall);
+		nfsd4_cb_notify_release_send_spool(cb);
 		return 1;
 	}
 }
@@ -3331,6 +3400,8 @@ nfsd4_cb_notify_release(struct nfsd4_callback *cb)
 			container_of(ncn, struct nfs4_delegation, dl_cb_notify);
 
 	nfs4_put_stid(&dp->dl_stid);
+	if (nfsd4_cb_notify_release_send_spool(cb))
+		nfsd4_run_cb(cb);
 }
 
 static const struct nfsd4_callback_ops nfsd4_cb_recall_any_ops = {
@@ -3346,6 +3417,7 @@ static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops = {
 };
 
 static const struct nfsd4_callback_ops nfsd4_cb_notify_ops = {
+	.prepare	= nfsd4_cb_notify_prepare,
 	.done		= nfsd4_cb_notify_done,
 	.release	= nfsd4_cb_notify_release,
 	.opcode		= OP_CB_NOTIFY,
@@ -9534,3 +9606,83 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
 	put_deleg_file(fp);
 	return ERR_PTR(status);
 }
+
+static void
+nfsd4_run_cb_notify(struct nfsd4_cb_notify *ncn)
+{
+	struct nfs4_delegation *dp = container_of(ncn, struct nfs4_delegation, dl_cb_notify);
+
+	if (test_and_set_bit(NFSD4_CALLBACK_RUNNING, &ncn->ncn_cb.cb_flags))
+		return;
+
+	if (!refcount_inc_not_zero(&dp->dl_stid.sc_count))
+		clear_bit(NFSD4_CALLBACK_RUNNING, &ncn->ncn_cb.cb_flags);
+	else
+		nfsd4_run_cb(&ncn->ncn_cb);
+}
+
+int
+nfsd_handle_dir_event(u32 mask, const struct inode *dir, const void *data,
+		      int data_type, const struct qstr *name)
+{
+	struct file_lock_context *ctx;
+	struct file_lock_core *flc;
+
+	ctx = locks_inode_context(dir);
+	if (!ctx || list_empty(&ctx->flc_lease))
+		return 0;
+
+	/*
+	 * FIXME: Do getattr against @inode, and then generate an fattr4. Use that as the
+	 * ne_attrs in the notify_entry4's.
+	 */
+	spin_lock(&ctx->flc_lock);
+	list_for_each_entry(flc, &ctx->flc_lease, flc_list) {
+		struct file_lease *fl = container_of(flc, struct file_lease, c);
+		struct nfs4_delegation *dp = flc->flc_owner;
+		struct nfsd4_cb_notify *ncn = &dp->dl_cb_notify;
+		struct nfsd4_notify_spool *nns = ncn->ncn_gather;
+		struct xdr_stream *stream = &nns->nns_stream;
+		static uint32_t zerobm;
+
+		if (fl->fl_lmops != &nfsd_dir_lease_mng_ops)
+			continue;
+
+		/* If no buffer or slots are available, give up and break the deleg */
+		if (!nns || nns->nns_idx >= NFSD4_NOTIFY_SPOOL_SZ) {
+			nfsd_break_deleg_cb(fl);
+			continue;
+		}
+
+		if (mask & FS_DELETE) {
+			static uint32_t notify_remove_bitmap = BIT(NOTIFY4_REMOVE_ENTRY);
+			struct notify4 *ent = &nns->nns_ent[nns->nns_idx];
+			struct notify_remove4 nr = { };
+			u8 *p = (u8 *)(stream->p);
+
+			if (!(flc->flc_flags & FL_IGN_DIR_DELETE))
+				continue;
+
+			nr.nrm_old_entry.ne_file.len = name->len;
+			nr.nrm_old_entry.ne_file.data = (char *)name->name;
+			nr.nrm_old_entry.ne_attrs.attrmask.count = 1;
+			nr.nrm_old_entry.ne_attrs.attrmask.element = &zerobm;
+			if (!xdrgen_encode_notify_remove4(stream, &nr)) {
+				pr_warn("nfsd: unable to marshal notify_remove4 to xdr stream\n");
+				continue;
+			}
+
+			/* grab a notify4 in the buffer and set it up */
+			ent->notify_mask.count = 1;
+			ent->notify_mask.element = &notify_remove_bitmap;
+			ent->notify_vals.len = (u8 *)stream->p - p;
+			ent->notify_vals.data = p;
+			++nns->nns_idx;
+		}
+
+		if (nns->nns_idx)
+			nfsd4_run_cb_notify(ncn);
+	}
+	spin_unlock(&ctx->flc_lock);
+	return 0;
+}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 98f87fa724ee242f3a855faa205223b0e09a16ed..345fa6325fde0435f811050625457a0d3cc29f3b 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -845,6 +845,8 @@ bool nfsd4_has_active_async_copies(struct nfs4_client *clp);
 extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(struct xdr_netobj name,
 				struct xdr_netobj princhash, struct nfsd_net *nn);
 extern bool nfs4_has_reclaimed_state(struct xdr_netobj name, struct nfsd_net *nn);
+int nfsd_handle_dir_event(u32 mask, const struct inode *dir, const void *data,
+			  int data_type, const struct qstr *name);
 
 void put_nfs4_file(struct nfs4_file *fi);
 extern void nfs4_put_cpntf_state(struct nfsd_net *nn,

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 25/28] nfsd: allow nfsd to get a dir lease with an ignore mask
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (23 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 24/28] nfsd: add notification handlers for dir events Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 26/28] nfsd: add a tracepoint for nfsd_file_fsnotify_handle_dir_event() Jeff Layton
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

When requesting a directory lease, enable the FL_IGN_DIR_* bits that
correspond to the requested notification types.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4proc.c  |  3 +++
 fs/nfsd/nfs4state.c | 27 +++++++++++++++++++++------
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index fa6f2980bcacd798c41387c71d55a59fdbc8043c..77b6d0363b9f4cfea96f3f1abd3e462fd2a77754 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2292,6 +2292,8 @@ nfsd4_verify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	return status == nfserr_same ? nfs_ok : status;
 }
 
+#define SUPPORTED_NOTIFY_MASK BIT(NOTIFY4_REMOVE_ENTRY)
+
 static __be32
 nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
 			 struct nfsd4_compound_state *cstate,
@@ -2330,6 +2332,7 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
 
 	gdd->gddrnf_status = GDD4_OK;
 	memcpy(&gdd->gddr_stateid, &dd->dl_stid.sc_stateid, sizeof(gdd->gddr_stateid));
+	gdd->gddr_notification[0] = gdd->gdda_notification_types[0] & SUPPORTED_NOTIFY_MASK;
 	nfs4_put_stid(&dd->dl_stid);
 	return nfs_ok;
 }
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 35b9e35f8b507cc9b3924fead3037433cd8f9371..a75179ffa6006868bae3931263830d7b7e1a8882 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6075,14 +6075,14 @@ static bool nfsd4_cb_channel_good(struct nfs4_client *clp)
 	return clp->cl_minorversion && clp->cl_cb_state == NFSD4_CB_UNKNOWN;
 }
 
-static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp)
+static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp, unsigned int ignore)
 {
 	struct file_lease *fl;
 
 	fl = locks_alloc_lease();
 	if (!fl)
 		return NULL;
-	fl->c.flc_flags = FL_DELEG;
+	fl->c.flc_flags = FL_DELEG | ignore;
 	fl->c.flc_type = deleg_is_read(dp->dl_type) ? F_RDLCK : F_WRLCK;
 	fl->c.flc_owner = (fl_owner_t)dp;
 	fl->c.flc_pid = current->tgid;
@@ -6299,7 +6299,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 	if (!dp)
 		goto out_delegees;
 
-	fl = nfs4_alloc_init_lease(dp);
+	fl = nfs4_alloc_init_lease(dp, 0);
 	if (!fl)
 		goto out_clnt_odstate;
 
@@ -9523,6 +9523,21 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
 	return status;
 }
 
+static unsigned int
+nfsd_notify_to_ignore_mask(u32 notify)
+{
+	unsigned int mask = 0;
+
+	if (notify & BIT(NOTIFY4_REMOVE_ENTRY))
+		mask |= FL_IGN_DIR_DELETE;
+	if (notify & BIT(NOTIFY4_ADD_ENTRY))
+		mask |= FL_IGN_DIR_CREATE;
+	if (notify & BIT(NOTIFY4_RENAME_ENTRY))
+		mask |= FL_IGN_DIR_RENAME;
+
+	return mask;
+}
+
 /**
  * nfsd_get_dir_deleg - attempt to get a directory delegation
  * @cstate: compound state
@@ -9569,12 +9584,12 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
 	if (!dp)
 		goto out_delegees;
 
-	fl = nfs4_alloc_init_lease(dp);
+	fl = nfs4_alloc_init_lease(dp,
+			nfsd_notify_to_ignore_mask(gdd->gdda_notification_types[0]));
 	if (!fl)
 		goto out_put_stid;
 
-	status = kernel_setlease(nf->nf_file,
-				 fl->c.flc_type, &fl, NULL);
+	status = kernel_setlease(nf->nf_file, fl->c.flc_type, &fl, NULL);
 	if (fl)
 		locks_free_lease(fl);
 	if (status)

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 26/28] nfsd: add a tracepoint for nfsd_file_fsnotify_handle_dir_event()
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (24 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 25/28] nfsd: allow nfsd to get a dir lease with an ignore mask Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 27/28] nfsd: add support for NOTIFY4_ADD_ENTRY events Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 28/28] nfsd: add support for NOTIFY4_RENAME_ENTRY events Jeff Layton
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Repurpose the existing nfsd_file_fsnotify_handle_event tracepoint() as a
class and call it from the dir notificaiton codepath. Add info about the
dir to it.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c |  2 +-
 fs/nfsd/nfs4state.c |  3 +++
 fs/nfsd/trace.h     | 25 ++++++++++++++++++-------
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 6cd4cfa0b46bf33c4134987a12e42c8455fc4879..ba72470b870cd0e266ba7fac8174a1a249a840e8 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -743,7 +743,7 @@ nfsd_file_fsnotify_handle_event(struct fsnotify_mark *mark, u32 mask,
 	if (WARN_ON_ONCE(!inode))
 		return 0;
 
-	trace_nfsd_file_fsnotify_handle_event(inode, mask);
+	trace_nfsd_file_fsnotify_handle_event(inode, dir, mask);
 
 	/* Should be no marks on non-regular files */
 	if (!S_ISREG(inode->i_mode)) {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a75179ffa6006868bae3931263830d7b7e1a8882..a610a90d119a771771cdb60ce3ee4ab3604cb8a3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -9640,9 +9640,12 @@ int
 nfsd_handle_dir_event(u32 mask, const struct inode *dir, const void *data,
 		      int data_type, const struct qstr *name)
 {
+	struct inode *inode = fsnotify_data_inode(data, data_type);
 	struct file_lock_context *ctx;
 	struct file_lock_core *flc;
 
+	trace_nfsd_file_fsnotify_handle_dir_event(inode, dir, mask);
+
 	ctx = locks_inode_context(dir);
 	if (!ctx || list_empty(&ctx->flc_lease))
 		return 0;
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 0c68df50eae248c7c9afe0437dfcf29837e09275..968e13a721942c051448f21af2f13849511b7c6a 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1293,25 +1293,36 @@ TRACE_EVENT(nfsd_file_is_cached,
 	)
 );
 
-TRACE_EVENT(nfsd_file_fsnotify_handle_event,
-	TP_PROTO(struct inode *inode, u32 mask),
-	TP_ARGS(inode, mask),
+DECLARE_EVENT_CLASS(nfsd_file_fsnotify_handle_event_class,
+	TP_PROTO(const struct inode *inode, const struct inode *dir, u32 mask),
+	TP_ARGS(inode, dir, mask),
 	TP_STRUCT__entry(
-		__field(struct inode *, inode)
+		__field(ino_t, ino)
+		__field(ino_t, dir)
 		__field(unsigned int, nlink)
 		__field(umode_t, mode)
 		__field(u32, mask)
 	),
 	TP_fast_assign(
-		__entry->inode = inode;
+		__entry->ino = inode->i_ino;
+		__entry->dir = dir ? dir->i_ino : 0;
 		__entry->nlink = inode->i_nlink;
 		__entry->mode = inode->i_mode;
 		__entry->mask = mask;
 	),
-	TP_printk("inode=%p nlink=%u mode=0%ho mask=0x%x", __entry->inode,
-			__entry->nlink, __entry->mode, __entry->mask)
+	TP_printk("dir=%lu inode=%lu nlink=%u mode=0%ho mask=0x%x",
+		  __entry->dir, __entry->ino, __entry->nlink,
+		  __entry->mode, __entry->mask)
 );
 
+#define DEFINE_NFSD_FSNOTIFY_HANDLE_EVENT(name)					\
+DEFINE_EVENT(nfsd_file_fsnotify_handle_event_class, name,			\
+	TP_PROTO(const struct inode *inode, const struct inode *dir, u32 mask),	\
+	TP_ARGS(inode, dir, mask))
+
+DEFINE_NFSD_FSNOTIFY_HANDLE_EVENT(nfsd_file_fsnotify_handle_event);
+DEFINE_NFSD_FSNOTIFY_HANDLE_EVENT(nfsd_file_fsnotify_handle_dir_event);
+
 DECLARE_EVENT_CLASS(nfsd_file_gc_class,
 	TP_PROTO(
 		const struct nfsd_file *nf

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 27/28] nfsd: add support for NOTIFY4_ADD_ENTRY events
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (25 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 26/28] nfsd: add a tracepoint for nfsd_file_fsnotify_handle_dir_event() Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  2025-06-02 14:02 ` [PATCH RFC v2 28/28] nfsd: add support for NOTIFY4_RENAME_ENTRY events Jeff Layton
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add support for handling NOTIFY4_ADD_ENTRY events. When a notification
comes in, marshall the event to the notify_spool and kick off the
callback.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4proc.c  |  2 +-
 fs/nfsd/nfs4state.c | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 77b6d0363b9f4cfea96f3f1abd3e462fd2a77754..a2996343fa0db33e014731f62aaa4e7c72506a76 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2292,7 +2292,7 @@ nfsd4_verify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	return status == nfserr_same ? nfs_ok : status;
 }
 
-#define SUPPORTED_NOTIFY_MASK BIT(NOTIFY4_REMOVE_ENTRY)
+#define SUPPORTED_NOTIFY_MASK BIT(NOTIFY4_REMOVE_ENTRY|NOTIFY4_ADD_ENTRY)
 
 static __be32
 nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a610a90d119a771771cdb60ce3ee4ab3604cb8a3..9dc607e355d5839d80946d4983205c15ece6a71e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -9697,6 +9697,31 @@ nfsd_handle_dir_event(u32 mask, const struct inode *dir, const void *data,
 			ent->notify_vals.data = p;
 			++nns->nns_idx;
 		}
+		if (mask & FS_CREATE) {
+			static uint32_t notify_add_bitmap = BIT(NOTIFY4_ADD_ENTRY);
+			struct notify4 *ent = &nns->nns_ent[nns->nns_idx];
+			struct notify_add4 na = { };
+			u8 *p = (u8 *)(stream->p);
+
+			if (!(flc->flc_flags & FL_IGN_DIR_CREATE))
+				continue;
+
+			na.nad_new_entry.ne_file.len = name->len;
+			na.nad_new_entry.ne_file.data = (char *)name->name;
+			na.nad_new_entry.ne_attrs.attrmask.count = 1;
+			na.nad_new_entry.ne_attrs.attrmask.element = &zerobm;
+			if (!xdrgen_encode_notify_add4(stream, &na)) {
+				pr_warn("nfsd: unable to marshal notify_add4 to xdr stream\n");
+				continue;
+			}
+
+			/* grab a notify4 in the buffer and set it up */
+			ent->notify_mask.count = 1;
+			ent->notify_mask.element = &notify_add_bitmap;
+			ent->notify_vals.len = (u8 *)stream->p - p;
+			ent->notify_vals.data = p;
+			++nns->nns_idx;
+		}
 
 		if (nns->nns_idx)
 			nfsd4_run_cb_notify(ncn);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH RFC v2 28/28] nfsd: add support for NOTIFY4_RENAME_ENTRY events
  2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
                   ` (26 preceding siblings ...)
  2025-06-02 14:02 ` [PATCH RFC v2 27/28] nfsd: add support for NOTIFY4_ADD_ENTRY events Jeff Layton
@ 2025-06-02 14:02 ` Jeff Layton
  27 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-02 14:02 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, linux-nfs, linux-cifs,
	samba-technical, linux-doc, Jeff Layton

Add support for RENAME events. Marshal the event into the notifylist4
buffer and kick the callback handler.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4proc.c  |  2 +-
 fs/nfsd/nfs4state.c | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a2996343fa0db33e014731f62aaa4e7c72506a76..4573c0651aa49df6089bcc4e5d40f45d46b1c499 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2292,7 +2292,7 @@ nfsd4_verify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	return status == nfserr_same ? nfs_ok : status;
 }
 
-#define SUPPORTED_NOTIFY_MASK BIT(NOTIFY4_REMOVE_ENTRY|NOTIFY4_ADD_ENTRY)
+#define SUPPORTED_NOTIFY_MASK BIT(NOTIFY4_REMOVE_ENTRY|NOTIFY4_ADD_ENTRY|NOTIFY4_RENAME_ENTRY)
 
 static __be32
 nfsd4_get_dir_delegation(struct svc_rqst *rqstp,
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9dc607e355d5839d80946d4983205c15ece6a71e..6333e95c075259af0c160eb130149c776e55f5a8 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -9722,6 +9722,45 @@ nfsd_handle_dir_event(u32 mask, const struct inode *dir, const void *data,
 			ent->notify_vals.data = p;
 			++nns->nns_idx;
 		}
+		if (mask & FS_RENAME) {
+			struct dentry *new_dentry = fsnotify_data_dentry(data, data_type);
+			static uint32_t notify_rename_bitmap = BIT(NOTIFY4_RENAME_ENTRY);
+			struct notify4 *ent = &nns->nns_ent[nns->nns_idx];
+			struct notify_rename4 nr = { };
+			u8 *p = (u8 *)(stream->p);
+			struct name_snapshot n;
+			bool ret;
+
+			if (!(flc->flc_flags & FL_IGN_DIR_RENAME))
+				continue;
+
+			/* FIXME: warn? */
+			if (!new_dentry)
+				continue;
+
+			nr.nrn_old_entry.nrm_old_entry.ne_file.len = name->len;
+			nr.nrn_old_entry.nrm_old_entry.ne_file.data = (char *)name->name;
+			nr.nrn_old_entry.nrm_old_entry.ne_attrs.attrmask.count = 1;
+			nr.nrn_old_entry.nrm_old_entry.ne_attrs.attrmask.element = &zerobm;
+			take_dentry_name_snapshot(&n, new_dentry);
+			nr.nrn_new_entry.nad_new_entry.ne_file.len = n.name.len;
+			nr.nrn_new_entry.nad_new_entry.ne_file.data = (char *)n.name.name;
+			nr.nrn_new_entry.nad_new_entry.ne_attrs.attrmask.count = 1;
+			nr.nrn_new_entry.nad_new_entry.ne_attrs.attrmask.element = &zerobm;
+			ret = xdrgen_encode_notify_rename4(stream, &nr);
+			release_dentry_name_snapshot(&n);
+			if (!ret) {
+				pr_warn("nfsd: unable to marshal notify_rename4 to xdr stream\n");
+				continue;
+			}
+
+			/* grab a notify4 in the buffer and set it up */
+			ent->notify_mask.count = 1;
+			ent->notify_mask.element = &notify_rename_bitmap;
+			ent->notify_vals.len = (u8 *)stream->p - p;
+			ent->notify_vals.data = p;
+			++nns->nns_idx;
+		}
 
 		if (nns->nns_idx)
 			nfsd4_run_cb_notify(ncn);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask()
  2025-06-02 14:02 ` [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask() Jeff Layton
@ 2025-06-03 20:13   ` Jan Kara
  2025-06-03 20:17     ` Jeff Layton
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Kara @ 2025-06-03 20:13 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi, linux-fsdevel,
	linux-kernel, linux-nfs, linux-cifs, samba-technical, linux-doc

On Mon 02-06-25 10:02:04, Jeff Layton wrote:
> nfsd needs to call this when new directory delegations are set or unset.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>

So fsnotify_recalc_mask() is not a great API to export because it depends
on lifetime rules of mark connector - in particular the caller has to make
sure the connector stays alive while fsnotify_recalc_mask() is running. So
far the knowledge was internal in fsnotify subsystem but now NFSD needs to
know as well.

Generally you need to recalculate the mask when you modify events you
listen to in a mark. So perhaps we should provide an API like:

int fsnotify_modify_mark_mask(struct fsnotify_mark *mark, __u32 mask_clear,
			      __u32 mask_set);

which could be used to modify mark mask without having to care about
details like cached masks and connector locking rules?

								Honza

> ---
>  fs/notify/mark.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index 798340db69d761dd05c1b361c251818dee89b9cf..ff21409c3ca3ad948557225afc586da3728f7cbe 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -308,6 +308,7 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
>  	if (update_children)
>  		fsnotify_conn_set_children_dentry_flags(conn);
>  }
> +EXPORT_SYMBOL_GPL(fsnotify_recalc_mask);
>  
>  /* Free all connectors queued for freeing once SRCU period ends */
>  static void fsnotify_connector_destroy_workfn(struct work_struct *work)
> 
> -- 
> 2.49.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask()
  2025-06-03 20:13   ` Jan Kara
@ 2025-06-03 20:17     ` Jeff Layton
  0 siblings, 0 replies; 34+ messages in thread
From: Jeff Layton @ 2025-06-03 20:17 UTC (permalink / raw)
  To: Jan Kara
  Cc: Alexander Viro, Christian Brauner, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Bharath SM,
	NeilBrown, Olga Kornievskaia, Dai Ngo, Jonathan Corbet,
	Amir Goldstein, Miklos Szeredi, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, linux-doc

On Tue, 2025-06-03 at 22:13 +0200, Jan Kara wrote:
> On Mon 02-06-25 10:02:04, Jeff Layton wrote:
> > nfsd needs to call this when new directory delegations are set or unset.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> 
> So fsnotify_recalc_mask() is not a great API to export because it depends
> on lifetime rules of mark connector - in particular the caller has to make
> sure the connector stays alive while fsnotify_recalc_mask() is running. So
> far the knowledge was internal in fsnotify subsystem but now NFSD needs to
> know as well.
> 
> Generally you need to recalculate the mask when you modify events you
> listen to in a mark. So perhaps we should provide an API like:
> 
> int fsnotify_modify_mark_mask(struct fsnotify_mark *mark, __u32 mask_clear,
> 			      __u32 mask_set);
> 
> which could be used to modify mark mask without having to care about
> details like cached masks and connector locking rules?
> 

That sounds like a reasonable thing to do. I'll plan to do something
along those lines. Thanks for the review!

> 
> > ---
> >  fs/notify/mark.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> > index 798340db69d761dd05c1b361c251818dee89b9cf..ff21409c3ca3ad948557225afc586da3728f7cbe 100644
> > --- a/fs/notify/mark.c
> > +++ b/fs/notify/mark.c
> > @@ -308,6 +308,7 @@ void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn)
> >  	if (update_children)
> >  		fsnotify_conn_set_children_dentry_flags(conn);
> >  }
> > +EXPORT_SYMBOL_GPL(fsnotify_recalc_mask);
> >  
> >  /* Free all connectors queued for freeing once SRCU period ends */
> >  static void fsnotify_connector_destroy_workfn(struct work_struct *work)
> > 
> > -- 
> > 2.49.0
> > 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent
  2025-06-02 14:01 ` [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
@ 2025-06-05 11:19   ` Jan Kara
  2025-06-05 11:25     ` Jeff Layton
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Kara @ 2025-06-05 11:19 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Alexander Aring, Trond Myklebust, Anna Schumaker, Steve French,
	Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N, Tom Talpey,
	Bharath SM, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Jonathan Corbet, Amir Goldstein, Miklos Szeredi, linux-fsdevel,
	linux-kernel, linux-nfs, linux-cifs, samba-technical, linux-doc

On Mon 02-06-25 10:01:47, Jeff Layton wrote:
> In order to add directory delegation support, we need to break
> delegations on the parent whenever there is going to be a change in the
> directory.
> 
> Rename the existing vfs_mkdir to __vfs_mkdir, make it static and add a
> new delegated_inode parameter. Add a new exported vfs_mkdir wrapper
> around it that passes a NULL pointer for delegated_inode.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>

FWIW I went through the changes adding breaking of delegations to VFS
directory functions and they look ok to me. Just I dislike the addition of
__vfs_mkdir() (and similar) helpers because over longer term the helpers
tend to pile up and the maze of functions (already hard to follow in VFS)
gets unwieldy. Either I'd try to give it a proper name or (if exposing the
functionality to the external world is fine - which seems it is) you could
just add the argument to vfs_mkdir() and change all the callers? I've
checked and for each of the modified functions there's less than 10 callers
so the churn shouldn't be that big. What do others think?

								Honza

> ---
>  fs/namei.c | 67 +++++++++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 42 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 0fea12860036162c01a291558e068fde9c986142..7c9e237ed1b1a535934ffe5e523424bb035e7ae0 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4318,29 +4318,9 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
>  	return do_mknodat(AT_FDCWD, getname(filename), mode, dev);
>  }
>  
> -/**
> - * vfs_mkdir - create directory returning correct dentry if possible
> - * @idmap:	idmap of the mount the inode was found from
> - * @dir:	inode of the parent directory
> - * @dentry:	dentry of the child directory
> - * @mode:	mode of the child directory
> - *
> - * Create a directory.
> - *
> - * If the inode has been found through an idmapped mount the idmap of
> - * the vfsmount must be passed through @idmap. This function will then take
> - * care to map the inode according to @idmap before checking permissions.
> - * On non-idmapped mounts or if permission checking is to be performed on the
> - * raw inode simply pass @nop_mnt_idmap.
> - *
> - * In the event that the filesystem does not use the *@dentry but leaves it
> - * negative or unhashes it and possibly splices a different one returning it,
> - * the original dentry is dput() and the alternate is returned.
> - *
> - * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> - */
> -struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> -			 struct dentry *dentry, umode_t mode)
> +static struct dentry *__vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> +				  struct dentry *dentry, umode_t mode,
> +				  struct inode **delegated_inode)
>  {
>  	int error;
>  	unsigned max_links = dir->i_sb->s_max_links;
> @@ -4363,6 +4343,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
>  	if (max_links && dir->i_nlink >= max_links)
>  		goto err;
>  
> +	error = try_break_deleg(dir, delegated_inode);
> +	if (error)
> +		goto err;
> +
>  	de = dir->i_op->mkdir(idmap, dir, dentry, mode);
>  	error = PTR_ERR(de);
>  	if (IS_ERR(de))
> @@ -4378,6 +4362,33 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
>  	dput(dentry);
>  	return ERR_PTR(error);
>  }
> +
> +/**
> + * vfs_mkdir - create directory returning correct dentry if possible
> + * @idmap:	idmap of the mount the inode was found from
> + * @dir:	inode of the parent directory
> + * @dentry:	dentry of the child directory
> + * @mode:	mode of the child directory
> + *
> + * Create a directory.
> + *
> + * If the inode has been found through an idmapped mount the idmap of
> + * the vfsmount must be passed through @idmap. This function will then take
> + * care to map the inode according to @idmap before checking permissions.
> + * On non-idmapped mounts or if permission checking is to be performed on the
> + * raw inode simply pass @nop_mnt_idmap.
> + *
> + * In the event that the filesystem does not use the *@dentry but leaves it
> + * negative or unhashes it and possibly splices a different one returning it,
> + * the original dentry is dput() and the alternate is returned.
> + *
> + * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> + */
> +struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> +			 struct dentry *dentry, umode_t mode)
> +{
> +	return __vfs_mkdir(idmap, dir, dentry, mode, NULL);
> +}
>  EXPORT_SYMBOL(vfs_mkdir);
>  
>  int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> @@ -4386,6 +4397,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
>  	struct path path;
>  	int error;
>  	unsigned int lookup_flags = LOOKUP_DIRECTORY;
> +	struct inode *delegated_inode = NULL;
>  
>  retry:
>  	dentry = filename_create(dfd, name, &path, lookup_flags);
> @@ -4396,12 +4408,17 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
>  	error = security_path_mkdir(&path, dentry,
>  			mode_strip_umask(path.dentry->d_inode, mode));
>  	if (!error) {
> -		dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> -				  dentry, mode);
> +		dentry = __vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> +				     dentry, mode, &delegated_inode);
>  		if (IS_ERR(dentry))
>  			error = PTR_ERR(dentry);
>  	}
>  	done_path_create(&path, dentry);
> +	if (delegated_inode) {
> +		error = break_deleg_wait(&delegated_inode);
> +		if (!error)
> +			goto retry;
> +	}
>  	if (retry_estale(error, lookup_flags)) {
>  		lookup_flags |= LOOKUP_REVAL;
>  		goto retry;
> 
> -- 
> 2.49.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent
  2025-06-05 11:19   ` Jan Kara
@ 2025-06-05 11:25     ` Jeff Layton
  2025-06-06 10:10       ` Christian Brauner
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff Layton @ 2025-06-05 11:25 UTC (permalink / raw)
  To: Jan Kara
  Cc: Alexander Viro, Christian Brauner, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Bharath SM,
	NeilBrown, Olga Kornievskaia, Dai Ngo, Jonathan Corbet,
	Amir Goldstein, Miklos Szeredi, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, linux-doc

On Thu, 2025-06-05 at 13:19 +0200, Jan Kara wrote:
> On Mon 02-06-25 10:01:47, Jeff Layton wrote:
> > In order to add directory delegation support, we need to break
> > delegations on the parent whenever there is going to be a change in the
> > directory.
> > 
> > Rename the existing vfs_mkdir to __vfs_mkdir, make it static and add a
> > new delegated_inode parameter. Add a new exported vfs_mkdir wrapper
> > around it that passes a NULL pointer for delegated_inode.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> 
> FWIW I went through the changes adding breaking of delegations to VFS
> directory functions and they look ok to me. Just I dislike the addition of
> __vfs_mkdir() (and similar) helpers because over longer term the helpers
> tend to pile up and the maze of functions (already hard to follow in VFS)
> gets unwieldy. Either I'd try to give it a proper name or (if exposing the
> functionality to the external world is fine - which seems it is) you could
> just add the argument to vfs_mkdir() and change all the callers? I've
> checked and for each of the modified functions there's less than 10 callers
> so the churn shouldn't be that big. What do others think?
> 

Good point -- I'm always terrible with naming functions. I'm fine with
either approach, but just adding the argument does sound simple enough.
I'll plan to do that unless anyone objects.

Thanks for taking a look!

> 								Honza
> 
> > ---
> >  fs/namei.c | 67 +++++++++++++++++++++++++++++++++++++++-----------------------
> >  1 file changed, 42 insertions(+), 25 deletions(-)
> > 
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 0fea12860036162c01a291558e068fde9c986142..7c9e237ed1b1a535934ffe5e523424bb035e7ae0 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -4318,29 +4318,9 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
> >  	return do_mknodat(AT_FDCWD, getname(filename), mode, dev);
> >  }
> >  
> > -/**
> > - * vfs_mkdir - create directory returning correct dentry if possible
> > - * @idmap:	idmap of the mount the inode was found from
> > - * @dir:	inode of the parent directory
> > - * @dentry:	dentry of the child directory
> > - * @mode:	mode of the child directory
> > - *
> > - * Create a directory.
> > - *
> > - * If the inode has been found through an idmapped mount the idmap of
> > - * the vfsmount must be passed through @idmap. This function will then take
> > - * care to map the inode according to @idmap before checking permissions.
> > - * On non-idmapped mounts or if permission checking is to be performed on the
> > - * raw inode simply pass @nop_mnt_idmap.
> > - *
> > - * In the event that the filesystem does not use the *@dentry but leaves it
> > - * negative or unhashes it and possibly splices a different one returning it,
> > - * the original dentry is dput() and the alternate is returned.
> > - *
> > - * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> > - */
> > -struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > -			 struct dentry *dentry, umode_t mode)
> > +static struct dentry *__vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > +				  struct dentry *dentry, umode_t mode,
> > +				  struct inode **delegated_inode)
> >  {
> >  	int error;
> >  	unsigned max_links = dir->i_sb->s_max_links;
> > @@ -4363,6 +4343,10 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> >  	if (max_links && dir->i_nlink >= max_links)
> >  		goto err;
> >  
> > +	error = try_break_deleg(dir, delegated_inode);
> > +	if (error)
> > +		goto err;
> > +
> >  	de = dir->i_op->mkdir(idmap, dir, dentry, mode);
> >  	error = PTR_ERR(de);
> >  	if (IS_ERR(de))
> > @@ -4378,6 +4362,33 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> >  	dput(dentry);
> >  	return ERR_PTR(error);
> >  }
> > +
> > +/**
> > + * vfs_mkdir - create directory returning correct dentry if possible
> > + * @idmap:	idmap of the mount the inode was found from
> > + * @dir:	inode of the parent directory
> > + * @dentry:	dentry of the child directory
> > + * @mode:	mode of the child directory
> > + *
> > + * Create a directory.
> > + *
> > + * If the inode has been found through an idmapped mount the idmap of
> > + * the vfsmount must be passed through @idmap. This function will then take
> > + * care to map the inode according to @idmap before checking permissions.
> > + * On non-idmapped mounts or if permission checking is to be performed on the
> > + * raw inode simply pass @nop_mnt_idmap.
> > + *
> > + * In the event that the filesystem does not use the *@dentry but leaves it
> > + * negative or unhashes it and possibly splices a different one returning it,
> > + * the original dentry is dput() and the alternate is returned.
> > + *
> > + * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> > + */
> > +struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
> > +			 struct dentry *dentry, umode_t mode)
> > +{
> > +	return __vfs_mkdir(idmap, dir, dentry, mode, NULL);
> > +}
> >  EXPORT_SYMBOL(vfs_mkdir);
> >  
> >  int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > @@ -4386,6 +4397,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> >  	struct path path;
> >  	int error;
> >  	unsigned int lookup_flags = LOOKUP_DIRECTORY;
> > +	struct inode *delegated_inode = NULL;
> >  
> >  retry:
> >  	dentry = filename_create(dfd, name, &path, lookup_flags);
> > @@ -4396,12 +4408,17 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode)
> >  	error = security_path_mkdir(&path, dentry,
> >  			mode_strip_umask(path.dentry->d_inode, mode));
> >  	if (!error) {
> > -		dentry = vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> > -				  dentry, mode);
> > +		dentry = __vfs_mkdir(mnt_idmap(path.mnt), path.dentry->d_inode,
> > +				     dentry, mode, &delegated_inode);
> >  		if (IS_ERR(dentry))
> >  			error = PTR_ERR(dentry);
> >  	}
> >  	done_path_create(&path, dentry);
> > +	if (delegated_inode) {
> > +		error = break_deleg_wait(&delegated_inode);
> > +		if (!error)
> > +			goto retry;
> > +	}
> >  	if (retry_estale(error, lookup_flags)) {
> >  		lookup_flags |= LOOKUP_REVAL;
> >  		goto retry;
> > 
> > -- 
> > 2.49.0
> > 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent
  2025-06-05 11:25     ` Jeff Layton
@ 2025-06-06 10:10       ` Christian Brauner
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Brauner @ 2025-06-06 10:10 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Jan Kara, Alexander Viro, Chuck Lever, Alexander Aring,
	Trond Myklebust, Anna Schumaker, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Bharath SM,
	NeilBrown, Olga Kornievskaia, Dai Ngo, Jonathan Corbet,
	Amir Goldstein, Miklos Szeredi, linux-fsdevel, linux-kernel,
	linux-nfs, linux-cifs, samba-technical, linux-doc

On Thu, Jun 05, 2025 at 07:25:38AM -0400, Jeff Layton wrote:
> On Thu, 2025-06-05 at 13:19 +0200, Jan Kara wrote:
> > On Mon 02-06-25 10:01:47, Jeff Layton wrote:
> > > In order to add directory delegation support, we need to break
> > > delegations on the parent whenever there is going to be a change in the
> > > directory.
> > > 
> > > Rename the existing vfs_mkdir to __vfs_mkdir, make it static and add a
> > > new delegated_inode parameter. Add a new exported vfs_mkdir wrapper
> > > around it that passes a NULL pointer for delegated_inode.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > 
> > FWIW I went through the changes adding breaking of delegations to VFS
> > directory functions and they look ok to me. Just I dislike the addition of
> > __vfs_mkdir() (and similar) helpers because over longer term the helpers
> > tend to pile up and the maze of functions (already hard to follow in VFS)
> > gets unwieldy. Either I'd try to give it a proper name or (if exposing the
> > functionality to the external world is fine - which seems it is) you could
> > just add the argument to vfs_mkdir() and change all the callers? I've
> > checked and for each of the modified functions there's less than 10 callers
> > so the churn shouldn't be that big. What do others think?

If it's just a few callers we should just add the argument.

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2025-06-06 10:11 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 14:01 [PATCH RFC v2 00/28] vfs, nfsd, nfs: implement directory delegations Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 01/28] filelock: push the S_ISREG check down to ->setlease handlers Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 02/28] filelock: add a lm_may_setlease lease_manager callback Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 03/28] vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 04/28] vfs: allow mkdir to wait for delegation break on parent Jeff Layton
2025-06-05 11:19   ` Jan Kara
2025-06-05 11:25     ` Jeff Layton
2025-06-06 10:10       ` Christian Brauner
2025-06-02 14:01 ` [PATCH RFC v2 05/28] vfs: allow rmdir " Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 06/28] vfs: break parent dir delegations in open(..., O_CREAT) codepath Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 07/28] vfs: make vfs_create break delegations on parent directory Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 08/28] vfs: make vfs_mknod " Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 09/28] filelock: lift the ban on directory leases in generic_setlease Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 10/28] nfsd: allow filecache to hold S_IFDIR files Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 11/28] nfsd: allow DELEGRETURN on directories Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 12/28] nfsd: check for delegation conflicts vs. the same client Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 13/28] nfsd: wire up GET_DIR_DELEGATION handling Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 14/28] filelock: rework the __break_lease API to use flags Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 15/28] filelock: add struct delegated_inode Jeff Layton
2025-06-02 14:01 ` [PATCH RFC v2 16/28] filelock: add support for ignoring deleg breaks for dir change events Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 17/28] filelock: add an inode_lease_ignore_mask helper Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 18/28] nfsd: add protocol support for CB_NOTIFY Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 19/28] nfsd: add callback encoding and decoding linkages " Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 20/28] nfsd: add data structures for handling CB_NOTIFY to directory delegation Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 21/28] fsnotify: export fsnotify_recalc_mask() Jeff Layton
2025-06-03 20:13   ` Jan Kara
2025-06-03 20:17     ` Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 22/28] nfsd: update the fsnotify mark when setting or removing a dir delegation Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 23/28] nfsd: make nfsd4_callback_ops->prepare operation bool return Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 24/28] nfsd: add notification handlers for dir events Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 25/28] nfsd: allow nfsd to get a dir lease with an ignore mask Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 26/28] nfsd: add a tracepoint for nfsd_file_fsnotify_handle_dir_event() Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 27/28] nfsd: add support for NOTIFY4_ADD_ENTRY events Jeff Layton
2025-06-02 14:02 ` [PATCH RFC v2 28/28] nfsd: add support for NOTIFY4_RENAME_ENTRY events Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).