From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <cel@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH v4 5/6] NFSD: Add export-scoped state revocation
Date: Wed, 18 Mar 2026 10:47:48 -0400 [thread overview]
Message-ID: <f582806fcf3c6d6f7416c24c55462fc502cb0ab0.camel@kernel.org> (raw)
In-Reply-To: <20260318-umount-kills-nfsv4-state-v4-5-56aad44ab982@oracle.com>
On Wed, 2026-03-18 at 10:15 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> nfsd4_revoke_states() revokes all NFSv4 state on an entire
> superblock, which is too coarse when multiple exports share a
> filesystem. Add nfsd4_revoke_export_states() to revoke only
> state associated with files under a specific export root, then
> convert nfsd4_revoke_states() to a thin wrapper that passes
> sb->s_root.
>
> nfsd4_revoke_export_states() uses find_next_sb_stid() to locate
> candidate stids, then verifies each against the export root via
> nfsd_file_inode_is_in_subtree(). That helper is placed in the
> file cache layer (filecache.c) because it operates on VFS types
> with no NFSv4 state dependency. It walks all of an inode's
> dentry aliases rather than calling d_find_any_alias(), because
> for hard-linked files an arbitrary alias may fall outside the
> export subtree even when another alias is inside it.
>
> When the export root is the filesystem root, the subtree check
> is elided and every stid matching the superblock is revoked
> directly.
>
> The NFSD_UNLOCK_TYPE_FILESYSTEM handler now calls
> nfsd4_revoke_export_states() with the resolved path dentry,
> enabling subtree-scoped revocation through the netlink
> interface.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/filecache.c | 32 +++++++++++++++++++
> fs/nfsd/filecache.h | 1 +
> fs/nfsd/nfs4state.c | 92 +++++++++++++++++++++++++++++++++++++----------------
> fs/nfsd/nfsctl.c | 3 +-
> fs/nfsd/state.h | 7 ++++
> 5 files changed, 107 insertions(+), 28 deletions(-)
>
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 1e2b38ed1d35..cd09be0c5465 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -894,6 +894,38 @@ __nfsd_file_cache_purge(struct net *net)
> nfsd_file_dispose_list(&dispose);
> }
>
> +/**
> + * nfsd_file_inode_is_in_subtree - check whether an inode is under a subtree
> + * @inode: inode to test
> + * @root_dentry: dentry of the subtree root
> + *
> + * Check whether @inode has any dentry alias that falls within the
> + * subtree rooted at @root_dentry. Hard-linked files can have aliases
> + * in multiple directories, so all aliases must be tested.
> + *
> + * Return: %true if any dentry alias of @inode is at or below
> + * @root_dentry, %false otherwise.
> + */
> +bool nfsd_file_inode_is_in_subtree(struct inode *inode,
> + struct dentry *root_dentry)
> +{
> + struct dentry *alias;
> + bool found = false;
> +
> + /* i_lock stabilizes the alias list; is_subdir() nests
> + * rename_lock (a seqlock) beneath it but does not sleep.
> + */
> + spin_lock(&inode->i_lock);
> + hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
> + if (is_subdir(alias, root_dentry)) {
> + found = true;
> + break;
> + }
> + }
> + spin_unlock(&inode->i_lock);
> + return found;
> +}
> +
> static struct nfsd_fcache_disposal *
> nfsd_alloc_fcache_disposal(void)
> {
> diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
> index b383dbc5b921..36c9a8e388d2 100644
> --- a/fs/nfsd/filecache.h
> +++ b/fs/nfsd/filecache.h
> @@ -70,6 +70,7 @@ struct net *nfsd_file_put_local(struct nfsd_file __rcu **nf);
> struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
> struct file *nfsd_file_file(struct nfsd_file *nf);
> void nfsd_file_close_inode_sync(struct inode *inode);
> +bool nfsd_file_inode_is_in_subtree(struct inode *inode, struct dentry *root_dentry);
> void nfsd_file_net_dispose(struct nfsd_net *nn);
> bool nfsd_file_is_cached(struct inode *inode);
> __be32 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 891669b32804..581f38395c42 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1763,15 +1763,6 @@ static struct nfs4_stid *find_next_sb_stid(struct nfs4_client *clp,
> return stid;
> }
>
> -static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp,
> - struct super_block *sb,
> - unsigned int sc_types)
> -{
> - unsigned long id = 0;
> -
> - return find_next_sb_stid(clp, sb, sc_types, &id);
> -}
> -
> static void revoke_ol_stid(struct nfs4_client *clp,
> struct nfs4_ol_stateid *stp)
> {
> @@ -1835,20 +1826,19 @@ static void revoke_one_stid(struct nfs4_client *clp, struct nfs4_stid *stid)
> }
>
> /**
> - * nfsd4_revoke_states - revoke all nfsv4 states associated with given filesystem
> - * @nn: used to identify instance of nfsd (there is one per net namespace)
> - * @sb: super_block used to identify target filesystem
> + * nfsd4_revoke_export_states - revoke states associated with a given export
> + * @nn: nfsd_net identifying the nfsd instance (one per net namespace)
> + * @sb: super_block of the export's filesystem
> + * @root_dentry: dentry of the export root directory
> *
> * All nfs4 states (open, lock, delegation, layout) held by the server instance
> - * and associated with a file on the given filesystem will be revoked resulting
> - * in any files being closed and so all references from nfsd to the filesystem
> - * being released. Thus nfsd will no longer prevent the filesystem from being
> - * unmounted.
> - *
> - * The clients which own the states will subsequently being notified that the
> - * states have been "admin-revoked".
> + * and associated with files under the given export will be revoked. When
> + * @root_dentry is the filesystem root, all state on @sb is revoked (equivalent
> + * to nfsd4_revoke_states). When @root_dentry is a subdirectory, only state on
> + * files within that subtree is revoked.
> */
> -void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> +void nfsd4_revoke_export_states(struct nfsd_net *nn, struct super_block *sb,
> + struct dentry *root_dentry)
> {
> unsigned int idhashval;
> unsigned int sc_types;
> @@ -1861,18 +1851,53 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> struct nfs4_client *clp;
> retry:
> list_for_each_entry(clp, head, cl_idhash) {
> - struct nfs4_stid *stid = find_one_sb_stid(clp, sb,
> - sc_types);
> - if (stid) {
> - spin_unlock(&nn->client_lock);
> + struct nfs4_stid *stid;
> + /* Resets to zero on each retry; revocation may
> + * alter the IDR, so a stale cursor is unsafe.
> + */
> + unsigned long id = 0;
> +
> + while ((stid = find_next_sb_stid(clp, sb,
> + sc_types, &id)) != NULL) {
> + if (root_dentry != sb->s_root) {
> + bool match;
> +
> + /* Bare inc to pin clp; get_client_locked() is
> + * not used because its courtesy-to-active
> + * transition is unwanted during revocation.
> + */
> + atomic_inc(&clp->cl_rpc_users);
> + spin_unlock(&nn->client_lock);
> + match = nfsd_file_inode_is_in_subtree(
> + stid->sc_file->fi_inode,
> + root_dentry);
Ouch the hardlinked thing makes this hard to reason about.
Ok, so suppose we have two exports on the same superblock.
/export/foo
/export/bar
One is exported to one client foo and one to another to client bar.
There is a file hardlinked across those directories:
$ touch /export/foo/baz
$ ln /export/bar/baz /export/foo/baz
Now, client foo opens /export/foo/baz, and client bar opens
/export/bar/baz.
/export/bar is unexported and state under it revoked. Won't client
foo's state end up being revoked too in that case?
Note that the different hardlinks should end up with different
filehandles since they are exposed to the clients via different
exports.
I wonder... do we need keep track of the export under which a stateid
was acquired so we can properly revoke the right ones in this
situation?
> + if (!match) {
> + nfs4_put_stid(stid);
> + spin_lock(&nn->client_lock);
> + put_client_renew_locked(clp);
> + id++;
> + continue;
> + }
> + } else {
> + /* Whole-sb: goto retry restarts the
> + * client list immediately after
> + * revocation, so clp needs no pin.
> + */
> + spin_unlock(&nn->client_lock);
> + }
> +
> revoke_one_stid(clp, stid);
> nfs4_put_stid(stid);
> spin_lock(&nn->client_lock);
> + if (root_dentry != sb->s_root)
> + put_client_renew_locked(clp);
> if (clp->cl_minorversion == 0)
> /* Allow cleanup after a lease period.
> - * store_release ensures cleanup will
> - * see any newly revoked states if it
> - * sees the time updated.
> + * The lock/unlock pair orders this
> + * store after revoke_one_stid(), so
> + * nfs40_clean_admin_revoked() sees
> + * newly revoked states if it sees
> + * the updated time.
> */
> nn->nfs40_last_revoke =
> ktime_get_boottime_seconds();
> @@ -1883,6 +1908,19 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> spin_unlock(&nn->client_lock);
> }
>
> +/**
> + * nfsd4_revoke_states - revoke all nfsv4 states associated with given filesystem
> + * @nn: nfsd_net identifying the nfsd instance
> + * @sb: super_block used to identify target filesystem
> + *
> + * Convenience wrapper around nfsd4_revoke_export_states() that revokes
> + * all state on @sb by passing sb->s_root as the export root.
> + */
> +void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> +{
> + nfsd4_revoke_export_states(nn, sb, sb->s_root);
> +}
> +
> static inline int
> hash_sessionid(struct nfs4_sessionid *sessionid)
> {
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index d3ed343699bd..d9c61c939059 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -2207,7 +2207,8 @@ static int nfsd_nl_unlock_by_filesystem(struct genl_info *info)
>
> mutex_lock(&nfsd_mutex);
> if (nn->nfsd_serv)
> - nfsd4_revoke_states(nn, path.dentry->d_sb);
> + nfsd4_revoke_export_states(nn, path.dentry->d_sb,
> + path.dentry);
> else
> error = -EINVAL;
> mutex_unlock(&nfsd_mutex);
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 6fcbf1e427d4..9e7c7884831c 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -843,11 +843,18 @@ struct nfsd_file *find_any_file(struct nfs4_file *f);
>
> #ifdef CONFIG_NFSD_V4
> void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb);
> +void nfsd4_revoke_export_states(struct nfsd_net *nn, struct super_block *sb,
> + struct dentry *root_dentry);
> void nfsd4_cancel_copy_by_sb(struct net *net, struct super_block *sb);
> #else
> static inline void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> {
> }
> +static inline void nfsd4_revoke_export_states(struct nfsd_net *nn,
> + struct super_block *sb,
> + struct dentry *root_dentry)
> +{
> +}
> static inline void nfsd4_cancel_copy_by_sb(struct net *net, struct super_block *sb)
> {
> }
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2026-03-18 14:47 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-18 14:15 [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2026-03-18 14:15 ` [PATCH v4 1/6] NFSD: Extract revoke_one_stid() utility function Chuck Lever
2026-03-18 14:21 ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 2/6] NFSD: Add NFSD_CMD_UNLOCK netlink command with ip scope Chuck Lever
2026-03-18 14:28 ` Jeff Layton
2026-03-18 14:32 ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 3/6] NFSD: Add filesystem scope to NFSD_CMD_UNLOCK Chuck Lever
2026-03-18 14:29 ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 4/6] NFSD: Refactor find_one_sb_stid() into find_next_sb_stid() Chuck Lever
2026-03-18 14:30 ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 5/6] NFSD: Add export-scoped state revocation Chuck Lever
2026-03-18 14:47 ` Jeff Layton [this message]
2026-03-18 14:51 ` Chuck Lever
2026-03-18 14:58 ` Jeff Layton
2026-03-18 17:57 ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 6/6] NFSD: Add nfsd_file_close_export() for file cache cleanup Chuck Lever
2026-03-18 14:24 ` [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f582806fcf3c6d6f7416c24c55462fc502cb0ab0.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox