From: Chuck Lever <cel@kernel.org>
To: Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH v4 5/6] NFSD: Add export-scoped state revocation
Date: Wed, 18 Mar 2026 13:57:06 -0400 [thread overview]
Message-ID: <2ef59bf1-c4f7-4818-9f47-16d056edcb8b@kernel.org> (raw)
In-Reply-To: <c42d3896eb47aa3bb766f29776fd0aac2a1eb07d.camel@kernel.org>
On 3/18/26 10:58 AM, Jeff Layton wrote:
> On Wed, 2026-03-18 at 10:51 -0400, Chuck Lever wrote:
>> On 3/18/26 10:47 AM, Jeff Layton wrote:
>>> On Wed, 2026-03-18 at 10:15 -0400, Chuck Lever wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> nfsd4_revoke_states() revokes all NFSv4 state on an entire
>>>> superblock, which is too coarse when multiple exports share a
>>>> filesystem. Add nfsd4_revoke_export_states() to revoke only
>>>> state associated with files under a specific export root, then
>>>> convert nfsd4_revoke_states() to a thin wrapper that passes
>>>> sb->s_root.
>>>>
>>>> nfsd4_revoke_export_states() uses find_next_sb_stid() to locate
>>>> candidate stids, then verifies each against the export root via
>>>> nfsd_file_inode_is_in_subtree(). That helper is placed in the
>>>> file cache layer (filecache.c) because it operates on VFS types
>>>> with no NFSv4 state dependency. It walks all of an inode's
>>>> dentry aliases rather than calling d_find_any_alias(), because
>>>> for hard-linked files an arbitrary alias may fall outside the
>>>> export subtree even when another alias is inside it.
>>>>
>>>> When the export root is the filesystem root, the subtree check
>>>> is elided and every stid matching the superblock is revoked
>>>> directly.
>>>>
>>>> The NFSD_UNLOCK_TYPE_FILESYSTEM handler now calls
>>>> nfsd4_revoke_export_states() with the resolved path dentry,
>>>> enabling subtree-scoped revocation through the netlink
>>>> interface.
>>>>
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>> fs/nfsd/filecache.c | 32 +++++++++++++++++++
>>>> fs/nfsd/filecache.h | 1 +
>>>> fs/nfsd/nfs4state.c | 92 +++++++++++++++++++++++++++++++++++++----------------
>>>> fs/nfsd/nfsctl.c | 3 +-
>>>> fs/nfsd/state.h | 7 ++++
>>>> 5 files changed, 107 insertions(+), 28 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>>> index 1e2b38ed1d35..cd09be0c5465 100644
>>>> --- a/fs/nfsd/filecache.c
>>>> +++ b/fs/nfsd/filecache.c
>>>> @@ -894,6 +894,38 @@ __nfsd_file_cache_purge(struct net *net)
>>>> nfsd_file_dispose_list(&dispose);
>>>> }
>>>>
>>>> +/**
>>>> + * nfsd_file_inode_is_in_subtree - check whether an inode is under a subtree
>>>> + * @inode: inode to test
>>>> + * @root_dentry: dentry of the subtree root
>>>> + *
>>>> + * Check whether @inode has any dentry alias that falls within the
>>>> + * subtree rooted at @root_dentry. Hard-linked files can have aliases
>>>> + * in multiple directories, so all aliases must be tested.
>>>> + *
>>>> + * Return: %true if any dentry alias of @inode is at or below
>>>> + * @root_dentry, %false otherwise.
>>>> + */
>>>> +bool nfsd_file_inode_is_in_subtree(struct inode *inode,
>>>> + struct dentry *root_dentry)
>>>> +{
>>>> + struct dentry *alias;
>>>> + bool found = false;
>>>> +
>>>> + /* i_lock stabilizes the alias list; is_subdir() nests
>>>> + * rename_lock (a seqlock) beneath it but does not sleep.
>>>> + */
>>>> + spin_lock(&inode->i_lock);
>>>> + hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
>>>> + if (is_subdir(alias, root_dentry)) {
>>>> + found = true;
>>>> + break;
>>>> + }
>>>> + }
>>>> + spin_unlock(&inode->i_lock);
>>>> + return found;
>>>> +}
>>>> +
>>>> static struct nfsd_fcache_disposal *
>>>> nfsd_alloc_fcache_disposal(void)
>>>> {
>>>> diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
>>>> index b383dbc5b921..36c9a8e388d2 100644
>>>> --- a/fs/nfsd/filecache.h
>>>> +++ b/fs/nfsd/filecache.h
>>>> @@ -70,6 +70,7 @@ struct net *nfsd_file_put_local(struct nfsd_file __rcu **nf);
>>>> struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
>>>> struct file *nfsd_file_file(struct nfsd_file *nf);
>>>> void nfsd_file_close_inode_sync(struct inode *inode);
>>>> +bool nfsd_file_inode_is_in_subtree(struct inode *inode, struct dentry *root_dentry);
>>>> void nfsd_file_net_dispose(struct nfsd_net *nn);
>>>> bool nfsd_file_is_cached(struct inode *inode);
>>>> __be32 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>> index 891669b32804..581f38395c42 100644
>>>> --- a/fs/nfsd/nfs4state.c
>>>> +++ b/fs/nfsd/nfs4state.c
>>>> @@ -1763,15 +1763,6 @@ static struct nfs4_stid *find_next_sb_stid(struct nfs4_client *clp,
>>>> return stid;
>>>> }
>>>>
>>>> -static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp,
>>>> - struct super_block *sb,
>>>> - unsigned int sc_types)
>>>> -{
>>>> - unsigned long id = 0;
>>>> -
>>>> - return find_next_sb_stid(clp, sb, sc_types, &id);
>>>> -}
>>>> -
>>>> static void revoke_ol_stid(struct nfs4_client *clp,
>>>> struct nfs4_ol_stateid *stp)
>>>> {
>>>> @@ -1835,20 +1826,19 @@ static void revoke_one_stid(struct nfs4_client *clp, struct nfs4_stid *stid)
>>>> }
>>>>
>>>> /**
>>>> - * nfsd4_revoke_states - revoke all nfsv4 states associated with given filesystem
>>>> - * @nn: used to identify instance of nfsd (there is one per net namespace)
>>>> - * @sb: super_block used to identify target filesystem
>>>> + * nfsd4_revoke_export_states - revoke states associated with a given export
>>>> + * @nn: nfsd_net identifying the nfsd instance (one per net namespace)
>>>> + * @sb: super_block of the export's filesystem
>>>> + * @root_dentry: dentry of the export root directory
>>>> *
>>>> * All nfs4 states (open, lock, delegation, layout) held by the server instance
>>>> - * and associated with a file on the given filesystem will be revoked resulting
>>>> - * in any files being closed and so all references from nfsd to the filesystem
>>>> - * being released. Thus nfsd will no longer prevent the filesystem from being
>>>> - * unmounted.
>>>> - *
>>>> - * The clients which own the states will subsequently being notified that the
>>>> - * states have been "admin-revoked".
>>>> + * and associated with files under the given export will be revoked. When
>>>> + * @root_dentry is the filesystem root, all state on @sb is revoked (equivalent
>>>> + * to nfsd4_revoke_states). When @root_dentry is a subdirectory, only state on
>>>> + * files within that subtree is revoked.
>>>> */
>>>> -void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
>>>> +void nfsd4_revoke_export_states(struct nfsd_net *nn, struct super_block *sb,
>>>> + struct dentry *root_dentry)
>>>> {
>>>> unsigned int idhashval;
>>>> unsigned int sc_types;
>>>> @@ -1861,18 +1851,53 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
>>>> struct nfs4_client *clp;
>>>> retry:
>>>> list_for_each_entry(clp, head, cl_idhash) {
>>>> - struct nfs4_stid *stid = find_one_sb_stid(clp, sb,
>>>> - sc_types);
>>>> - if (stid) {
>>>> - spin_unlock(&nn->client_lock);
>>>> + struct nfs4_stid *stid;
>>>> + /* Resets to zero on each retry; revocation may
>>>> + * alter the IDR, so a stale cursor is unsafe.
>>>> + */
>>>> + unsigned long id = 0;
>>>> +
>>>> + while ((stid = find_next_sb_stid(clp, sb,
>>>> + sc_types, &id)) != NULL) {
>>>> + if (root_dentry != sb->s_root) {
>>>> + bool match;
>>>> +
>>>> + /* Bare inc to pin clp; get_client_locked() is
>>>> + * not used because its courtesy-to-active
>>>> + * transition is unwanted during revocation.
>>>> + */
>>>> + atomic_inc(&clp->cl_rpc_users);
>>>> + spin_unlock(&nn->client_lock);
>>>> + match = nfsd_file_inode_is_in_subtree(
>>>> + stid->sc_file->fi_inode,
>>>> + root_dentry);
>>>
>>> Ouch the hardlinked thing makes this hard to reason about.
>>>
>>> Ok, so suppose we have two exports on the same superblock.
>>>
>>> /export/foo
>>> /export/bar
>>>
>>> One is exported to one client foo and one to another to client bar.
>>> There is a file hardlinked across those directories:
>>>
>>> $ touch /export/foo/baz
>>> $ ln /export/bar/baz /export/foo/baz
>>>
>>> Now, client foo opens /export/foo/baz, and client bar opens
>>> /export/bar/baz.
>>>
>>> /export/bar is unexported and state under it revoked. Won't client
>>> foo's state end up being revoked too in that case?
>>>
>>> Note that the different hardlinks should end up with different
>>> filehandles since they are exposed to the clients via different
>>> exports.
>>>
>>> I wonder... do we need keep track of the export under which a stateid
>>> was acquired so we can properly revoke the right ones in this
>>> situation?
>>
>> If I understand your comment correctly, I believe that is what the
>> earlier patches in this series do -- tie the state IDs to a particular
>> export.
>>
>>
>
> I don't think so, not AFAICT.
>
> + * Check whether @inode has any dentry alias that falls within the
> + * subtree rooted at @root_dentry. Hard-linked files can have aliases
> + * in multiple directories, so all aliases must be tested.
>
> In this case, you're calling nfsd_file_inode_is_in_subtree(). In the
> above example, that's going to return true for foo's state, even though
> bar's export is the one being revoked.
>
> I think to do this properly, you'd have to track an st_export field in
> struct nfs4_ol_stateid (or maybe in struct nfs4_stid).
>
> OTOH, hardlinks are rare so maybe this isn't worth fretting over.
Probably rare, but the point is the current approach is difficult to
reason about, thus will be hard to maintain. For v5, I'll try linking
NFSv4 state to exports.
--
Chuck Lever
next prev parent reply other threads:[~2026-03-18 17:57 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-18 14:15 [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2026-03-18 14:15 ` [PATCH v4 1/6] NFSD: Extract revoke_one_stid() utility function Chuck Lever
2026-03-18 14:21 ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 2/6] NFSD: Add NFSD_CMD_UNLOCK netlink command with ip scope Chuck Lever
2026-03-18 14:28 ` Jeff Layton
2026-03-18 14:32 ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 3/6] NFSD: Add filesystem scope to NFSD_CMD_UNLOCK Chuck Lever
2026-03-18 14:29 ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 4/6] NFSD: Refactor find_one_sb_stid() into find_next_sb_stid() Chuck Lever
2026-03-18 14:30 ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 5/6] NFSD: Add export-scoped state revocation Chuck Lever
2026-03-18 14:47 ` Jeff Layton
2026-03-18 14:51 ` Chuck Lever
2026-03-18 14:58 ` Jeff Layton
2026-03-18 17:57 ` Chuck Lever [this message]
2026-03-18 14:15 ` [PATCH v4 6/6] NFSD: Add nfsd_file_close_export() for file cache cleanup Chuck Lever
2026-03-18 14:24 ` [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2ef59bf1-c4f7-4818-9f47-16d056edcb8b@kernel.org \
--to=cel@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox