public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH v4 5/6] NFSD: Add export-scoped state revocation
Date: Wed, 18 Mar 2026 10:51:25 -0400	[thread overview]
Message-ID: <202c9697-82cd-409a-8c5c-ea56a3830a8a@kernel.org> (raw)
In-Reply-To: <f582806fcf3c6d6f7416c24c55462fc502cb0ab0.camel@kernel.org>

On 3/18/26 10:47 AM, Jeff Layton wrote:
> On Wed, 2026-03-18 at 10:15 -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> nfsd4_revoke_states() revokes all NFSv4 state on an entire
>> superblock, which is too coarse when multiple exports share a
>> filesystem. Add nfsd4_revoke_export_states() to revoke only
>> state associated with files under a specific export root, then
>> convert nfsd4_revoke_states() to a thin wrapper that passes
>> sb->s_root.
>>
>> nfsd4_revoke_export_states() uses find_next_sb_stid() to locate
>> candidate stids, then verifies each against the export root via
>> nfsd_file_inode_is_in_subtree(). That helper is placed in the
>> file cache layer (filecache.c) because it operates on VFS types
>> with no NFSv4 state dependency. It walks all of an inode's
>> dentry aliases rather than calling d_find_any_alias(), because
>> for hard-linked files an arbitrary alias may fall outside the
>> export subtree even when another alias is inside it.
>>
>> When the export root is the filesystem root, the subtree check
>> is elided and every stid matching the superblock is revoked
>> directly.
>>
>> The NFSD_UNLOCK_TYPE_FILESYSTEM handler now calls
>> nfsd4_revoke_export_states() with the resolved path dentry,
>> enabling subtree-scoped revocation through the netlink
>> interface.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>  fs/nfsd/filecache.c | 32 +++++++++++++++++++
>>  fs/nfsd/filecache.h |  1 +
>>  fs/nfsd/nfs4state.c | 92 +++++++++++++++++++++++++++++++++++++----------------
>>  fs/nfsd/nfsctl.c    |  3 +-
>>  fs/nfsd/state.h     |  7 ++++
>>  5 files changed, 107 insertions(+), 28 deletions(-)
>>
>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>> index 1e2b38ed1d35..cd09be0c5465 100644
>> --- a/fs/nfsd/filecache.c
>> +++ b/fs/nfsd/filecache.c
>> @@ -894,6 +894,38 @@ __nfsd_file_cache_purge(struct net *net)
>>  	nfsd_file_dispose_list(&dispose);
>>  }
>>  
>> +/**
>> + * nfsd_file_inode_is_in_subtree - check whether an inode is under a subtree
>> + * @inode:        inode to test
>> + * @root_dentry:  dentry of the subtree root
>> + *
>> + * Check whether @inode has any dentry alias that falls within the
>> + * subtree rooted at @root_dentry.  Hard-linked files can have aliases
>> + * in multiple directories, so all aliases must be tested.
>> + *
>> + * Return: %true if any dentry alias of @inode is at or below
>> + * @root_dentry, %false otherwise.
>> + */
>> +bool nfsd_file_inode_is_in_subtree(struct inode *inode,
>> +				   struct dentry *root_dentry)
>> +{
>> +	struct dentry *alias;
>> +	bool found = false;
>> +
>> +	/* i_lock stabilizes the alias list; is_subdir() nests
>> +	 * rename_lock (a seqlock) beneath it but does not sleep.
>> +	 */
>> +	spin_lock(&inode->i_lock);
>> +	hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
>> +		if (is_subdir(alias, root_dentry)) {
>> +			found = true;
>> +			break;
>> +		}
>> +	}
>> +	spin_unlock(&inode->i_lock);
>> +	return found;
>> +}
>> +
>>  static struct nfsd_fcache_disposal *
>>  nfsd_alloc_fcache_disposal(void)
>>  {
>> diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
>> index b383dbc5b921..36c9a8e388d2 100644
>> --- a/fs/nfsd/filecache.h
>> +++ b/fs/nfsd/filecache.h
>> @@ -70,6 +70,7 @@ struct net *nfsd_file_put_local(struct nfsd_file __rcu **nf);
>>  struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
>>  struct file *nfsd_file_file(struct nfsd_file *nf);
>>  void nfsd_file_close_inode_sync(struct inode *inode);
>> +bool nfsd_file_inode_is_in_subtree(struct inode *inode, struct dentry *root_dentry);
>>  void nfsd_file_net_dispose(struct nfsd_net *nn);
>>  bool nfsd_file_is_cached(struct inode *inode);
>>  __be32 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index 891669b32804..581f38395c42 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -1763,15 +1763,6 @@ static struct nfs4_stid *find_next_sb_stid(struct nfs4_client *clp,
>>  	return stid;
>>  }
>>  
>> -static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp,
>> -					  struct super_block *sb,
>> -					  unsigned int sc_types)
>> -{
>> -	unsigned long id = 0;
>> -
>> -	return find_next_sb_stid(clp, sb, sc_types, &id);
>> -}
>> -
>>  static void revoke_ol_stid(struct nfs4_client *clp,
>>  			   struct nfs4_ol_stateid *stp)
>>  {
>> @@ -1835,20 +1826,19 @@ static void revoke_one_stid(struct nfs4_client *clp, struct nfs4_stid *stid)
>>  }
>>  
>>  /**
>> - * nfsd4_revoke_states - revoke all nfsv4 states associated with given filesystem
>> - * @nn:   used to identify instance of nfsd (there is one per net namespace)
>> - * @sb:   super_block used to identify target filesystem
>> + * nfsd4_revoke_export_states - revoke states associated with a given export
>> + * @nn:           nfsd_net identifying the nfsd instance (one per net namespace)
>> + * @sb:           super_block of the export's filesystem
>> + * @root_dentry:  dentry of the export root directory
>>   *
>>   * All nfs4 states (open, lock, delegation, layout) held by the server instance
>> - * and associated with a file on the given filesystem will be revoked resulting
>> - * in any files being closed and so all references from nfsd to the filesystem
>> - * being released.  Thus nfsd will no longer prevent the filesystem from being
>> - * unmounted.
>> - *
>> - * The clients which own the states will subsequently being notified that the
>> - * states have been "admin-revoked".
>> + * and associated with files under the given export will be revoked.  When
>> + * @root_dentry is the filesystem root, all state on @sb is revoked (equivalent
>> + * to nfsd4_revoke_states).  When @root_dentry is a subdirectory, only state on
>> + * files within that subtree is revoked.
>>   */
>> -void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
>> +void nfsd4_revoke_export_states(struct nfsd_net *nn, struct super_block *sb,
>> +				struct dentry *root_dentry)
>>  {
>>  	unsigned int idhashval;
>>  	unsigned int sc_types;
>> @@ -1861,18 +1851,53 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
>>  		struct nfs4_client *clp;
>>  	retry:
>>  		list_for_each_entry(clp, head, cl_idhash) {
>> -			struct nfs4_stid *stid = find_one_sb_stid(clp, sb,
>> -								  sc_types);
>> -			if (stid) {
>> -				spin_unlock(&nn->client_lock);
>> +			struct nfs4_stid *stid;
>> +			/* Resets to zero on each retry; revocation may
>> +			 * alter the IDR, so a stale cursor is unsafe.
>> +			 */
>> +			unsigned long id = 0;
>> +
>> +			while ((stid = find_next_sb_stid(clp, sb,
>> +						sc_types, &id)) != NULL) {
>> +				if (root_dentry != sb->s_root) {
>> +					bool match;
>> +
>> +					/* Bare inc to pin clp; get_client_locked() is
>> +					 * not used because its courtesy-to-active
>> +					 * transition is unwanted during revocation.
>> +					 */
>> +					atomic_inc(&clp->cl_rpc_users);
>> +					spin_unlock(&nn->client_lock);
>> +					match = nfsd_file_inode_is_in_subtree(
>> +							stid->sc_file->fi_inode,
>> +							root_dentry);
> 
> Ouch the hardlinked thing makes this hard to reason about.
> 
> Ok, so suppose we have two exports on the same superblock.
> 
> /export/foo
> /export/bar
> 
> One is exported to one client foo and one to another to client bar.
> There is a file hardlinked across those directories:
> 
> $ touch /export/foo/baz
> $ ln /export/bar/baz /export/foo/baz
> 
> Now, client foo opens /export/foo/baz, and client bar opens
> /export/bar/baz.
> 
> /export/bar is unexported and state under it revoked. Won't client
> foo's state end up being revoked too in that case?
> 
> Note that the different hardlinks should end up with different
> filehandles since they are exposed to the clients via different
> exports.
> 
> I wonder... do we need keep track of the export under which a stateid
> was acquired so we can properly revoke the right ones in this
> situation?

If I understand your comment correctly, I believe that is what the
earlier patches in this series do -- tie the state IDs to a particular
export.



-- 
Chuck Lever

  reply	other threads:[~2026-03-18 14:51 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 14:15 [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2026-03-18 14:15 ` [PATCH v4 1/6] NFSD: Extract revoke_one_stid() utility function Chuck Lever
2026-03-18 14:21   ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 2/6] NFSD: Add NFSD_CMD_UNLOCK netlink command with ip scope Chuck Lever
2026-03-18 14:28   ` Jeff Layton
2026-03-18 14:32     ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 3/6] NFSD: Add filesystem scope to NFSD_CMD_UNLOCK Chuck Lever
2026-03-18 14:29   ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 4/6] NFSD: Refactor find_one_sb_stid() into find_next_sb_stid() Chuck Lever
2026-03-18 14:30   ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 5/6] NFSD: Add export-scoped state revocation Chuck Lever
2026-03-18 14:47   ` Jeff Layton
2026-03-18 14:51     ` Chuck Lever [this message]
2026-03-18 14:58       ` Jeff Layton
2026-03-18 17:57         ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 6/6] NFSD: Add nfsd_file_close_export() for file cache cleanup Chuck Lever
2026-03-18 14:24 ` [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202c9697-82cd-409a-8c5c-ea56a3830a8a@kernel.org \
    --to=cel@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox