public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <cel@kernel.org>, NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH v4 5/6] NFSD: Add export-scoped state revocation
Date: Wed, 18 Mar 2026 10:58:53 -0400	[thread overview]
Message-ID: <c42d3896eb47aa3bb766f29776fd0aac2a1eb07d.camel@kernel.org> (raw)
In-Reply-To: <202c9697-82cd-409a-8c5c-ea56a3830a8a@kernel.org>

On Wed, 2026-03-18 at 10:51 -0400, Chuck Lever wrote:
> On 3/18/26 10:47 AM, Jeff Layton wrote:
> > On Wed, 2026-03-18 at 10:15 -0400, Chuck Lever wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > nfsd4_revoke_states() revokes all NFSv4 state on an entire
> > > superblock, which is too coarse when multiple exports share a
> > > filesystem. Add nfsd4_revoke_export_states() to revoke only
> > > state associated with files under a specific export root, then
> > > convert nfsd4_revoke_states() to a thin wrapper that passes
> > > sb->s_root.
> > > 
> > > nfsd4_revoke_export_states() uses find_next_sb_stid() to locate
> > > candidate stids, then verifies each against the export root via
> > > nfsd_file_inode_is_in_subtree(). That helper is placed in the
> > > file cache layer (filecache.c) because it operates on VFS types
> > > with no NFSv4 state dependency. It walks all of an inode's
> > > dentry aliases rather than calling d_find_any_alias(), because
> > > for hard-linked files an arbitrary alias may fall outside the
> > > export subtree even when another alias is inside it.
> > > 
> > > When the export root is the filesystem root, the subtree check
> > > is elided and every stid matching the superblock is revoked
> > > directly.
> > > 
> > > The NFSD_UNLOCK_TYPE_FILESYSTEM handler now calls
> > > nfsd4_revoke_export_states() with the resolved path dentry,
> > > enabling subtree-scoped revocation through the netlink
> > > interface.
> > > 
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > ---
> > >  fs/nfsd/filecache.c | 32 +++++++++++++++++++
> > >  fs/nfsd/filecache.h |  1 +
> > >  fs/nfsd/nfs4state.c | 92 +++++++++++++++++++++++++++++++++++++----------------
> > >  fs/nfsd/nfsctl.c    |  3 +-
> > >  fs/nfsd/state.h     |  7 ++++
> > >  5 files changed, 107 insertions(+), 28 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > > index 1e2b38ed1d35..cd09be0c5465 100644
> > > --- a/fs/nfsd/filecache.c
> > > +++ b/fs/nfsd/filecache.c
> > > @@ -894,6 +894,38 @@ __nfsd_file_cache_purge(struct net *net)
> > >  	nfsd_file_dispose_list(&dispose);
> > >  }
> > >  
> > > +/**
> > > + * nfsd_file_inode_is_in_subtree - check whether an inode is under a subtree
> > > + * @inode:        inode to test
> > > + * @root_dentry:  dentry of the subtree root
> > > + *
> > > + * Check whether @inode has any dentry alias that falls within the
> > > + * subtree rooted at @root_dentry.  Hard-linked files can have aliases
> > > + * in multiple directories, so all aliases must be tested.
> > > + *
> > > + * Return: %true if any dentry alias of @inode is at or below
> > > + * @root_dentry, %false otherwise.
> > > + */
> > > +bool nfsd_file_inode_is_in_subtree(struct inode *inode,
> > > +				   struct dentry *root_dentry)
> > > +{
> > > +	struct dentry *alias;
> > > +	bool found = false;
> > > +
> > > +	/* i_lock stabilizes the alias list; is_subdir() nests
> > > +	 * rename_lock (a seqlock) beneath it but does not sleep.
> > > +	 */
> > > +	spin_lock(&inode->i_lock);
> > > +	hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
> > > +		if (is_subdir(alias, root_dentry)) {
> > > +			found = true;
> > > +			break;
> > > +		}
> > > +	}
> > > +	spin_unlock(&inode->i_lock);
> > > +	return found;
> > > +}
> > > +
> > >  static struct nfsd_fcache_disposal *
> > >  nfsd_alloc_fcache_disposal(void)
> > >  {
> > > diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
> > > index b383dbc5b921..36c9a8e388d2 100644
> > > --- a/fs/nfsd/filecache.h
> > > +++ b/fs/nfsd/filecache.h
> > > @@ -70,6 +70,7 @@ struct net *nfsd_file_put_local(struct nfsd_file __rcu **nf);
> > >  struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
> > >  struct file *nfsd_file_file(struct nfsd_file *nf);
> > >  void nfsd_file_close_inode_sync(struct inode *inode);
> > > +bool nfsd_file_inode_is_in_subtree(struct inode *inode, struct dentry *root_dentry);
> > >  void nfsd_file_net_dispose(struct nfsd_net *nn);
> > >  bool nfsd_file_is_cached(struct inode *inode);
> > >  __be32 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 891669b32804..581f38395c42 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -1763,15 +1763,6 @@ static struct nfs4_stid *find_next_sb_stid(struct nfs4_client *clp,
> > >  	return stid;
> > >  }
> > >  
> > > -static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp,
> > > -					  struct super_block *sb,
> > > -					  unsigned int sc_types)
> > > -{
> > > -	unsigned long id = 0;
> > > -
> > > -	return find_next_sb_stid(clp, sb, sc_types, &id);
> > > -}
> > > -
> > >  static void revoke_ol_stid(struct nfs4_client *clp,
> > >  			   struct nfs4_ol_stateid *stp)
> > >  {
> > > @@ -1835,20 +1826,19 @@ static void revoke_one_stid(struct nfs4_client *clp, struct nfs4_stid *stid)
> > >  }
> > >  
> > >  /**
> > > - * nfsd4_revoke_states - revoke all nfsv4 states associated with given filesystem
> > > - * @nn:   used to identify instance of nfsd (there is one per net namespace)
> > > - * @sb:   super_block used to identify target filesystem
> > > + * nfsd4_revoke_export_states - revoke states associated with a given export
> > > + * @nn:           nfsd_net identifying the nfsd instance (one per net namespace)
> > > + * @sb:           super_block of the export's filesystem
> > > + * @root_dentry:  dentry of the export root directory
> > >   *
> > >   * All nfs4 states (open, lock, delegation, layout) held by the server instance
> > > - * and associated with a file on the given filesystem will be revoked resulting
> > > - * in any files being closed and so all references from nfsd to the filesystem
> > > - * being released.  Thus nfsd will no longer prevent the filesystem from being
> > > - * unmounted.
> > > - *
> > > - * The clients which own the states will subsequently being notified that the
> > > - * states have been "admin-revoked".
> > > + * and associated with files under the given export will be revoked.  When
> > > + * @root_dentry is the filesystem root, all state on @sb is revoked (equivalent
> > > + * to nfsd4_revoke_states).  When @root_dentry is a subdirectory, only state on
> > > + * files within that subtree is revoked.
> > >   */
> > > -void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> > > +void nfsd4_revoke_export_states(struct nfsd_net *nn, struct super_block *sb,
> > > +				struct dentry *root_dentry)
> > >  {
> > >  	unsigned int idhashval;
> > >  	unsigned int sc_types;
> > > @@ -1861,18 +1851,53 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb)
> > >  		struct nfs4_client *clp;
> > >  	retry:
> > >  		list_for_each_entry(clp, head, cl_idhash) {
> > > -			struct nfs4_stid *stid = find_one_sb_stid(clp, sb,
> > > -								  sc_types);
> > > -			if (stid) {
> > > -				spin_unlock(&nn->client_lock);
> > > +			struct nfs4_stid *stid;
> > > +			/* Resets to zero on each retry; revocation may
> > > +			 * alter the IDR, so a stale cursor is unsafe.
> > > +			 */
> > > +			unsigned long id = 0;
> > > +
> > > +			while ((stid = find_next_sb_stid(clp, sb,
> > > +						sc_types, &id)) != NULL) {
> > > +				if (root_dentry != sb->s_root) {
> > > +					bool match;
> > > +
> > > +					/* Bare inc to pin clp; get_client_locked() is
> > > +					 * not used because its courtesy-to-active
> > > +					 * transition is unwanted during revocation.
> > > +					 */
> > > +					atomic_inc(&clp->cl_rpc_users);
> > > +					spin_unlock(&nn->client_lock);
> > > +					match = nfsd_file_inode_is_in_subtree(
> > > +							stid->sc_file->fi_inode,
> > > +							root_dentry);
> > 
> > Ouch the hardlinked thing makes this hard to reason about.
> > 
> > Ok, so suppose we have two exports on the same superblock.
> > 
> > /export/foo
> > /export/bar
> > 
> > One is exported to one client foo and one to another to client bar.
> > There is a file hardlinked across those directories:
> > 
> > $ touch /export/foo/baz
> > $ ln /export/bar/baz /export/foo/baz
> > 
> > Now, client foo opens /export/foo/baz, and client bar opens
> > /export/bar/baz.
> > 
> > /export/bar is unexported and state under it revoked. Won't client
> > foo's state end up being revoked too in that case?
> > 
> > Note that the different hardlinks should end up with different
> > filehandles since they are exposed to the clients via different
> > exports.
> > 
> > I wonder... do we need keep track of the export under which a stateid
> > was acquired so we can properly revoke the right ones in this
> > situation?
> 
> If I understand your comment correctly, I believe that is what the
> earlier patches in this series do -- tie the state IDs to a particular
> export.
> 
> 

I don't think so, not AFAICT. 

+ * Check whether @inode has any dentry alias that falls within the
+ * subtree rooted at @root_dentry.  Hard-linked files can have aliases
+ * in multiple directories, so all aliases must be tested.

In this case, you're calling nfsd_file_inode_is_in_subtree(). In the
above example, that's going to return true for foo's state, even though
bar's export is the one being revoked.

I think to do this properly, you'd have to track an st_export field in
struct nfs4_ol_stateid (or maybe in struct nfs4_stid).

OTOH, hardlinks are rare so maybe this isn't worth fretting over.
-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2026-03-18 14:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 14:15 [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2026-03-18 14:15 ` [PATCH v4 1/6] NFSD: Extract revoke_one_stid() utility function Chuck Lever
2026-03-18 14:21   ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 2/6] NFSD: Add NFSD_CMD_UNLOCK netlink command with ip scope Chuck Lever
2026-03-18 14:28   ` Jeff Layton
2026-03-18 14:32     ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 3/6] NFSD: Add filesystem scope to NFSD_CMD_UNLOCK Chuck Lever
2026-03-18 14:29   ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 4/6] NFSD: Refactor find_one_sb_stid() into find_next_sb_stid() Chuck Lever
2026-03-18 14:30   ` Jeff Layton
2026-03-18 14:15 ` [PATCH v4 5/6] NFSD: Add export-scoped state revocation Chuck Lever
2026-03-18 14:47   ` Jeff Layton
2026-03-18 14:51     ` Chuck Lever
2026-03-18 14:58       ` Jeff Layton [this message]
2026-03-18 17:57         ` Chuck Lever
2026-03-18 14:15 ` [PATCH v4 6/6] NFSD: Add nfsd_file_close_export() for file cache cleanup Chuck Lever
2026-03-18 14:24 ` [PATCH v4 0/6] Automatic NFSv4 state revocation on filesystem unmount Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c42d3896eb47aa3bb766f29776fd0aac2a1eb07d.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox