linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Anna Schumaker <anna@kernel.org>,
	Trond Myklebust <trondmy@hammerspace.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v15 16/26] nfsd: add LOCALIO support
Date: Fri, 6 Sep 2024 11:04:16 -0400	[thread overview]
Message-ID: <ZtsZ8IEoV-DyqAzj@kernel.org> (raw)
In-Reply-To: <172557924809.4433.12586767127138915683@noble.neil.brown.name>

On Fri, Sep 06, 2024 at 09:34:08AM +1000, NeilBrown wrote:
> On Fri, 06 Sep 2024, Mike Snitzer wrote:
> > On Wed, Sep 04, 2024 at 09:47:07AM -0400, Chuck Lever wrote:
> > > On Wed, Sep 04, 2024 at 03:01:46PM +1000, NeilBrown wrote:
> > > > On Wed, 04 Sep 2024, NeilBrown wrote:
> > > > > 
> > > > > I agree that dropping and reclaiming a lock is an anti-pattern and in
> > > > > best avoided in general.  I cannot see a better alternative in this
> > > > > case.
> > > > 
> > > > It occurred to me what I should spell out the alternate that I DO see so
> > > > you have the option of disagreeing with my assessment that it isn't
> > > > "better".
> > > > 
> > > > We need RCU to call into nfsd, we need a per-cpu ref on the net (which
> > > > we can only get inside nfsd) and NOT RCU to call
> > > > nfsd_file_acquire_local().
> > > > 
> > > > The current code combines these (because they are only used together)
> > > > and so the need to drop rcu. 
> > > > 
> > > > I thought briefly that it could simply drop rcu and leave it dropped
> > > > (__releases(rcu)) but not only do I generally like that LESS than
> > > > dropping and reclaiming, I think it would be buggy.  While in the nfsd
> > > > module code we need to be holding either rcu or a ref on the server else
> > > > the code could disappear out from under the CPU.  So if we exit without
> > > > a ref on the server - which we do if nfsd_file_acquire_local() fails -
> > > > then we need to reclaim RCU *before* dropping the ref.  So the current
> > > > code is slightly buggy.
> > > > 
> > > > We could instead split the combined call into multiple nfs_to
> > > > interfaces.
> > > > 
> > > > So nfs_open_local_fh() in nfs_common/nfslocalio.c would be something
> > > > like:
> > > > 
> > > >  rcu_read_lock();
> > > >  net = READ_ONCE(uuid->net);
> > > >  if (!net || !nfs_to.get_net(net)) {
> > > >        rcu_read_unlock();
> > > >        return ERR_PTR(-ENXIO);
> > > >  }
> > > >  rcu_read_unlock();
> > > >  localio = nfs_to.nfsd_open_local_fh(....);
> > > >  if (IS_ERR(localio))
> > > >        nfs_to.put_net(net);
> > > >  return localio;
> > > > 
> > > > So we have 3 interfaces instead of 1, but no hidden unlock/lock.
> > > 
> > > Splitting up the function call occurred to me as well, but I didn't
> > > come up with a specific bit of surgery. Thanks for the suggestion.
> > > 
> > > At this point, my concern is that we will lose your cogent
> > > explanation of why the release/lock is done. Having it in email is
> > > great, but email is more ephemeral than actually putting it in the
> > > code.
> > > 
> > > 
> > > > As I said, I don't think this is a net win, but reasonable people might
> > > > disagree with me.
> > > 
> > > The "win" here is that it makes this code self-documenting and
> > > somewhat less likely to be broken down the road by changes in and
> > > around this area. Since I'm more forgetful these days I lean towards
> > > the more obvious kinds of coding solutions. ;-)
> > > 
> > > Mike, how do you feel about the 3-interface suggestion?
> > 
> > I dislike expanding from 1 indirect function call to 2 in rapid
> > succession (3 for the error path, not a problem, just being precise.
> > But I otherwise like it.. maybe.. heh.
> > 
> > FYI, I did run with the suggestion to make nfs_to a pointer that just
> > needs a simple assignment rather than memcpy to initialize.  So Neil's
> > above code becames:
> > 
> >         rcu_read_lock();
> >         net = rcu_dereference(uuid->net);
> >         if (!net || !nfs_to->nfsd_serv_try_get(net)) {
> >                 rcu_read_unlock();
> >                 return ERR_PTR(-ENXIO);
> >         }
> >         rcu_read_unlock();
> >         /* We have an implied reference to net thanks to nfsd_serv_try_get */
> >         localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
> >                                              cred, nfs_fh, fmode);
> >         if (IS_ERR(localio))
> >                 nfs_to->nfsd_serv_put(net);
> >         return localio;
> > 
> > I do think it cleans the code up... full patch is here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next.v15-with-fixups&id=e85306941878a87070176702de687f2779436061
> > 
> > But I'm still on the fence.. someone help push me over!
> 
> I think the new code is unquestionable clearer, and not taking this
> approach would be a micro-optimisation which would need to be
> numerically justified.  So I'm pushing for the three-interface version
> (despite what I said before).
> 
> Unfortunately the new code is not bug-free - not quite.
> As soon as nfs_to->nfsd_serv_put() calls percpu_ref_put() the nfsd
> module can be unloaded, and the "return" instruction might not be
> present.  For this to go wrong would require a lot of bad luck, but if
> the CPU took an interrupt at the wrong time were would be room.
> 
> [Ever since module_put_and_exit() was added (now ..and_kthread_exit)
>  I've been sensitive to dropping the ref to a module in code running in
>  the module]
> 
> So I think nfsd_serv_put (and nfsd_serv_try_get() __must_hold(RCU) and
> nfs_open_local_fh() needs rcu_read_lock() before calling
> nfs_to->nfsd_serv_put(net).

OK, yes I can see that, I implemented what you suggested at the end of
your reply (see inline patch below)...

But I'd just like to point out that something like the below patch
wouldn't be needed if we kept my "heavy" approach (nfs reference on
nfsd modules via nfs_common using request_symbol):
https://marc.info/?l=linux-nfs&m=172499445027800&w=2
(that patch has stuff I since cleaned up, e.g. removed typedefs and
EXPORT_SYMBOL_GPLs..)

I knew we were going to pay for being too cute with how nfs took its
reference on nfsd.

So here we are, needing fiddly incremental fixes like this to close a
really-small-yet-will-be-deadly race:

diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index c29cdf51c458..d124c265b8fd 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -341,7 +341,7 @@ nfs_local_pgio_release(struct nfs_local_kiocb *iocb)
 {
 	struct nfs_pgio_header *hdr = iocb->hdr;
 
-	nfs_to->nfsd_file_put_local(iocb->localio);
+	nfs_to_nfsd_file_put_local(iocb->localio);
 	nfs_local_iocb_free(iocb);
 	nfs_local_hdr_release(hdr, hdr->task.tk_ops);
 }
@@ -622,7 +622,7 @@ int nfs_local_doio(struct nfs_client *clp, struct nfsd_file *localio,
 	}
 out:
 	if (status != 0) {
-		nfs_to->nfsd_file_put_local(localio);
+		nfs_to_nfsd_file_put_local(localio);
 		hdr->task.tk_status = status;
 		nfs_local_hdr_release(hdr, call_ops);
 	}
@@ -673,7 +673,7 @@ nfs_local_release_commit_data(struct nfsd_file *localio,
 		struct nfs_commit_data *data,
 		const struct rpc_call_ops *call_ops)
 {
-	nfs_to->nfsd_file_put_local(localio);
+	nfs_to_nfsd_file_put_local(localio);
 	call_ops->rpc_call_done(&data->task, data);
 	call_ops->rpc_release(data);
 }
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 42b479b9191f..5c8ce5066c16 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -142,8 +142,11 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
 	/* We have an implied reference to net thanks to nfsd_serv_try_get */
 	localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
 					     cred, nfs_fh, fmode);
-	if (IS_ERR(localio))
+	if (IS_ERR(localio)) {
+		rcu_read_lock();
 		nfs_to->nfsd_serv_put(net);
+		rcu_read_unlock();
+	}
 	return localio;
 }
 EXPORT_SYMBOL_GPL(nfs_open_local_fh);
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 7ff477b40bcd..0d389051d08d 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -398,7 +398,7 @@ nfsd_file_put(struct nfsd_file *nf)
  * reference to the associated nn->nfsd_serv.
  */
 void
-nfsd_file_put_local(struct nfsd_file *nf)
+nfsd_file_put_local(struct nfsd_file *nf) __must_hold(rcu)
 {
 	struct net *net = nf->nf_net;
 
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 291e9c69cae4..f441cb9f74d5 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -53,7 +53,7 @@ void nfsd_localio_ops_init(void)
  *
  * On successful return, returned nfsd_file will have its nf_net member
  * set. Caller (NFS client) is responsible for calling nfsd_serv_put and
- * nfsd_file_put (via nfs_to->nfsd_file_put_local).
+ * nfsd_file_put (via nfs_to_nfsd_file_put_local).
  */
 struct nfsd_file *
 nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index e236135ddc63..47172b407be8 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -214,14 +214,14 @@ int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change
 	return 0;
 }
 
-bool nfsd_serv_try_get(struct net *net)
+bool nfsd_serv_try_get(struct net *net) __must_hold(rcu)
 {
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	return (nn && percpu_ref_tryget_live(&nn->nfsd_serv_ref));
 }
 
-void nfsd_serv_put(struct net *net)
+void nfsd_serv_put(struct net *net) __must_hold(rcu)
 {
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index b353abe00357..b0dd9b1eef4f 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -65,10 +65,25 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
 		   struct rpc_clnt *, const struct cred *,
 		   const struct nfs_fh *, const fmode_t);
 
+static inline void nfs_to_nfsd_file_put_local(struct nfsd_file *localio)
+{
+	/*
+	 * Once reference to nfsd_serv is dropped, NFSD could be
+	 * unloaded, so ensure safe return from nfsd_file_put_local()
+	 * by always taking RCU.
+	 */
+	rcu_read_lock();
+	nfs_to->nfsd_file_put_local(localio);
+	rcu_read_unlock();
+}
+
 #else   /* CONFIG_NFS_LOCALIO */
 static inline void nfsd_localio_ops_init(void)
 {
 }
+static inline void nfs_to_nfsd_file_put_local(struct nfsd_file *localio)
+{
+}
 #endif  /* CONFIG_NFS_LOCALIO */
 
 #endif  /* __LINUX_NFSLOCALIO_H */

> > 
> > Tangent, but in the related business of "what are next steps?":
> > 
> > I updated headers with various provided Reviewed-by:s and Acked-by:s,
> > fixed at least 1 commit header, fixed some sparse issues, various
> > fixes to nfs_to patch (removed EXPORT_SYMBOL_GPL, switched to using
> > pointer, updated nfs_to callers). Etc...
> > 
> > But if I fold those changes in I compromise the provided Reviewed-by
> > and Acked-by.. so I'm leaning toward posting a v16 that has
> > these incremental fixes/improvements, see the 3 topmost commits here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-next.v15-with-fixups
> > 
> > Or if you can review the incremental patches I can fold them in and
> > preserve the various Reviewed-by and Acked-by...
> 
> I have reviewed the incremental patches and I'm happy for all my tags to
> apply to the new versions of the patches.

Thanks!
Mike

  reply	other threads:[~2024-09-06 15:04 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-31 22:37 [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 01/26] nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 02/26] nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 03/26] nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 04/26] NFSD: Handle @rqstp == NULL in check_nfsd_access() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 05/26] NFSD: Refactor nfsd_setuser_and_check_port() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 06/26] NFSD: Avoid using rqstp->rq_vers in nfsd_set_fh_dentry() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 07/26] NFSD: Short-circuit fh_verify tracepoints for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 08/26] nfsd: factor out __fh_verify to allow NULL rqstp to be passed Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 09/26] nfsd: add nfsd_file_acquire_local() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 10/26] nfsd: add nfsd_serv_try_get and nfsd_serv_put Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 11/26] SUNRPC: remove call_allocate() BUG_ONs Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 12/26] SUNRPC: add svcauth_map_clnt_to_svc_cred_local Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 13/26] SUNRPC: replace program list with program array Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 14/26] nfs_common: add NFS LOCALIO auxiliary protocol enablement Mike Snitzer
2024-09-01 23:25   ` NeilBrown
2024-09-03 16:33     ` Mike Snitzer
2024-09-05 19:24   ` Anna Schumaker
2024-09-05 19:38     ` Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 15/26] nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO Mike Snitzer
2024-09-01 23:37   ` NeilBrown
2024-08-31 22:37 ` [PATCH v15 16/26] nfsd: add LOCALIO support Mike Snitzer
2024-09-01 23:46   ` NeilBrown
2024-09-03 14:34   ` Chuck Lever
2024-09-03 14:40     ` Jeff Layton
2024-09-03 15:00       ` Mike Snitzer
2024-09-03 15:19         ` Jeff Layton
2024-09-03 15:29           ` Mike Snitzer
2024-09-03 15:59             ` Chuck Lever III
2024-09-03 16:09               ` Mike Snitzer
2024-09-03 17:07                 ` Chuck Lever III
2024-09-03 22:31               ` NeilBrown
2024-09-04  5:01                 ` NeilBrown
2024-09-04 13:47                   ` Chuck Lever
2024-09-05 14:21                     ` Mike Snitzer
2024-09-05 15:41                       ` Chuck Lever III
2024-09-05 23:34                       ` NeilBrown
2024-09-06 15:04                         ` Mike Snitzer [this message]
2024-09-06 18:07                           ` Mike Snitzer
2024-09-06 21:56                             ` NeilBrown
2024-09-06 22:33                               ` Chuck Lever III
2024-09-06 23:14                                 ` NeilBrown
2024-09-07 15:17                                   ` Mike Snitzer
2024-09-07 16:09                                     ` Chuck Lever III
2024-09-07 19:08                                       ` Mike Snitzer
2024-09-07 21:12                                         ` Chuck Lever III
2024-09-08 15:05                                           ` Chuck Lever III
2024-09-07 15:52                               ` Mike Snitzer
2024-09-04 13:54                   ` Jeff Layton
2024-09-04 13:56                     ` Chuck Lever III
2024-08-31 22:37 ` [PATCH v15 17/26] nfsd: implement server support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-09-03 14:11   ` Chuck Lever
2024-08-31 22:37 ` [PATCH v15 18/26] nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 19/26] nfs: add LOCALIO support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 20/26] nfs: enable localio for non-pNFS IO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 21/26] pnfs/flexfiles: enable localio support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 22/26] nfs/localio: use dedicated workqueues for filesystem read and write Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 23/26] nfs: implement client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 24/26] nfs: add Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 25/26] nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 26/26] nfs: add "NFS Client and Server Interlock" section to localio.rst Mike Snitzer
2024-09-01 23:52 ` [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO NeilBrown
2024-09-03 14:49 ` Jeff Layton
2024-09-06 19:31 ` Anna Schumaker
2024-09-06 20:34   ` Mike Snitzer
2024-09-06 21:09     ` Chuck Lever III
2024-09-10 16:45     ` Mike Snitzer
2024-09-10 19:14       ` Mike Snitzer
2024-09-10 19:24         ` Anna Schumaker
2024-09-10 20:31         ` Anna Schumaker
2024-09-10 22:11           ` Mike Snitzer
2024-09-11 17:51             ` Mike Snitzer
2024-09-11 18:48               ` Mike Snitzer
2024-09-13 18:12                 ` Mike Snitzer
2024-09-11  0:43   ` NeilBrown
2024-09-11 16:03     ` Chuck Lever III
2024-09-12 23:31       ` NeilBrown
2024-09-12 23:42         ` Chuck Lever III
2024-09-13 12:27           ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtsZ8IEoV-DyqAzj@kernel.org \
    --to=snitzer@kernel.org \
    --cc=anna@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).