All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: NeilBrown <neil@brown.name>,
	"Trond Myklebust" <trondmy@kernel.org>,
	"Anna Schumaker" <anna@kernel.org>,
	"Pali Rohár" <pali@kernel.org>,
	"Vincent Mailhol" <mailhol.vincent@wanadoo.fr>,
	"Jeff Layton" <jlayton@kernel.org>,
	linux-nfs@vger.kernel.org, paulmck@kernel.org
Subject: Re: [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers
Date: Sat, 10 May 2025 15:57:58 -0400	[thread overview]
Message-ID: <aB-vxtKNKxpPnkz2@kernel.org> (raw)
In-Reply-To: <8c8ded94-ea5e-4b12-af2b-72004a988ad5@oracle.com>

On Sat, May 10, 2025 at 12:02:27PM -0400, Chuck Lever wrote:
> On 5/9/25 11:01 PM, NeilBrown wrote:
> > On Sat, 10 May 2025, Chuck Lever wrote:
> >> [ adding Paul McK ]
> >>
> >> On 5/8/25 8:46 PM, NeilBrown wrote:
> >>> This is a revised version a the earlier series.  I've actually tested
> >>> this time and fixed a few issues including the one that Mike found.
> >>
> >> As Mike mentioned in a previous thread, at this point, any fix for this
> >> issue will need to be applied to recent stable kernels as well. This
> >> series looks a bit too complicated for that.
> >>
> >> I expect that other subsystems will encounter this issue eventually,
> >> so it would be beneficial to address the root cause. For that purpose, I
> >> think I like Vincent's proposal the best:
> >>
> >> https://lore.kernel.org/linux-nfs/8c67a295-8caa-4e53-a764-f691657bbe62@wanadoo.fr/raw
> >>
> >> None of this is to say that Neil's patches shouldn't be applied. But
> >> perhaps these are not a broad solution to the RCU compilation issue.
> > 
> > Do we need a "broad solution to the RCU compilation issue"?
> 
> Fair question. If the current localio code is simply incorrect as it
> stands, then I suppose the answer is no. Because gcc is happy to compile
> it in most cases, I thought the problem was with older versions of gcc,
> not with localio (even though, I agree, the use of an incomplete
> structure definition is somewhat brittle when used with RCU).
> 
> 
> > Does it ever make sense to "dereference" a pointer to a structure that is
> > not fully specified?  What does that even mean?
> > 
> > I find it harder to argue against use of rcu_access_pointer() in that
> > context, at least for test-against-NULL, but sparse doesn't complain
> > about a bare test of an __rcu pointer against NULL, so maybe there is no
> > need for rcu_access_pointer() for simple tests - in which case the
> > documentation should be updated.
> 
> For backporting purposes, inventing our own local RCU helper to handle
> the situation might be best. Then going forward, apply your patches to
> rectify the use of the incomplete structure definition, and the local
> helper can eventually be removed.
> 
> My interest is getting to a narrow set of changes that can be applied
> now and backported as needed. The broader clean-ups can then be applied
> to future kernels (or as subsequent patches in the same merge window).
> 
> My 2 cents, worth every penny.

I really would prefer we just use this patch as the stop-gap for 6.14
and 6.15 (which I have been carrying for nearly a year now because I
need to support an EL8 platform that uses gcc 8.5):

https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=kernel-6.12.24/nfs-testing&id=f9add5e4c9b4629b102876dce9484e1da3e35d1f

Then we work through getting Neil's patchset to land for 6.16 and
revert the stop-gap (dummy nfsd_file) patch.

> > (of course rcu_dereference() doesn't actually dereference the pointer,
> >  despite its name.  It just declared that there is an imminent intention
> >  to dereference the pointer.....)
> > 
> > NeilBrown

Rather than do a way more crazy stop-gap like this (which actually works):

 fs/nfs/localio.c           |  6 +++---
 fs/nfs_common/nfslocalio.c |  8 +++----
 include/linux/nfslocalio.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 73dd07495440..fedc07254c00 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -272,7 +272,7 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
 
 	new = NULL;
 	rcu_read_lock();
-	nf = rcu_dereference(*pnf);
+	nf = rcu_dereference_opaque(*pnf);
 	if (!nf) {
 		rcu_read_unlock();
 		new = __nfs_local_open_fh(clp, cred, fh, nfl, mode);
@@ -281,11 +281,11 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
 		rcu_read_lock();
 		/* try to swap in the pointer */
 		spin_lock(&clp->cl_uuid.lock);
-		nf = rcu_dereference_protected(*pnf, 1);
+		nf = rcu_dereference_opaque_protected(*pnf, 1);
 		if (!nf) {
 			nf = new;
 			new = NULL;
-			rcu_assign_pointer(*pnf, nf);
+			rcu_assign_opaque_pointer(*pnf, nf);
 		}
 		spin_unlock(&clp->cl_uuid.lock);
 	}
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 6a0bdea6d644..213862ceb8bb 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -285,14 +285,14 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
 		return;
 	}
 
-	ro_nf = rcu_access_pointer(nfl->ro_file);
-	rw_nf = rcu_access_pointer(nfl->rw_file);
+	ro_nf = rcu_access_opaque(nfl->ro_file);
+	rw_nf = rcu_access_opaque(nfl->rw_file);
 	if (ro_nf || rw_nf) {
 		spin_lock(&nfs_uuid->lock);
 		if (ro_nf)
-			ro_nf = rcu_dereference_protected(xchg(&nfl->ro_file, NULL), 1);
+			ro_nf = rcu_dereference_opaque_protected(xchg(&nfl->ro_file, NULL), 1);
 		if (rw_nf)
-			rw_nf = rcu_dereference_protected(xchg(&nfl->rw_file, NULL), 1);
+			rw_nf = rcu_dereference_opaque_protected(xchg(&nfl->rw_file, NULL), 1);
 
 		/* Remove nfl from nfs_uuid->files list */
 		RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index 9aa8a43843d7..c6e86891d4b5 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -15,6 +15,58 @@
 #include <linux/sunrpc/svcauth.h>
 #include <linux/nfs.h>
 #include <net/net_namespace.h>
+#include <linux/rcupdate.h>
+
+/*
+ * RCU methods to allow fs/nfs_common and fs/nfs LOCALIO code to avoid
+ * dereferencing pointer to 'struct nfs_file' which is opaque outside fs/nfsd
+*/
+#define __rcu_access_opaque_pointer(p, local, space) \
+({ \
+	typeof(p) local = (__force typeof(p))READ_ONCE(p); \
+	rcu_check_sparse(p, space); \
+	(__force __kernel typeof(p))(local); \
+})
+
+#define rcu_access_opaque(p) __rcu_access_opaque_pointer((p), __UNIQUE_ID(rcu), __rcu)
+
+#define __rcu_dereference_opaque_protected(p, local, c, space) \
+({ \
+	RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_opaque_protected() usage"); \
+	rcu_check_sparse(p, space); \
+	(__force __kernel typeof(p))(p); \
+})
+
+#define rcu_dereference_opaque_protected(p, c) \
+	__rcu_dereference_opaque_protected((p), __UNIQUE_ID(rcu), (c), __rcu)
+
+#define __rcu_dereference_opaque_check(p, local, c, space) \
+({ \
+	/* Dependency order vs. p above. */ \
+	typeof(p) local = (__force typeof(p))READ_ONCE(p); \
+	RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_opaque_check() usage"); \
+	rcu_check_sparse(p, space); \
+	(__force __kernel typeof(p))(local); \
+})
+
+#define rcu_dereference_opaque_check(p, c) \
+	__rcu_dereference_opaque_check((p), __UNIQUE_ID(rcu), \
+				       (c) || rcu_read_lock_held(), __rcu)
+
+#define rcu_dereference_opaque(p) rcu_dereference_opaque_check(p, 0)
+
+#define RCU_INITIALIZER_OPAQUE(v) (typeof((v)) __force __rcu)(v)
+
+#define rcu_assign_opaque_pointer(p, v)					      \
+do {									      \
+	uintptr_t _r_a_p__v = (uintptr_t)(v);				      \
+	rcu_check_sparse(p, __rcu);					      \
+									      \
+	if (__builtin_constant_p(v) && (_r_a_p__v) == (uintptr_t)NULL)	      \
+		WRITE_ONCE((p), (typeof(p))(_r_a_p__v));		      \
+	else								      \
+		smp_store_release(&p, RCU_INITIALIZER_OPAQUE((typeof(p))_r_a_p__v)); \
+} while (0)
 
 struct nfs_client;
 struct nfs_file_localio;

  reply	other threads:[~2025-05-10 19:57 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-09  0:46 [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-05-09  0:46 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
2025-05-09  0:46 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
2025-05-09  0:46 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
2025-05-09  0:46 ` [PATCH 4/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
2025-05-09  0:46 ` [PATCH 5/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
2025-05-09  0:46 ` [PATCH 6/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer NeilBrown
2025-05-09 11:03   ` kernel test robot
2025-07-08 14:20   ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14  3:13     ` [for-6.16-final PATCH 0/9] NFSD/NFS/LOCALIO: stable fixes and revert 6.16 LOCALIO changes Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 1/9] Revert "NFSD: Clean up kdoc for nfsd_open_local_fh()" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 2/9] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 3/9] Revert "nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 4/9] Revert "nfs_localio: duplicate nfs_close_local_fh()" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 5/9] Revert "nfs_localio: simplify interface to nfsd for getting nfsd_file" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 6/9] Revert "nfs_localio: always hold nfsd net ref with nfsd_file ref" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 7/9] Revert "nfs_localio: use cmpxchg() to install new nfs_file_localio" Mike Snitzer
2025-07-14  3:13       ` [for-6.16-final PATCH 8/9] nfs/localio: avoid bouncing LOCALIO if nfs_client_is_local() Mike Snitzer
2025-07-14  4:19         ` NeilBrown
2025-07-14 14:37           ` Mike Snitzer
2025-07-14 12:23         ` Jeff Layton
2025-07-14  3:13       ` [for-6.16-final PATCH 9/9] nfs/localio: add localio_async_probe modparm Mike Snitzer
2025-07-14  4:23         ` NeilBrown
2025-07-14 12:28           ` Jeff Layton
2025-07-14 14:08             ` Mike Snitzer
2025-07-14  3:50     ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" NeilBrown
2025-07-14 14:45       ` Mike Snitzer
2025-07-15 22:52     ` [PATCH 0/3] Fix localio hangs Trond Myklebust
2025-07-15 22:52       ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-15 22:52       ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events Trond Myklebust
2025-07-15 22:52       ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16  1:09       ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed NeilBrown
2025-07-16  1:22       ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events NeilBrown
2025-07-16  2:29         ` Trond Myklebust
2025-07-16  3:51           ` NeilBrown
2025-07-16  1:31       ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file NeilBrown
2025-07-16  4:17         ` Trond Myklebust
2025-07-16  5:07           ` NeilBrown
2025-07-16 15:19             ` Trond Myklebust
2025-07-16 15:59       ` [PATCH v2 0/3] Fix localio hangs Trond Myklebust
2025-07-16 15:59         ` [PATCH v2 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-16 15:59         ` [PATCH v2 2/3] NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh() Trond Myklebust
2025-07-16 15:59         ` [PATCH v2 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 22:09         ` [PATCH v2 0/3] Fix localio hangs NeilBrown
2025-07-16 23:27           ` Mike Snitzer
2025-07-18  0:18             ` NeilBrown
2025-05-09 16:01 ` [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers Chuck Lever
2025-05-09 21:02   ` Mike Snitzer
2025-05-10  0:16     ` Paul E. McKenney
2025-05-10  2:44       ` NeilBrown
2025-05-10  3:01   ` NeilBrown
2025-05-10 16:02     ` Chuck Lever
2025-05-10 19:57       ` Mike Snitzer [this message]
2025-05-16 15:33         ` Chuck Lever
2025-05-18 10:46           ` Pali Rohár
2025-05-19  3:49         ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aB-vxtKNKxpPnkz2@kernel.org \
    --to=snitzer@kernel.org \
    --cc=anna@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mailhol.vincent@wanadoo.fr \
    --cc=neil@brown.name \
    --cc=pali@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.