From: Chuck Lever <chuck.lever@oracle.com>
To: Mike Snitzer <snitzer@kernel.org>
Cc: NeilBrown <neil@brown.name>,
"Trond Myklebust" <trondmy@kernel.org>,
"Anna Schumaker" <anna@kernel.org>,
"Pali Rohár" <pali@kernel.org>,
"Vincent Mailhol" <mailhol.vincent@wanadoo.fr>,
"Jeff Layton" <jlayton@kernel.org>,
linux-nfs@vger.kernel.org, paulmck@kernel.org
Subject: Re: [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers
Date: Fri, 16 May 2025 11:33:20 -0400 [thread overview]
Message-ID: <d5d6927d-aaa2-4160-989f-53e8b84bdac8@oracle.com> (raw)
In-Reply-To: <aB-vxtKNKxpPnkz2@kernel.org>
On 5/10/25 3:57 PM, Mike Snitzer wrote:
> On Sat, May 10, 2025 at 12:02:27PM -0400, Chuck Lever wrote:
>> On 5/9/25 11:01 PM, NeilBrown wrote:
>>> On Sat, 10 May 2025, Chuck Lever wrote:
>>>> [ adding Paul McK ]
>>>>
>>>> On 5/8/25 8:46 PM, NeilBrown wrote:
>>>>> This is a revised version a the earlier series. I've actually tested
>>>>> this time and fixed a few issues including the one that Mike found.
>>>>
>>>> As Mike mentioned in a previous thread, at this point, any fix for this
>>>> issue will need to be applied to recent stable kernels as well. This
>>>> series looks a bit too complicated for that.
>>>>
>>>> I expect that other subsystems will encounter this issue eventually,
>>>> so it would be beneficial to address the root cause. For that purpose, I
>>>> think I like Vincent's proposal the best:
>>>>
>>>> https://lore.kernel.org/linux-nfs/8c67a295-8caa-4e53-a764-f691657bbe62@wanadoo.fr/raw
>>>>
>>>> None of this is to say that Neil's patches shouldn't be applied. But
>>>> perhaps these are not a broad solution to the RCU compilation issue.
>>>
>>> Do we need a "broad solution to the RCU compilation issue"?
>>
>> Fair question. If the current localio code is simply incorrect as it
>> stands, then I suppose the answer is no. Because gcc is happy to compile
>> it in most cases, I thought the problem was with older versions of gcc,
>> not with localio (even though, I agree, the use of an incomplete
>> structure definition is somewhat brittle when used with RCU).
>>
>>
>>> Does it ever make sense to "dereference" a pointer to a structure that is
>>> not fully specified? What does that even mean?
>>>
>>> I find it harder to argue against use of rcu_access_pointer() in that
>>> context, at least for test-against-NULL, but sparse doesn't complain
>>> about a bare test of an __rcu pointer against NULL, so maybe there is no
>>> need for rcu_access_pointer() for simple tests - in which case the
>>> documentation should be updated.
>>
>> For backporting purposes, inventing our own local RCU helper to handle
>> the situation might be best. Then going forward, apply your patches to
>> rectify the use of the incomplete structure definition, and the local
>> helper can eventually be removed.
>>
>> My interest is getting to a narrow set of changes that can be applied
>> now and backported as needed. The broader clean-ups can then be applied
>> to future kernels (or as subsequent patches in the same merge window).
>>
>> My 2 cents, worth every penny.
>
> I really would prefer we just use this patch as the stop-gap for 6.14
> and 6.15 (which I have been carrying for nearly a year now because I
> need to support an EL8 platform that uses gcc 8.5):
>
> https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=kernel-6.12.24/nfs-testing&id=f9add5e4c9b4629b102876dce9484e1da3e35d1f
>
> Then we work through getting Neil's patchset to land for 6.16 and
> revert the stop-gap (dummy nfsd_file) patch.
>
>>> (of course rcu_dereference() doesn't actually dereference the pointer,
>>> despite its name. It just declared that there is an imminent intention
>>> to dereference the pointer.....)
>>>
>>> NeilBrown
>
> Rather than do a way more crazy stop-gap like this (which actually works):
The below looks about like I expected it to. Thanks for coding it up and
trying it!
Agreed that it is more verbose than the original patch, but this seems
more self-documenting to me and looks narrow enough to backport to LTS
kernels or even move these helpers to common headers if they are found
to be more broadly useful. I have only a mild preference for this
solution over the initial one.
One more comment below.
> fs/nfs/localio.c | 6 +++---
> fs/nfs_common/nfslocalio.c | 8 +++----
> include/linux/nfslocalio.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 59 insertions(+), 7 deletions(-)
>
> diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
> index 73dd07495440..fedc07254c00 100644
> --- a/fs/nfs/localio.c
> +++ b/fs/nfs/localio.c
> @@ -272,7 +272,7 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
>
> new = NULL;
> rcu_read_lock();
> - nf = rcu_dereference(*pnf);
> + nf = rcu_dereference_opaque(*pnf);
> if (!nf) {
> rcu_read_unlock();
> new = __nfs_local_open_fh(clp, cred, fh, nfl, mode);
> @@ -281,11 +281,11 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
> rcu_read_lock();
> /* try to swap in the pointer */
> spin_lock(&clp->cl_uuid.lock);
> - nf = rcu_dereference_protected(*pnf, 1);
> + nf = rcu_dereference_opaque_protected(*pnf, 1);
> if (!nf) {
> nf = new;
> new = NULL;
> - rcu_assign_pointer(*pnf, nf);
> + rcu_assign_opaque_pointer(*pnf, nf);
> }
> spin_unlock(&clp->cl_uuid.lock);
> }
> diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
> index 6a0bdea6d644..213862ceb8bb 100644
> --- a/fs/nfs_common/nfslocalio.c
> +++ b/fs/nfs_common/nfslocalio.c
> @@ -285,14 +285,14 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
> return;
> }
>
> - ro_nf = rcu_access_pointer(nfl->ro_file);
> - rw_nf = rcu_access_pointer(nfl->rw_file);
> + ro_nf = rcu_access_opaque(nfl->ro_file);
> + rw_nf = rcu_access_opaque(nfl->rw_file);
> if (ro_nf || rw_nf) {
> spin_lock(&nfs_uuid->lock);
> if (ro_nf)
> - ro_nf = rcu_dereference_protected(xchg(&nfl->ro_file, NULL), 1);
> + ro_nf = rcu_dereference_opaque_protected(xchg(&nfl->ro_file, NULL), 1);
> if (rw_nf)
> - rw_nf = rcu_dereference_protected(xchg(&nfl->rw_file, NULL), 1);
> + rw_nf = rcu_dereference_opaque_protected(xchg(&nfl->rw_file, NULL), 1);
I might combine rcu_dereference_opaque_protected and xchg into another
helper, since this rather verbose idiom appears twice.
> /* Remove nfl from nfs_uuid->files list */
> RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
> index 9aa8a43843d7..c6e86891d4b5 100644
> --- a/include/linux/nfslocalio.h
> +++ b/include/linux/nfslocalio.h
> @@ -15,6 +15,58 @@
> #include <linux/sunrpc/svcauth.h>
> #include <linux/nfs.h>
> #include <net/net_namespace.h>
> +#include <linux/rcupdate.h>
> +
> +/*
> + * RCU methods to allow fs/nfs_common and fs/nfs LOCALIO code to avoid
> + * dereferencing pointer to 'struct nfs_file' which is opaque outside fs/nfsd
> +*/
> +#define __rcu_access_opaque_pointer(p, local, space) \
> +({ \
> + typeof(p) local = (__force typeof(p))READ_ONCE(p); \
> + rcu_check_sparse(p, space); \
> + (__force __kernel typeof(p))(local); \
> +})
> +
> +#define rcu_access_opaque(p) __rcu_access_opaque_pointer((p), __UNIQUE_ID(rcu), __rcu)
> +
> +#define __rcu_dereference_opaque_protected(p, local, c, space) \
> +({ \
> + RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_opaque_protected() usage"); \
> + rcu_check_sparse(p, space); \
> + (__force __kernel typeof(p))(p); \
> +})
> +
> +#define rcu_dereference_opaque_protected(p, c) \
> + __rcu_dereference_opaque_protected((p), __UNIQUE_ID(rcu), (c), __rcu)
> +
> +#define __rcu_dereference_opaque_check(p, local, c, space) \
> +({ \
> + /* Dependency order vs. p above. */ \
> + typeof(p) local = (__force typeof(p))READ_ONCE(p); \
> + RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_opaque_check() usage"); \
> + rcu_check_sparse(p, space); \
> + (__force __kernel typeof(p))(local); \
> +})
> +
> +#define rcu_dereference_opaque_check(p, c) \
> + __rcu_dereference_opaque_check((p), __UNIQUE_ID(rcu), \
> + (c) || rcu_read_lock_held(), __rcu)
> +
> +#define rcu_dereference_opaque(p) rcu_dereference_opaque_check(p, 0)
> +
> +#define RCU_INITIALIZER_OPAQUE(v) (typeof((v)) __force __rcu)(v)
> +
> +#define rcu_assign_opaque_pointer(p, v) \
> +do { \
> + uintptr_t _r_a_p__v = (uintptr_t)(v); \
> + rcu_check_sparse(p, __rcu); \
> + \
> + if (__builtin_constant_p(v) && (_r_a_p__v) == (uintptr_t)NULL) \
> + WRITE_ONCE((p), (typeof(p))(_r_a_p__v)); \
> + else \
> + smp_store_release(&p, RCU_INITIALIZER_OPAQUE((typeof(p))_r_a_p__v)); \
> +} while (0)
>
> struct nfs_client;
> struct nfs_file_localio;
--
Chuck Lever
next prev parent reply other threads:[~2025-05-16 15:33 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-09 0:46 [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-05-09 0:46 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
2025-05-09 0:46 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
2025-05-09 0:46 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
2025-05-09 0:46 ` [PATCH 4/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
2025-05-09 0:46 ` [PATCH 5/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
2025-05-09 0:46 ` [PATCH 6/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer NeilBrown
2025-05-09 11:03 ` kernel test robot
2025-07-08 14:20 ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 0/9] NFSD/NFS/LOCALIO: stable fixes and revert 6.16 LOCALIO changes Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 1/9] Revert "NFSD: Clean up kdoc for nfsd_open_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 2/9] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 3/9] Revert "nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 4/9] Revert "nfs_localio: duplicate nfs_close_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 5/9] Revert "nfs_localio: simplify interface to nfsd for getting nfsd_file" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 6/9] Revert "nfs_localio: always hold nfsd net ref with nfsd_file ref" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 7/9] Revert "nfs_localio: use cmpxchg() to install new nfs_file_localio" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 8/9] nfs/localio: avoid bouncing LOCALIO if nfs_client_is_local() Mike Snitzer
2025-07-14 4:19 ` NeilBrown
2025-07-14 14:37 ` Mike Snitzer
2025-07-14 12:23 ` Jeff Layton
2025-07-14 3:13 ` [for-6.16-final PATCH 9/9] nfs/localio: add localio_async_probe modparm Mike Snitzer
2025-07-14 4:23 ` NeilBrown
2025-07-14 12:28 ` Jeff Layton
2025-07-14 14:08 ` Mike Snitzer
2025-07-14 3:50 ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" NeilBrown
2025-07-14 14:45 ` Mike Snitzer
2025-07-15 22:52 ` [PATCH 0/3] Fix localio hangs Trond Myklebust
2025-07-15 22:52 ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-15 22:52 ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events Trond Myklebust
2025-07-15 22:52 ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 1:09 ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed NeilBrown
2025-07-16 1:22 ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events NeilBrown
2025-07-16 2:29 ` Trond Myklebust
2025-07-16 3:51 ` NeilBrown
2025-07-16 1:31 ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file NeilBrown
2025-07-16 4:17 ` Trond Myklebust
2025-07-16 5:07 ` NeilBrown
2025-07-16 15:19 ` Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 0/3] Fix localio hangs Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 2/3] NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh() Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 22:09 ` [PATCH v2 0/3] Fix localio hangs NeilBrown
2025-07-16 23:27 ` Mike Snitzer
2025-07-18 0:18 ` NeilBrown
2025-05-09 16:01 ` [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers Chuck Lever
2025-05-09 21:02 ` Mike Snitzer
2025-05-10 0:16 ` Paul E. McKenney
2025-05-10 2:44 ` NeilBrown
2025-05-10 3:01 ` NeilBrown
2025-05-10 16:02 ` Chuck Lever
2025-05-10 19:57 ` Mike Snitzer
2025-05-16 15:33 ` Chuck Lever [this message]
2025-05-18 10:46 ` Pali Rohár
2025-05-19 3:49 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5d6927d-aaa2-4160-989f-53e8b84bdac8@oracle.com \
--to=chuck.lever@oracle.com \
--cc=anna@kernel.org \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=mailhol.vincent@wanadoo.fr \
--cc=neil@brown.name \
--cc=pali@kernel.org \
--cc=paulmck@kernel.org \
--cc=snitzer@kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox