* [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
@ 2024-09-15 23:45 NeilBrown
2024-09-16 0:24 ` Jeff Layton
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: NeilBrown @ 2024-09-15 23:45 UTC (permalink / raw)
To: Chuck Lever III, Jeff Layton; +Cc: Linux NFS Mailing List, Olga Kornievskaia
The memory barriers in this patch were all wrong.
smp_store_release / smp_load_acquire ensures that all changes written
before the store_release are equally visible after the acquire.
These are not needed here as the *only* value that
svc_thread_init_status() or its caller sets that is of any interest to
svc_start_kthread() is ->rq_err. The barrier wold effect writes before
->rq_err is written and reads after ->rq_err is read.
However we DO need a full memory barrier between setting ->rq_err and
before the the waitqueue_active() read in wake_up_var(). This is a
barrier between a write and a read, hence a full barrier: smb_mb().
This addresses a race if wait_var_event() adds itself to the wait queue
and tests ->rq_err such that the read in waitqueue_active() happens
earlier and doesn't see that the task has been added, and the ->rq_err
write is delayed so that the waiting task doesn't see it. In this case
the wake_up never happens.
Signed-off-by: NeilBrown <neilb@suse.de>
---
include/linux/sunrpc/svc.h | 7 +++++--
net/sunrpc/svc.c | 3 +--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 437672bcaa22..558e5ae03103 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -326,8 +326,11 @@ static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
*/
static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
{
- /* store_release ensures svc_start_kthreads() sees the error */
- smp_store_release(&rqstp->rq_err, err);
+ rqstp->rq_err = err;
+ /* memory barrier ensures assignment to error above is visible before
+ * waitqueue_active() test below completes.
+ */
+ smb_mb();
wake_up_var(&rqstp->rq_err);
if (err)
kthread_exit(1);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index ff6f3e35b36d..9aff845196ce 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -818,8 +818,7 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
svc_sock_update_bufs(serv);
wake_up_process(task);
- /* load_acquire ensures we get value stored in svc_thread_init_status() */
- wait_var_event(&rqstp->rq_err, smp_load_acquire(&rqstp->rq_err) != -EAGAIN);
+ wait_var_event(&rqstp->rq_err, rqstp->rq_err != -EAGAIN);
err = rqstp->rq_err;
if (err) {
svc_exit_thread(rqstp);
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
@ 2024-09-16 0:24 ` Jeff Layton
2024-09-16 2:27 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2024-09-16 0:24 UTC (permalink / raw)
To: NeilBrown, Chuck Lever III
Cc: Linux NFS Mailing List,
Olga Kornievskaia <okorniev@redhat.com> Olga Kornievskaia,
Dai Ngo, Tom Talpey
On Mon, 2024-09-16 at 09:45 +1000, NeilBrown wrote:
> The memory barriers in this patch were all wrong.
> smp_store_release / smp_load_acquire ensures that all changes written
> before the store_release are equally visible after the acquire.
> These are not needed here as the *only* value that
> svc_thread_init_status() or its caller sets that is of any interest to
> svc_start_kthread() is ->rq_err. The barrier wold effect writes before
> ->rq_err is written and reads after ->rq_err is read.
>
> However we DO need a full memory barrier between setting ->rq_err and
> before the the waitqueue_active() read in wake_up_var(). This is a
> barrier between a write and a read, hence a full barrier: smb_mb().
>
> This addresses a race if wait_var_event() adds itself to the wait queue
> and tests ->rq_err such that the read in waitqueue_active() happens
> earlier and doesn't see that the task has been added, and the ->rq_err
> write is delayed so that the waiting task doesn't see it. In this case
> the wake_up never happens.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> include/linux/sunrpc/svc.h | 7 +++++--
> net/sunrpc/svc.c | 3 +--
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 437672bcaa22..558e5ae03103 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -326,8 +326,11 @@ static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
> */
> static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
> {
> - /* store_release ensures svc_start_kthreads() sees the error */
> - smp_store_release(&rqstp->rq_err, err);
> + rqstp->rq_err = err;
> + /* memory barrier ensures assignment to error above is visible before
> + * waitqueue_active() test below completes.
> + */
> + smb_mb();
> wake_up_var(&rqstp->rq_err);
> if (err)
> kthread_exit(1);
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ff6f3e35b36d..9aff845196ce 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -818,8 +818,7 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
> svc_sock_update_bufs(serv);
> wake_up_process(task);
>
> - /* load_acquire ensures we get value stored in svc_thread_init_status() */
> - wait_var_event(&rqstp->rq_err, smp_load_acquire(&rqstp->rq_err) != -EAGAIN);
> + wait_var_event(&rqstp->rq_err, rqstp->rq_err != -EAGAIN);
> err = rqstp->rq_err;
> if (err) {
> svc_exit_thread(rqstp);
Makes sense.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
2024-09-16 0:24 ` Jeff Layton
@ 2024-09-16 2:27 ` kernel test robot
2024-09-16 2:27 ` kernel test robot
2024-09-16 3:50 ` Chuck Lever
3 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2024-09-16 2:27 UTC (permalink / raw)
To: NeilBrown, Chuck Lever III, Jeff Layton; +Cc: oe-kbuild-all
Hi NeilBrown,
kernel test robot noticed the following build errors:
[auto build test ERROR on next-20240913]
[cannot apply to trondmy-nfs/linux-next v6.11 v6.11-rc7 v6.11-rc6 linus/master v6.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/NeilBrown/SQUASH-sunrpc-allow-svc-threads-to-fail-initialisation-cleanly/20240916-074739
base: next-20240913
patch link: https://lore.kernel.org/r/172644394073.17050.16376953609629336068%40noble.neil.brown.name
patch subject: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
config: x86_64-rhel-8.3-bpf (https://download.01.org/0day-ci/archive/20240916/202409161033.eurajfZc-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240916/202409161033.eurajfZc-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409161033.eurajfZc-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from include/linux/sunrpc/svcsock.h:13,
from include/linux/sunrpc/bc_xprt.h:16,
from net/sunrpc/clnt.c:40:
include/linux/sunrpc/svc.h: In function 'svc_thread_init_status':
>> include/linux/sunrpc/svc.h:333:9: error: implicit declaration of function 'smb_mb'; did you mean 'smp_mb'? [-Werror=implicit-function-declaration]
333 | smb_mb();
| ^~~~~~
| smp_mb
cc1: some warnings being treated as errors
vim +333 include/linux/sunrpc/svc.h
313
314 /**
315 * svc_thread_init_status - report whether thread has initialised successfully
316 * @rqstp: the thread in question
317 * @err: errno code
318 *
319 * After performing any initialisation that could fail, and before starting
320 * normal work, each sunrpc svc_thread must call svc_thread_init_status()
321 * with an appropriate error, or zero.
322 *
323 * If zero is passed, the thread is ready and must continue until
324 * svc_thread_should_stop() returns true. If a non-zero error is passed
325 * the call will not return - the thread will exit.
326 */
327 static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
328 {
329 rqstp->rq_err = err;
330 /* memory barrier ensures assignment to error above is visible before
331 * waitqueue_active() test below completes.
332 */
> 333 smb_mb();
334 wake_up_var(&rqstp->rq_err);
335 if (err)
336 kthread_exit(1);
337 }
338
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
2024-09-16 0:24 ` Jeff Layton
2024-09-16 2:27 ` kernel test robot
@ 2024-09-16 2:27 ` kernel test robot
2024-09-16 3:50 ` Chuck Lever
3 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2024-09-16 2:27 UTC (permalink / raw)
To: NeilBrown, Chuck Lever III, Jeff Layton; +Cc: llvm, oe-kbuild-all
Hi NeilBrown,
kernel test robot noticed the following build errors:
[auto build test ERROR on next-20240913]
[cannot apply to trondmy-nfs/linux-next v6.11 v6.11-rc7 v6.11-rc6 linus/master v6.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/NeilBrown/SQUASH-sunrpc-allow-svc-threads-to-fail-initialisation-cleanly/20240916-074739
base: next-20240913
patch link: https://lore.kernel.org/r/172644394073.17050.16376953609629336068%40noble.neil.brown.name
patch subject: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20240916/202409161022.FCpNdJMX-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240916/202409161022.FCpNdJMX-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409161022.FCpNdJMX-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from fs/nfsd/trace.c:4:
In file included from fs/nfsd/trace.h:18:
In file included from fs/nfsd/nfsfh.h:12:
>> include/linux/sunrpc/svc.h:333:2: error: call to undeclared function 'smb_mb'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
333 | smb_mb();
| ^
1 error generated.
--
In file included from fs/nfsd/export.c:19:
In file included from include/linux/sunrpc/svc_xprt.h:11:
>> include/linux/sunrpc/svc.h:333:2: error: call to undeclared function 'smb_mb'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
333 | smb_mb();
| ^
fs/nfsd/export.c:1020:17: warning: variable 'inode' set but not used [-Wunused-but-set-variable]
1020 | struct inode *inode;
| ^
1 warning and 1 error generated.
vim +/smb_mb +333 include/linux/sunrpc/svc.h
313
314 /**
315 * svc_thread_init_status - report whether thread has initialised successfully
316 * @rqstp: the thread in question
317 * @err: errno code
318 *
319 * After performing any initialisation that could fail, and before starting
320 * normal work, each sunrpc svc_thread must call svc_thread_init_status()
321 * with an appropriate error, or zero.
322 *
323 * If zero is passed, the thread is ready and must continue until
324 * svc_thread_should_stop() returns true. If a non-zero error is passed
325 * the call will not return - the thread will exit.
326 */
327 static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
328 {
329 rqstp->rq_err = err;
330 /* memory barrier ensures assignment to error above is visible before
331 * waitqueue_active() test below completes.
332 */
> 333 smb_mb();
334 wake_up_var(&rqstp->rq_err);
335 if (err)
336 kthread_exit(1);
337 }
338
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
` (2 preceding siblings ...)
2024-09-16 2:27 ` kernel test robot
@ 2024-09-16 3:50 ` Chuck Lever
3 siblings, 0 replies; 5+ messages in thread
From: Chuck Lever @ 2024-09-16 3:50 UTC (permalink / raw)
To: NeilBrown
Cc: Jeff Layton, Linux NFS Mailing List,
Olga Kornievskaia <okorniev@redhat.com>Olga Kornievskaia,
Dai Ngo, Tom Talpey
On Mon, Sep 16, 2024 at 09:45:40AM +1000, NeilBrown wrote:
>
> The memory barriers in this patch were all wrong.
> smp_store_release / smp_load_acquire ensures that all changes written
> before the store_release are equally visible after the acquire.
> These are not needed here as the *only* value that
> svc_thread_init_status() or its caller sets that is of any interest to
> svc_start_kthread() is ->rq_err. The barrier wold effect writes before
> ->rq_err is written and reads after ->rq_err is read.
>
> However we DO need a full memory barrier between setting ->rq_err and
> before the the waitqueue_active() read in wake_up_var(). This is a
> barrier between a write and a read, hence a full barrier: smb_mb().
>
> This addresses a race if wait_var_event() adds itself to the wait queue
> and tests ->rq_err such that the read in waitqueue_active() happens
> earlier and doesn't see that the task has been added, and the ->rq_err
> write is delayed so that the waiting task doesn't see it. In this case
> the wake_up never happens.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> include/linux/sunrpc/svc.h | 7 +++++--
> net/sunrpc/svc.c | 3 +--
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 437672bcaa22..558e5ae03103 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -326,8 +326,11 @@ static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
> */
> static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
> {
> - /* store_release ensures svc_start_kthreads() sees the error */
> - smp_store_release(&rqstp->rq_err, err);
> + rqstp->rq_err = err;
> + /* memory barrier ensures assignment to error above is visible before
> + * waitqueue_active() test below completes.
> + */
> + smb_mb();
This should have been "smp_mb();". I fixed it up before applying.
> wake_up_var(&rqstp->rq_err);
> if (err)
> kthread_exit(1);
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ff6f3e35b36d..9aff845196ce 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -818,8 +818,7 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
> svc_sock_update_bufs(serv);
> wake_up_process(task);
>
> - /* load_acquire ensures we get value stored in svc_thread_init_status() */
> - wait_var_event(&rqstp->rq_err, smp_load_acquire(&rqstp->rq_err) != -EAGAIN);
> + wait_var_event(&rqstp->rq_err, rqstp->rq_err != -EAGAIN);
> err = rqstp->rq_err;
> if (err) {
> svc_exit_thread(rqstp);
> --
> 2.46.0
>
I've squashed this into the already-applied patch in nfsd-next.
Thanks!
--
Chuck Lever
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-09-16 3:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
2024-09-16 0:24 ` Jeff Layton
2024-09-16 2:27 ` kernel test robot
2024-09-16 2:27 ` kernel test robot
2024-09-16 3:50 ` Chuck Lever
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.