All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
@ 2024-09-15 23:45 NeilBrown
  2024-09-16  0:24 ` Jeff Layton
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: NeilBrown @ 2024-09-15 23:45 UTC (permalink / raw)
  To: Chuck Lever III, Jeff Layton; +Cc: Linux NFS Mailing List, Olga Kornievskaia


The memory barriers in this patch were all wrong.
smp_store_release / smp_load_acquire ensures that all changes written
before the store_release are equally visible after the acquire.
These are not needed here as the *only* value that
svc_thread_init_status() or its caller sets that is of any interest to
svc_start_kthread() is ->rq_err.  The barrier wold effect writes before
->rq_err is written and reads after ->rq_err is read.

However we DO need a full memory barrier between setting ->rq_err and
before the the waitqueue_active() read in wake_up_var().  This is a
barrier between a write and a read, hence a full barrier: smb_mb().

This addresses a race if wait_var_event() adds itself to the wait queue
and tests ->rq_err such that the read in waitqueue_active() happens
earlier and doesn't see that the task has been added, and the ->rq_err
write is delayed so that the waiting task doesn't see it.  In this case
the wake_up never happens.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 include/linux/sunrpc/svc.h | 7 +++++--
 net/sunrpc/svc.c           | 3 +--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 437672bcaa22..558e5ae03103 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -326,8 +326,11 @@ static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
  */
 static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
 {
-	/* store_release ensures svc_start_kthreads() sees the error */
-	smp_store_release(&rqstp->rq_err, err);
+	rqstp->rq_err = err;
+	/* memory barrier ensures assignment to error above is visible before
+	 * waitqueue_active() test below completes.
+	 */
+	smb_mb();
 	wake_up_var(&rqstp->rq_err);
 	if (err)
 		kthread_exit(1);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index ff6f3e35b36d..9aff845196ce 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -818,8 +818,7 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
 		svc_sock_update_bufs(serv);
 		wake_up_process(task);
 
-		/* load_acquire ensures we get value stored in svc_thread_init_status() */
-		wait_var_event(&rqstp->rq_err, smp_load_acquire(&rqstp->rq_err) != -EAGAIN);
+		wait_var_event(&rqstp->rq_err, rqstp->rq_err != -EAGAIN);
 		err = rqstp->rq_err;
 		if (err) {
 			svc_exit_thread(rqstp);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
  2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
@ 2024-09-16  0:24 ` Jeff Layton
  2024-09-16  2:27 ` kernel test robot
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2024-09-16  0:24 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever III
  Cc: Linux NFS Mailing List,
	Olga Kornievskaia <okorniev@redhat.com> Olga Kornievskaia,
	Dai Ngo, Tom Talpey

On Mon, 2024-09-16 at 09:45 +1000, NeilBrown wrote:
> The memory barriers in this patch were all wrong.
> smp_store_release / smp_load_acquire ensures that all changes written
> before the store_release are equally visible after the acquire.
> These are not needed here as the *only* value that
> svc_thread_init_status() or its caller sets that is of any interest to
> svc_start_kthread() is ->rq_err.  The barrier wold effect writes before
> ->rq_err is written and reads after ->rq_err is read.
> 
> However we DO need a full memory barrier between setting ->rq_err and
> before the the waitqueue_active() read in wake_up_var().  This is a
> barrier between a write and a read, hence a full barrier: smb_mb().
> 
> This addresses a race if wait_var_event() adds itself to the wait queue
> and tests ->rq_err such that the read in waitqueue_active() happens
> earlier and doesn't see that the task has been added, and the ->rq_err
> write is delayed so that the waiting task doesn't see it.  In this case
> the wake_up never happens.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  include/linux/sunrpc/svc.h | 7 +++++--
>  net/sunrpc/svc.c           | 3 +--
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 437672bcaa22..558e5ae03103 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -326,8 +326,11 @@ static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
>   */
>  static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
>  {
> -	/* store_release ensures svc_start_kthreads() sees the error */
> -	smp_store_release(&rqstp->rq_err, err);
> +	rqstp->rq_err = err;
> +	/* memory barrier ensures assignment to error above is visible before
> +	 * waitqueue_active() test below completes.
> +	 */
> +	smb_mb();
>  	wake_up_var(&rqstp->rq_err);
>  	if (err)
>  		kthread_exit(1);
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ff6f3e35b36d..9aff845196ce 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -818,8 +818,7 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
>  		svc_sock_update_bufs(serv);
>  		wake_up_process(task);
>  
> -		/* load_acquire ensures we get value stored in svc_thread_init_status() */
> -		wait_var_event(&rqstp->rq_err, smp_load_acquire(&rqstp->rq_err) != -EAGAIN);
> +		wait_var_event(&rqstp->rq_err, rqstp->rq_err != -EAGAIN);
>  		err = rqstp->rq_err;
>  		if (err) {
>  			svc_exit_thread(rqstp);

Makes sense.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
  2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
  2024-09-16  0:24 ` Jeff Layton
@ 2024-09-16  2:27 ` kernel test robot
  2024-09-16  2:27 ` kernel test robot
  2024-09-16  3:50 ` Chuck Lever
  3 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2024-09-16  2:27 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever III, Jeff Layton; +Cc: oe-kbuild-all

Hi NeilBrown,

kernel test robot noticed the following build errors:

[auto build test ERROR on next-20240913]
[cannot apply to trondmy-nfs/linux-next v6.11 v6.11-rc7 v6.11-rc6 linus/master v6.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/NeilBrown/SQUASH-sunrpc-allow-svc-threads-to-fail-initialisation-cleanly/20240916-074739
base:   next-20240913
patch link:    https://lore.kernel.org/r/172644394073.17050.16376953609629336068%40noble.neil.brown.name
patch subject: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
config: x86_64-rhel-8.3-bpf (https://download.01.org/0day-ci/archive/20240916/202409161033.eurajfZc-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240916/202409161033.eurajfZc-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409161033.eurajfZc-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/sunrpc/svcsock.h:13,
                    from include/linux/sunrpc/bc_xprt.h:16,
                    from net/sunrpc/clnt.c:40:
   include/linux/sunrpc/svc.h: In function 'svc_thread_init_status':
>> include/linux/sunrpc/svc.h:333:9: error: implicit declaration of function 'smb_mb'; did you mean 'smp_mb'? [-Werror=implicit-function-declaration]
     333 |         smb_mb();
         |         ^~~~~~
         |         smp_mb
   cc1: some warnings being treated as errors


vim +333 include/linux/sunrpc/svc.h

   313	
   314	/**
   315	 * svc_thread_init_status - report whether thread has initialised successfully
   316	 * @rqstp: the thread in question
   317	 * @err: errno code
   318	 *
   319	 * After performing any initialisation that could fail, and before starting
   320	 * normal work, each sunrpc svc_thread must call svc_thread_init_status()
   321	 * with an appropriate error, or zero.
   322	 *
   323	 * If zero is passed, the thread is ready and must continue until
   324	 * svc_thread_should_stop() returns true.  If a non-zero error is passed
   325	 * the call will not return - the thread will exit.
   326	 */
   327	static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
   328	{
   329		rqstp->rq_err = err;
   330		/* memory barrier ensures assignment to error above is visible before
   331		 * waitqueue_active() test below completes.
   332		 */
 > 333		smb_mb();
   334		wake_up_var(&rqstp->rq_err);
   335		if (err)
   336			kthread_exit(1);
   337	}
   338	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
  2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
  2024-09-16  0:24 ` Jeff Layton
  2024-09-16  2:27 ` kernel test robot
@ 2024-09-16  2:27 ` kernel test robot
  2024-09-16  3:50 ` Chuck Lever
  3 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2024-09-16  2:27 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever III, Jeff Layton; +Cc: llvm, oe-kbuild-all

Hi NeilBrown,

kernel test robot noticed the following build errors:

[auto build test ERROR on next-20240913]
[cannot apply to trondmy-nfs/linux-next v6.11 v6.11-rc7 v6.11-rc6 linus/master v6.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/NeilBrown/SQUASH-sunrpc-allow-svc-threads-to-fail-initialisation-cleanly/20240916-074739
base:   next-20240913
patch link:    https://lore.kernel.org/r/172644394073.17050.16376953609629336068%40noble.neil.brown.name
patch subject: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20240916/202409161022.FCpNdJMX-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240916/202409161022.FCpNdJMX-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409161022.FCpNdJMX-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from fs/nfsd/trace.c:4:
   In file included from fs/nfsd/trace.h:18:
   In file included from fs/nfsd/nfsfh.h:12:
>> include/linux/sunrpc/svc.h:333:2: error: call to undeclared function 'smb_mb'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     333 |         smb_mb();
         |         ^
   1 error generated.
--
   In file included from fs/nfsd/export.c:19:
   In file included from include/linux/sunrpc/svc_xprt.h:11:
>> include/linux/sunrpc/svc.h:333:2: error: call to undeclared function 'smb_mb'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     333 |         smb_mb();
         |         ^
   fs/nfsd/export.c:1020:17: warning: variable 'inode' set but not used [-Wunused-but-set-variable]
    1020 |         struct inode            *inode;
         |                                  ^
   1 warning and 1 error generated.


vim +/smb_mb +333 include/linux/sunrpc/svc.h

   313	
   314	/**
   315	 * svc_thread_init_status - report whether thread has initialised successfully
   316	 * @rqstp: the thread in question
   317	 * @err: errno code
   318	 *
   319	 * After performing any initialisation that could fail, and before starting
   320	 * normal work, each sunrpc svc_thread must call svc_thread_init_status()
   321	 * with an appropriate error, or zero.
   322	 *
   323	 * If zero is passed, the thread is ready and must continue until
   324	 * svc_thread_should_stop() returns true.  If a non-zero error is passed
   325	 * the call will not return - the thread will exit.
   326	 */
   327	static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
   328	{
   329		rqstp->rq_err = err;
   330		/* memory barrier ensures assignment to error above is visible before
   331		 * waitqueue_active() test below completes.
   332		 */
 > 333		smb_mb();
   334		wake_up_var(&rqstp->rq_err);
   335		if (err)
   336			kthread_exit(1);
   337	}
   338	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly
  2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
                   ` (2 preceding siblings ...)
  2024-09-16  2:27 ` kernel test robot
@ 2024-09-16  3:50 ` Chuck Lever
  3 siblings, 0 replies; 5+ messages in thread
From: Chuck Lever @ 2024-09-16  3:50 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Linux NFS Mailing List,
	Olga Kornievskaia <okorniev@redhat.com>Olga Kornievskaia,
	Dai Ngo, Tom Talpey

On Mon, Sep 16, 2024 at 09:45:40AM +1000, NeilBrown wrote:
> 
> The memory barriers in this patch were all wrong.
> smp_store_release / smp_load_acquire ensures that all changes written
> before the store_release are equally visible after the acquire.
> These are not needed here as the *only* value that
> svc_thread_init_status() or its caller sets that is of any interest to
> svc_start_kthread() is ->rq_err.  The barrier wold effect writes before
> ->rq_err is written and reads after ->rq_err is read.
> 
> However we DO need a full memory barrier between setting ->rq_err and
> before the the waitqueue_active() read in wake_up_var().  This is a
> barrier between a write and a read, hence a full barrier: smb_mb().
> 
> This addresses a race if wait_var_event() adds itself to the wait queue
> and tests ->rq_err such that the read in waitqueue_active() happens
> earlier and doesn't see that the task has been added, and the ->rq_err
> write is delayed so that the waiting task doesn't see it.  In this case
> the wake_up never happens.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  include/linux/sunrpc/svc.h | 7 +++++--
>  net/sunrpc/svc.c           | 3 +--
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 437672bcaa22..558e5ae03103 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -326,8 +326,11 @@ static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
>   */
>  static inline void svc_thread_init_status(struct svc_rqst *rqstp, int err)
>  {
> -	/* store_release ensures svc_start_kthreads() sees the error */
> -	smp_store_release(&rqstp->rq_err, err);
> +	rqstp->rq_err = err;
> +	/* memory barrier ensures assignment to error above is visible before
> +	 * waitqueue_active() test below completes.
> +	 */
> +	smb_mb();

This should have been "smp_mb();". I fixed it up before applying.


>  	wake_up_var(&rqstp->rq_err);
>  	if (err)
>  		kthread_exit(1);
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ff6f3e35b36d..9aff845196ce 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -818,8 +818,7 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
>  		svc_sock_update_bufs(serv);
>  		wake_up_process(task);
>  
> -		/* load_acquire ensures we get value stored in svc_thread_init_status() */
> -		wait_var_event(&rqstp->rq_err, smp_load_acquire(&rqstp->rq_err) != -EAGAIN);
> +		wait_var_event(&rqstp->rq_err, rqstp->rq_err != -EAGAIN);
>  		err = rqstp->rq_err;
>  		if (err) {
>  			svc_exit_thread(rqstp);
> -- 
> 2.46.0
> 

I've squashed this into the already-applied patch in nfsd-next.
Thanks!

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-09-16  3:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-15 23:45 [PATCH nfsd-next] SQUASH sunrpc: allow svc threads to fail initialisation cleanly NeilBrown
2024-09-16  0:24 ` Jeff Layton
2024-09-16  2:27 ` kernel test robot
2024-09-16  2:27 ` kernel test robot
2024-09-16  3:50 ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.