[PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery

Linux NFS development
 help / color / mirror / Atom feed

* [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery
@ 2025-03-25 22:35 trondmy
  2025-03-25 22:35 ` [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

With the recent patch series that caused containerised mounts which
return ENETUNREACH or ENETDOWN errors to report fatal errors, we also
want to ensure that the state manager thread also triggers fatal errors
in the processes or threads that are waiting for recovery to complete.

---
v2:
 - Return EIO instead of ENETUNREACH in nfs4_wait_clnt_recover()
v3:
 - Fix sysfs' shut down of the nfs_client
 - Replace tests of cl_shutdown in NFS code

Trond Myklebust (6):
  SUNRPC: rpcbind should never reset the port to the value '0'
  SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
  NFS: Shut down the nfs_client only after all the superblocks
  NFSv4: Further cleanups to shutdown loops
  NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
  NFSv4: Treat ENETUNREACH errors as fatal for state recovery

 fs/nfs/nfs4proc.c      |  2 +-
 fs/nfs/nfs4state.c     | 14 +++++++++++---
 fs/nfs/sysfs.c         | 22 +++++++++++++++++++++-
 net/sunrpc/clnt.c      |  3 ---
 net/sunrpc/rpcb_clnt.c |  5 +++--
 5 files changed, 36 insertions(+), 10 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0'
  2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
@ 2025-03-25 22:35 ` trondmy
  2025-03-26 10:43   ` Benjamin Coddington
  2025-03-25 22:35 ` [PATCH v3 2/6] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

If we already had a valid port number for the RPC service, then we
should not allow the rpcbind client to set it to the invalid value '0'.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/rpcb_clnt.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 102c3818bc54..53bcca365fb1 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -820,9 +820,10 @@ static void rpcb_getport_done(struct rpc_task *child, void *data)
 	}
 
 	trace_rpcb_setport(child, map->r_status, map->r_port);
-	xprt->ops->set_port(xprt, map->r_port);
-	if (map->r_port)
+	if (map->r_port) {
+		xprt->ops->set_port(xprt, map->r_port);
 		xprt_set_bound(xprt);
+	}
 }
 
 /*
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 2/6] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
  2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
  2025-03-25 22:35 ` [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
@ 2025-03-25 22:35 ` trondmy
  2025-03-26 10:43   ` Benjamin Coddington
  2025-03-25 22:35 ` [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks trondmy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

The autobind setting was supposed to be determined in rpc_create(),
since commit c2866763b402 ("SUNRPC: use sockaddr + size when creating
remote transport endpoints").

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/clnt.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 1cfaf93ceec1..6f75862d9782 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -270,9 +270,6 @@ static struct rpc_xprt *rpc_clnt_set_transport(struct rpc_clnt *clnt,
 	old = rcu_dereference_protected(clnt->cl_xprt,
 			lockdep_is_held(&clnt->cl_lock));
 
-	if (!xprt_bound(xprt))
-		clnt->cl_autobind = 1;
-
 	clnt->cl_timeout = timeout;
 	rcu_assign_pointer(clnt->cl_xprt, xprt);
 	spin_unlock(&clnt->cl_lock);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks
  2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
  2025-03-25 22:35 ` [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
  2025-03-25 22:35 ` [PATCH v3 2/6] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
@ 2025-03-25 22:35 ` trondmy
  2025-03-26  0:15   ` Jeff Layton
  2025-03-25 22:35 ` [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops trondmy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

The nfs_client manages state for all the superblocks in the
"cl_superblocks" list, so it must not be shut down until all of them are
gone.

Fixes: 7d3e26a054c8 ("NFS: Cancel all existing RPC tasks when shutdown")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/sysfs.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
index b30401b2c939..37cb2b776435 100644
--- a/fs/nfs/sysfs.c
+++ b/fs/nfs/sysfs.c
@@ -14,6 +14,7 @@
 #include <linux/rcupdate.h>
 #include <linux/lockd/lockd.h>
 
+#include "internal.h"
 #include "nfs4_fs.h"
 #include "netns.h"
 #include "sysfs.h"
@@ -228,6 +229,25 @@ static void shutdown_client(struct rpc_clnt *clnt)
 	rpc_cancel_tasks(clnt, -EIO, shutdown_match_client, NULL);
 }
 
+/*
+ * Shut down the nfs_client only once all the superblocks
+ * have been shut down.
+ */
+static void shutdown_nfs_client(struct nfs_client *clp)
+{
+	struct nfs_server *server;
+	rcu_read_lock();
+	list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
+		if (!(server->flags & NFS_MOUNT_SHUTDOWN)) {
+			rcu_read_unlock();
+			return;
+		}
+	}
+	rcu_read_unlock();
+	nfs_mark_client_ready(clp, -EIO);
+	shutdown_client(clp->cl_rpcclient);
+}
+
 static ssize_t
 shutdown_show(struct kobject *kobj, struct kobj_attribute *attr,
 				char *buf)
@@ -259,7 +279,6 @@ shutdown_store(struct kobject *kobj, struct kobj_attribute *attr,
 
 	server->flags |= NFS_MOUNT_SHUTDOWN;
 	shutdown_client(server->client);
-	shutdown_client(server->nfs_client->cl_rpcclient);
 
 	if (!IS_ERR(server->client_acl))
 		shutdown_client(server->client_acl);
@@ -267,6 +286,7 @@ shutdown_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (server->nlm_host)
 		shutdown_client(server->nlm_host->h_rpcclnt);
 out:
+	shutdown_nfs_client(server->nfs_client);
 	return count;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
                   ` (2 preceding siblings ...)
  2025-03-25 22:35 ` [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks trondmy
@ 2025-03-25 22:35 ` trondmy
  2025-03-26  0:17   ` Jeff Layton
  2025-03-26 10:13   ` Jeff Layton
  2025-03-25 22:35 ` [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
  2025-03-25 22:35 ` [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
  5 siblings, 2 replies; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

Replace the tests for the RPC client being shut down with tests for
whether the nfs_client is in an error state.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4proc.c  | 2 +-
 fs/nfs/nfs4state.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 889511650ceb..50be54e0f578 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -9580,7 +9580,7 @@ static void nfs41_sequence_call_done(struct rpc_task *task, void *data)
 		return;
 
 	trace_nfs4_sequence(clp, task->tk_status);
-	if (task->tk_status < 0 && !task->tk_client->cl_shutdown) {
+	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
 		dprintk("%s ERROR %d\n", __func__, task->tk_status);
 		if (refcount_read(&clp->cl_count) == 1)
 			return;
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 542cdf71229f..f1f7eaa97973 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp)
 	struct rpc_clnt *clnt = clp->cl_rpcclient;
 	bool swapon = false;
 
-	if (clnt->cl_shutdown)
+	if (clp->cl_cons_state < 0)
 		return;
 
 	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
  2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
                   ` (3 preceding siblings ...)
  2025-03-25 22:35 ` [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops trondmy
@ 2025-03-25 22:35 ` trondmy
  2025-03-26  0:17   ` Jeff Layton
  2025-03-26 10:46   ` Benjamin Coddington
  2025-03-25 22:35 ` [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
  5 siblings, 2 replies; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

If someone calls nfs_mark_client_ready(clp, status) with a negative
value for status, then that should signal that the nfs_client is no
longer valid.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index f1f7eaa97973..272d2ebdae0f 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1403,7 +1403,7 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
 	dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
 			clp->cl_hostname);
 	nfs4_schedule_state_manager(clp);
-	return 0;
+	return clp->cl_cons_state < 0 ? clp->cl_cons_state : 0;
 }
 EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
  2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
                   ` (4 preceding siblings ...)
  2025-03-25 22:35 ` [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
@ 2025-03-25 22:35 ` trondmy
  2025-03-26 10:39   ` Benjamin Coddington
  5 siblings, 1 reply; 24+ messages in thread
From: trondmy @ 2025-03-25 22:35 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik, Benjamin Coddington

From: Trond Myklebust <trond.myklebust@hammerspace.com>

If a containerised process is killed and causes an ENETUNREACH or
ENETDOWN error to be propagated to the state manager, then mark the
nfs_client as being dead so that we don't loop in functions that are
expecting recovery to succeed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4state.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 272d2ebdae0f..7612e977e80b 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct nfs_client *clp)
 	pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s"
 			" with error %d\n", section_sep, section,
 			clp->cl_hostname, -status);
-	ssleep(1);
+	switch (status) {
+	case -ENETDOWN:
+	case -ENETUNREACH:
+		nfs_mark_client_ready(clp, -EIO);
+		break;
+	default:
+		ssleep(1);
+		break;
+	}
 out_drain:
 	memalloc_nofs_restore(memflags);
 	nfs4_end_drain_session(clp);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks
  2025-03-25 22:35 ` [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks trondmy
@ 2025-03-26  0:15   ` Jeff Layton
  2025-03-26  0:37     ` Trond Myklebust
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Layton @ 2025-03-26  0:15 UTC (permalink / raw)
  To: trondmy, linux-nfs; +Cc: Josef Bacik, Benjamin Coddington

On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> The nfs_client manages state for all the superblocks in the
> "cl_superblocks" list, so it must not be shut down until all of them are
> gone.
> 
> Fixes: 7d3e26a054c8 ("NFS: Cancel all existing RPC tasks when shutdown")
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/sysfs.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
> index b30401b2c939..37cb2b776435 100644
> --- a/fs/nfs/sysfs.c
> +++ b/fs/nfs/sysfs.c
> @@ -14,6 +14,7 @@
>  #include <linux/rcupdate.h>
>  #include <linux/lockd/lockd.h>
>  
> +#include "internal.h"
>  #include "nfs4_fs.h"
>  #include "netns.h"
>  #include "sysfs.h"
> @@ -228,6 +229,25 @@ static void shutdown_client(struct rpc_clnt *clnt)
>  	rpc_cancel_tasks(clnt, -EIO, shutdown_match_client, NULL);
>  }
>  
> +/*
> + * Shut down the nfs_client only once all the superblocks
> + * have been shut down.
> + */
> +static void shutdown_nfs_client(struct nfs_client *clp)
> +{
> +	struct nfs_server *server;
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
> +		if (!(server->flags & NFS_MOUNT_SHUTDOWN)) {
> +			rcu_read_unlock();
> +			return;
> +		}
> +	}
> +	rcu_read_unlock();
> +	nfs_mark_client_ready(clp, -EIO);
> +	shutdown_client(clp->cl_rpcclient);
> +}
> +

Isn't the upshot of this that a mount won't actually get shutdown until
you shutdown all of the mounts to the same server? Personally, I find
that acceptable, but we should note that it is a change in behavior.


>  static ssize_t
>  shutdown_show(struct kobject *kobj, struct kobj_attribute *attr,
>  				char *buf)
> @@ -259,7 +279,6 @@ shutdown_store(struct kobject *kobj, struct kobj_attribute *attr,
>  
>  	server->flags |= NFS_MOUNT_SHUTDOWN;
>  	shutdown_client(server->client);
> -	shutdown_client(server->nfs_client->cl_rpcclient);
>  
>  	if (!IS_ERR(server->client_acl))
>  		shutdown_client(server->client_acl);
> @@ -267,6 +286,7 @@ shutdown_store(struct kobject *kobj, struct kobj_attribute *attr,
>  	if (server->nlm_host)
>  		shutdown_client(server->nlm_host->h_rpcclnt);
>  out:
> +	shutdown_nfs_client(server->nfs_client);
>  	return count;
>  }
>  

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-25 22:35 ` [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops trondmy
@ 2025-03-26  0:17   ` Jeff Layton
  2025-03-26 10:13   ` Jeff Layton
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Layton @ 2025-03-26  0:17 UTC (permalink / raw)
  To: trondmy, linux-nfs; +Cc: Josef Bacik, Benjamin Coddington

On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> Replace the tests for the RPC client being shut down with tests for
> whether the nfs_client is in an error state.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/nfs4proc.c  | 2 +-
>  fs/nfs/nfs4state.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 889511650ceb..50be54e0f578 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -9580,7 +9580,7 @@ static void nfs41_sequence_call_done(struct rpc_task *task, void *data)
>  		return;
>  
>  	trace_nfs4_sequence(clp, task->tk_status);
> -	if (task->tk_status < 0 && !task->tk_client->cl_shutdown) {
> +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
>  		dprintk("%s ERROR %d\n", __func__, task->tk_status);
>  		if (refcount_read(&clp->cl_count) == 1)
>  			return;
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 542cdf71229f..f1f7eaa97973 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp)
>  	struct rpc_clnt *clnt = clp->cl_rpcclient;
>  	bool swapon = false;
>  
> -	if (clnt->cl_shutdown)
> +	if (clp->cl_cons_state < 0)
>  		return;
>  
>  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);

This should probably also have:

    Fixes: 6ad477a69ad8 ("NFSv4: Clean up some shutdown loops")

...but otherwise this looks good to me.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
  2025-03-25 22:35 ` [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
@ 2025-03-26  0:17   ` Jeff Layton
  2025-03-26 10:46   ` Benjamin Coddington
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Layton @ 2025-03-26  0:17 UTC (permalink / raw)
  To: trondmy, linux-nfs; +Cc: Josef Bacik, Benjamin Coddington

On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> If someone calls nfs_mark_client_ready(clp, status) with a negative
> value for status, then that should signal that the nfs_client is no
> longer valid.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index f1f7eaa97973..272d2ebdae0f 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1403,7 +1403,7 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
>  	dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
>  			clp->cl_hostname);
>  	nfs4_schedule_state_manager(clp);
> -	return 0;
> +	return clp->cl_cons_state < 0 ? clp->cl_cons_state : 0;
>  }
>  EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);
>  

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks
  2025-03-26  0:15   ` Jeff Layton
@ 2025-03-26  0:37     ` Trond Myklebust
  2025-03-26 10:07       ` Benjamin Coddington
  0 siblings, 1 reply; 24+ messages in thread
From: Trond Myklebust @ 2025-03-26  0:37 UTC (permalink / raw)
  To: Jeff Layton, linux-nfs; +Cc: Josef Bacik, Benjamin Coddington

On Tue, 2025-03-25 at 20:15 -0400, Jeff Layton wrote:
> On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > The nfs_client manages state for all the superblocks in the
> > "cl_superblocks" list, so it must not be shut down until all of
> > them are
> > gone.
> > 
> > Fixes: 7d3e26a054c8 ("NFS: Cancel all existing RPC tasks when
> > shutdown")
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  fs/nfs/sysfs.c | 22 +++++++++++++++++++++-
> >  1 file changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
> > index b30401b2c939..37cb2b776435 100644
> > --- a/fs/nfs/sysfs.c
> > +++ b/fs/nfs/sysfs.c
> > @@ -14,6 +14,7 @@
> >  #include <linux/rcupdate.h>
> >  #include <linux/lockd/lockd.h>
> >  
> > +#include "internal.h"
> >  #include "nfs4_fs.h"
> >  #include "netns.h"
> >  #include "sysfs.h"
> > @@ -228,6 +229,25 @@ static void shutdown_client(struct rpc_clnt
> > *clnt)
> >  	rpc_cancel_tasks(clnt, -EIO, shutdown_match_client, NULL);
> >  }
> >  
> > +/*
> > + * Shut down the nfs_client only once all the superblocks
> > + * have been shut down.
> > + */
> > +static void shutdown_nfs_client(struct nfs_client *clp)
> > +{
> > +	struct nfs_server *server;
> > +	rcu_read_lock();
> > +	list_for_each_entry_rcu(server, &clp->cl_superblocks,
> > client_link) {
> > +		if (!(server->flags & NFS_MOUNT_SHUTDOWN)) {
> > +			rcu_read_unlock();
> > +			return;
> > +		}
> > +	}
> > +	rcu_read_unlock();
> > +	nfs_mark_client_ready(clp, -EIO);
> > +	shutdown_client(clp->cl_rpcclient);
> > +}
> > +
> 
> Isn't the upshot of this that a mount won't actually get shutdown
> until
> you shutdown all of the mounts to the same server? Personally, I find
> that acceptable, but we should note that it is a change in behavior.

It means that the other mounts to the same server won't inevitably and
irrevocably lock up. I'm unhappy that I missed this bug when I applied
the patch, but that's not a reason to not fix it.

As I said in the above changelog, the nfs_client's RPC client is there
to manage state for all mounts to the same server in the same network
namespace. If you shut down that client while there are still mounts
that depend on it, then not only have you shot yourself in the foot,
but you're going to spend a lot of electrons to just loop madly when
those other mounts need to recover state but can't because you've
permanently shut down the only way for recovery threads to communicate
with the server.

> 
> 
> >  static ssize_t
> >  shutdown_show(struct kobject *kobj, struct kobj_attribute *attr,
> >  				char *buf)
> > @@ -259,7 +279,6 @@ shutdown_store(struct kobject *kobj, struct
> > kobj_attribute *attr,
> >  
> >  	server->flags |= NFS_MOUNT_SHUTDOWN;
> >  	shutdown_client(server->client);
> > -	shutdown_client(server->nfs_client->cl_rpcclient);
> >  
> >  	if (!IS_ERR(server->client_acl))
> >  		shutdown_client(server->client_acl);
> > @@ -267,6 +286,7 @@ shutdown_store(struct kobject *kobj, struct
> > kobj_attribute *attr,
> >  	if (server->nlm_host)
> >  		shutdown_client(server->nlm_host->h_rpcclnt);
> >  out:
> > +	shutdown_nfs_client(server->nfs_client);
> >  	return count;
> >  }
> >  
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks
  2025-03-26  0:37     ` Trond Myklebust
@ 2025-03-26 10:07       ` Benjamin Coddington
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Coddington @ 2025-03-26 10:07 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Jeff Layton, linux-nfs, Josef Bacik

On 25 Mar 2025, at 20:37, Trond Myklebust wrote:

> On Tue, 2025-03-25 at 20:15 -0400, Jeff Layton wrote:
>> On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>
>>> The nfs_client manages state for all the superblocks in the
>>> "cl_superblocks" list, so it must not be shut down until all of
>>> them are
>>> gone.
>>>
>>> Fixes: 7d3e26a054c8 ("NFS: Cancel all existing RPC tasks when
>>> shutdown")
>>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>>> ---
>>>  fs/nfs/sysfs.c | 22 +++++++++++++++++++++-
>>>  1 file changed, 21 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
>>> index b30401b2c939..37cb2b776435 100644
>>> --- a/fs/nfs/sysfs.c
>>> +++ b/fs/nfs/sysfs.c
>>> @@ -14,6 +14,7 @@
>>>  #include <linux/rcupdate.h>
>>>  #include <linux/lockd/lockd.h>
>>>  
>>> +#include "internal.h"
>>>  #include "nfs4_fs.h"
>>>  #include "netns.h"
>>>  #include "sysfs.h"
>>> @@ -228,6 +229,25 @@ static void shutdown_client(struct rpc_clnt
>>> *clnt)
>>>  	rpc_cancel_tasks(clnt, -EIO, shutdown_match_client, NULL);
>>>  }
>>>  
>>> +/*
>>> + * Shut down the nfs_client only once all the superblocks
>>> + * have been shut down.
>>> + */
>>> +static void shutdown_nfs_client(struct nfs_client *clp)
>>> +{
>>> +	struct nfs_server *server;
>>> +	rcu_read_lock();
>>> +	list_for_each_entry_rcu(server, &clp->cl_superblocks,
>>> client_link) {
>>> +		if (!(server->flags & NFS_MOUNT_SHUTDOWN)) {
>>> +			rcu_read_unlock();
>>> +			return;
>>> +		}
>>> +	}
>>> +	rcu_read_unlock();
>>> +	nfs_mark_client_ready(clp, -EIO);
>>> +	shutdown_client(clp->cl_rpcclient);
>>> +}
>>> +
>>
>> Isn't the upshot of this that a mount won't actually get shutdown
>> until
>> you shutdown all of the mounts to the same server? Personally, I find
>> that acceptable, but we should note that it is a change in behavior.
>
> It means that the other mounts to the same server won't inevitably and
> irrevocably lock up. I'm unhappy that I missed this bug when I applied
> the patch, but that's not a reason to not fix it.
>
> As I said in the above changelog, the nfs_client's RPC client is there
> to manage state for all mounts to the same server in the same network
> namespace. If you shut down that client while there are still mounts
> that depend on it, then not only have you shot yourself in the foot,
> but you're going to spend a lot of electrons to just loop madly when
> those other mounts need to recover state but can't because you've
> permanently shut down the only way for recovery threads to communicate
> with the server.

The only use for shutdown so far had been when the server is permanently
unavailable, so breaking mounts to the same server didn't matter.  I agree
this fix is the right thing to do now.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-25 22:35 ` [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops trondmy
  2025-03-26  0:17   ` Jeff Layton
@ 2025-03-26 10:13   ` Jeff Layton
  2025-03-26 10:46     ` Benjamin Coddington
  2025-03-26 18:11     ` Trond Myklebust
  1 sibling, 2 replies; 24+ messages in thread
From: Jeff Layton @ 2025-03-26 10:13 UTC (permalink / raw)
  To: trondmy, linux-nfs; +Cc: Josef Bacik, Benjamin Coddington

On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> Replace the tests for the RPC client being shut down with tests for
> whether the nfs_client is in an error state.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/nfs4proc.c  | 2 +-
>  fs/nfs/nfs4state.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 889511650ceb..50be54e0f578 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -9580,7 +9580,7 @@ static void nfs41_sequence_call_done(struct rpc_task *task, void *data)
>  		return;
>  
>  	trace_nfs4_sequence(clp, task->tk_status);
> -	if (task->tk_status < 0 && !task->tk_client->cl_shutdown) {
> +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
>  		dprintk("%s ERROR %d\n", __func__, task->tk_status);
>  		if (refcount_read(&clp->cl_count) == 1)
>  			return;
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 542cdf71229f..f1f7eaa97973 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp)
>  	struct rpc_clnt *clnt = clp->cl_rpcclient;
>  	bool swapon = false;
>  
> -	if (clnt->cl_shutdown)
> +	if (clp->cl_cons_state < 0)
>  		return;
>  
>  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);

One more thing:

Do we need cl_shutdown at all? If we can replace these checks here with
a check for cl_cons_state < 0, why not do the same in call_start()?
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
  2025-03-25 22:35 ` [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
@ 2025-03-26 10:39   ` Benjamin Coddington
  2025-03-26 11:18     ` Jeff Layton
  0 siblings, 1 reply; 24+ messages in thread
From: Benjamin Coddington @ 2025-03-26 10:39 UTC (permalink / raw)
  To: trondmy; +Cc: linux-nfs, Jeff Layton, Josef Bacik

On 25 Mar 2025, at 18:35, trondmy@kernel.org wrote:

> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> If a containerised process is killed and causes an ENETUNREACH or
> ENETDOWN error to be propagated to the state manager, then mark the
> nfs_client as being dead so that we don't loop in functions that are
> expecting recovery to succeed.
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/nfs4state.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 272d2ebdae0f..7612e977e80b 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct nfs_client *clp)
>  	pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s"
>  			" with error %d\n", section_sep, section,
>  			clp->cl_hostname, -status);
> -	ssleep(1);
> +	switch (status) {
> +	case -ENETDOWN:
> +	case -ENETUNREACH:
> +		nfs_mark_client_ready(clp, -EIO);
> +		break;
> +	default:
> +		ssleep(1);
> +		break;
> +	}
>  out_drain:
>  	memalloc_nofs_restore(memflags);
>  	nfs4_end_drain_session(clp);
> -- 
> 2.49.0

Doesn't this have the same bug as the sysfs shutdown - in that a mount with
fatal_neterrors=ENETDOWN:ENETUNREACH can take down the state manager for a
mount without it?  I think the same consideration applies as shutdown so
far: in practical use, you're not going to care.

Another thought - its pretty subtle that the only way those errors
might/should reach us here is if that mount option is in play.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0'
  2025-03-25 22:35 ` [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
@ 2025-03-26 10:43   ` Benjamin Coddington
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Coddington @ 2025-03-26 10:43 UTC (permalink / raw)
  To: trondmy; +Cc: linux-nfs, Jeff Layton, Josef Bacik

On 25 Mar 2025, at 18:35, trondmy@kernel.org wrote:

> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> If we already had a valid port number for the RPC service, then we
> should not allow the rpcbind client to set it to the invalid value '0'.
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  net/sunrpc/rpcb_clnt.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
> index 102c3818bc54..53bcca365fb1 100644
> --- a/net/sunrpc/rpcb_clnt.c
> +++ b/net/sunrpc/rpcb_clnt.c
> @@ -820,9 +820,10 @@ static void rpcb_getport_done(struct rpc_task *child, void *data)
>  	}
>
>  	trace_rpcb_setport(child, map->r_status, map->r_port);
> -	xprt->ops->set_port(xprt, map->r_port);
> -	if (map->r_port)
> +	if (map->r_port) {
> +		xprt->ops->set_port(xprt, map->r_port);
>  		xprt_set_bound(xprt);
> +	}
>  }
>
>  /*
> -- 
> 2.49.0

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/6] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
  2025-03-25 22:35 ` [PATCH v3 2/6] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
@ 2025-03-26 10:43   ` Benjamin Coddington
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Coddington @ 2025-03-26 10:43 UTC (permalink / raw)
  To: trondmy; +Cc: linux-nfs, Jeff Layton, Josef Bacik

On 25 Mar 2025, at 18:35, trondmy@kernel.org wrote:

> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> The autobind setting was supposed to be determined in rpc_create(),
> since commit c2866763b402 ("SUNRPC: use sockaddr + size when creating
> remote transport endpoints").
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  net/sunrpc/clnt.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 1cfaf93ceec1..6f75862d9782 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -270,9 +270,6 @@ static struct rpc_xprt *rpc_clnt_set_transport(struct rpc_clnt *clnt,
>  	old = rcu_dereference_protected(clnt->cl_xprt,
>  			lockdep_is_held(&clnt->cl_lock));
>
> -	if (!xprt_bound(xprt))
> -		clnt->cl_autobind = 1;
> -
>  	clnt->cl_timeout = timeout;
>  	rcu_assign_pointer(clnt->cl_xprt, xprt);
>  	spin_unlock(&clnt->cl_lock);
> -- 
> 2.49.0

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-26 10:13   ` Jeff Layton
@ 2025-03-26 10:46     ` Benjamin Coddington
  2025-03-26 18:11     ` Trond Myklebust
  1 sibling, 0 replies; 24+ messages in thread
From: Benjamin Coddington @ 2025-03-26 10:46 UTC (permalink / raw)
  To: Jeff Layton; +Cc: trondmy, linux-nfs, Josef Bacik

On 26 Mar 2025, at 6:13, Jeff Layton wrote:

> On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>
>> Replace the tests for the RPC client being shut down with tests for
>> whether the nfs_client is in an error state.
>>
>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>> ---
>>  fs/nfs/nfs4proc.c  | 2 +-
>>  fs/nfs/nfs4state.c | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 889511650ceb..50be54e0f578 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -9580,7 +9580,7 @@ static void nfs41_sequence_call_done(struct rpc_task *task, void *data)
>>  		return;
>>
>>  	trace_nfs4_sequence(clp, task->tk_status);
>> -	if (task->tk_status < 0 && !task->tk_client->cl_shutdown) {
>> +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
>>  		dprintk("%s ERROR %d\n", __func__, task->tk_status);
>>  		if (refcount_read(&clp->cl_count) == 1)
>>  			return;
>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>> index 542cdf71229f..f1f7eaa97973 100644
>> --- a/fs/nfs/nfs4state.c
>> +++ b/fs/nfs/nfs4state.c
>> @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp)
>>  	struct rpc_clnt *clnt = clp->cl_rpcclient;
>>  	bool swapon = false;
>>
>> -	if (clnt->cl_shutdown)
>> +	if (clp->cl_cons_state < 0)
>>  		return;
>>
>>  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
>
> One more thing:
>
> Do we need cl_shutdown at all? If we can replace these checks here with
> a check for cl_cons_state < 0, why not do the same in call_start()?
> -- 
> Jeff Layton <jlayton@kernel.org>

Agree - I don't see why not.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
  2025-03-25 22:35 ` [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
  2025-03-26  0:17   ` Jeff Layton
@ 2025-03-26 10:46   ` Benjamin Coddington
  1 sibling, 0 replies; 24+ messages in thread
From: Benjamin Coddington @ 2025-03-26 10:46 UTC (permalink / raw)
  To: trondmy; +Cc: linux-nfs, Jeff Layton, Josef Bacik

On 25 Mar 2025, at 18:35, trondmy@kernel.org wrote:

> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> If someone calls nfs_mark_client_ready(clp, status) with a negative
> value for status, then that should signal that the nfs_client is no
> longer valid.
>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index f1f7eaa97973..272d2ebdae0f 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1403,7 +1403,7 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
>  	dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
>  			clp->cl_hostname);
>  	nfs4_schedule_state_manager(clp);
> -	return 0;
> +	return clp->cl_cons_state < 0 ? clp->cl_cons_state : 0;
>  }
>  EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);
>
> -- 
> 2.49.0

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
  2025-03-26 10:39   ` Benjamin Coddington
@ 2025-03-26 11:18     ` Jeff Layton
  2025-03-26 13:10       ` Trond Myklebust
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Layton @ 2025-03-26 11:18 UTC (permalink / raw)
  To: Benjamin Coddington, trondmy; +Cc: linux-nfs, Josef Bacik

On Wed, 2025-03-26 at 06:39 -0400, Benjamin Coddington wrote:
> On 25 Mar 2025, at 18:35, trondmy@kernel.org wrote:
> 
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > If a containerised process is killed and causes an ENETUNREACH or
> > ENETDOWN error to be propagated to the state manager, then mark the
> > nfs_client as being dead so that we don't loop in functions that are
> > expecting recovery to succeed.
> > 
> > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  fs/nfs/nfs4state.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > index 272d2ebdae0f..7612e977e80b 100644
> > --- a/fs/nfs/nfs4state.c
> > +++ b/fs/nfs/nfs4state.c
> > @@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct nfs_client *clp)
> >  	pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s"
> >  			" with error %d\n", section_sep, section,
> >  			clp->cl_hostname, -status);
> > -	ssleep(1);
> > +	switch (status) {
> > +	case -ENETDOWN:
> > +	case -ENETUNREACH:
> > +		nfs_mark_client_ready(clp, -EIO);
> > +		break;
> > +	default:
> > +		ssleep(1);
> > +		break;
> > +	}
> >  out_drain:
> >  	memalloc_nofs_restore(memflags);
> >  	nfs4_end_drain_session(clp);
> > -- 
> > 2.49.0
> 
> Doesn't this have the same bug as the sysfs shutdown - in that a mount with
> fatal_neterrors=ENETDOWN:ENETUNREACH can take down the state manager for a
> mount without it?  I think the same consideration applies as shutdown so
> far: in practical use, you're not going to care.
> 

True. In principle we probably ought to avoid sharing superblocks
between mounts with different fatal_errors= options. In practice, I
agree that no one will care since this means that the server is borked.

> Another thought - its pretty subtle that the only way those errors
> might/should reach us here is if that mount option is in play.
> 
> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
> 
> Ben
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
  2025-03-26 11:18     ` Jeff Layton
@ 2025-03-26 13:10       ` Trond Myklebust
  0 siblings, 0 replies; 24+ messages in thread
From: Trond Myklebust @ 2025-03-26 13:10 UTC (permalink / raw)
  To: jlayton@kernel.org, bcodding@redhat.com
  Cc: linux-nfs@vger.kernel.org, josef@toxicpanda.com

On Wed, 2025-03-26 at 07:18 -0400, Jeff Layton wrote:
> On Wed, 2025-03-26 at 06:39 -0400, Benjamin Coddington wrote:
> > On 25 Mar 2025, at 18:35, trondmy@kernel.org wrote:
> > 
> > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > 
> > > If a containerised process is killed and causes an ENETUNREACH or
> > > ENETDOWN error to be propagated to the state manager, then mark
> > > the
> > > nfs_client as being dead so that we don't loop in functions that
> > > are
> > > expecting recovery to succeed.
> > > 
> > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > ---
> > >  fs/nfs/nfs4state.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > index 272d2ebdae0f..7612e977e80b 100644
> > > --- a/fs/nfs/nfs4state.c
> > > +++ b/fs/nfs/nfs4state.c
> > > @@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct
> > > nfs_client *clp)
> > >  	pr_warn_ratelimited("NFS: state manager%s%s failed on
> > > NFSv4 server %s"
> > >  			" with error %d\n", section_sep,
> > > section,
> > >  			clp->cl_hostname, -status);
> > > -	ssleep(1);
> > > +	switch (status) {
> > > +	case -ENETDOWN:
> > > +	case -ENETUNREACH:
> > > +		nfs_mark_client_ready(clp, -EIO);
> > > +		break;
> > > +	default:
> > > +		ssleep(1);
> > > +		break;
> > > +	}
> > >  out_drain:
> > >  	memalloc_nofs_restore(memflags);
> > >  	nfs4_end_drain_session(clp);
> > > -- 
> > > 2.49.0
> > 
> > Doesn't this have the same bug as the sysfs shutdown - in that a
> > mount with
> > fatal_neterrors=ENETDOWN:ENETUNREACH can take down the state
> > manager for a
> > mount without it?  I think the same consideration applies as
> > shutdown so
> > far: in practical use, you're not going to care.
> > 
> 
> True. In principle we probably ought to avoid sharing superblocks
> between mounts with different fatal_errors= options. In practice, I
> agree that no one will care since this means that the server is
> borked.
> 
> > Another thought - its pretty subtle that the only way those errors
> > might/should reach us here is if that mount option is in play.
> > 
> > Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
> > 
> > Ben
> > 
> 

The difference here is that the fatal_neterrors option as it stands
today, only looks at whether or not your request is routable. It is not
intended to function as a mechanism for dealing with servers being
down. It only works as a last resort for when the host's orchestrator
software destroys a container's local virtual network devices without
first ensuring that its mounts have been safely shut down.

IOW: yes there are some subtleties here, which is why we need a mount
option in the first place to allow override of the default behaviours.
However those default settings as they stand are hopefully sufficiently
conservative that admins should only rarely need to override them.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-26 10:13   ` Jeff Layton
  2025-03-26 10:46     ` Benjamin Coddington
@ 2025-03-26 18:11     ` Trond Myklebust
  2025-03-26 18:21       ` Jeff Layton
  1 sibling, 1 reply; 24+ messages in thread
From: Trond Myklebust @ 2025-03-26 18:11 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org, jlayton@kernel.org
  Cc: josef@toxicpanda.com, bcodding@redhat.com

On Wed, 2025-03-26 at 06:13 -0400, Jeff Layton wrote:
> On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > Replace the tests for the RPC client being shut down with tests for
> > whether the nfs_client is in an error state.
> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  fs/nfs/nfs4proc.c  | 2 +-
> >  fs/nfs/nfs4state.c | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 889511650ceb..50be54e0f578 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -9580,7 +9580,7 @@ static void nfs41_sequence_call_done(struct
> > rpc_task *task, void *data)
> >  		return;
> >  
> >  	trace_nfs4_sequence(clp, task->tk_status);
> > -	if (task->tk_status < 0 && !task->tk_client->cl_shutdown)
> > {
> > +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
> >  		dprintk("%s ERROR %d\n", __func__, task-
> > >tk_status);
> >  		if (refcount_read(&clp->cl_count) == 1)
> >  			return;
> > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > index 542cdf71229f..f1f7eaa97973 100644
> > --- a/fs/nfs/nfs4state.c
> > +++ b/fs/nfs/nfs4state.c
> > @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct
> > nfs_client *clp)
> >  	struct rpc_clnt *clnt = clp->cl_rpcclient;
> >  	bool swapon = false;
> >  
> > -	if (clnt->cl_shutdown)
> > +	if (clp->cl_cons_state < 0)
> >  		return;
> >  
> >  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
> 
> One more thing:
> 
> Do we need cl_shutdown at all? If we can replace these checks here
> with
> a check for cl_cons_state < 0, why not do the same in call_start()?

The struct nfs_client is a NFS level object. It can't be moved to the
RPC layer.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-26 18:11     ` Trond Myklebust
@ 2025-03-26 18:21       ` Jeff Layton
  2025-03-26 18:24         ` Trond Myklebust
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Layton @ 2025-03-26 18:21 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs@vger.kernel.org
  Cc: josef@toxicpanda.com, bcodding@redhat.com

On Wed, 2025-03-26 at 18:11 +0000, Trond Myklebust wrote:
> On Wed, 2025-03-26 at 06:13 -0400, Jeff Layton wrote:
> > On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > 
> > > Replace the tests for the RPC client being shut down with tests for
> > > whether the nfs_client is in an error state.
> > > 
> > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > ---
> > >  fs/nfs/nfs4proc.c  | 2 +-
> > >  fs/nfs/nfs4state.c | 2 +-
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 889511650ceb..50be54e0f578 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -9580,7 +9580,7 @@ static void nfs41_sequence_call_done(struct
> > > rpc_task *task, void *data)
> > >  		return;
> > >  
> > >  	trace_nfs4_sequence(clp, task->tk_status);
> > > -	if (task->tk_status < 0 && !task->tk_client->cl_shutdown)
> > > {
> > > +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
> > >  		dprintk("%s ERROR %d\n", __func__, task-
> > > > tk_status);
> > >  		if (refcount_read(&clp->cl_count) == 1)
> > >  			return;
> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > index 542cdf71229f..f1f7eaa97973 100644
> > > --- a/fs/nfs/nfs4state.c
> > > +++ b/fs/nfs/nfs4state.c
> > > @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct
> > > nfs_client *clp)
> > >  	struct rpc_clnt *clnt = clp->cl_rpcclient;
> > >  	bool swapon = false;
> > >  
> > > -	if (clnt->cl_shutdown)
> > > +	if (clp->cl_cons_state < 0)
> > >  		return;
> > >  
> > >  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
> > 
> > One more thing:
> > 
> > Do we need cl_shutdown at all? If we can replace these checks here
> > with
> > a check for cl_cons_state < 0, why not do the same in call_start()?
> 
> The struct nfs_client is a NFS level object. It can't be moved to the
> RPC layer.
> 

But...cl_shutdown is an rpc_clnt field.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-26 18:21       ` Jeff Layton
@ 2025-03-26 18:24         ` Trond Myklebust
  2025-03-27  0:35           ` Jeff Layton
  0 siblings, 1 reply; 24+ messages in thread
From: Trond Myklebust @ 2025-03-26 18:24 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org, jlayton@kernel.org
  Cc: josef@toxicpanda.com, bcodding@redhat.com

On Wed, 2025-03-26 at 14:21 -0400, Jeff Layton wrote:
> On Wed, 2025-03-26 at 18:11 +0000, Trond Myklebust wrote:
> > On Wed, 2025-03-26 at 06:13 -0400, Jeff Layton wrote:
> > > On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> > > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > > 
> > > > Replace the tests for the RPC client being shut down with tests
> > > > for
> > > > whether the nfs_client is in an error state.
> > > > 
> > > > Signed-off-by: Trond Myklebust
> > > > <trond.myklebust@hammerspace.com>
> > > > ---
> > > >  fs/nfs/nfs4proc.c  | 2 +-
> > > >  fs/nfs/nfs4state.c | 2 +-
> > > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > > index 889511650ceb..50be54e0f578 100644
> > > > --- a/fs/nfs/nfs4proc.c
> > > > +++ b/fs/nfs/nfs4proc.c
> > > > @@ -9580,7 +9580,7 @@ static void
> > > > nfs41_sequence_call_done(struct
> > > > rpc_task *task, void *data)
> > > >  		return;
> > > >  
> > > >  	trace_nfs4_sequence(clp, task->tk_status);
> > > > -	if (task->tk_status < 0 && !task->tk_client-
> > > > >cl_shutdown)
> > > > {
> > > > +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
> > > >  		dprintk("%s ERROR %d\n", __func__, task-
> > > > > tk_status);
> > > >  		if (refcount_read(&clp->cl_count) == 1)
> > > >  			return;
> > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > > index 542cdf71229f..f1f7eaa97973 100644
> > > > --- a/fs/nfs/nfs4state.c
> > > > +++ b/fs/nfs/nfs4state.c
> > > > @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct
> > > > nfs_client *clp)
> > > >  	struct rpc_clnt *clnt = clp->cl_rpcclient;
> > > >  	bool swapon = false;
> > > >  
> > > > -	if (clnt->cl_shutdown)
> > > > +	if (clp->cl_cons_state < 0)
> > > >  		return;
> > > >  
> > > >  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
> > > 
> > > One more thing:
> > > 
> > > Do we need cl_shutdown at all? If we can replace these checks
> > > here
> > > with
> > > a check for cl_cons_state < 0, why not do the same in
> > > call_start()?
> > 
> > The struct nfs_client is a NFS level object. It can't be moved to
> > the
> > RPC layer.
> > 
> 
> But...cl_shutdown is an rpc_clnt field.

Right, but cl_cons_state is a field in struct nfs_client, hence that
check cannot be moved into call_start()

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops
  2025-03-26 18:24         ` Trond Myklebust
@ 2025-03-27  0:35           ` Jeff Layton
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff Layton @ 2025-03-27  0:35 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs@vger.kernel.org
  Cc: josef@toxicpanda.com, bcodding@redhat.com

On Wed, 2025-03-26 at 18:24 +0000, Trond Myklebust wrote:
> On Wed, 2025-03-26 at 14:21 -0400, Jeff Layton wrote:
> > On Wed, 2025-03-26 at 18:11 +0000, Trond Myklebust wrote:
> > > On Wed, 2025-03-26 at 06:13 -0400, Jeff Layton wrote:
> > > > On Tue, 2025-03-25 at 18:35 -0400, trondmy@kernel.org wrote:
> > > > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > > > 
> > > > > Replace the tests for the RPC client being shut down with tests
> > > > > for
> > > > > whether the nfs_client is in an error state.
> > > > > 
> > > > > Signed-off-by: Trond Myklebust
> > > > > <trond.myklebust@hammerspace.com>
> > > > > ---
> > > > >  fs/nfs/nfs4proc.c  | 2 +-
> > > > >  fs/nfs/nfs4state.c | 2 +-
> > > > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > > > index 889511650ceb..50be54e0f578 100644
> > > > > --- a/fs/nfs/nfs4proc.c
> > > > > +++ b/fs/nfs/nfs4proc.c
> > > > > @@ -9580,7 +9580,7 @@ static void
> > > > > nfs41_sequence_call_done(struct
> > > > > rpc_task *task, void *data)
> > > > >  		return;
> > > > >  
> > > > >  	trace_nfs4_sequence(clp, task->tk_status);
> > > > > -	if (task->tk_status < 0 && !task->tk_client-
> > > > > > cl_shutdown)
> > > > > {
> > > > > +	if (task->tk_status < 0 && clp->cl_cons_state >= 0) {
> > > > >  		dprintk("%s ERROR %d\n", __func__, task-
> > > > > > tk_status);
> > > > >  		if (refcount_read(&clp->cl_count) == 1)
> > > > >  			return;
> > > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > > > index 542cdf71229f..f1f7eaa97973 100644
> > > > > --- a/fs/nfs/nfs4state.c
> > > > > +++ b/fs/nfs/nfs4state.c
> > > > > @@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct
> > > > > nfs_client *clp)
> > > > >  	struct rpc_clnt *clnt = clp->cl_rpcclient;
> > > > >  	bool swapon = false;
> > > > >  
> > > > > -	if (clnt->cl_shutdown)
> > > > > +	if (clp->cl_cons_state < 0)
> > > > >  		return;
> > > > >  
> > > > >  	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
> > > > 
> > > > One more thing:
> > > > 
> > > > Do we need cl_shutdown at all? If we can replace these checks
> > > > here
> > > > with
> > > > a check for cl_cons_state < 0, why not do the same in
> > > > call_start()?
> > > 
> > > The struct nfs_client is a NFS level object. It can't be moved to
> > > the
> > > RPC layer.
> > > 
> > 
> > But...cl_shutdown is an rpc_clnt field.
> 
> Right, but cl_cons_state is a field in struct nfs_client, hence that
> check cannot be moved into call_start()

Doh! Good point. Nevermind!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-03-27  0:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-25 22:35 [PATCH v3 0/6] Ensure that ENETUNREACH terminates state recovery trondmy
2025-03-25 22:35 ` [PATCH v3 1/6] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
2025-03-26 10:43   ` Benjamin Coddington
2025-03-25 22:35 ` [PATCH v3 2/6] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
2025-03-26 10:43   ` Benjamin Coddington
2025-03-25 22:35 ` [PATCH v3 3/6] NFS: Shut down the nfs_client only after all the superblocks trondmy
2025-03-26  0:15   ` Jeff Layton
2025-03-26  0:37     ` Trond Myklebust
2025-03-26 10:07       ` Benjamin Coddington
2025-03-25 22:35 ` [PATCH v3 4/6] NFSv4: Further cleanups to shutdown loops trondmy
2025-03-26  0:17   ` Jeff Layton
2025-03-26 10:13   ` Jeff Layton
2025-03-26 10:46     ` Benjamin Coddington
2025-03-26 18:11     ` Trond Myklebust
2025-03-26 18:21       ` Jeff Layton
2025-03-26 18:24         ` Trond Myklebust
2025-03-27  0:35           ` Jeff Layton
2025-03-25 22:35 ` [PATCH v3 5/6] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
2025-03-26  0:17   ` Jeff Layton
2025-03-26 10:46   ` Benjamin Coddington
2025-03-25 22:35 ` [PATCH v3 6/6] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
2025-03-26 10:39   ` Benjamin Coddington
2025-03-26 11:18     ` Jeff Layton
2025-03-26 13:10       ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox