[PATCH 0/4] Ensure that ENETUNREACH terminates state recovery

Linux NFS development
 help / color / mirror / Atom feed

* [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery
@ 2025-03-25  0:46 trondmy
  2025-03-25  0:46 ` [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: trondmy @ 2025-03-25  0:46 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik

From: Trond Myklebust <trond.myklebust@hammerspace.com>

With the recent patch series that caused containerised mounts which
return ENETUNREACH or ENETDOWN errors to report fatal errors, we also
want to ensure that the state manager thread also triggers fatal errors
in the processes or threads that are waiting for recovery to complete.

Trond Myklebust (4):
  SUNRPC: rpcbind should never reset the port to the value '0'
  SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
  NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
  NFSv4: Treat ENETUNREACH errors as fatal for state recovery

 fs/nfs/nfs4state.c     | 14 +++++++++++---
 net/sunrpc/clnt.c      |  3 ---
 net/sunrpc/rpcb_clnt.c |  5 +++--
 3 files changed, 14 insertions(+), 8 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0'
  2025-03-25  0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
@ 2025-03-25  0:46 ` trondmy
  2025-03-25  0:46 ` [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: trondmy @ 2025-03-25  0:46 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik

From: Trond Myklebust <trond.myklebust@hammerspace.com>

If we already had a valid port number for the RPC service, then we
should not allow the rpcbind client to set it to the invalid value '0'.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/rpcb_clnt.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 102c3818bc54..53bcca365fb1 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -820,9 +820,10 @@ static void rpcb_getport_done(struct rpc_task *child, void *data)
 	}
 
 	trace_rpcb_setport(child, map->r_status, map->r_port);
-	xprt->ops->set_port(xprt, map->r_port);
-	if (map->r_port)
+	if (map->r_port) {
+		xprt->ops->set_port(xprt, map->r_port);
 		xprt_set_bound(xprt);
+	}
 }
 
 /*
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
  2025-03-25  0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
  2025-03-25  0:46 ` [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
@ 2025-03-25  0:46 ` trondmy
  2025-03-25  0:46 ` [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
  2025-03-25  0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
  3 siblings, 0 replies; 6+ messages in thread
From: trondmy @ 2025-03-25  0:46 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik

From: Trond Myklebust <trond.myklebust@hammerspace.com>

The autobind setting was supposed to be determined in rpc_create(),
since commit c2866763b402 ("SUNRPC: use sockaddr + size when creating
remote transport endpoints").

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/clnt.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 1cfaf93ceec1..6f75862d9782 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -270,9 +270,6 @@ static struct rpc_xprt *rpc_clnt_set_transport(struct rpc_clnt *clnt,
 	old = rcu_dereference_protected(clnt->cl_xprt,
 			lockdep_is_held(&clnt->cl_lock));
 
-	if (!xprt_bound(xprt))
-		clnt->cl_autobind = 1;
-
 	clnt->cl_timeout = timeout;
 	rcu_assign_pointer(clnt->cl_xprt, xprt);
 	spin_unlock(&clnt->cl_lock);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
  2025-03-25  0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
  2025-03-25  0:46 ` [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
  2025-03-25  0:46 ` [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
@ 2025-03-25  0:46 ` trondmy
  2025-03-25  0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
  3 siblings, 0 replies; 6+ messages in thread
From: trondmy @ 2025-03-25  0:46 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik

From: Trond Myklebust <trond.myklebust@hammerspace.com>

If someone calls nfs_mark_client_ready(clp, status) with a negative
value for status, then that should signal that the nfs_client is no
longer valid.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4state.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 542cdf71229f..738eb2789266 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp)
 	struct rpc_clnt *clnt = clp->cl_rpcclient;
 	bool swapon = false;
 
-	if (clnt->cl_shutdown)
+	if (clnt->cl_shutdown || clp->cl_cons_state < 0)
 		return;
 
 	set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
@@ -1403,7 +1403,7 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
 	dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
 			clp->cl_hostname);
 	nfs4_schedule_state_manager(clp);
-	return 0;
+	return clp->cl_cons_state < 0 ? clp->cl_cons_state : 0;
 }
 EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
  2025-03-25  0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
                   ` (2 preceding siblings ...)
  2025-03-25  0:46 ` [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
@ 2025-03-25  0:46 ` trondmy
  2025-03-25 11:36   ` Trond Myklebust
  3 siblings, 1 reply; 6+ messages in thread
From: trondmy @ 2025-03-25  0:46 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton, Josef Bacik

From: Trond Myklebust <trond.myklebust@hammerspace.com>

If a containerised process is killed and causes an ENETUNREACH or
ENETDOWN error to be propagated to the state manager, then mark the
nfs_client as being dead so that we don't loop in functions that are
expecting recovery to succeed.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4state.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 738eb2789266..629578dd4a42 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct nfs_client *clp)
 	pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s"
 			" with error %d\n", section_sep, section,
 			clp->cl_hostname, -status);
-	ssleep(1);
+	switch (status) {
+	case -ENETDOWN:
+	case -ENETUNREACH:
+		nfs_mark_client_ready(clp, status);
+		break;
+	default:
+		ssleep(1);
+		break;
+	}
 out_drain:
 	memalloc_nofs_restore(memflags);
 	nfs4_end_drain_session(clp);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
  2025-03-25  0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
@ 2025-03-25 11:36   ` Trond Myklebust
  0 siblings, 0 replies; 6+ messages in thread
From: Trond Myklebust @ 2025-03-25 11:36 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org; +Cc: josef@toxicpanda.com, jlayton@kernel.org

On Mon, 2025-03-24 at 20:46 -0400, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> If a containerised process is killed and causes an ENETUNREACH or
> ENETDOWN error to be propagated to the state manager, then mark the
> nfs_client as being dead so that we don't loop in functions that are
> expecting recovery to succeed.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/nfs4state.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 738eb2789266..629578dd4a42 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct
> nfs_client *clp)
>  	pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4
> server %s"
>  			" with error %d\n", section_sep, section,
>  			clp->cl_hostname, -status);
> -	ssleep(1);
> +	switch (status) {
> +	case -ENETDOWN:
> +	case -ENETUNREACH:
> +		nfs_mark_client_ready(clp, status);

Actually, that probably needs to be nfs_mark_client_ready(clp, -EIO) in
order to make the return value a fatal I/O error.

> +		break;
> +	default:
> +		ssleep(1);
> +		break;
> +	}
>  out_drain:
>  	memalloc_nofs_restore(memflags);
>  	nfs4_end_drain_session(clp);

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-03-25 11:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-25  0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
2025-03-25  0:46 ` [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
2025-03-25  0:46 ` [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
2025-03-25  0:46 ` [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
2025-03-25  0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
2025-03-25 11:36   ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox