* [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0'
2025-03-25 0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
@ 2025-03-25 0:46 ` trondmy
2025-03-25 0:46 ` [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: trondmy @ 2025-03-25 0:46 UTC (permalink / raw)
To: linux-nfs; +Cc: Jeff Layton, Josef Bacik
From: Trond Myklebust <trond.myklebust@hammerspace.com>
If we already had a valid port number for the RPC service, then we
should not allow the rpcbind client to set it to the invalid value '0'.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
net/sunrpc/rpcb_clnt.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 102c3818bc54..53bcca365fb1 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -820,9 +820,10 @@ static void rpcb_getport_done(struct rpc_task *child, void *data)
}
trace_rpcb_setport(child, map->r_status, map->r_port);
- xprt->ops->set_port(xprt, map->r_port);
- if (map->r_port)
+ if (map->r_port) {
+ xprt->ops->set_port(xprt, map->r_port);
xprt_set_bound(xprt);
+ }
}
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
2025-03-25 0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
2025-03-25 0:46 ` [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
@ 2025-03-25 0:46 ` trondmy
2025-03-25 0:46 ` [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
2025-03-25 0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
3 siblings, 0 replies; 6+ messages in thread
From: trondmy @ 2025-03-25 0:46 UTC (permalink / raw)
To: linux-nfs; +Cc: Jeff Layton, Josef Bacik
From: Trond Myklebust <trond.myklebust@hammerspace.com>
The autobind setting was supposed to be determined in rpc_create(),
since commit c2866763b402 ("SUNRPC: use sockaddr + size when creating
remote transport endpoints").
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
net/sunrpc/clnt.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 1cfaf93ceec1..6f75862d9782 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -270,9 +270,6 @@ static struct rpc_xprt *rpc_clnt_set_transport(struct rpc_clnt *clnt,
old = rcu_dereference_protected(clnt->cl_xprt,
lockdep_is_held(&clnt->cl_lock));
- if (!xprt_bound(xprt))
- clnt->cl_autobind = 1;
-
clnt->cl_timeout = timeout;
rcu_assign_pointer(clnt->cl_xprt, xprt);
spin_unlock(&clnt->cl_lock);
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client
2025-03-25 0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
2025-03-25 0:46 ` [PATCH 1/4] SUNRPC: rpcbind should never reset the port to the value '0' trondmy
2025-03-25 0:46 ` [PATCH 2/4] SUNRPC: rpc_clnt_set_transport() must not change the autobind setting trondmy
@ 2025-03-25 0:46 ` trondmy
2025-03-25 0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
3 siblings, 0 replies; 6+ messages in thread
From: trondmy @ 2025-03-25 0:46 UTC (permalink / raw)
To: linux-nfs; +Cc: Jeff Layton, Josef Bacik
From: Trond Myklebust <trond.myklebust@hammerspace.com>
If someone calls nfs_mark_client_ready(clp, status) with a negative
value for status, then that should signal that the nfs_client is no
longer valid.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
fs/nfs/nfs4state.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 542cdf71229f..738eb2789266 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1198,7 +1198,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp)
struct rpc_clnt *clnt = clp->cl_rpcclient;
bool swapon = false;
- if (clnt->cl_shutdown)
+ if (clnt->cl_shutdown || clp->cl_cons_state < 0)
return;
set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
@@ -1403,7 +1403,7 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
clp->cl_hostname);
nfs4_schedule_state_manager(clp);
- return 0;
+ return clp->cl_cons_state < 0 ? clp->cl_cons_state : 0;
}
EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
2025-03-25 0:46 [PATCH 0/4] Ensure that ENETUNREACH terminates state recovery trondmy
` (2 preceding siblings ...)
2025-03-25 0:46 ` [PATCH 3/4] NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client trondmy
@ 2025-03-25 0:46 ` trondmy
2025-03-25 11:36 ` Trond Myklebust
3 siblings, 1 reply; 6+ messages in thread
From: trondmy @ 2025-03-25 0:46 UTC (permalink / raw)
To: linux-nfs; +Cc: Jeff Layton, Josef Bacik
From: Trond Myklebust <trond.myklebust@hammerspace.com>
If a containerised process is killed and causes an ENETUNREACH or
ENETDOWN error to be propagated to the state manager, then mark the
nfs_client as being dead so that we don't loop in functions that are
expecting recovery to succeed.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
fs/nfs/nfs4state.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 738eb2789266..629578dd4a42 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct nfs_client *clp)
pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s"
" with error %d\n", section_sep, section,
clp->cl_hostname, -status);
- ssleep(1);
+ switch (status) {
+ case -ENETDOWN:
+ case -ENETUNREACH:
+ nfs_mark_client_ready(clp, status);
+ break;
+ default:
+ ssleep(1);
+ break;
+ }
out_drain:
memalloc_nofs_restore(memflags);
nfs4_end_drain_session(clp);
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery
2025-03-25 0:46 ` [PATCH 4/4] NFSv4: Treat ENETUNREACH errors as fatal for state recovery trondmy
@ 2025-03-25 11:36 ` Trond Myklebust
0 siblings, 0 replies; 6+ messages in thread
From: Trond Myklebust @ 2025-03-25 11:36 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org; +Cc: josef@toxicpanda.com, jlayton@kernel.org
On Mon, 2025-03-24 at 20:46 -0400, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> If a containerised process is killed and causes an ENETUNREACH or
> ENETDOWN error to be propagated to the state manager, then mark the
> nfs_client as being dead so that we don't loop in functions that are
> expecting recovery to succeed.
>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
> fs/nfs/nfs4state.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 738eb2789266..629578dd4a42 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct
> nfs_client *clp)
> pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4
> server %s"
> " with error %d\n", section_sep, section,
> clp->cl_hostname, -status);
> - ssleep(1);
> + switch (status) {
> + case -ENETDOWN:
> + case -ENETUNREACH:
> + nfs_mark_client_ready(clp, status);
Actually, that probably needs to be nfs_mark_client_ready(clp, -EIO) in
order to make the return value a fatal I/O error.
> + break;
> + default:
> + ssleep(1);
> + break;
> + }
> out_drain:
> memalloc_nofs_restore(memflags);
> nfs4_end_drain_session(clp);
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 6+ messages in thread