[PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups
@ 2025-01-23 20:25 Jeff Layton
  2025-01-23 20:25 ` [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set Jeff Layton
                   ` (7 more replies)
  0 siblings, 8 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

The first four patches fix bugs in the CB_SEQUENCE error handling. The
last patches are cleanups.

These are only lightly tested, mostly because we don't have a great way
to test backchannel error handling. I tried to keep these very small so
that we could bisect if there are problems.

These should probably go in via Chuck's tree, but the last patch touches
some NFS client code, so it'd be good to have R-b's or A-b's from Trond
and/or Anna on that one.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Jeff Layton (8):
      nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
      nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
      nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
      nfsd: fix default case in nfsd4_cb_sequence_done()
      nfsd: reverse default of "ret" variable in nfsd4_cb_sequence_done()
      nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
      nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
      sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return

 fs/nfs/nfs4proc.c           | 12 ++++----
 fs/nfsd/nfs4callback.c      | 69 +++++++++++++++++++++++----------------------
 include/linux/sunrpc/clnt.h |  4 +--
 net/sunrpc/clnt.c           |  7 ++---
 4 files changed, 45 insertions(+), 47 deletions(-)
---
base-commit: 0ab8e05a5a694a1e4c6854a98f08a477d16b6aeb
change-id: 20250123-nfsd-6-14-b0797e385dc0

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-25 16:24   ` Chuck Lever
  2025-01-25 23:01   ` NeilBrown
  2025-01-23 20:25 ` [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED} Jeff Layton
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

This is problematic, since the RPC might have been entirely successful.
There is no point in restarting a v4.1+ callback just because
RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
detecting when it should retransmit the RPC.

Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e12205ef16ca932ffbcc86d67b0817aec2436c89 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1403,9 +1403,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 	}
 	trace_nfsd_cb_free_slot(task, cb);
 	nfsd41_cb_release_slot(cb);
-
-	if (RPC_SIGNALLED(task))
-		goto need_restart;
 out:
 	return ret;
 retry_nowait:

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
  2025-01-23 20:25 ` [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-24 14:32   ` Chuck Lever
  2025-01-23 20:25 ` [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot Jeff Layton
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

The current error handling has some problems:

BADSLOT and BADSESSION: don't release the slot before retrying the call

SEQ_MISORDERED: does some sketchy resetting of the seqid? I can't find any
recommendation about doing that in the spec, and it seems wrong.

Handle all three errors the same way: release the slot, but then handle
it just like we would as if we hadn't gotten a reply; mark the session
as faulty, and retry the call.

Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index e12205ef16ca932ffbcc86d67b0817aec2436c89..bfc9de1fcb67b4f05ed2f7a28038cd8290809c17 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1371,17 +1371,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		nfsd4_mark_cb_fault(cb->cb_clp);
 		ret = false;
 		break;
+	case -NFS4ERR_BADSESSION:
+	case -NFS4ERR_BADSLOT:
+	case -NFS4ERR_SEQ_MISORDERED:
+		/*
+		 * These errors indicate that something has gone wrong
+		 * with the server and client's synchronization. Release
+		 * the slot, but handle it as if we hadn't gotten a reply.
+		 */
+		nfsd41_cb_release_slot(cb);
+		fallthrough;
 	case 1:
 		/*
 		 * cb_seq_status remains 1 if an RPC Reply was never
 		 * received. NFSD can't know if the client processed
 		 * the CB_SEQUENCE operation. Ask the client to send a
-		 * DESTROY_SESSION to recover.
+		 * DESTROY_SESSION to recover, but keep the slot.
 		 */
-		fallthrough;
-	case -NFS4ERR_BADSESSION:
 		nfsd4_mark_cb_fault(cb->cb_clp);
-		ret = false;
 		goto need_restart;
 	case -NFS4ERR_DELAY:
 		cb->cb_seq_status = 1;
@@ -1390,14 +1397,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 
 		rpc_delay(task, 2 * HZ);
 		return false;
-	case -NFS4ERR_BADSLOT:
-		goto retry_nowait;
-	case -NFS4ERR_SEQ_MISORDERED:
-		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
-			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
-			goto retry_nowait;
-		}
-		break;
 	default:
 		nfsd4_mark_cb_fault(cb->cb_clp);
 	}
@@ -1405,10 +1404,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 	nfsd41_cb_release_slot(cb);
 out:
 	return ret;
-retry_nowait:
-	if (rpc_restart_call_prepare(task))
-		ret = false;
-	goto out;
 need_restart:
 	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
 		trace_nfsd_cb_restart(clp, cb);

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
  2025-01-23 20:25 ` [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set Jeff Layton
  2025-01-23 20:25 ` [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED} Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-23 22:18   ` Chuck Lever
  2025-01-23 20:25 ` [PATCH 4/8] nfsd: fix default case in nfsd4_cb_sequence_done() Jeff Layton
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:

"For any of a number of reasons, the replier could not process this
 operation in what was deemed a reasonable time. The client should wait
 and then try the request with a new slot and sequence value."

This is CB_SEQUENCE, but I believe the same rule applies. Release the
slot before submitting the delayed RPC.

Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		goto need_restart;
 	case -NFS4ERR_DELAY:
 		cb->cb_seq_status = 1;
+		nfsd41_cb_release_slot(cb);
 		if (!rpc_restart_call(task))
 			goto out;
 

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/8] nfsd: fix default case in nfsd4_cb_sequence_done()
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (2 preceding siblings ...)
  2025-01-23 20:25 ` [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-23 20:25 ` [PATCH 5/8] nfsd: reverse default of "ret" variable " Jeff Layton
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

If the switch hits the default: case, then it's likely that the client
sent an unexpected error. In that case, CB_COMPOUND reply processing
should stop. Ensure that the function returns false in that case.

Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index c26ccb9485b95499fc908833a384d741e966a8db..dcd1c16ca5e6cc1928cae74b89ff4b36912503df 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1399,6 +1399,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		rpc_delay(task, 2 * HZ);
 		return false;
 	default:
+		ret = false;
 		nfsd4_mark_cb_fault(cb->cb_clp);
 	}
 	trace_nfsd_cb_free_slot(task, cb);

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 5/8] nfsd: reverse default of "ret" variable in nfsd4_cb_sequence_done()
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (3 preceding siblings ...)
  2025-01-23 20:25 ` [PATCH 4/8] nfsd: fix default case in nfsd4_cb_sequence_done() Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-23 20:25 ` [PATCH 6/8] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

Currently it's set to true and which must be overridden in each error
case. The only time that it should return true however is if the client
returned 0. Change it to default to false, and only set it to true if
the CB_SEQUENCE request succeeded.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index dcd1c16ca5e6cc1928cae74b89ff4b36912503df..258bda1193f664f048e7b802082c8307b0a88821 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1332,7 +1332,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 {
 	struct nfs4_client *clp = cb->cb_clp;
 	struct nfsd4_session *session = clp->cl_cb_session;
-	bool ret = true;
+	bool ret = false;
 
 	if (!clp->cl_minorversion) {
 		/*
@@ -1365,11 +1365,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		 * (sequence ID, cached reply) MUST NOT change.
 		 */
 		++session->se_cb_seq_nr[cb->cb_held_slot];
+		ret = true;
 		break;
 	case -ESERVERFAULT:
 		++session->se_cb_seq_nr[cb->cb_held_slot];
 		nfsd4_mark_cb_fault(cb->cb_clp);
-		ret = false;
 		break;
 	case -NFS4ERR_BADSESSION:
 	case -NFS4ERR_BADSLOT:
@@ -1399,7 +1399,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		rpc_delay(task, 2 * HZ);
 		return false;
 	default:
-		ret = false;
 		nfsd4_mark_cb_fault(cb->cb_clp);
 	}
 	trace_nfsd_cb_free_slot(task, cb);

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 6/8] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (4 preceding siblings ...)
  2025-01-23 20:25 ` [PATCH 5/8] nfsd: reverse default of "ret" variable " Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-23 20:25 ` [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done() Jeff Layton
  2025-01-23 20:25 ` [PATCH 8/8] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton
  7 siblings, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

This isn't needed.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 258bda1193f664f048e7b802082c8307b0a88821..6e0561f3b21bd850b0387b5af7084eb05e818231 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -45,9 +45,6 @@
 #include "nfs4xdr_gen.h"
 
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
-
-static void nfsd4_mark_cb_fault(struct nfs4_client *clp);
-
 #define NFSPROC4_CB_NULL 0
 #define NFSPROC4_CB_COMPOUND 1
 

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (5 preceding siblings ...)
  2025-01-23 20:25 ` [PATCH 6/8] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  2025-01-24 14:43   ` Chuck Lever
  2025-01-23 20:25 ` [PATCH 8/8] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton
  7 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

Add a new kerneldoc header, and clean up the comments a bit.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 6e0561f3b21bd850b0387b5af7084eb05e818231..415fc8aae0f47c36f00b2384805c7a996fb1feb0 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1325,6 +1325,17 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
 	rpc_call_start(task);
 }
 
+/**
+ * nfsd4_cb_sequence_done - process the result of a CB_SEQUENCE
+ * @task: rpc_task
+ * @cb: nfsd4_callback for this call
+ *
+ * For minorversion 0, there is no CB_SEQUENCE. Only restart the call
+ * if the callback RPC client was killed. For v4.1+ the error handling
+ * is more sophisticated.
+ *
+ * Returns true if reply processing should continue.
+ */
 static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
 {
 	struct nfs4_client *clp = cb->cb_clp;
@@ -1334,11 +1345,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 	if (!clp->cl_minorversion) {
 		/*
 		 * If the backchannel connection was shut down while this
-		 * task was queued, we need to resubmit it after setting up
-		 * a new backchannel connection.
+		 * task was queued, resubmit it after setting up a new
+		 * backchannel connection.
 		 *
-		 * Note that if we lost our callback connection permanently
-		 * the submission code will error out, so we don't need to
+		 * Note that if the callback connection is permanently lost,
+		 * the submission code will error out. There is no need to
 		 * handle that case here.
 		 */
 		if (RPC_SIGNALLED(task))
@@ -1355,8 +1366,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 	switch (cb->cb_seq_status) {
 	case 0:
 		/*
-		 * No need for lock, access serialized in nfsd4_cb_prepare
-		 *
 		 * RFC5661 20.9.3
 		 * If CB_SEQUENCE returns an error, then the state of the slot
 		 * (sequence ID, cached reply) MUST NOT change.
@@ -1365,6 +1374,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		ret = true;
 		break;
 	case -ESERVERFAULT:
+		/*
+		 * Client returned NFS4_OK, but decoding failed. Mark the
+		 * backchannel as faulty, but don't retransmit since the
+		 * call was successful.
+		 */
 		++session->se_cb_seq_nr[cb->cb_held_slot];
 		nfsd4_mark_cb_fault(cb->cb_clp);
 		break;

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 8/8] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
  2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (6 preceding siblings ...)
  2025-01-23 20:25 ` [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done() Jeff Layton
@ 2025-01-23 20:25 ` Jeff Layton
  7 siblings, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 20:25 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev, Jeff Layton

These functions always return 1. Make them void return and fix up the
places that check the return code.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/nfs4proc.c           | 12 +++++-------
 fs/nfsd/nfs4callback.c      |  5 +----
 include/linux/sunrpc/clnt.h |  4 ++--
 net/sunrpc/clnt.c           |  7 +++----
 4 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 405f17e6e0b45b26cebae06c5bbe932895af9a56..cda20bfeca56db1ef8c51e524d08908b93bfeba6 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -968,15 +968,13 @@ static int nfs41_sequence_process(struct rpc_task *task,
 retry_new_seq:
 	++slot->seq_nr;
 retry_nowait:
-	if (rpc_restart_call_prepare(task)) {
-		nfs41_sequence_free_slot(res);
-		task->tk_status = 0;
-		ret = 0;
-	}
+	rpc_restart_call_prepare(task);
+	nfs41_sequence_free_slot(res);
+	task->tk_status = 0;
+	ret = 0;
 	goto out;
 out_retry:
-	if (!rpc_restart_call(task))
-		goto out;
+	rpc_restart_call(task);
 	rpc_delay(task, NFS4_POLL_RETRY_MAX);
 	return 0;
 }
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 415fc8aae0f47c36f00b2384805c7a996fb1feb0..fa8049d031f7dd15dfb901263ae1bf7aa2f2dd41 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1404,9 +1404,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 	case -NFS4ERR_DELAY:
 		cb->cb_seq_status = 1;
 		nfsd41_cb_release_slot(cb);
-		if (!rpc_restart_call(task))
-			goto out;
-
+		rpc_restart_call(task);
 		rpc_delay(task, 2 * HZ);
 		return false;
 	default:
@@ -1414,7 +1412,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 	}
 	trace_nfsd_cb_free_slot(task, cb);
 	nfsd41_cb_release_slot(cb);
-out:
 	return ret;
 need_restart:
 	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 5321585c778fcc1fef0e0420cb481786c02a7aac..e56f15c97fa24c735090c21c51ef312bfd877cfd 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -213,8 +213,8 @@ int		rpc_call_sync(struct rpc_clnt *clnt,
 			      const struct rpc_message *msg, int flags);
 struct rpc_task *rpc_call_null(struct rpc_clnt *clnt, struct rpc_cred *cred,
 			       int flags);
-int		rpc_restart_call_prepare(struct rpc_task *);
-int		rpc_restart_call(struct rpc_task *);
+void		rpc_restart_call_prepare(struct rpc_task *task);
+void		rpc_restart_call(struct rpc_task *task);
 void		rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int);
 struct net *	rpc_net_ns(struct rpc_clnt *);
 size_t		rpc_max_payload(struct rpc_clnt *);
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 0090162ee8c350568c91f1bcd951675ac3ae141c..3d2989120599ccee32e8827b1790d4be7d7a565a 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1670,20 +1670,19 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
 }
 EXPORT_SYMBOL_GPL(rpc_force_rebind);
 
-static int
+static void
 __rpc_restart_call(struct rpc_task *task, void (*action)(struct rpc_task *))
 {
 	task->tk_status = 0;
 	task->tk_rpc_status = 0;
 	task->tk_action = action;
-	return 1;
 }
 
 /*
  * Restart an (async) RPC call. Usually called from within the
  * exit handler.
  */
-int
+void
 rpc_restart_call(struct rpc_task *task)
 {
 	return __rpc_restart_call(task, call_start);
@@ -1694,7 +1693,7 @@ EXPORT_SYMBOL_GPL(rpc_restart_call);
  * Restart an (async) RPC call from the call_prepare state.
  * Usually called from within the exit handler.
  */
-int
+void
 rpc_restart_call_prepare(struct rpc_task *task)
 {
 	if (task->tk_ops->rpc_call_prepare != NULL)

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-23 20:25 ` [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot Jeff Layton
@ 2025-01-23 22:18   ` Chuck Lever
  2025-01-23 23:20     ` Jeff Layton
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2025-01-23 22:18 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/23/25 3:25 PM, Jeff Layton wrote:
> RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
> 
> "For any of a number of reasons, the replier could not process this
>   operation in what was deemed a reasonable time. The client should wait
>   and then try the request with a new slot and sequence value."

A little farther down, Section 15.1.1.3 says this:

"If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
  retried in full with the SEQUENCE operation containing the same slot
  and sequence values."

And:

"If NFS4ERR_DELAY is returned on an operation other than the first in
  the request, the request when retried MUST contain a SEQUENCE operation
  that is different than the original one, with either the slot ID or the
  sequence value different from that in the original request."

My impression is that the slot needs to be held and used again only if
the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
the NFS4ERR_DELAY was the status of the 2nd or later operation in the
COMPOUND, then yes, a different slot, or the same slot with a bumped
sequence number, must be used.

The current code in nfsd4_cb_sequence_done() appears to be correct in
this regard.


> This is CB_SEQUENCE, but I believe the same rule applies. Release the
> slot before submitting the delayed RPC.
> 
> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/nfsd/nfs4callback.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   		goto need_restart;
>   	case -NFS4ERR_DELAY:
>   		cb->cb_seq_status = 1;
> +		nfsd41_cb_release_slot(cb);
>   		if (!rpc_restart_call(task))
>   			goto out;
>   
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-23 22:18   ` Chuck Lever
@ 2025-01-23 23:20     ` Jeff Layton
  2025-01-24  1:30       ` Tom Talpey
  2025-01-24 14:00       ` J. Bruce Fields
  0 siblings, 2 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-23 23:20 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
> On 1/23/25 3:25 PM, Jeff Layton wrote:
> > RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
> > 
> > "For any of a number of reasons, the replier could not process this
> >   operation in what was deemed a reasonable time. The client should wait
> >   and then try the request with a new slot and sequence value."
> 
> A little farther down, Section 15.1.1.3 says this:
> 
> "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
>   retried in full with the SEQUENCE operation containing the same slot
>   and sequence values."
> 
> And:
> 
> "If NFS4ERR_DELAY is returned on an operation other than the first in
>   the request, the request when retried MUST contain a SEQUENCE operation
>   that is different than the original one, with either the slot ID or the
>   sequence value different from that in the original request."
> 
> My impression is that the slot needs to be held and used again only if
> the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
> the NFS4ERR_DELAY was the status of the 2nd or later operation in the
> COMPOUND, then yes, a different slot, or the same slot with a bumped
> sequence number, must be used.
> 
> The current code in nfsd4_cb_sequence_done() appears to be correct in
> this regard.
> 

Ok! I stand corrected. We should be able to just drop this patch, but
some of the later patches may need some trivial merge conflicts fixed
up.

Any idea why SEQUENCE is different in this regard? This rule seems a
bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
matter which slot you use when retransmitting? The responder is just
saying "go away and come back later".

What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
it's under resource pressure), and also shrinks the slot table in the
meantime? It seems like that might put the requestor in an untenable
position.

Maybe we should lobby to get this changed in the spec?

> 
> > This is CB_SEQUENCE, but I believe the same rule applies. Release the
> > slot before submitting the delayed RPC.
> > 
> > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/nfsd/nfs4callback.c | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   		goto need_restart;
> >   	case -NFS4ERR_DELAY:
> >   		cb->cb_seq_status = 1;
> > +		nfsd41_cb_release_slot(cb);
> >   		if (!rpc_restart_call(task))
> >   			goto out;
> >   
> > 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-23 23:20     ` Jeff Layton
@ 2025-01-24  1:30       ` Tom Talpey
  2025-01-24 14:00       ` J. Bruce Fields
  1 sibling, 0 replies; 35+ messages in thread
From: Tom Talpey @ 2025-01-24  1:30 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/23/2025 6:20 PM, Jeff Layton wrote:
> On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
>> On 1/23/25 3:25 PM, Jeff Layton wrote:
>>> RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
>>>
>>> "For any of a number of reasons, the replier could not process this
>>>    operation in what was deemed a reasonable time. The client should wait
>>>    and then try the request with a new slot and sequence value."
>>
>> A little farther down, Section 15.1.1.3 says this:
>>
>> "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
>>    retried in full with the SEQUENCE operation containing the same slot
>>    and sequence values."
>>
>> And:
>>
>> "If NFS4ERR_DELAY is returned on an operation other than the first in
>>    the request, the request when retried MUST contain a SEQUENCE operation
>>    that is different than the original one, with either the slot ID or the
>>    sequence value different from that in the original request."
>>
>> My impression is that the slot needs to be held and used again only if
>> the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
>> the NFS4ERR_DELAY was the status of the 2nd or later operation in the
>> COMPOUND, then yes, a different slot, or the same slot with a bumped
>> sequence number, must be used.
>>
>> The current code in nfsd4_cb_sequence_done() appears to be correct in
>> this regard.
>>
> 
> Ok! I stand corrected. We should be able to just drop this patch, but
> some of the later patches may need some trivial merge conflicts fixed
> up.
> 
> Any idea why SEQUENCE is different in this regard? This rule seems a
> bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
> matter which slot you use when retransmitting? The responder is just
> saying "go away and come back later".

SEQUENCE is special because of the minor versioning rules. It is
prepended to COMPOUNDs in a way that allows it to be optional in
order to comply with the "no new required operations" rule.

As such, it's not handled quite the same as other compounded ops,
and has some exceptional handling. It's not beautiful, honestly.
But it allowed the session to be introduced in v4.1.

> What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
> it's under resource pressure), and also shrinks the slot table in the
> meantime? It seems like that might put the requestor in an untenable
> position.
> 
> Maybe we should lobby to get this changed in the spec?

If there's a defect, absolutely. There's an IETF errata process.

Tom.

>>
>>> This is CB_SEQUENCE, but I believe the same rule applies. Release the
>>> slot before submitting the delayed RPC.
>>>
>>> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>>    fs/nfsd/nfs4callback.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    		goto need_restart;
>>>    	case -NFS4ERR_DELAY:
>>>    		cb->cb_seq_status = 1;
>>> +		nfsd41_cb_release_slot(cb);
>>>    		if (!rpc_restart_call(task))
>>>    			goto out;
>>>    
>>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-23 23:20     ` Jeff Layton
  2025-01-24  1:30       ` Tom Talpey
@ 2025-01-24 14:00       ` J. Bruce Fields
  2025-01-24 14:11         ` Jeff Layton
  2025-01-24 17:45         ` Olga Kornievskaia
  1 sibling, 2 replies; 35+ messages in thread
From: J. Bruce Fields @ 2025-01-24 14:00 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On Thu, Jan 23, 2025 at 06:20:08PM -0500, Jeff Layton wrote:
> On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
> > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
> > > 
> > > "For any of a number of reasons, the replier could not process this
> > >   operation in what was deemed a reasonable time. The client should wait
> > >   and then try the request with a new slot and sequence value."
> > 
> > A little farther down, Section 15.1.1.3 says this:
> > 
> > "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
> >   retried in full with the SEQUENCE operation containing the same slot
> >   and sequence values."
> > 
> > And:
> > 
> > "If NFS4ERR_DELAY is returned on an operation other than the first in
> >   the request, the request when retried MUST contain a SEQUENCE operation
> >   that is different than the original one, with either the slot ID or the
> >   sequence value different from that in the original request."
> > 
> > My impression is that the slot needs to be held and used again only if
> > the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
> > the NFS4ERR_DELAY was the status of the 2nd or later operation in the
> > COMPOUND, then yes, a different slot, or the same slot with a bumped
> > sequence number, must be used.
> > 
> > The current code in nfsd4_cb_sequence_done() appears to be correct in
> > this regard.
> > 
> 
> Ok! I stand corrected. We should be able to just drop this patch, but
> some of the later patches may need some trivial merge conflicts fixed
> up.
> 
> Any idea why SEQUENCE is different in this regard?

Isn't DELAY on SEQUENCE an indication that the operation is still in
progress?  The difference between retrying the same slot or not is
whether you're asking the server again for the result of the previous
operation, or whether you're asking it to perform a new one.

If you get DELAY on a later op and then keep retrying with the same
seqid/slot then I'd expect you to get stuck in an infinite loop getting
a DELAY response out of the reply cache.

--b.


> This rule seems a
> bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
> matter which slot you use when retransmitting? The responder is just
> saying "go away and come back later".
> 
> What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
> it's under resource pressure), and also shrinks the slot table in the
> meantime? It seems like that might put the requestor in an untenable
> position.
> 
> Maybe we should lobby to get this changed in the spec?
> 
> > 
> > > This is CB_SEQUENCE, but I believe the same rule applies. Release the
> > > slot before submitting the delayed RPC.
> > > 
> > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > >   fs/nfsd/nfs4callback.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
> > > --- a/fs/nfsd/nfs4callback.c
> > > +++ b/fs/nfsd/nfs4callback.c
> > > @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > >   		goto need_restart;
> > >   	case -NFS4ERR_DELAY:
> > >   		cb->cb_seq_status = 1;
> > > +		nfsd41_cb_release_slot(cb);
> > >   		if (!rpc_restart_call(task))
> > >   			goto out;
> > >   
> > > 
> > 
> > 
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-24 14:00       ` J. Bruce Fields
@ 2025-01-24 14:11         ` Jeff Layton
  2025-01-24 20:29           ` Tom Talpey
  2025-01-24 17:45         ` Olga Kornievskaia
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-24 14:11 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On Fri, 2025-01-24 at 09:00 -0500, J. Bruce Fields wrote:
> On Thu, Jan 23, 2025 at 06:20:08PM -0500, Jeff Layton wrote:
> > On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
> > > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > > RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
> > > > 
> > > > "For any of a number of reasons, the replier could not process this
> > > >   operation in what was deemed a reasonable time. The client should wait
> > > >   and then try the request with a new slot and sequence value."
> > > 
> > > A little farther down, Section 15.1.1.3 says this:
> > > 
> > > "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
> > >   retried in full with the SEQUENCE operation containing the same slot
> > >   and sequence values."
> > > 
> > > And:
> > > 
> > > "If NFS4ERR_DELAY is returned on an operation other than the first in
> > >   the request, the request when retried MUST contain a SEQUENCE operation
> > >   that is different than the original one, with either the slot ID or the
> > >   sequence value different from that in the original request."
> > > 
> > > My impression is that the slot needs to be held and used again only if
> > > the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
> > > the NFS4ERR_DELAY was the status of the 2nd or later operation in the
> > > COMPOUND, then yes, a different slot, or the same slot with a bumped
> > > sequence number, must be used.
> > > 
> > > The current code in nfsd4_cb_sequence_done() appears to be correct in
> > > this regard.
> > > 
> > 
> > Ok! I stand corrected. We should be able to just drop this patch, but
> > some of the later patches may need some trivial merge conflicts fixed
> > up.
> > 
> > Any idea why SEQUENCE is different in this regard?
> 
> Isn't DELAY on SEQUENCE an indication that the operation is still in
> progress?  The difference between retrying the same slot or not is
> whether you're asking the server again for the result of the previous
> operation, or whether you're asking it to perform a new one.
> 
> If you get DELAY on a later op and then keep retrying with the same
> seqid/slot then I'd expect you to get stuck in an infinite loop getting
> a DELAY response out of the reply cache.
> 

Hi Bruce!

That may be the case. From RFC8881, section 2.10.6.2:

"A retry might be sent while the original request is still in progress
on the replier. The replier SHOULD deal with the issue by returning
NFS4ERR_DELAY as the reply to SEQUENCE or CB_SEQUENCE operation, but
implementations MAY return NFS4ERR_MISORDERED. Since errors from
SEQUENCE and CB_SEQUENCE are never recorded in the reply cache, this
approach allows the results of the execution of the original request to
be properly recorded in the reply cache (assuming that the requester
specified the reply to be cached)."

> 
> 
> > This rule seems a
> > bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
> > matter which slot you use when retransmitting? The responder is just
> > saying "go away and come back later".
> > 
> > What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
> > it's under resource pressure), and also shrinks the slot table in the
> > meantime? It seems like that might put the requestor in an untenable
> > position.
> > 
> > Maybe we should lobby to get this changed in the spec?
> > 
> > > 
> > > > This is CB_SEQUENCE, but I believe the same rule applies. Release the
> > > > slot before submitting the delayed RPC.
> > > > 
> > > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >   fs/nfsd/nfs4callback.c | 1 +
> > > >   1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >   		goto need_restart;
> > > >   	case -NFS4ERR_DELAY:
> > > >   		cb->cb_seq_status = 1;
> > > > +		nfsd41_cb_release_slot(cb);
> > > >   		if (!rpc_restart_call(task))
> > > >   			goto out;
> > > >   
> > > > 
> > > 
> > > 
> > 
> > -- 
> > Jeff Layton <jlayton@kernel.org>

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
  2025-01-23 20:25 ` [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED} Jeff Layton
@ 2025-01-24 14:32   ` Chuck Lever
  2025-01-24 14:46     ` Jeff Layton
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2025-01-24 14:32 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/23/25 3:25 PM, Jeff Layton wrote:
> The current error handling has some problems:
> 
> BADSLOT and BADSESSION: don't release the slot before retrying the call
> 
> SEQ_MISORDERED: does some sketchy resetting of the seqid? I can't find any
> recommendation about doing that in the spec, and it seems wrong.

Random thought: You might use the Linux NFS client's forechannel session
implementation as a code reference.


> Handle all three errors the same way: release the slot, but then handle
> it just like we would as if we hadn't gotten a reply; mark the session
> as faulty, and retry the call.

Some questions:

Why does it matter whether NFSD keeps the slot if both sides plan to
destroy the session?

Also, AFAICT marking CB_FAULT does not destroy the session, it simply
tries to recreate backchannel's rpc_clnt. Perhaps NFSD's callback code
should actively destroy the session and let the client drive a fresh
CREATE_SESSION to recover?


> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/nfsd/nfs4callback.c | 27 +++++++++++----------------
>   1 file changed, 11 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index e12205ef16ca932ffbcc86d67b0817aec2436c89..bfc9de1fcb67b4f05ed2f7a28038cd8290809c17 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1371,17 +1371,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   		nfsd4_mark_cb_fault(cb->cb_clp);
>   		ret = false;
>   		break;
> +	case -NFS4ERR_BADSESSION:
> +	case -NFS4ERR_BADSLOT:
> +	case -NFS4ERR_SEQ_MISORDERED:
> +		/*
> +		 * These errors indicate that something has gone wrong
> +		 * with the server and client's synchronization. Release
> +		 * the slot, but handle it as if we hadn't gotten a reply.
> +		 */
> +		nfsd41_cb_release_slot(cb);
> +		fallthrough;
>   	case 1:
>   		/*
>   		 * cb_seq_status remains 1 if an RPC Reply was never
>   		 * received. NFSD can't know if the client processed
>   		 * the CB_SEQUENCE operation. Ask the client to send a
> -		 * DESTROY_SESSION to recover.
> +		 * DESTROY_SESSION to recover, but keep the slot.
>   		 */
> -		fallthrough;
> -	case -NFS4ERR_BADSESSION:
>   		nfsd4_mark_cb_fault(cb->cb_clp);
> -		ret = false;
>   		goto need_restart;
>   	case -NFS4ERR_DELAY:
>   		cb->cb_seq_status = 1;
> @@ -1390,14 +1397,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   
>   		rpc_delay(task, 2 * HZ);
>   		return false;
> -	case -NFS4ERR_BADSLOT:
> -		goto retry_nowait;
> -	case -NFS4ERR_SEQ_MISORDERED:
> -		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
> -			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
> -			goto retry_nowait;
> -		}
> -		break;
>   	default:
>   		nfsd4_mark_cb_fault(cb->cb_clp);
>   	}
> @@ -1405,10 +1404,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   	nfsd41_cb_release_slot(cb);
>   out:
>   	return ret;
> -retry_nowait:
> -	if (rpc_restart_call_prepare(task))
> -		ret = false;
> -	goto out;
>   need_restart:
>   	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
>   		trace_nfsd_cb_restart(clp, cb);
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-23 20:25 ` [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done() Jeff Layton
@ 2025-01-24 14:43   ` Chuck Lever
  2025-01-24 14:50     ` Jeff Layton
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2025-01-24 14:43 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/23/25 3:25 PM, Jeff Layton wrote:
> Add a new kerneldoc header, and clean up the comments a bit.

Usually I'm in favor of kdoc headers, but here, it's a static function
whose address is not shared outside of this source file. The only
documentation need is the meaning of the return code, IMO.


> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/nfsd/nfs4callback.c | 26 ++++++++++++++++++++------
>   1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 6e0561f3b21bd850b0387b5af7084eb05e818231..415fc8aae0f47c36f00b2384805c7a996fb1feb0 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1325,6 +1325,17 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
>   	rpc_call_start(task);
>   }
>   
> +/**
> + * nfsd4_cb_sequence_done - process the result of a CB_SEQUENCE
> + * @task: rpc_task
> + * @cb: nfsd4_callback for this call
> + *
> + * For minorversion 0, there is no CB_SEQUENCE. Only restart the call
> + * if the callback RPC client was killed. For v4.1+ the error handling
> + * is more sophisticated.

It would be much clearer to pull the 4.0 error handling out of this
function, which is named "cb_/sequence/_done".

Perhaps the need_restart label can be hoisted into nfsd4_cb_done() ?


> + *
> + * Returns true if reply processing should continue.
> + */
>   static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
>   {
>   	struct nfs4_client *clp = cb->cb_clp;
> @@ -1334,11 +1345,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   	if (!clp->cl_minorversion) {
>   		/*
>   		 * If the backchannel connection was shut down while this
> -		 * task was queued, we need to resubmit it after setting up
> -		 * a new backchannel connection.
> +		 * task was queued, resubmit it after setting up a new
> +		 * backchannel connection.
>   		 *
> -		 * Note that if we lost our callback connection permanently
> -		 * the submission code will error out, so we don't need to
> +		 * Note that if the callback connection is permanently lost,
> +		 * the submission code will error out. There is no need to
>   		 * handle that case here.
>   		 */
>   		if (RPC_SIGNALLED(task))
> @@ -1355,8 +1366,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   	switch (cb->cb_seq_status) {
>   	case 0:
>   		/*
> -		 * No need for lock, access serialized in nfsd4_cb_prepare
> -		 *
>   		 * RFC5661 20.9.3
>   		 * If CB_SEQUENCE returns an error, then the state of the slot
>   		 * (sequence ID, cached reply) MUST NOT change.
> @@ -1365,6 +1374,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   		ret = true;
>   		break;
>   	case -ESERVERFAULT:
> +		/*
> +		 * Client returned NFS4_OK, but decoding failed. Mark the
> +		 * backchannel as faulty, but don't retransmit since the
> +		 * call was successful.
> +		 */
>   		++session->se_cb_seq_nr[cb->cb_held_slot];
>   		nfsd4_mark_cb_fault(cb->cb_clp);
>   		break;

This old code abuses the meaning of ESERVERFAULT IMO. NFS4ERR_BADXDR is
a better choice. But why call mark_cb_fault in this case?

Maybe split this clean-up into a separate patch.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
  2025-01-24 14:32   ` Chuck Lever
@ 2025-01-24 14:46     ` Jeff Layton
  2025-01-24 15:31       ` Chuck Lever
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-24 14:46 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Fri, 2025-01-24 at 09:32 -0500, Chuck Lever wrote:
> On 1/23/25 3:25 PM, Jeff Layton wrote:
> > The current error handling has some problems:
> > 
> > BADSLOT and BADSESSION: don't release the slot before retrying the call
> > 
> > SEQ_MISORDERED: does some sketchy resetting of the seqid? I can't find any
> > recommendation about doing that in the spec, and it seems wrong.
> 
> Random thought: You might use the Linux NFS client's forechannel session
> implementation as a code reference.
> 
> 
> > Handle all three errors the same way: release the slot, but then handle
> > it just like we would as if we hadn't gotten a reply; mark the session
> > as faulty, and retry the call.
> 
> Some questions:
> 
> Why does it matter whether NFSD keeps the slot if both sides plan to
> destroy the session?
> 

It may not be required, but there is no reason to hold onto the slot in
these cases. Also, at this point, only nfsd has declared that it needs
a new session (see below).

> Also, AFAICT marking CB_FAULT does not destroy the session, it simply
> tries to recreate backchannel's rpc_clnt. Perhaps NFSD's callback code
> should actively destroy the session and let the client drive a fresh
> CREATE_SESSION to recover?
> 

Marking it with a fault just sets the cl_cb_state to NFSD4_CB_FAULT.
Then, on the next SEQUENCE call, that makes nfsd set
SEQ4_STATUS_BACKCHANNEL_FAULT, which should make the client recreate
the session. Obviously, there is some delay involved there since we
might have to wait for the client to do a lease renewal before this
happens.

> 
> > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/nfsd/nfs4callback.c | 27 +++++++++++----------------
> >   1 file changed, 11 insertions(+), 16 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index e12205ef16ca932ffbcc86d67b0817aec2436c89..bfc9de1fcb67b4f05ed2f7a28038cd8290809c17 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1371,17 +1371,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   		nfsd4_mark_cb_fault(cb->cb_clp);
> >   		ret = false;
> >   		break;
> > +	case -NFS4ERR_BADSESSION:
> > +	case -NFS4ERR_BADSLOT:
> > +	case -NFS4ERR_SEQ_MISORDERED:
> > +		/*
> > +		 * These errors indicate that something has gone wrong
> > +		 * with the server and client's synchronization. Release
> > +		 * the slot, but handle it as if we hadn't gotten a reply.
> > +		 */
> > +		nfsd41_cb_release_slot(cb);
> > +		fallthrough;
> >   	case 1:
> >   		/*
> >   		 * cb_seq_status remains 1 if an RPC Reply was never
> >   		 * received. NFSD can't know if the client processed
> >   		 * the CB_SEQUENCE operation. Ask the client to send a
> > -		 * DESTROY_SESSION to recover.
> > +		 * DESTROY_SESSION to recover, but keep the slot.
> >   		 */
> > -		fallthrough;
> > -	case -NFS4ERR_BADSESSION:
> >   		nfsd4_mark_cb_fault(cb->cb_clp);
> > -		ret = false;
> >   		goto need_restart;
> >   	case -NFS4ERR_DELAY:
> >   		cb->cb_seq_status = 1;
> > @@ -1390,14 +1397,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   
> >   		rpc_delay(task, 2 * HZ);
> >   		return false;
> > -	case -NFS4ERR_BADSLOT:
> > -		goto retry_nowait;
> > -	case -NFS4ERR_SEQ_MISORDERED:
> > -		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
> > -			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
> > -			goto retry_nowait;
> > -		}
> > -		break;
> >   	default:
> >   		nfsd4_mark_cb_fault(cb->cb_clp);
> >   	}
> > @@ -1405,10 +1404,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   	nfsd41_cb_release_slot(cb);
> >   out:
> >   	return ret;
> > -retry_nowait:
> > -	if (rpc_restart_call_prepare(task))
> > -		ret = false;
> > -	goto out;
> >   need_restart:
> >   	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> >   		trace_nfsd_cb_restart(clp, cb);
> > 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-24 14:43   ` Chuck Lever
@ 2025-01-24 14:50     ` Jeff Layton
  2025-01-24 15:05       ` Chuck Lever
  2025-01-26 16:50       ` Chuck Lever
  0 siblings, 2 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-24 14:50 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Fri, 2025-01-24 at 09:43 -0500, Chuck Lever wrote:
> On 1/23/25 3:25 PM, Jeff Layton wrote:
> > Add a new kerneldoc header, and clean up the comments a bit.
> 
> Usually I'm in favor of kdoc headers, but here, it's a static function
> whose address is not shared outside of this source file. The only
> documentation need is the meaning of the return code, IMO.
> 

If you like. I figured it wouldn't hurt to do a full kdoc comment.

> 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/nfsd/nfs4callback.c | 26 ++++++++++++++++++++------
> >   1 file changed, 20 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 6e0561f3b21bd850b0387b5af7084eb05e818231..415fc8aae0f47c36f00b2384805c7a996fb1feb0 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1325,6 +1325,17 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
> >   	rpc_call_start(task);
> >   }
> >   
> > +/**
> > + * nfsd4_cb_sequence_done - process the result of a CB_SEQUENCE
> > + * @task: rpc_task
> > + * @cb: nfsd4_callback for this call
> > + *
> > + * For minorversion 0, there is no CB_SEQUENCE. Only restart the call
> > + * if the callback RPC client was killed. For v4.1+ the error handling
> > + * is more sophisticated.
> 
> It would be much clearer to pull the 4.0 error handling out of this
> function, which is named "cb_/sequence/_done".
> 
> Perhaps the need_restart label can be hoisted into nfsd4_cb_done() ?
> 

If we do that then we'll need to change this function to return
something other than a bool, and that's a larger change than I wanted
to make here. I really wanted to keep these as small, targeted patches
that can be backported easily.

I wouldn't object to further cleanup here on top of that though.


> 
> > + *
> > + * Returns true if reply processing should continue.
> > + */
> >   static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
> >   {
> >   	struct nfs4_client *clp = cb->cb_clp;
> > @@ -1334,11 +1345,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   	if (!clp->cl_minorversion) {
> >   		/*
> >   		 * If the backchannel connection was shut down while this
> > -		 * task was queued, we need to resubmit it after setting up
> > -		 * a new backchannel connection.
> > +		 * task was queued, resubmit it after setting up a new
> > +		 * backchannel connection.
> >   		 *
> > -		 * Note that if we lost our callback connection permanently
> > -		 * the submission code will error out, so we don't need to
> > +		 * Note that if the callback connection is permanently lost,
> > +		 * the submission code will error out. There is no need to
> >   		 * handle that case here.
> >   		 */
> >   		if (RPC_SIGNALLED(task))
> > @@ -1355,8 +1366,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   	switch (cb->cb_seq_status) {
> >   	case 0:
> >   		/*
> > -		 * No need for lock, access serialized in nfsd4_cb_prepare
> > -		 *
> >   		 * RFC5661 20.9.3
> >   		 * If CB_SEQUENCE returns an error, then the state of the slot
> >   		 * (sequence ID, cached reply) MUST NOT change.
> > @@ -1365,6 +1374,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   		ret = true;
> >   		break;
> >   	case -ESERVERFAULT:
> > +		/*
> > +		 * Client returned NFS4_OK, but decoding failed. Mark the
> > +		 * backchannel as faulty, but don't retransmit since the
> > +		 * call was successful.
> > +		 */
> >   		++session->se_cb_seq_nr[cb->cb_held_slot];
> >   		nfsd4_mark_cb_fault(cb->cb_clp);
> >   		break;
> 
> This old code abuses the meaning of ESERVERFAULT IMO. NFS4ERR_BADXDR is
> a better choice. But why call mark_cb_fault in this case?
> 
> Maybe split this clean-up into a separate patch.
> 
> 

I'm only altering comments in this patch. Do you really want separate
patches for the different comments?

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-24 14:50     ` Jeff Layton
@ 2025-01-24 15:05       ` Chuck Lever
  2025-01-24 15:31         ` Jeff Layton
  2025-01-26 16:50       ` Chuck Lever
  1 sibling, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2025-01-24 15:05 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/24/25 9:50 AM, Jeff Layton wrote:
> On Fri, 2025-01-24 at 09:43 -0500, Chuck Lever wrote:
>> On 1/23/25 3:25 PM, Jeff Layton wrote:
>>> Add a new kerneldoc header, and clean up the comments a bit.
>>
>> Usually I'm in favor of kdoc headers, but here, it's a static function
>> whose address is not shared outside of this source file. The only
>> documentation need is the meaning of the return code, IMO.
>>
> 
> If you like. I figured it wouldn't hurt to do a full kdoc comment.

Kdoc comments are pretty noisy. This one doesn't seem to me to add
much real value -- its callers are all right here in the same file.


>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>>    fs/nfsd/nfs4callback.c | 26 ++++++++++++++++++++------
>>>    1 file changed, 20 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index 6e0561f3b21bd850b0387b5af7084eb05e818231..415fc8aae0f47c36f00b2384805c7a996fb1feb0 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -1325,6 +1325,17 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
>>>    	rpc_call_start(task);
>>>    }
>>>    
>>> +/**
>>> + * nfsd4_cb_sequence_done - process the result of a CB_SEQUENCE
>>> + * @task: rpc_task
>>> + * @cb: nfsd4_callback for this call
>>> + *
>>> + * For minorversion 0, there is no CB_SEQUENCE. Only restart the call
>>> + * if the callback RPC client was killed. For v4.1+ the error handling
>>> + * is more sophisticated.
>>
>> It would be much clearer to pull the 4.0 error handling out of this
>> function, which is named "cb_/sequence/_done".
>>
>> Perhaps the need_restart label can be hoisted into nfsd4_cb_done() ?
>>
> 
> If we do that then we'll need to change this function to return
> something other than a bool, and that's a larger change than I wanted
> to make here. I really wanted to keep these as small, targeted patches
> that can be backported easily.
> 
> I wouldn't object to further cleanup here on top of that though.

There's no reason to document the 4.0 logic if it's about to be moved
out. I strongly prefer making the code more self-documenting. Adding
a comment here about 4.0 then adding a patch on top moving the code
somewhere else seems silly to me.


>>> + *
>>> + * Returns true if reply processing should continue.
>>> + */
>>>    static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
>>>    {
>>>    	struct nfs4_client *clp = cb->cb_clp;
>>> @@ -1334,11 +1345,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    	if (!clp->cl_minorversion) {
>>>    		/*
>>>    		 * If the backchannel connection was shut down while this
>>> -		 * task was queued, we need to resubmit it after setting up
>>> -		 * a new backchannel connection.
>>> +		 * task was queued, resubmit it after setting up a new
>>> +		 * backchannel connection.
>>>    		 *
>>> -		 * Note that if we lost our callback connection permanently
>>> -		 * the submission code will error out, so we don't need to
>>> +		 * Note that if the callback connection is permanently lost,
>>> +		 * the submission code will error out. There is no need to
>>>    		 * handle that case here.
>>>    		 */
>>>    		if (RPC_SIGNALLED(task))
>>> @@ -1355,8 +1366,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    	switch (cb->cb_seq_status) {
>>>    	case 0:
>>>    		/*
>>> -		 * No need for lock, access serialized in nfsd4_cb_prepare
>>> -		 *
>>>    		 * RFC5661 20.9.3
>>>    		 * If CB_SEQUENCE returns an error, then the state of the slot
>>>    		 * (sequence ID, cached reply) MUST NOT change.
>>> @@ -1365,6 +1374,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    		ret = true;
>>>    		break;
>>>    	case -ESERVERFAULT:
>>> +		/*
>>> +		 * Client returned NFS4_OK, but decoding failed. Mark the
>>> +		 * backchannel as faulty, but don't retransmit since the
>>> +		 * call was successful.
>>> +		 */
>>>    		++session->se_cb_seq_nr[cb->cb_held_slot];
>>>    		nfsd4_mark_cb_fault(cb->cb_clp);
>>>    		break;
>>
>> This old code abuses the meaning of ESERVERFAULT IMO. NFS4ERR_BADXDR is
>> a better choice. But why call mark_cb_fault in this case?
>>
>> Maybe split this clean-up into a separate patch.
>>
>>
> 
> I'm only altering comments in this patch. Do you really want separate
> patches for the different comments?

Why call mark_cb_fault here? If NFSD retransmits this operation on a
fresh session/transport it will just fail to decode the reply again.

Do we believe that the decoding failure means there was a transport
problem of some kind?

It's clear we do not understand this code well enough to update the
existing comment, so my review comment above suggests a broader code
change is necessary.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-24 15:05       ` Chuck Lever
@ 2025-01-24 15:31         ` Jeff Layton
  2025-01-24 15:42           ` Chuck Lever
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-24 15:31 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Fri, 2025-01-24 at 10:05 -0500, Chuck Lever wrote:
> On 1/24/25 9:50 AM, Jeff Layton wrote:
> > On Fri, 2025-01-24 at 09:43 -0500, Chuck Lever wrote:
> > > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > > Add a new kerneldoc header, and clean up the comments a bit.
> > > 
> > > Usually I'm in favor of kdoc headers, but here, it's a static function
> > > whose address is not shared outside of this source file. The only
> > > documentation need is the meaning of the return code, IMO.
> > > 
> > 
> > If you like. I figured it wouldn't hurt to do a full kdoc comment.
> 
> Kdoc comments are pretty noisy. This one doesn't seem to me to add
> much real value -- its callers are all right here in the same file.
> 
> 
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >    fs/nfsd/nfs4callback.c | 26 ++++++++++++++++++++------
> > > >    1 file changed, 20 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index 6e0561f3b21bd850b0387b5af7084eb05e818231..415fc8aae0f47c36f00b2384805c7a996fb1feb0 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -1325,6 +1325,17 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
> > > >    	rpc_call_start(task);
> > > >    }
> > > >    
> > > > +/**
> > > > + * nfsd4_cb_sequence_done - process the result of a CB_SEQUENCE
> > > > + * @task: rpc_task
> > > > + * @cb: nfsd4_callback for this call
> > > > + *
> > > > + * For minorversion 0, there is no CB_SEQUENCE. Only restart the call
> > > > + * if the callback RPC client was killed. For v4.1+ the error handling
> > > > + * is more sophisticated.
> > > 
> > > It would be much clearer to pull the 4.0 error handling out of this
> > > function, which is named "cb_/sequence/_done".
> > > 
> > > Perhaps the need_restart label can be hoisted into nfsd4_cb_done() ?
> > > 
> > 
> > If we do that then we'll need to change this function to return
> > something other than a bool, and that's a larger change than I wanted
> > to make here. I really wanted to keep these as small, targeted patches
> > that can be backported easily.
> > 
> > I wouldn't object to further cleanup here on top of that though.
> 
> There's no reason to document the 4.0 logic if it's about to be moved
> out. I strongly prefer making the code more self-documenting. Adding
> a comment here about 4.0 then adding a patch on top moving the code
> somewhere else seems silly to me.
> 

Ok.

> 
> > > > + *
> > > > + * Returns true if reply processing should continue.
> > > > + */
> > > >    static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
> > > >    {
> > > >    	struct nfs4_client *clp = cb->cb_clp;
> > > > @@ -1334,11 +1345,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    	if (!clp->cl_minorversion) {
> > > >    		/*
> > > >    		 * If the backchannel connection was shut down while this
> > > > -		 * task was queued, we need to resubmit it after setting up
> > > > -		 * a new backchannel connection.
> > > > +		 * task was queued, resubmit it after setting up a new
> > > > +		 * backchannel connection.
> > > >    		 *
> > > > -		 * Note that if we lost our callback connection permanently
> > > > -		 * the submission code will error out, so we don't need to
> > > > +		 * Note that if the callback connection is permanently lost,
> > > > +		 * the submission code will error out. There is no need to
> > > >    		 * handle that case here.
> > > >    		 */
> > > >    		if (RPC_SIGNALLED(task))
> > > > @@ -1355,8 +1366,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    	switch (cb->cb_seq_status) {
> > > >    	case 0:
> > > >    		/*
> > > > -		 * No need for lock, access serialized in nfsd4_cb_prepare
> > > > -		 *
> > > >    		 * RFC5661 20.9.3
> > > >    		 * If CB_SEQUENCE returns an error, then the state of the slot
> > > >    		 * (sequence ID, cached reply) MUST NOT change.
> > > > @@ -1365,6 +1374,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    		ret = true;
> > > >    		break;
> > > >    	case -ESERVERFAULT:
> > > > +		/*
> > > > +		 * Client returned NFS4_OK, but decoding failed. Mark the
> > > > +		 * backchannel as faulty, but don't retransmit since the
> > > > +		 * call was successful.
> > > > +		 */
> > > >    		++session->se_cb_seq_nr[cb->cb_held_slot];
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > >    		break;
> > > 
> > > This old code abuses the meaning of ESERVERFAULT IMO. NFS4ERR_BADXDR is
> > > a better choice. But why call mark_cb_fault in this case?
> > > 

I can fix that up. BADXDR is more descriptive.

> > > Maybe split this clean-up into a separate patch.
> > > 
> > > 
> > 
> > I'm only altering comments in this patch. Do you really want separate
> > patches for the different comments?
> 
> Why call mark_cb_fault here? If NFSD retransmits this operation on a
> fresh session/transport it will just fail to decode the reply again.
> 
> Do we believe that the decoding failure means there was a transport
> problem of some kind?
> 
> It's clear we do not understand this code well enough to update the
> existing comment, so my review comment above suggests a broader code
> change is necessary.
> 
> 

It won't be retransmitted in the ESERVERFAULT case. It just fails.

I can't speak definitively, but my guess is that this is an
extraordinary situation and the author (Kinglong Mee?) figured "might
as well mark the cb faulty too". I don't think that's necessarily
unreasonable, but I agree that it's unlikely to help.

Unfortunately as the server in this situation, our options for alerting
about callback problems are limited. Marking the CB channel as faulty
isn't a great response, but it is at least something. The problem here
is that these are callbacks, and if they fail, there is zero indication
that there is a problem.

Stepping back, when we do find failing callbacks, should we be doing
more to alert the admin? What would be an appropriate response?
Ratelimited pr_notice()? Conditional tracepoints? Something else?
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
  2025-01-24 14:46     ` Jeff Layton
@ 2025-01-24 15:31       ` Chuck Lever
  2025-01-24 16:04         ` Jeff Layton
  2025-01-24 16:08         ` Jeff Layton
  0 siblings, 2 replies; 35+ messages in thread
From: Chuck Lever @ 2025-01-24 15:31 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/24/25 9:46 AM, Jeff Layton wrote:
> On Fri, 2025-01-24 at 09:32 -0500, Chuck Lever wrote:
>> On 1/23/25 3:25 PM, Jeff Layton wrote:
>>> The current error handling has some problems:
>>>
>>> BADSLOT and BADSESSION: don't release the slot before retrying the call
>>>
>>> SEQ_MISORDERED: does some sketchy resetting of the seqid? I can't find any
>>> recommendation about doing that in the spec, and it seems wrong.
>>
>> Random thought: You might use the Linux NFS client's forechannel session
>> implementation as a code reference.
>>
>>
>>> Handle all three errors the same way: release the slot, but then handle
>>> it just like we would as if we hadn't gotten a reply; mark the session
>>> as faulty, and retry the call.
>>
>> Some questions:
>>
>> Why does it matter whether NFSD keeps the slot if both sides plan to
>> destroy the session?
>>
> 
> It may not be required, but there is no reason to hold onto the slot in
> these cases.

In the BADSLOT case, if the slot is released, then another session
consumer on the NFS server can use it and will encounter the same error.
Best to keep it in the penalty box, IMO.

If there are other slots, they are likely still usable. An
implementation can choose to continue using the session rather than
scuttling it immediately. In the past, with a single backchannel slot,
NFSD had no choice but to replace the session. But now it can be more
conservative.


> Also, at this point, only nfsd has declared that it needs
> a new session (see below).

If the client's backchannel service has returned BADSESSION, then the
client already knows this session is unusable.


>> Also, AFAICT marking CB_FAULT does not destroy the session, it simply
>> tries to recreate backchannel's rpc_clnt. Perhaps NFSD's callback code
>> should actively destroy the session and let the client drive a fresh
>> CREATE_SESSION to recover?
>>
> 
> Marking it with a fault just sets the cl_cb_state to NFSD4_CB_FAULT.
> Then, on the next SEQUENCE call, that makes nfsd set
> SEQ4_STATUS_BACKCHANNEL_FAULT, which should make the client recreate
> the session. Obviously, there is some delay involved there since we
> might have to wait for the client to do a lease renewal before this
> happens.
> 
>>
>>> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>>    fs/nfsd/nfs4callback.c | 27 +++++++++++----------------
>>>    1 file changed, 11 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index e12205ef16ca932ffbcc86d67b0817aec2436c89..bfc9de1fcb67b4f05ed2f7a28038cd8290809c17 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -1371,17 +1371,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    		nfsd4_mark_cb_fault(cb->cb_clp);
>>>    		ret = false;
>>>    		break;
>>> +	case -NFS4ERR_BADSESSION:
>>> +	case -NFS4ERR_BADSLOT:
>>> +	case -NFS4ERR_SEQ_MISORDERED:
>>> +		/*
>>> +		 * These errors indicate that something has gone wrong
>>> +		 * with the server and client's synchronization. Release
>>> +		 * the slot, but handle it as if we hadn't gotten a reply.
>>> +		 */
>>> +		nfsd41_cb_release_slot(cb);
>>> +		fallthrough;
>>>    	case 1:
>>>    		/*
>>>    		 * cb_seq_status remains 1 if an RPC Reply was never
>>>    		 * received. NFSD can't know if the client processed
>>>    		 * the CB_SEQUENCE operation. Ask the client to send a
>>> -		 * DESTROY_SESSION to recover.
>>> +		 * DESTROY_SESSION to recover, but keep the slot.
>>>    		 */
>>> -		fallthrough;
>>> -	case -NFS4ERR_BADSESSION:
>>>    		nfsd4_mark_cb_fault(cb->cb_clp);
>>> -		ret = false;
>>>    		goto need_restart;
>>>    	case -NFS4ERR_DELAY:
>>>    		cb->cb_seq_status = 1;
>>> @@ -1390,14 +1397,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    
>>>    		rpc_delay(task, 2 * HZ);
>>>    		return false;
>>> -	case -NFS4ERR_BADSLOT:
>>> -		goto retry_nowait;
>>> -	case -NFS4ERR_SEQ_MISORDERED:
>>> -		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
>>> -			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
>>> -			goto retry_nowait;
>>> -		}
>>> -		break;
>>>    	default:
>>>    		nfsd4_mark_cb_fault(cb->cb_clp);
>>>    	}
>>> @@ -1405,10 +1404,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    	nfsd41_cb_release_slot(cb);
>>>    out:
>>>    	return ret;
>>> -retry_nowait:
>>> -	if (rpc_restart_call_prepare(task))
>>> -		ret = false;
>>> -	goto out;
>>>    need_restart:
>>>    	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
>>>    		trace_nfsd_cb_restart(clp, cb);
>>>
>>
>>
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-24 15:31         ` Jeff Layton
@ 2025-01-24 15:42           ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2025-01-24 15:42 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/24/25 10:31 AM, Jeff Layton wrote:
> Stepping back, when we do find failing callbacks, should we be doing
> more to alert the admin? What would be an appropriate response?
> Ratelimited pr_notice()? Conditional tracepoints? Something else?

If we can identify a reasonable action that an admin can take, then
pr_ratelimited (or once per client instance) to report a decoding
failure makes sense. Report the IP address of the client that is sending
us weird crap.

For the more conventional session failures, I'm less enthusiastic
about logging these because frequently they are the result of
transport failures (or a suspended client, or ...). We'll have to
sort through the various cases to understand where a logged error
might be helpful/actionable.

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
  2025-01-24 15:31       ` Chuck Lever
@ 2025-01-24 16:04         ` Jeff Layton
  2025-01-24 16:08         ` Jeff Layton
  1 sibling, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-24 16:04 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Fri, 2025-01-24 at 10:31 -0500, Chuck Lever wrote:
> On 1/24/25 9:46 AM, Jeff Layton wrote:
> > On Fri, 2025-01-24 at 09:32 -0500, Chuck Lever wrote:
> > > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > > The current error handling has some problems:
> > > > 
> > > > BADSLOT and BADSESSION: don't release the slot before retrying the call
> > > > 
> > > > SEQ_MISORDERED: does some sketchy resetting of the seqid? I can't find any
> > > > recommendation about doing that in the spec, and it seems wrong.
> > > 
> > > Random thought: You might use the Linux NFS client's forechannel session
> > > implementation as a code reference.
> > > 
> > > 
> > > > Handle all three errors the same way: release the slot, but then handle
> > > > it just like we would as if we hadn't gotten a reply; mark the session
> > > > as faulty, and retry the call.
> > > 
> > > Some questions:
> > > 
> > > Why does it matter whether NFSD keeps the slot if both sides plan to
> > > destroy the session?
> > > 
> > 
> > It may not be required, but there is no reason to hold onto the slot in
> > these cases.
> 
> In the BADSLOT case, if the slot is released, then another session
> consumer on the NFS server can use it and will encounter the same error.
> Best to keep it in the penalty box, IMO.
> 
> If there are other slots, they are likely still usable. An
> implementation can choose to continue using the session rather than
> scuttling it immediately. In the past, with a single backchannel slot,
> NFSD had no choice but to replace the session. But now it can be more
> conservative.
> 

It may still be usable, but if we got BADSLOT, then it's still
indicative of a faulty session. Best to mark it as such, IMO.

It's worth noting that these things should really _never_ happen, so
being somewhat heavy-handed in this situation seems reasonable.

> 
> > Also, at this point, only nfsd has declared that it needs
> > a new session (see below).
> 
> If the client's backchannel service has returned BADSESSION, then the
> client already knows this session is unusable.
> 

Fair point. I still think it's reasonable to mark it CB_FAULT in that
case. That status is also displayed via the client_info file, so that's
helpful info.

> 
> > > Also, AFAICT marking CB_FAULT does not destroy the session, it simply
> > > tries to recreate backchannel's rpc_clnt. Perhaps NFSD's callback code
> > > should actively destroy the session and let the client drive a fresh
> > > CREATE_SESSION to recover?
> > > 
> > 
> > Marking it with a fault just sets the cl_cb_state to NFSD4_CB_FAULT.
> > Then, on the next SEQUENCE call, that makes nfsd set
> > SEQ4_STATUS_BACKCHANNEL_FAULT, which should make the client recreate
> > the session. Obviously, there is some delay involved there since we
> > might have to wait for the client to do a lease renewal before this
> > happens.
> > 
> > > 
> > > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >    fs/nfsd/nfs4callback.c | 27 +++++++++++----------------
> > > >    1 file changed, 11 insertions(+), 16 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index e12205ef16ca932ffbcc86d67b0817aec2436c89..bfc9de1fcb67b4f05ed2f7a28038cd8290809c17 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -1371,17 +1371,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > >    		ret = false;
> > > >    		break;
> > > > +	case -NFS4ERR_BADSESSION:
> > > > +	case -NFS4ERR_BADSLOT:
> > > > +	case -NFS4ERR_SEQ_MISORDERED:
> > > > +		/*
> > > > +		 * These errors indicate that something has gone wrong
> > > > +		 * with the server and client's synchronization. Release
> > > > +		 * the slot, but handle it as if we hadn't gotten a reply.
> > > > +		 */
> > > > +		nfsd41_cb_release_slot(cb);
> > > > +		fallthrough;
> > > >    	case 1:
> > > >    		/*
> > > >    		 * cb_seq_status remains 1 if an RPC Reply was never
> > > >    		 * received. NFSD can't know if the client processed
> > > >    		 * the CB_SEQUENCE operation. Ask the client to send a
> > > > -		 * DESTROY_SESSION to recover.
> > > > +		 * DESTROY_SESSION to recover, but keep the slot.
> > > >    		 */
> > > > -		fallthrough;
> > > > -	case -NFS4ERR_BADSESSION:
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > > -		ret = false;
> > > >    		goto need_restart;
> > > >    	case -NFS4ERR_DELAY:
> > > >    		cb->cb_seq_status = 1;
> > > > @@ -1390,14 +1397,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    
> > > >    		rpc_delay(task, 2 * HZ);
> > > >    		return false;
> > > > -	case -NFS4ERR_BADSLOT:
> > > > -		goto retry_nowait;
> > > > -	case -NFS4ERR_SEQ_MISORDERED:
> > > > -		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
> > > > -			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
> > > > -			goto retry_nowait;
> > > > -		}
> > > > -		break;
> > > >    	default:
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > >    	}
> > > > @@ -1405,10 +1404,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    	nfsd41_cb_release_slot(cb);
> > > >    out:
> > > >    	return ret;
> > > > -retry_nowait:
> > > > -	if (rpc_restart_call_prepare(task))
> > > > -		ret = false;
> > > > -	goto out;
> > > >    need_restart:
> > > >    	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> > > >    		trace_nfsd_cb_restart(clp, cb);
> > > > 
> > > 
> > > 
> > 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED}
  2025-01-24 15:31       ` Chuck Lever
  2025-01-24 16:04         ` Jeff Layton
@ 2025-01-24 16:08         ` Jeff Layton
  1 sibling, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-24 16:08 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Fri, 2025-01-24 at 10:31 -0500, Chuck Lever wrote:
> On 1/24/25 9:46 AM, Jeff Layton wrote:
> > On Fri, 2025-01-24 at 09:32 -0500, Chuck Lever wrote:
> > > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > > The current error handling has some problems:
> > > > 
> > > > BADSLOT and BADSESSION: don't release the slot before retrying the call
> > > > 
> > > > SEQ_MISORDERED: does some sketchy resetting of the seqid? I can't find any
> > > > recommendation about doing that in the spec, and it seems wrong.
> > > 
> > > Random thought: You might use the Linux NFS client's forechannel session
> > > implementation as a code reference.
> > > 
> > > 
> > > > Handle all three errors the same way: release the slot, but then handle
> > > > it just like we would as if we hadn't gotten a reply; mark the session
> > > > as faulty, and retry the call.
> > > 
> > > Some questions:
> > > 
> > > Why does it matter whether NFSD keeps the slot if both sides plan to
> > > destroy the session?
> > > 
> > 
> > It may not be required, but there is no reason to hold onto the slot in
> > these cases.
> 
> In the BADSLOT case, if the slot is released, then another session
> consumer on the NFS server can use it and will encounter the same error.
> Best to keep it in the penalty box, IMO.
> 

There is another problem here too. Once the session is reconstituted,
there is no guarantee that the slot that the call is sitting on will
still be valid. The new CB slot table may be smaller than before. I
think we do need to release the slot in these cases for that reason
alone.

> If there are other slots, they are likely still usable. An
> implementation can choose to continue using the session rather than
> scuttling it immediately. In the past, with a single backchannel slot,
> NFSD had no choice but to replace the session. But now it can be more
> conservative.
> 
> 
> > Also, at this point, only nfsd has declared that it needs
> > a new session (see below).
> 
> If the client's backchannel service has returned BADSESSION, then the
> client already knows this session is unusable.
> 
> 
> > > Also, AFAICT marking CB_FAULT does not destroy the session, it simply
> > > tries to recreate backchannel's rpc_clnt. Perhaps NFSD's callback code
> > > should actively destroy the session and let the client drive a fresh
> > > CREATE_SESSION to recover?
> > > 
> > 
> > Marking it with a fault just sets the cl_cb_state to NFSD4_CB_FAULT.
> > Then, on the next SEQUENCE call, that makes nfsd set
> > SEQ4_STATUS_BACKCHANNEL_FAULT, which should make the client recreate
> > the session. Obviously, there is some delay involved there since we
> > might have to wait for the client to do a lease renewal before this
> > happens.
> > 
> > > 
> > > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >    fs/nfsd/nfs4callback.c | 27 +++++++++++----------------
> > > >    1 file changed, 11 insertions(+), 16 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index e12205ef16ca932ffbcc86d67b0817aec2436c89..bfc9de1fcb67b4f05ed2f7a28038cd8290809c17 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -1371,17 +1371,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > >    		ret = false;
> > > >    		break;
> > > > +	case -NFS4ERR_BADSESSION:
> > > > +	case -NFS4ERR_BADSLOT:
> > > > +	case -NFS4ERR_SEQ_MISORDERED:
> > > > +		/*
> > > > +		 * These errors indicate that something has gone wrong
> > > > +		 * with the server and client's synchronization. Release
> > > > +		 * the slot, but handle it as if we hadn't gotten a reply.
> > > > +		 */
> > > > +		nfsd41_cb_release_slot(cb);
> > > > +		fallthrough;
> > > >    	case 1:
> > > >    		/*
> > > >    		 * cb_seq_status remains 1 if an RPC Reply was never
> > > >    		 * received. NFSD can't know if the client processed
> > > >    		 * the CB_SEQUENCE operation. Ask the client to send a
> > > > -		 * DESTROY_SESSION to recover.
> > > > +		 * DESTROY_SESSION to recover, but keep the slot.
> > > >    		 */
> > > > -		fallthrough;
> > > > -	case -NFS4ERR_BADSESSION:
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > > -		ret = false;
> > > >    		goto need_restart;
> > > >    	case -NFS4ERR_DELAY:
> > > >    		cb->cb_seq_status = 1;
> > > > @@ -1390,14 +1397,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    
> > > >    		rpc_delay(task, 2 * HZ);
> > > >    		return false;
> > > > -	case -NFS4ERR_BADSLOT:
> > > > -		goto retry_nowait;
> > > > -	case -NFS4ERR_SEQ_MISORDERED:
> > > > -		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
> > > > -			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
> > > > -			goto retry_nowait;
> > > > -		}
> > > > -		break;
> > > >    	default:
> > > >    		nfsd4_mark_cb_fault(cb->cb_clp);
> > > >    	}
> > > > @@ -1405,10 +1404,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >    	nfsd41_cb_release_slot(cb);
> > > >    out:
> > > >    	return ret;
> > > > -retry_nowait:
> > > > -	if (rpc_restart_call_prepare(task))
> > > > -		ret = false;
> > > > -	goto out;
> > > >    need_restart:
> > > >    	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> > > >    		trace_nfsd_cb_restart(clp, cb);
> > > > 
> > > 
> > > 
> > 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-24 14:00       ` J. Bruce Fields
  2025-01-24 14:11         ` Jeff Layton
@ 2025-01-24 17:45         ` Olga Kornievskaia
  2025-01-24 17:47           ` Olga Kornievskaia
  1 sibling, 1 reply; 35+ messages in thread
From: Olga Kornievskaia @ 2025-01-24 17:45 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jeff Layton, Chuck Lever, Neil Brown, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On Fri, Jan 24, 2025 at 9:08 AM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Thu, Jan 23, 2025 at 06:20:08PM -0500, Jeff Layton wrote:
> > On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
> > > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > > RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
> > > >
> > > > "For any of a number of reasons, the replier could not process this
> > > >   operation in what was deemed a reasonable time. The client should wait
> > > >   and then try the request with a new slot and sequence value."
> > >
> > > A little farther down, Section 15.1.1.3 says this:
> > >
> > > "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
> > >   retried in full with the SEQUENCE operation containing the same slot
> > >   and sequence values."
> > >
> > > And:
> > >
> > > "If NFS4ERR_DELAY is returned on an operation other than the first in
> > >   the request, the request when retried MUST contain a SEQUENCE operation
> > >   that is different than the original one, with either the slot ID or the
> > >   sequence value different from that in the original request."
> > >
> > > My impression is that the slot needs to be held and used again only if
> > > the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
> > > the NFS4ERR_DELAY was the status of the 2nd or later operation in the
> > > COMPOUND, then yes, a different slot, or the same slot with a bumped
> > > sequence number, must be used.
> > >
> > > The current code in nfsd4_cb_sequence_done() appears to be correct in
> > > this regard.
> > >
> >
> > Ok! I stand corrected. We should be able to just drop this patch, but
> > some of the later patches may need some trivial merge conflicts fixed
> > up.
> >
> > Any idea why SEQUENCE is different in this regard?
>
> Isn't DELAY on SEQUENCE an indication that the operation is still in
> progress?  The difference between retrying the same slot or not is
> whether you're asking the server again for the result of the previous
> operation, or whether you're asking it to perform a new one.
>
> If you get DELAY on a later op and then keep retrying with the same
> seqid/slot then I'd expect you to get stuck in an infinite loop getting
> a DELAY response out of the reply cache.

If the client would keep re-using the same seqid for the operation it
got a DELAY on then it would be a broken client. When the linux client
gets ERR_DELAY, it retries the operation but it increments the seqid.

>
> --b.
>
>
> > This rule seems a
> > bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
> > matter which slot you use when retransmitting? The responder is just
> > saying "go away and come back later".
> >
> > What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
> > it's under resource pressure), and also shrinks the slot table in the
> > meantime? It seems like that might put the requestor in an untenable
> > position.
> >
> > Maybe we should lobby to get this changed in the spec?
> >
> > >
> > > > This is CB_SEQUENCE, but I believe the same rule applies. Release the
> > > > slot before submitting the delayed RPC.
> > > >
> > > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >   fs/nfsd/nfs4callback.c | 1 +
> > > >   1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > >                   goto need_restart;
> > > >           case -NFS4ERR_DELAY:
> > > >                   cb->cb_seq_status = 1;
> > > > +         nfsd41_cb_release_slot(cb);
> > > >                   if (!rpc_restart_call(task))
> > > >                           goto out;
> > > >
> > > >
> > >
> > >
> >
> > --
> > Jeff Layton <jlayton@kernel.org>
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-24 17:45         ` Olga Kornievskaia
@ 2025-01-24 17:47           ` Olga Kornievskaia
  0 siblings, 0 replies; 35+ messages in thread
From: Olga Kornievskaia @ 2025-01-24 17:47 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jeff Layton, Chuck Lever, Neil Brown, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On Fri, Jan 24, 2025 at 12:45 PM Olga Kornievskaia <okorniev@redhat.com> wrote:
>
> On Fri, Jan 24, 2025 at 9:08 AM J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Thu, Jan 23, 2025 at 06:20:08PM -0500, Jeff Layton wrote:
> > > On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
> > > > On 1/23/25 3:25 PM, Jeff Layton wrote:
> > > > > RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
> > > > >
> > > > > "For any of a number of reasons, the replier could not process this
> > > > >   operation in what was deemed a reasonable time. The client should wait
> > > > >   and then try the request with a new slot and sequence value."
> > > >
> > > > A little farther down, Section 15.1.1.3 says this:
> > > >
> > > > "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
> > > >   retried in full with the SEQUENCE operation containing the same slot
> > > >   and sequence values."
> > > >
> > > > And:
> > > >
> > > > "If NFS4ERR_DELAY is returned on an operation other than the first in
> > > >   the request, the request when retried MUST contain a SEQUENCE operation
> > > >   that is different than the original one, with either the slot ID or the
> > > >   sequence value different from that in the original request."
> > > >
> > > > My impression is that the slot needs to be held and used again only if
> > > > the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
> > > > the NFS4ERR_DELAY was the status of the 2nd or later operation in the
> > > > COMPOUND, then yes, a different slot, or the same slot with a bumped
> > > > sequence number, must be used.
> > > >
> > > > The current code in nfsd4_cb_sequence_done() appears to be correct in
> > > > this regard.
> > > >
> > >
> > > Ok! I stand corrected. We should be able to just drop this patch, but
> > > some of the later patches may need some trivial merge conflicts fixed
> > > up.
> > >
> > > Any idea why SEQUENCE is different in this regard?
> >
> > Isn't DELAY on SEQUENCE an indication that the operation is still in
> > progress?  The difference between retrying the same slot or not is
> > whether you're asking the server again for the result of the previous
> > operation, or whether you're asking it to perform a new one.
> >
> > If you get DELAY on a later op and then keep retrying with the same
> > seqid/slot then I'd expect you to get stuck in an infinite loop getting
> > a DELAY response out of the reply cache.
>
> If the client would keep re-using the same seqid for the operation it
> got a DELAY on then it would be a broken client. When the linux client
> gets ERR_DELAY, it retries the operation but it increments the seqid.

Urgh I stand corrected this is on the SEQUENCE not an operation.
>
> >
> > --b.
> >
> >
> > > This rule seems a
> > > bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
> > > matter which slot you use when retransmitting? The responder is just
> > > saying "go away and come back later".
> > >
> > > What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
> > > it's under resource pressure), and also shrinks the slot table in the
> > > meantime? It seems like that might put the requestor in an untenable
> > > position.
> > >
> > > Maybe we should lobby to get this changed in the spec?
> > >
> > > >
> > > > > This is CB_SEQUENCE, but I believe the same rule applies. Release the
> > > > > slot before submitting the delayed RPC.
> > > > >
> > > > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > > ---
> > > > >   fs/nfsd/nfs4callback.c | 1 +
> > > > >   1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > > index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
> > > > > --- a/fs/nfsd/nfs4callback.c
> > > > > +++ b/fs/nfsd/nfs4callback.c
> > > > > @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > > > >                   goto need_restart;
> > > > >           case -NFS4ERR_DELAY:
> > > > >                   cb->cb_seq_status = 1;
> > > > > +         nfsd41_cb_release_slot(cb);
> > > > >                   if (!rpc_restart_call(task))
> > > > >                           goto out;
> > > > >
> > > > >
> > > >
> > > >
> > >
> > > --
> > > Jeff Layton <jlayton@kernel.org>
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot
  2025-01-24 14:11         ` Jeff Layton
@ 2025-01-24 20:29           ` Tom Talpey
  0 siblings, 0 replies; 35+ messages in thread
From: Tom Talpey @ 2025-01-24 20:29 UTC (permalink / raw)
  To: Jeff Layton, J. Bruce Fields
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Kinglong Mee,
	Trond Myklebust, Anna Schumaker, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, linux-nfs,
	linux-kernel, netdev

On 1/24/2025 9:11 AM, Jeff Layton wrote:
> On Fri, 2025-01-24 at 09:00 -0500, J. Bruce Fields wrote:
>> On Thu, Jan 23, 2025 at 06:20:08PM -0500, Jeff Layton wrote:
>>> On Thu, 2025-01-23 at 17:18 -0500, Chuck Lever wrote:
>>>> On 1/23/25 3:25 PM, Jeff Layton wrote:
>>>>> RFC8881, 15.1.1.3 says this about NFS4ERR_DELAY:
>>>>>
>>>>> "For any of a number of reasons, the replier could not process this
>>>>>    operation in what was deemed a reasonable time. The client should wait
>>>>>    and then try the request with a new slot and sequence value."
>>>>
>>>> A little farther down, Section 15.1.1.3 says this:
>>>>
>>>> "If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request is
>>>>    retried in full with the SEQUENCE operation containing the same slot
>>>>    and sequence values."
>>>>
>>>> And:
>>>>
>>>> "If NFS4ERR_DELAY is returned on an operation other than the first in
>>>>    the request, the request when retried MUST contain a SEQUENCE operation
>>>>    that is different than the original one, with either the slot ID or the
>>>>    sequence value different from that in the original request."
>>>>
>>>> My impression is that the slot needs to be held and used again only if
>>>> the server responded with NFS4ERR_DELAY on the SEQUENCE operation. If
>>>> the NFS4ERR_DELAY was the status of the 2nd or later operation in the
>>>> COMPOUND, then yes, a different slot, or the same slot with a bumped
>>>> sequence number, must be used.
>>>>
>>>> The current code in nfsd4_cb_sequence_done() appears to be correct in
>>>> this regard.
>>>>
>>>
>>> Ok! I stand corrected. We should be able to just drop this patch, but
>>> some of the later patches may need some trivial merge conflicts fixed
>>> up.
>>>
>>> Any idea why SEQUENCE is different in this regard?
>>
>> Isn't DELAY on SEQUENCE an indication that the operation is still in
>> progress?  The difference between retrying the same slot or not is
>> whether you're asking the server again for the result of the previous
>> operation, or whether you're asking it to perform a new one.
>>
>> If you get DELAY on a later op and then keep retrying with the same
>> seqid/slot then I'd expect you to get stuck in an infinite loop getting
>> a DELAY response out of the reply cache.
>>
> 
> Hi Bruce!
> 
> That may be the case. From RFC8881, section 2.10.6.2:
> 
> "A retry might be sent while the original request is still in progress
> on the replier. The replier SHOULD deal with the issue by returning
> NFS4ERR_DELAY as the reply to SEQUENCE or CB_SEQUENCE operation, but
> implementations MAY return NFS4ERR_MISORDERED. Since errors from
> SEQUENCE and CB_SEQUENCE are never recorded in the reply cache, this
> approach allows the results of the execution of the original request to
> be properly recorded in the reply cache (assuming that the requester
> specified the reply to be cached)."

It's true, but note that the NFS4ERR_DELAY response is a SHOULD, so it's
not the complete picture. It seems rude to return MISORDERED as suggested,
but it's just as possible that the server may not respond at all, and
simply wait for the in-progress operation to complete. We certainly
can't count on all servers to do the same thing.

Tom.

>>> This rule seems a
>>> bit arbitrary. If the response is NFS4ERR_DELAY, then why would it
>>> matter which slot you use when retransmitting? The responder is just
>>> saying "go away and come back later".
>>>
>>> What if the responder repeatedly returns NFS4ERR_DELAY (maybe because
>>> it's under resource pressure), and also shrinks the slot table in the
>>> meantime? It seems like that might put the requestor in an untenable
>>> position.
>>>
>>> Maybe we should lobby to get this changed in the spec?
>>>
>>>>
>>>>> This is CB_SEQUENCE, but I believe the same rule applies. Release the
>>>>> slot before submitting the delayed RPC.
>>>>>
>>>>> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>>> ---
>>>>>    fs/nfsd/nfs4callback.c | 1 +
>>>>>    1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>>>> index bfc9de1fcb67b4f05ed2f7a28038cd8290809c17..c26ccb9485b95499fc908833a384d741e966a8db 100644
>>>>> --- a/fs/nfsd/nfs4callback.c
>>>>> +++ b/fs/nfsd/nfs4callback.c
>>>>> @@ -1392,6 +1392,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>>>    		goto need_restart;
>>>>>    	case -NFS4ERR_DELAY:
>>>>>    		cb->cb_seq_status = 1;
>>>>> +		nfsd41_cb_release_slot(cb);
>>>>>    		if (!rpc_restart_call(task))
>>>>>    			goto out;
>>>>>    
>>>>>
>>>>
>>>>
>>>
>>> -- 
>>> Jeff Layton <jlayton@kernel.org>
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-23 20:25 ` [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set Jeff Layton
@ 2025-01-25 16:24   ` Chuck Lever
  2025-01-25 22:04     ` Jeff Layton
  2025-01-25 23:01   ` NeilBrown
  1 sibling, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2025-01-25 16:24 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/23/25 3:25 PM, Jeff Layton wrote:
> This is problematic, since the RPC might have been entirely successful.
> There is no point in restarting a v4.1+ callback just because
> RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
> detecting when it should retransmit the RPC.
> 
> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/nfsd/nfs4callback.c | 3 ---
>   1 file changed, 3 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e12205ef16ca932ffbcc86d67b0817aec2436c89 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1403,9 +1403,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>   	}
>   	trace_nfsd_cb_free_slot(task, cb);
>   	nfsd41_cb_release_slot(cb);
> -
> -	if (RPC_SIGNALLED(task))
> -		goto need_restart;
>   out:
>   	return ret;
>   retry_nowait:
> 

I too am skeptical about this logic, but I don't entirely understand it
yet. More importantly, though, I don't recall seeing (mis)behavior that
I can directly attribute to it, so I can't yet confirm or deny your 
assertion that "This is problematic".

Before making a code change here, let's gather a little evidence of a
real problem. For instance, we might want to replace this logic with
something better rather than wholesale removing it.

You might start by enabling aggressive disconnect injection to see how
backchannel recovery works (or that it doesn't work!). I'm trying this
on my kdevops NFSD while running fstests:

cd /sys/kernel/debug/fail_sunrpc/
echo Y > ignore-cache-wait
echo Y > ignore-client-disconnect
echo 24847 > interval
echo 97 > times
echo 100 > probability


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-25 16:24   ` Chuck Lever
@ 2025-01-25 22:04     ` Jeff Layton
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff Layton @ 2025-01-25 22:04 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On Sat, 2025-01-25 at 11:24 -0500, Chuck Lever wrote:
> On 1/23/25 3:25 PM, Jeff Layton wrote:
> > This is problematic, since the RPC might have been entirely successful.
> > There is no point in restarting a v4.1+ callback just because
> > RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
> > detecting when it should retransmit the RPC.
> > 
> > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/nfsd/nfs4callback.c | 3 ---
> >   1 file changed, 3 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e12205ef16ca932ffbcc86d67b0817aec2436c89 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1403,9 +1403,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   	}
> >   	trace_nfsd_cb_free_slot(task, cb);
> >   	nfsd41_cb_release_slot(cb);
> > -
> > -	if (RPC_SIGNALLED(task))
> > -		goto need_restart;
> >   out:
> >   	return ret;
> >   retry_nowait:
> > 
> 
> I too am skeptical about this logic, but I don't entirely understand it
> yet. More importantly, though, I don't recall seeing (mis)behavior that
> I can directly attribute to it, so I can't yet confirm or deny your 
> assertion that "This is problematic".
> 

I haven't seen behavior that I can directly attribute to this either,
but we have seen a number of strange panics and weird behaviors in the
callback code over the years that may be related.

At this point, I think you're correct that we will probably need to do
more than just small, incremental changes here.
 
> Before making a code change here, let's gather a little evidence of a
> real problem. For instance, we might want to replace this logic with
> something better rather than wholesale removing it.
> 
> You might start by enabling aggressive disconnect injection to see how
> backchannel recovery works (or that it doesn't work!). I'm trying this
> on my kdevops NFSD while running fstests:
> 
> cd /sys/kernel/debug/fail_sunrpc/
> echo Y > ignore-cache-wait
> echo Y > ignore-client-disconnect
> echo 24847 > interval
> echo 97 > times
> echo 100 > probability
> 
> 

Unfortunately, I've found an even bigger problem in the callback code.

It accesses the clp->cl_cb_session pointer when processing the call and
reply, but that pointer doesn't imply a reference and nothing else
ensures that the nfsd4_session object will stick around while this
happens. I think a callback can race with a DESTROY_SESSION and cause a
UAF. I started working on patches to fix this up, but it's a bit
complex and will take some time.

Please don't apply any of these until I get a better picture of what
will need to be changed. Stay tuned!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-23 20:25 ` [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set Jeff Layton
  2025-01-25 16:24   ` Chuck Lever
@ 2025-01-25 23:01   ` NeilBrown
  2025-01-26 11:18     ` Jeff Layton
  1 sibling, 1 reply; 35+ messages in thread
From: NeilBrown @ 2025-01-25 23:01 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-nfs, linux-kernel, netdev, Jeff Layton

On Fri, 24 Jan 2025, Jeff Layton wrote:
> This is problematic, since the RPC might have been entirely successful.
> There is no point in restarting a v4.1+ callback just because
> RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
> detecting when it should retransmit the RPC.

But why might RPC_SIGNALLED() ever be true?
The flag RPC_TASK_SIGNALLED is only ever set by rpc_signal_task() which
is only called from rpc_killall_tasks() and __rpc_execute() for
non-async tasks which doesn't apply to nfsd callbacks as they are
started with rpc_call_async().

rpc_killall_tasks() is called by fs/nfs/ which isn't relevant for us,
and from rpc_shutdown_client().  In those cases we certainly don't want
the request to be retried, though the nfsd4_process_cb_update() case is
a little interesting as we do want it to be retried, but in a different
client.

So the code you are removing is either dead code because something else
will prevent the restart when a client is being shut down, or it is bad
code because it would delay rpc_shutdown_client() while the request is
retried. 

I haven't dug the extra step to figure out which, but either way I think
the code should go.

NeilBrown

> 
> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/nfs4callback.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e12205ef16ca932ffbcc86d67b0817aec2436c89 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1403,9 +1403,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>  	}
>  	trace_nfsd_cb_free_slot(task, cb);
>  	nfsd41_cb_release_slot(cb);
> -
> -	if (RPC_SIGNALLED(task))
> -		goto need_restart;
>  out:
>  	return ret;
>  retry_nowait:
> 
> -- 
> 2.48.1
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-25 23:01   ` NeilBrown
@ 2025-01-26 11:18     ` Jeff Layton
  2025-01-26 16:41       ` Chuck Lever
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-26 11:18 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-nfs, linux-kernel, netdev

On Sun, 2025-01-26 at 10:01 +1100, NeilBrown wrote:
> On Fri, 24 Jan 2025, Jeff Layton wrote:
> > This is problematic, since the RPC might have been entirely successful.
> > There is no point in restarting a v4.1+ callback just because
> > RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
> > detecting when it should retransmit the RPC.
> 
> But why might RPC_SIGNALLED() ever be true?
> The flag RPC_TASK_SIGNALLED is only ever set by rpc_signal_task() which
> is only called from rpc_killall_tasks() and __rpc_execute() for
> non-async tasks which doesn't apply to nfsd callbacks as they are
> started with rpc_call_async().
> 
> rpc_killall_tasks() is called by fs/nfs/ which isn't relevant for us,
> and from rpc_shutdown_client().  In those cases we certainly don't want
> the request to be retried, though the nfsd4_process_cb_update() case is
> a little interesting as we do want it to be retried, but in a different
> client.
>
> So the code you are removing is either dead code because something else
> will prevent the restart when a client is being shut down, or it is bad
> code because it would delay rpc_shutdown_client() while the request is
> retried. 
> 
> I haven't dug the extra step to figure out which, but either way I think
> the code should go.
> 
> 

Thanks. That was my analysis too.

rpc_shutdown_client() is called when we tear down and rebuild the
rpc_client. nfsd does this in setup_callback_client(), which gets
called from nfsd4_process_cb_update() (basically when we detect that
the backchannel is having problems).

There are really only two states: We either got a reply from the server
before the client went down, or we didn't. In the case where we got a
reply, there is no need to retry anything. In the case where we didn't,
the tk_status will be '1', so there is no need to check RPC_SIGNALLED()
at all here.

The existing code could lead to the call being retried when we had
already gotten a perfectly valid reply.

> > 
> > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfsd/nfs4callback.c | 3 ---
> >  1 file changed, 3 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e12205ef16ca932ffbcc86d67b0817aec2436c89 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1403,9 +1403,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >  	}
> >  	trace_nfsd_cb_free_slot(task, cb);
> >  	nfsd41_cb_release_slot(cb);
> > -
> > -	if (RPC_SIGNALLED(task))
> > -		goto need_restart;
> >  out:
> >  	return ret;
> >  retry_nowait:
> > 
> > -- 
> > 2.48.1
> > 
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-26 11:18     ` Jeff Layton
@ 2025-01-26 16:41       ` Chuck Lever
  2025-01-27 15:43         ` Jeff Layton
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2025-01-26 16:41 UTC (permalink / raw)
  To: Jeff Layton, NeilBrown
  Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, J. Bruce Fields,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On 1/26/25 6:18 AM, Jeff Layton wrote:
> On Sun, 2025-01-26 at 10:01 +1100, NeilBrown wrote:
>> On Fri, 24 Jan 2025, Jeff Layton wrote:
>>> This is problematic, since the RPC might have been entirely successful.
>>> There is no point in restarting a v4.1+ callback just because
>>> RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
>>> detecting when it should retransmit the RPC.
>>
>> But why might RPC_SIGNALLED() ever be true?
>> The flag RPC_TASK_SIGNALLED is only ever set by rpc_signal_task() which
>> is only called from rpc_killall_tasks() and __rpc_execute() for
>> non-async tasks which doesn't apply to nfsd callbacks as they are
>> started with rpc_call_async().
>>
>> rpc_killall_tasks() is called by fs/nfs/ which isn't relevant for us,
>> and from rpc_shutdown_client().  In those cases we certainly don't want
>> the request to be retried, though the nfsd4_process_cb_update() case is
>> a little interesting as we do want it to be retried, but in a different
>> client.
>>
>> So the code you are removing is either dead code because something else
>> will prevent the restart when a client is being shut down, or it is bad
>> code because it would delay rpc_shutdown_client() while the request is
>> retried.
>>
>> I haven't dug the extra step to figure out which, but either way I think
>> the code should go.
> 
> Thanks. That was my analysis too.

Agreed, this code is problematic, but it appears to me that some of
these problems are not resolved by simply removing the signal check.


> rpc_shutdown_client() is called when we tear down and rebuild the
> rpc_client. nfsd does this in setup_callback_client(), which gets
> called from nfsd4_process_cb_update() (basically when we detect that
> the backchannel is having problems).
> 
> There are really only two states: We either got a reply from the server
> before the client went down, or we didn't. In the case where we got a
> reply, there is no need to retry anything. In the case where we didn't,
> the tk_status will be '1', so there is no need to check RPC_SIGNALLED()
> at all here.

Fwiw, the "cb_seq_status == 1" arm skips the signal check in the current
code.


> The existing code could lead to the call being retried when we had
> already gotten a perfectly valid reply.

Here's a case-by-case audit:

  - NFS_OK: SEQUENCE was decoded and passed sanity checks. So this should
    not ever requeue in here. It might be requeued during subsequent
    processing.

  - ESERVERFAULT: SEQUENCE was decoded but failed sanity checking. The
    reply should be dropped now, and the session marked FAULT. No requeue
    is ever needed here.

    [ I question whether the sequence number should be bumped in this
      case -- the client's callback server replied with the identity of
      some other slot. And anyway, this session is about to become
      toast. ]

  - 1: The timeout case. We want a fresh session here, so it falls
    through to BADSESSION.

  - NFS4ERR_BADSESSION: This needs a requeue so that the operation can
    be retried with a fresh session. But it should always check if the
    rpc_clnt is shutting down before doing so. This is a problem in the
    current code.

  - NFS4ERR_DELAY: Skips the signal check, but shouldn't. If the rpc_clnt
    is shutting down, this RPC should not be requeued.

  - NFS4ERR_BAD_SLOT: Skips the signal check, but shouldn't. I need to
    think more about BAD_SLOT recovery best practice.

  - NFS4ERR_SEQ_MISORDERED: Does one retry with a seq_nr of 1. It
    probably should terminate if that fails. IMO this should check for
    rpc_clnt shutdown before requeuing the retry.

  - default: I don't think this case should ever be requeued, but it
    appears that it could be if the rpc_clnt is shutting down.


>>> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>>   fs/nfsd/nfs4callback.c | 3 ---
>>>   1 file changed, 3 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e12205ef16ca932ffbcc86d67b0817aec2436c89 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -1403,9 +1403,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>   	}
>>>   	trace_nfsd_cb_free_slot(task, cb);
>>>   	nfsd41_cb_release_slot(cb);
>>> -
>>> -	if (RPC_SIGNALLED(task))
>>> -		goto need_restart;
>>>   out:
>>>   	return ret;
>>>   retry_nowait:
>>>
>>> -- 
>>> 2.48.1
>>>
>>>
>>
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done()
  2025-01-24 14:50     ` Jeff Layton
  2025-01-24 15:05       ` Chuck Lever
@ 2025-01-26 16:50       ` Chuck Lever
  1 sibling, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2025-01-26 16:50 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: linux-nfs, linux-kernel, netdev

On 1/24/25 9:50 AM, Jeff Layton wrote:
> On Fri, 2025-01-24 at 09:43 -0500, Chuck Lever wrote:
>> On 1/23/25 3:25 PM, Jeff Layton wrote:
>>> Add a new kerneldoc header, and clean up the comments a bit.
>>
>> Usually I'm in favor of kdoc headers, but here, it's a static function
>> whose address is not shared outside of this source file. The only
>> documentation need is the meaning of the return code, IMO.
>>
> 
> If you like. I figured it wouldn't hurt to do a full kdoc comment.
> 
>>
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>>    fs/nfsd/nfs4callback.c | 26 ++++++++++++++++++++------
>>>    1 file changed, 20 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index 6e0561f3b21bd850b0387b5af7084eb05e818231..415fc8aae0f47c36f00b2384805c7a996fb1feb0 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -1325,6 +1325,17 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
>>>    	rpc_call_start(task);
>>>    }
>>>    
>>> +/**
>>> + * nfsd4_cb_sequence_done - process the result of a CB_SEQUENCE
>>> + * @task: rpc_task
>>> + * @cb: nfsd4_callback for this call
>>> + *
>>> + * For minorversion 0, there is no CB_SEQUENCE. Only restart the call
>>> + * if the callback RPC client was killed. For v4.1+ the error handling
>>> + * is more sophisticated.
>>
>> It would be much clearer to pull the 4.0 error handling out of this
>> function, which is named "cb_/sequence/_done".
>>
>> Perhaps the need_restart label can be hoisted into nfsd4_cb_done() ?
>>
> 
> If we do that then we'll need to change this function to return
> something other than a bool, and that's a larger change than I wanted
> to make here.

I don't think that's needed. If you create a helper like so:

static bool nfsd4_cb_requeue(struct rpc_task *task,
			     struct nfsd4_callback *cb)
{ 

         struct nfs4_client *clp = cb->cb_clp;

         if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) { 

                 trace_nfsd_cb_restart(clp, cb); 

                 task->tk_status = 0; 

                 cb->cb_need_restart = true; 

         } 

         return false;
}

Then you can replace the "goto need_restart;" sites in both functions
with a tail call to this helper:

	return nfsd4_cb_requeue(task, cb);


> I really wanted to keep these as small, targeted patches
> that can be backported easily.

Clean-ups are generally not back-portable, so I don't mind if you go to
a little extra trouble.


> I wouldn't object to further cleanup here on top of that though.
> 
> 
>>
>>> + *
>>> + * Returns true if reply processing should continue.
>>> + */
>>>    static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
>>>    {
>>>    	struct nfs4_client *clp = cb->cb_clp;
>>> @@ -1334,11 +1345,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    	if (!clp->cl_minorversion) {
>>>    		/*
>>>    		 * If the backchannel connection was shut down while this
>>> -		 * task was queued, we need to resubmit it after setting up
>>> -		 * a new backchannel connection.
>>> +		 * task was queued, resubmit it after setting up a new
>>> +		 * backchannel connection.
>>>    		 *
>>> -		 * Note that if we lost our callback connection permanently
>>> -		 * the submission code will error out, so we don't need to
>>> +		 * Note that if the callback connection is permanently lost,
>>> +		 * the submission code will error out. There is no need to
>>>    		 * handle that case here.
>>>    		 */
>>>    		if (RPC_SIGNALLED(task))
>>> @@ -1355,8 +1366,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    	switch (cb->cb_seq_status) {
>>>    	case 0:
>>>    		/*
>>> -		 * No need for lock, access serialized in nfsd4_cb_prepare
>>> -		 *
>>>    		 * RFC5661 20.9.3
>>>    		 * If CB_SEQUENCE returns an error, then the state of the slot
>>>    		 * (sequence ID, cached reply) MUST NOT change.
>>> @@ -1365,6 +1374,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>>    		ret = true;
>>>    		break;
>>>    	case -ESERVERFAULT:
>>> +		/*
>>> +		 * Client returned NFS4_OK, but decoding failed. Mark the
>>> +		 * backchannel as faulty, but don't retransmit since the
>>> +		 * call was successful.
>>> +		 */
>>>    		++session->se_cb_seq_nr[cb->cb_held_slot];
>>>    		nfsd4_mark_cb_fault(cb->cb_clp);
>>>    		break;
>>
>> This old code abuses the meaning of ESERVERFAULT IMO. NFS4ERR_BADXDR is
>> a better choice. But why call mark_cb_fault in this case?
>>
>> Maybe split this clean-up into a separate patch.
>>
>>
> 
> I'm only altering comments in this patch. Do you really want separate
> patches for the different comments?
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-26 16:41       ` Chuck Lever
@ 2025-01-27 15:43         ` Jeff Layton
  2025-01-27 17:00           ` Chuck Lever
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff Layton @ 2025-01-27 15:43 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown
  Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, J. Bruce Fields,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On Sun, 2025-01-26 at 11:41 -0500, Chuck Lever wrote:
> On 1/26/25 6:18 AM, Jeff Layton wrote:
> > On Sun, 2025-01-26 at 10:01 +1100, NeilBrown wrote:
> > > On Fri, 24 Jan 2025, Jeff Layton wrote:
> > > > This is problematic, since the RPC might have been entirely successful.
> > > > There is no point in restarting a v4.1+ callback just because
> > > > RPC_SIGNALLED is true. The v4.1+ error handling has other mechanisms for
> > > > detecting when it should retransmit the RPC.
> > > 
> > > But why might RPC_SIGNALLED() ever be true?
> > > The flag RPC_TASK_SIGNALLED is only ever set by rpc_signal_task() which
> > > is only called from rpc_killall_tasks() and __rpc_execute() for
> > > non-async tasks which doesn't apply to nfsd callbacks as they are
> > > started with rpc_call_async().
> > > 
> > > rpc_killall_tasks() is called by fs/nfs/ which isn't relevant for us,
> > > and from rpc_shutdown_client().  In those cases we certainly don't want
> > > the request to be retried, though the nfsd4_process_cb_update() case is
> > > a little interesting as we do want it to be retried, but in a different
> > > client.
> > > 
> > > So the code you are removing is either dead code because something else
> > > will prevent the restart when a client is being shut down, or it is bad
> > > code because it would delay rpc_shutdown_client() while the request is
> > > retried.
> > > 
> > > I haven't dug the extra step to figure out which, but either way I think
> > > the code should go.
> > 
> > Thanks. That was my analysis too.
> 
> Agreed, this code is problematic, but it appears to me that some of
> these problems are not resolved by simply removing the signal check.
> 
> 
> > rpc_shutdown_client() is called when we tear down and rebuild the
> > rpc_client. nfsd does this in setup_callback_client(), which gets
> > called from nfsd4_process_cb_update() (basically when we detect that
> > the backchannel is having problems).
> > 
> > There are really only two states: We either got a reply from the server
> > before the client went down, or we didn't. In the case where we got a
> > reply, there is no need to retry anything. In the case where we didn't,
> > the tk_status will be '1', so there is no need to check RPC_SIGNALLED()
> > at all here.
> 
> Fwiw, the "cb_seq_status == 1" arm skips the signal check in the current
> code.
> 
> 
> > The existing code could lead to the call being retried when we had
> > already gotten a perfectly valid reply.
> 
> Here's a case-by-case audit:
> 
>   - NFS_OK: SEQUENCE was decoded and passed sanity checks. So this should
>     not ever requeue in here. It might be requeued during subsequent
>     processing.
> 
>   - ESERVERFAULT: SEQUENCE was decoded but failed sanity checking. The
>     reply should be dropped now, and the session marked FAULT. No requeue
>     is ever needed here.
> 
>     [ I question whether the sequence number should be bumped in this
>       case -- the client's callback server replied with the identity of
>       some other slot. And anyway, this session is about to become
>       toast. ]
> 

It didn't necessarily reply with the ID of a different slot. It's just
that the decoding failed in some way. It could have been any of the
cases in decode_cb_sequence4resok(). Maybe that function needs to
return more distinct error codes so we know what was mangled.

>   - 1: The timeout case. We want a fresh session here, so it falls
>     through to BADSESSION.
> 

Ok.

>   - NFS4ERR_BADSESSION: This needs a requeue so that the operation can
>     be retried with a fresh session. But it should always check if the
>     rpc_clnt is shutting down before doing so. This is a problem in the
>     current code.
> 

I'm not sure I understand the problem you see with that in the existing
code. There's a rather complicated dance in nfsd4_process_cb_update(),
but if the nfs4_client is shutting down, then clp->cl_cb_client will be
NULL after it, and the callback will end.

You said "rpc_clnt" though, so I'm not sure I understand the scenario
you mean.

>   - NFS4ERR_DELAY: Skips the signal check, but shouldn't. If the rpc_clnt
>     is shutting down, this RPC should not be requeued.
> 

Good point -- ot sure how we deal with that in a non-racy way. I'll
think about it.

>   - NFS4ERR_BAD_SLOT: Skips the signal check, but shouldn't. I need to
>     think more about BAD_SLOT recovery best practice.
> 

RPC_SIGNALLED() is irrelevant here. I think what we want to do is mark
the backchannel as faulty, _leak_ the slot and retry via the workqueue
(not just requeue the rpc_task). That should just cause the callback to
exit once it runs again.

We should also mark the backchannel as faulty, since the client and
server no longer agree on the size of the slot table.

>   - NFS4ERR_SEQ_MISORDERED: Does one retry with a seq_nr of 1. It
>     probably should terminate if that fails. IMO this should check for
>     rpc_clnt shutdown before requeuing the retry.
> 

Fair enough. There is a frustrating lack of guidance in the spec about
SEQ_MISORDERED. We should probably mark the BC as having a FAULT too if
the retry fails.

>   - default: I don't think this case should ever be requeued, but it
>     appears that it could be if the rpc_clnt is shutting down.
> 

Yeah. Might not hurt to throw a pr_warn() here too. I think we never
want to fall into this case.

In any case, my intention is to fix up the cb_session lifetime problem
first, and then we can rework the error handling from the callbacks on
top of that.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set
  2025-01-27 15:43         ` Jeff Layton
@ 2025-01-27 17:00           ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2025-01-27 17:00 UTC (permalink / raw)
  To: Jeff Layton, NeilBrown
  Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, J. Bruce Fields,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-nfs, linux-kernel, netdev

On 1/27/25 10:43 AM, Jeff Layton wrote:
> On Sun, 2025-01-26 at 11:41 -0500, Chuck Lever wrote:

>>    - ESERVERFAULT: SEQUENCE was decoded but failed sanity checking. The
>>      reply should be dropped now, and the session marked FAULT. No requeue
>>      is ever needed here.
>>
>>      [ I question whether the sequence number should be bumped in this
>>        case -- the client's callback server replied with the identity of
>>        some other slot. And anyway, this session is about to become
>>        toast. ]
> 
> It didn't necessarily reply with the ID of a different slot. It's just
> that the decoding failed in some way. 

My read is that if the XDR decode failed in any way, the decoder sets
cb_seq_status to -EIO.

-ESERVERFAULT means the decoding went fine, but one or more of the
session ID, slot number, or sequence did not match what NFSD's callback
client expected.

It's not the same slot if either the session ID or slot number doesn't
match what the server sent in its CB_SEQUENCE Call. That seems
equivalent to BAD_SLOT without any question.

If the sequence number is wrong, then it's equivalent to SEQ_MISORDERED,
maybe?


> It could have been any of the
> cases in decode_cb_sequence4resok(). Maybe that function needs to
> return more distinct error codes so we know what was mangled.

My preference would be that decode_cb_sequence() should simply
decode these fields, and let nfsd4_cb_sequence_done() do the sanity
checking. I don't think decode_cb_sequence4resok() should be doing
any sanity checking beyond "does this unmarshal in the space allowed?"


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2025-01-27 17:01 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-23 20:25 [PATCH 0/8] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
2025-01-23 20:25 ` [PATCH 1/8] nfsd: don't restart v4.1+ callback when RPC_SIGNALLED is set Jeff Layton
2025-01-25 16:24   ` Chuck Lever
2025-01-25 22:04     ` Jeff Layton
2025-01-25 23:01   ` NeilBrown
2025-01-26 11:18     ` Jeff Layton
2025-01-26 16:41       ` Chuck Lever
2025-01-27 15:43         ` Jeff Layton
2025-01-27 17:00           ` Chuck Lever
2025-01-23 20:25 ` [PATCH 2/8] nfsd: fix CB_SEQUENCE error handling of NFS4ERR_{BADSLOT,BADSESSION,SEQ_MISORDERED} Jeff Layton
2025-01-24 14:32   ` Chuck Lever
2025-01-24 14:46     ` Jeff Layton
2025-01-24 15:31       ` Chuck Lever
2025-01-24 16:04         ` Jeff Layton
2025-01-24 16:08         ` Jeff Layton
2025-01-23 20:25 ` [PATCH 3/8] nfsd: when CB_SEQUENCE gets NFS4ERR_DELAY, release the slot Jeff Layton
2025-01-23 22:18   ` Chuck Lever
2025-01-23 23:20     ` Jeff Layton
2025-01-24  1:30       ` Tom Talpey
2025-01-24 14:00       ` J. Bruce Fields
2025-01-24 14:11         ` Jeff Layton
2025-01-24 20:29           ` Tom Talpey
2025-01-24 17:45         ` Olga Kornievskaia
2025-01-24 17:47           ` Olga Kornievskaia
2025-01-23 20:25 ` [PATCH 4/8] nfsd: fix default case in nfsd4_cb_sequence_done() Jeff Layton
2025-01-23 20:25 ` [PATCH 5/8] nfsd: reverse default of "ret" variable " Jeff Layton
2025-01-23 20:25 ` [PATCH 6/8] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
2025-01-23 20:25 ` [PATCH 7/8] nfsd: clean up and amend comments around nfsd4_cb_sequence_done() Jeff Layton
2025-01-24 14:43   ` Chuck Lever
2025-01-24 14:50     ` Jeff Layton
2025-01-24 15:05       ` Chuck Lever
2025-01-24 15:31         ` Jeff Layton
2025-01-24 15:42           ` Chuck Lever
2025-01-26 16:50       ` Chuck Lever
2025-01-23 20:25 ` [PATCH 8/8] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).