public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
@ 2025-01-29 13:39 Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 1/7] nfsd: add routines to get/put session references for callbacks Jeff Layton
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

While looking over the CB_SEQUENCE error handling, I discovered that
callbacks don't hold a reference to a session, and the
clp->cl_cb_session could easily change between request and response.
If that happens at an inopportune time, there could be UAFs or weird
slot/sequence handling problems.

This series changes the nfsd4_session to be RCU-freed, and then adds a
new method of session refcounting that is compatible with the old.
nfsd4_callback RPCs will now hold a lightweight reference to the session
in addition to the slot. Then, all of the callback handling is switched
to use that session instead of dereferencing clp->cb_cb_session.
I've also reworked the error handling in nfsd4_cb_sequence_done()
based on review comments, and lifted the v4.0 handing out of that
function.

This passes pynfs, nfstests, and fstests for me, but I'm not sure how
much any of that stresses the backchannel's error handling.

These should probably go in via Chuck's tree, but the last patch touches
some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
from Trond and/or Anna on that one.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Changes in v2:
- make nfsd4_session be RCU-freed
- change code to keep reference to session over callback RPCs
- rework error handling in nfsd4_cb_sequence_done()
- move NFSv4.0 handling out of nfsd4_cb_sequence_done()
- Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org

---
Jeff Layton (7):
      nfsd: add routines to get/put session references for callbacks
      nfsd: make clp->cl_cb_session be an RCU managed pointer
      nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
      nfsd: overhaul CB_SEQUENCE error handling
      nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
      nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
      sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return

 fs/nfs/nfs4proc.c           |  12 ++-
 fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
 fs/nfsd/nfs4state.c         |  43 ++++++++-
 fs/nfsd/state.h             |   6 +-
 fs/nfsd/trace.h             |   6 +-
 include/linux/sunrpc/clnt.h |   4 +-
 net/sunrpc/clnt.c           |   7 +-
 7 files changed, 210 insertions(+), 80 deletions(-)
---
base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
change-id: 20250123-nfsd-6-14-b0797e385dc0

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/7] nfsd: add routines to get/put session references for callbacks
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
@ 2025-01-29 13:39 ` Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 2/7] nfsd: make clp->cl_cb_session be an RCU managed pointer Jeff Layton
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

The existing session reference counting is too heavyweight for
callbacks. There is an atomic refcount in nfsd4_session (se_ref), but
the existing functions take a client reference alongside it, and require
the nn->client_lock. This is unnecessary for callbacks as they are
already owned by the client.

Add new nfsd4_cb_get_session() and nfsd4_cb_put_session() calls that
take and put a session reference on behalf of a callback.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 32 ++++++++++++++++++++++++++++++--
 fs/nfsd/state.h     |  2 ++
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index cc819b8e8acdf5dcfe44c5bae45c6233f7b695e9..2c26c6aaea93e3e1eb438e7e23dc881c7bf35fe2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -234,6 +234,35 @@ static void put_client_renew(struct nfs4_client *clp)
 	spin_unlock(&nn->client_lock);
 }
 
+/**
+ * nfsd4_cb_get_session - get a session reference for a callback
+ * @ses: session of which to get a reference
+ *
+ * Callbacks are different than client-driven RPCs. The caller doesn't
+ * need a reference to the nfs4_client, and doesn't want to renew the
+ * lease when putting the reference.
+ */
+bool nfsd4_cb_get_session(struct nfsd4_session *ses)
+{
+	if (is_session_dead(ses))
+		return false;
+	return atomic_inc_not_zero(&ses->se_ref);
+}
+
+/**
+ * nfsd4_cb_put_session - put a session reference for a callback
+ * @ses: session of which to put a reference
+ *
+ * Callbacks are different than client-driven RPCs. The caller doesn't
+ * need a reference to the nfs4_client, and doesn't want to renew the
+ * lease when putting the reference.
+ */
+void nfsd4_cb_put_session(struct nfsd4_session *ses)
+{
+	if (ses && atomic_dec_and_test(&ses->se_ref) && is_session_dead(ses))
+		free_session(ses);
+}
+
 static __be32 nfsd4_get_session_locked(struct nfsd4_session *ses)
 {
 	__be32 status;
@@ -254,8 +283,7 @@ static void nfsd4_put_session_locked(struct nfsd4_session *ses)
 
 	lockdep_assert_held(&nn->client_lock);
 
-	if (atomic_dec_and_test(&ses->se_ref) && is_session_dead(ses))
-		free_session(ses);
+	nfsd4_cb_put_session(ses);
 	put_client_renew_locked(clp);
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 74d2d7b42676d907bec9159b927aeed223d668c3..79d985d2a656e1a5b22a6a9c88f309515725e847 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -753,6 +753,8 @@ struct nfsd4_compound_state;
 struct nfsd_net;
 struct nfsd4_copy;
 
+bool nfsd4_cb_get_session(struct nfsd4_session *ses);
+void nfsd4_cb_put_session(struct nfsd4_session *ses);
 extern __be32 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *cstate, struct svc_fh *fhp,
 		stateid_t *stateid, int flags, struct nfsd_file **filp,

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/7] nfsd: make clp->cl_cb_session be an RCU managed pointer
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 1/7] nfsd: add routines to get/put session references for callbacks Jeff Layton
@ 2025-01-29 13:39 ` Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 3/7] nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session Jeff Layton
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

Currently, this is just a pointer to the most recent session, but
there is no guarantee that the session is still valid and in memory.
It's possible for this pointer go NULL or replaced.

First, embed a struct rcu in nfsd4_session and free it via free_rcu.
Ensure that when clp->cl_cb_session pointer is changed, that it is done
via RCU-safe methods.

This will allow callbacks to access the cl_cb_session pointer safely via
RCU.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 21 ++++++++++++++++++---
 fs/nfsd/nfs4state.c    | 11 +++++++++--
 fs/nfsd/state.h        |  3 ++-
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..e55bf66a33d6efb56d2f75f0a49a60307e3807ac 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1122,6 +1122,7 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 	};
 	struct rpc_clnt *client;
 	const struct cred *cred;
+	int ret;
 
 	if (clp->cl_minorversion == 0) {
 		if (!clp->cl_cred.cr_principal &&
@@ -1137,7 +1138,9 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 	} else {
 		if (!conn->cb_xprt || !ses)
 			return -EINVAL;
-		clp->cl_cb_session = ses;
+		if (!nfsd4_cb_get_session(ses))
+			return -EINVAL;
+		rcu_assign_pointer(clp->cl_cb_session, ses);
 		args.bc_xprt = conn->cb_xprt;
 		args.prognumber = clp->cl_cb_session->se_cb_prog;
 		args.protocol = conn->cb_xprt->xpt_class->xcl_ident |
@@ -1148,13 +1151,15 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 	client = rpc_create(&args);
 	if (IS_ERR(client)) {
 		trace_nfsd_cb_setup_err(clp, PTR_ERR(client));
-		return PTR_ERR(client);
+		ret = PTR_ERR(client);
+		goto out_put_ses;
 	}
 	cred = get_backchannel_cred(clp, client, ses);
 	if (!cred) {
 		trace_nfsd_cb_setup_err(clp, -ENOMEM);
 		rpc_shutdown_client(client);
-		return -ENOMEM;
+		ret = -ENOMEM;
+		goto out_put_ses;
 	}
 
 	if (clp->cl_minorversion != 0)
@@ -1166,6 +1171,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 			    args.authflavor);
 	rcu_read_unlock();
 	return 0;
+out_put_ses:
+	if (clp->cl_minorversion != 0) {
+		rcu_assign_pointer(clp->cl_cb_session, NULL);
+		nfsd4_cb_put_session(ses);
+	}
+	return ret;
 }
 
 static void nfsd4_mark_cb_state(struct nfs4_client *clp, int newstate)
@@ -1529,11 +1540,15 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
 	 * kill the old client:
 	 */
 	if (clp->cl_cb_client) {
+		struct nfsd4_session *ses;
+
 		trace_nfsd_cb_bc_shutdown(clp, cb);
 		rpc_shutdown_client(clp->cl_cb_client);
 		clp->cl_cb_client = NULL;
 		put_cred(clp->cl_cb_cred);
 		clp->cl_cb_cred = NULL;
+		ses = rcu_replace_pointer(clp->cl_cb_session, NULL, true);
+		nfsd4_cb_put_session(ses);
 	}
 	if (clp->cl_cb_conn.cb_xprt) {
 		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2c26c6aaea93e3e1eb438e7e23dc881c7bf35fe2..59d3111f558396ec46f7d286b2c90500bda642d9 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2180,7 +2180,7 @@ static void __free_session(struct nfsd4_session *ses)
 {
 	free_session_slots(ses, 0);
 	xa_destroy(&ses->se_slots);
-	kfree(ses);
+	kfree_rcu(ses, se_rcu);
 }
 
 static void free_session(struct nfsd4_session *ses)
@@ -3283,7 +3283,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	clp->cl_time = ktime_get_boottime_seconds();
 	copy_verf(clp, verf);
 	memcpy(&clp->cl_addr, sa, sizeof(struct sockaddr_storage));
-	clp->cl_cb_session = NULL;
+	rcu_assign_pointer(clp->cl_cb_session, NULL);
 	clp->net = net;
 	clp->cl_nfsd_dentry = nfsd_client_mkdir(
 		nn, &clp->cl_nfsdfs,
@@ -4251,6 +4251,13 @@ nfsd4_destroy_session(struct svc_rqst *r, struct nfsd4_compound_state *cstate,
 	status = nfserr_wrong_cred;
 	if (!nfsd4_mach_creds_match(ses->se_client, r))
 		goto out_put_session;
+
+	/*
+	 * Is this session the backchannel session? Count an extra
+	 * reference if so.
+	 */
+	if (ses == rcu_access_pointer(ses->se_client->cl_cb_session))
+		ref_held_by_me++;
 	status = mark_session_dead_locked(ses, 1 + ref_held_by_me);
 	if (status)
 		goto out_put_session;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 79d985d2a656e1a5b22a6a9c88f309515725e847..0faa367c9fa3280fa4a8a982f974804bb89f2035 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -354,6 +354,7 @@ struct nfsd4_session {
 	u16			se_slot_gen;
 	bool			se_dead;
 	u32			se_target_maxslots;
+	struct rcu_head		se_rcu;
 };
 
 /* formatted contents of nfs4_sessionid */
@@ -465,7 +466,7 @@ struct nfs4_client {
 #define NFSD4_CB_FAULT		3
 	int			cl_cb_state;
 	struct nfsd4_callback	cl_cb_null;
-	struct nfsd4_session	*cl_cb_session;
+	struct nfsd4_session	__rcu *cl_cb_session;
 
 	/* for all client information that callback code might need: */
 	spinlock_t		cl_lock;

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/7] nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 1/7] nfsd: add routines to get/put session references for callbacks Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 2/7] nfsd: make clp->cl_cb_session be an RCU managed pointer Jeff Layton
@ 2025-01-29 13:39 ` Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 4/7] nfsd: overhaul CB_SEQUENCE error handling Jeff Layton
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

Once a callback workqueue job has completed, the cl_cb_session could
change to a completely different session, if there is more than one
callback in flight at a time.

Have the callback hold its own cb reference to the session, and fix the
slot acquisition routines to get/put a session reference. This ensures
that the call and reply handling are working with the same session.

In the event that rpc_prepare can't get a session reference, allow the
rpc_task to sleep until the session reference changes.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 67 +++++++++++++++++++++++++++++++++++++++++---------
 fs/nfsd/state.h        |  1 +
 fs/nfsd/trace.h        |  6 ++---
 3 files changed, 58 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index e55bf66a33d6efb56d2f75f0a49a60307e3807ac..9f4838a6d9c668cdf66a77793f63c064586f2b22 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -435,7 +435,7 @@ static void encode_cb_sequence4args(struct xdr_stream *xdr,
 				    const struct nfsd4_callback *cb,
 				    struct nfs4_cb_compound_hdr *hdr)
 {
-	struct nfsd4_session *session = cb->cb_clp->cl_cb_session;
+	struct nfsd4_session *session = cb->cb_ses;
 	__be32 *p;
 
 	if (hdr->minorversion == 0)
@@ -503,7 +503,7 @@ static void update_cb_slot_table(struct nfsd4_session *ses, u32 target)
 static int decode_cb_sequence4resok(struct xdr_stream *xdr,
 				    struct nfsd4_callback *cb)
 {
-	struct nfsd4_session *session = cb->cb_clp->cl_cb_session;
+	struct nfsd4_session *session = cb->cb_ses;
 	int status = -ESERVERFAULT;
 	__be32 *p;
 	u32 seqid, slotid, target;
@@ -1142,7 +1142,7 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 			return -EINVAL;
 		rcu_assign_pointer(clp->cl_cb_session, ses);
 		args.bc_xprt = conn->cb_xprt;
-		args.prognumber = clp->cl_cb_session->se_cb_prog;
+		args.prognumber = ses->se_cb_prog;
 		args.protocol = conn->cb_xprt->xpt_class->xcl_ident |
 				XPRT_TRANSPORT_BC;
 		args.authflavor = ses->se_cb_sec.flavor;
@@ -1161,9 +1161,10 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 		ret = -ENOMEM;
 		goto out_put_ses;
 	}
-
-	if (clp->cl_minorversion != 0)
+	if (clp->cl_minorversion != 0) {
 		clp->cl_cb_conn.cb_xprt = conn->cb_xprt;
+		rpc_wake_up(&clp->cl_cb_waitq);
+	}
 	clp->cl_cb_client = client;
 	clp->cl_cb_cred = cred;
 	rcu_read_lock();
@@ -1252,6 +1253,34 @@ void nfsd4_change_callback(struct nfs4_client *clp, struct nfs4_cb_conn *conn)
 	spin_unlock(&clp->cl_lock);
 }
 
+static bool grab_cb_ses(struct nfsd4_callback *cb)
+{
+	struct nfs4_client *clp = cb->cb_clp;
+	struct nfsd4_session *ses;
+	bool ret = false;
+
+	if (cb->cb_ses)
+		return true;
+
+	rcu_read_lock();
+	ses = rcu_dereference(clp->cl_cb_session);
+	if (ses && nfsd4_cb_get_session(ses)) {
+		cb->cb_ses = ses;
+		ret = true;
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
+static void put_cb_ses(struct nfsd4_callback *cb)
+{
+	if (cb->cb_ses) {
+		nfsd4_cb_put_session(cb->cb_ses);
+		cb->cb_ses = NULL;
+	}
+}
+
 static int grab_slot(struct nfsd4_session *ses)
 {
 	int idx;
@@ -1269,24 +1298,33 @@ static int grab_slot(struct nfsd4_session *ses)
 }
 
 /*
- * There's currently a single callback channel slot.
- * If the slot is available, then mark it busy.  Otherwise, set the
- * thread for sleeping on the callback RPC wait queue.
+ * Get both a session reference and a slot.
  */
 static bool nfsd41_cb_get_slot(struct nfsd4_callback *cb, struct rpc_task *task)
 {
 	struct nfs4_client *clp = cb->cb_clp;
-	struct nfsd4_session *ses = clp->cl_cb_session;
+	struct nfsd4_session *ses;
+
+	if (!grab_cb_ses(cb)) {
+		rpc_sleep_on(&clp->cl_cb_waitq, task, NULL);
+		if (!grab_cb_ses(cb))
+			return false;
+		rpc_wake_up_queued_task(&clp->cl_cb_waitq, task);
+	}
 
 	if (cb->cb_held_slot >= 0)
 		return true;
+
+	ses = cb->cb_ses;
 	cb->cb_held_slot = grab_slot(ses);
 	if (cb->cb_held_slot < 0) {
 		rpc_sleep_on(&clp->cl_cb_waitq, task, NULL);
 		/* Race breaker */
 		cb->cb_held_slot = grab_slot(ses);
-		if (cb->cb_held_slot < 0)
+		if (cb->cb_held_slot < 0) {
+			put_cb_ses(cb);
 			return false;
+		}
 		rpc_wake_up_queued_task(&clp->cl_cb_waitq, task);
 	}
 	return true;
@@ -1295,7 +1333,10 @@ static bool nfsd41_cb_get_slot(struct nfsd4_callback *cb, struct rpc_task *task)
 static void nfsd41_cb_release_slot(struct nfsd4_callback *cb)
 {
 	struct nfs4_client *clp = cb->cb_clp;
-	struct nfsd4_session *ses = clp->cl_cb_session;
+	struct nfsd4_session *ses = cb->cb_ses;
+
+	if (!ses)
+		return;
 
 	if (cb->cb_held_slot >= 0) {
 		spin_lock(&ses->se_lock);
@@ -1304,6 +1345,7 @@ static void nfsd41_cb_release_slot(struct nfsd4_callback *cb)
 		cb->cb_held_slot = -1;
 		rpc_wake_up_next(&clp->cl_cb_waitq);
 	}
+	put_cb_ses(cb);
 }
 
 static void nfsd41_destroy_cb(struct nfsd4_callback *cb)
@@ -1342,7 +1384,7 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
 static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
 {
 	struct nfs4_client *clp = cb->cb_clp;
-	struct nfsd4_session *session = clp->cl_cb_session;
+	struct nfsd4_session *session = cb->cb_ses;
 	bool ret = true;
 
 	if (!clp->cl_minorversion) {
@@ -1629,6 +1671,7 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 		const struct nfsd4_callback_ops *ops, enum nfsd4_cb_op op)
 {
 	cb->cb_clp = clp;
+	cb->cb_ses = NULL;
 	cb->cb_msg.rpc_proc = &nfs4_cb_procedures[op];
 	cb->cb_msg.rpc_argp = cb;
 	cb->cb_msg.rpc_resp = cb;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 0faa367c9fa3280fa4a8a982f974804bb89f2035..56fe34d8dd90344404512114113c00a027aeb6a4 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -66,6 +66,7 @@ typedef struct {
 
 struct nfsd4_callback {
 	struct nfs4_client *cb_clp;
+	struct nfsd4_session *cb_ses;
 	struct rpc_message cb_msg;
 	const struct nfsd4_callback_ops *cb_ops;
 	struct work_struct cb_work;
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index ad2c0c432d08705bcebf00f7309f19267afcae03..fff665bac3b252387f92139b5f868cf1b034d1c9 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1644,8 +1644,7 @@ TRACE_EVENT(nfsd_cb_seq_status,
 		__field(int, seq_status)
 	),
 	TP_fast_assign(
-		const struct nfs4_client *clp = cb->cb_clp;
-		const struct nfsd4_session *session = clp->cl_cb_session;
+		const struct nfsd4_session *session = cb->cb_ses;
 		const struct nfsd4_sessionid *sid =
 			(struct nfsd4_sessionid *)&session->se_sessionid;
 
@@ -1684,8 +1683,7 @@ TRACE_EVENT(nfsd_cb_free_slot,
 		__field(u32, slot_seqno)
 	),
 	TP_fast_assign(
-		const struct nfs4_client *clp = cb->cb_clp;
-		const struct nfsd4_session *session = clp->cl_cb_session;
+		const struct nfsd4_session *session = cb->cb_ses;
 		const struct nfsd4_sessionid *sid =
 			(struct nfsd4_sessionid *)&session->se_sessionid;
 

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/7] nfsd: overhaul CB_SEQUENCE error handling
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (2 preceding siblings ...)
  2025-01-29 13:39 ` [PATCH v2 3/7] nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session Jeff Layton
@ 2025-01-29 13:39 ` Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 5/7] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

Some of the existing error handling in nfsd4_cb_sequence_done is
incorrect. This patch first does some general cleanup and then reworks
some of the error handling cases.

There is only one case where we want to proceed with processing the rest
of the CB_COMPOUND, and that's when the cb_seq_status is 0. Make the
default return value be false, and only set it to true in that case.

Now that there is a clear record of the session used to handle the
CB_COMPOUND, we can take changes to the cl_cb_session into account
when reattempting a call.

When restarting the RPC (but not the entire callback), test
RPC_SIGNALLED(). If it has been, then fall through to restart the
callback instead of just restarting the RPC.

Whenever restarting the entire callback, release the slot and session,
in case there have been changes in the interim.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 78 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 58 insertions(+), 20 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 9f4838a6d9c668cdf66a77793f63c064586f2b22..e70a7a03933b1f8a48d31b0ef226c3f157d14ed1 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1381,11 +1381,16 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
 	rpc_call_start(task);
 }
 
+static bool cb_session_changed(struct nfsd4_callback *cb)
+{
+	return cb->cb_ses != rcu_access_pointer(cb->cb_clp->cl_cb_session);
+}
+
 static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
 {
 	struct nfs4_client *clp = cb->cb_clp;
 	struct nfsd4_session *session = cb->cb_ses;
-	bool ret = true;
+	bool ret = false;
 
 	if (!clp->cl_minorversion) {
 		/*
@@ -1418,11 +1423,16 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		 * (sequence ID, cached reply) MUST NOT change.
 		 */
 		++session->se_cb_seq_nr[cb->cb_held_slot];
+		ret = true;
 		break;
 	case -ESERVERFAULT:
+		/*
+		 * Call succeeded, but CB_SEQUENCE reply failed sanity checks.
+		 * The client has gone insane. Mark the BC faulty, since there
+		 * isn't much else we can do.
+		 */
 		++session->se_cb_seq_nr[cb->cb_held_slot];
 		nfsd4_mark_cb_fault(cb->cb_clp);
-		ret = false;
 		break;
 	case 1:
 		/*
@@ -1433,39 +1443,67 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		 */
 		fallthrough;
 	case -NFS4ERR_BADSESSION:
-		nfsd4_mark_cb_fault(cb->cb_clp);
-		ret = false;
+		/*
+		 * If the session hasn't changed, mark it faulty. Restart
+		 * the call.
+		 */
+		if (!cb_session_changed(cb))
+			nfsd4_mark_cb_fault(cb->cb_clp);
 		goto need_restart;
 	case -NFS4ERR_DELAY:
+		/*
+		 * If the rpc_clnt is being torn down, then we must restart
+		 * the call from scratch.
+		 */
+		if (RPC_SIGNALLED(task))
+			goto need_restart;
 		cb->cb_seq_status = 1;
-		if (!rpc_restart_call(task))
-			goto out;
-
-		rpc_delay(task, 2 * HZ);
+		if (rpc_restart_call(task))
+			rpc_delay(task, 2 * HZ);
 		return false;
-	case -NFS4ERR_BADSLOT:
-		goto retry_nowait;
 	case -NFS4ERR_SEQ_MISORDERED:
-		if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
-			session->se_cb_seq_nr[cb->cb_held_slot] = 1;
+		/*
+		 * Reattempt once with seq_nr 1. If that fails, treat this
+		 * like BADSLOT.
+		 */
+		if (!cb_session_changed(cb)) {
+			if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
+				session->se_cb_seq_nr[cb->cb_held_slot] = 1;
+				goto retry_nowait;
+			}
+		}
+		fallthrough;
+	case -NFS4ERR_BADSLOT:
+		/*
+		 * BADSLOT means that the client and server are out of sync
+		 * on the BC parameters. In this case, we want to mark the
+		 * backchannel faulty and then try it again, but _leak_ the
+		 * slot so no one uses it. If the callback session has
+		 * changed since then though, don't mark it.
+		 */
+		if (!cb_session_changed(cb)) {
+			nfsd4_mark_cb_fault(cb->cb_clp);
+			cb->cb_held_slot = -1;
 			goto retry_nowait;
 		}
-		break;
+		goto need_restart;
 	default:
 		nfsd4_mark_cb_fault(cb->cb_clp);
 	}
 	trace_nfsd_cb_free_slot(task, cb);
 	nfsd41_cb_release_slot(cb);
-
-	if (RPC_SIGNALLED(task))
-		goto need_restart;
-out:
 	return ret;
 retry_nowait:
-	if (rpc_restart_call_prepare(task))
-		ret = false;
-	goto out;
+	/*
+	 * RPC_SIGNALLED() means that the rpc_client is being torn down and
+	 * (possibly) recreated. Restart the call completely in that case.
+	 */
+	if (!RPC_SIGNALLED(task)) {
+		rpc_restart_call_prepare(task);
+		return false;
+	}
 need_restart:
+	nfsd41_cb_release_slot(cb);
 	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
 		trace_nfsd_cb_restart(clp, cb);
 		task->tk_status = 0;

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/7] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (3 preceding siblings ...)
  2025-01-29 13:39 ` [PATCH v2 4/7] nfsd: overhaul CB_SEQUENCE error handling Jeff Layton
@ 2025-01-29 13:39 ` Jeff Layton
  2025-01-29 13:39 ` [PATCH v2 6/7] nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done() Jeff Layton
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

This isn't needed.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index e70a7a03933b1f8a48d31b0ef226c3f157d14ed1..4d4e0c95a36724027a63b53487fd36260a9ab1cd 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -45,9 +45,6 @@
 #include "nfs4xdr_gen.h"
 
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
-
-static void nfsd4_mark_cb_fault(struct nfs4_client *clp);
-
 #define NFSPROC4_CB_NULL 0
 #define NFSPROC4_CB_COMPOUND 1
 

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/7] nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (4 preceding siblings ...)
  2025-01-29 13:39 ` [PATCH v2 5/7] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
@ 2025-01-29 13:39 ` Jeff Layton
  2025-01-29 13:40 ` [PATCH v2 7/7] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

It's a bit strange to call nfsd4_cb_sequence_done() on a callback with no
CB_SEQUENCE. Lift the handling of restarting a call into a new helper,
and move the handling of NFSv4.0 into nfsd4_cb_done().

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 53 ++++++++++++++++++++++++++------------------------
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 4d4e0c95a36724027a63b53487fd36260a9ab1cd..2cfff984b42f0ef7fe885d57d57b3d318ed966e0 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1383,27 +1383,22 @@ static bool cb_session_changed(struct nfsd4_callback *cb)
 	return cb->cb_ses != rcu_access_pointer(cb->cb_clp->cl_cb_session);
 }
 
-static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
+static void restart_callback(struct rpc_task *task, struct nfsd4_callback *cb)
 {
 	struct nfs4_client *clp = cb->cb_clp;
-	struct nfsd4_session *session = cb->cb_ses;
-	bool ret = false;
 
-	if (!clp->cl_minorversion) {
-		/*
-		 * If the backchannel connection was shut down while this
-		 * task was queued, we need to resubmit it after setting up
-		 * a new backchannel connection.
-		 *
-		 * Note that if we lost our callback connection permanently
-		 * the submission code will error out, so we don't need to
-		 * handle that case here.
-		 */
-		if (RPC_SIGNALLED(task))
-			goto need_restart;
-
-		return true;
+	nfsd41_cb_release_slot(cb);
+	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
+		trace_nfsd_cb_restart(clp, cb);
+		task->tk_status = 0;
+		cb->cb_need_restart = true;
 	}
+}
+
+static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
+{
+	struct nfsd4_session *session = cb->cb_ses;
+	bool ret = false;
 
 	if (cb->cb_held_slot < 0)
 		goto need_restart;
@@ -1493,19 +1488,14 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 retry_nowait:
 	/*
 	 * RPC_SIGNALLED() means that the rpc_client is being torn down and
-	 * (possibly) recreated. Restart the call completely in that case.
+	 * (possibly) recreated. Restart the whole callback in that case.
 	 */
 	if (!RPC_SIGNALLED(task)) {
 		rpc_restart_call_prepare(task);
 		return false;
 	}
 need_restart:
-	nfsd41_cb_release_slot(cb);
-	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
-		trace_nfsd_cb_restart(clp, cb);
-		task->tk_status = 0;
-		cb->cb_need_restart = true;
-	}
+	restart_callback(task, cb);
 	return false;
 }
 
@@ -1516,8 +1506,21 @@ static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
 
 	trace_nfsd_cb_rpc_done(clp);
 
-	if (!nfsd4_cb_sequence_done(task, cb))
+	if (!clp->cl_minorversion) {
+		/*
+		 * If the backchannel connection was shut down while this
+		 * task was queued, we need to resubmit it after setting up
+		 * a new backchannel connection.
+		 *
+		 * Note that if we lost our callback connection permanently
+		 * the submission code will error out, so we don't need to
+		 * handle that case here.
+		 */
+		if (RPC_SIGNALLED(task))
+			restart_callback(task, cb);
+	} else if (!nfsd4_cb_sequence_done(task, cb)) {
 		return;
+	}
 
 	if (cb->cb_status) {
 		WARN_ONCE(task->tk_status,

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/7] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (5 preceding siblings ...)
  2025-01-29 13:39 ` [PATCH v2 6/7] nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done() Jeff Layton
@ 2025-01-29 13:40 ` Jeff Layton
  2025-01-29 14:21 ` [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups J. Bruce Fields
  2025-01-29 14:22 ` Chuck Lever
  8 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 13:40 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel, Jeff Layton

These functions always return 1. Make them void return and fix up the
places that check the return code.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/nfs4proc.c           | 12 +++++-------
 fs/nfsd/nfs4callback.c      |  4 ++--
 include/linux/sunrpc/clnt.h |  4 ++--
 net/sunrpc/clnt.c           |  7 +++----
 4 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 405f17e6e0b45b26cebae06c5bbe932895af9a56..cda20bfeca56db1ef8c51e524d08908b93bfeba6 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -968,15 +968,13 @@ static int nfs41_sequence_process(struct rpc_task *task,
 retry_new_seq:
 	++slot->seq_nr;
 retry_nowait:
-	if (rpc_restart_call_prepare(task)) {
-		nfs41_sequence_free_slot(res);
-		task->tk_status = 0;
-		ret = 0;
-	}
+	rpc_restart_call_prepare(task);
+	nfs41_sequence_free_slot(res);
+	task->tk_status = 0;
+	ret = 0;
 	goto out;
 out_retry:
-	if (!rpc_restart_call(task))
-		goto out;
+	rpc_restart_call(task);
 	rpc_delay(task, NFS4_POLL_RETRY_MAX);
 	return 0;
 }
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 2cfff984b42f0ef7fe885d57d57b3d318ed966e0..d101f18012bfe89169d8218b46183cab6c06d1bf 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1450,8 +1450,8 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
 		if (RPC_SIGNALLED(task))
 			goto need_restart;
 		cb->cb_seq_status = 1;
-		if (rpc_restart_call(task))
-			rpc_delay(task, 2 * HZ);
+		rpc_restart_call(task);
+		rpc_delay(task, 2 * HZ);
 		return false;
 	case -NFS4ERR_SEQ_MISORDERED:
 		/*
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 5321585c778fcc1fef0e0420cb481786c02a7aac..e56f15c97fa24c735090c21c51ef312bfd877cfd 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -213,8 +213,8 @@ int		rpc_call_sync(struct rpc_clnt *clnt,
 			      const struct rpc_message *msg, int flags);
 struct rpc_task *rpc_call_null(struct rpc_clnt *clnt, struct rpc_cred *cred,
 			       int flags);
-int		rpc_restart_call_prepare(struct rpc_task *);
-int		rpc_restart_call(struct rpc_task *);
+void		rpc_restart_call_prepare(struct rpc_task *task);
+void		rpc_restart_call(struct rpc_task *task);
 void		rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int);
 struct net *	rpc_net_ns(struct rpc_clnt *);
 size_t		rpc_max_payload(struct rpc_clnt *);
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 0090162ee8c350568c91f1bcd951675ac3ae141c..3d2989120599ccee32e8827b1790d4be7d7a565a 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1670,20 +1670,19 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
 }
 EXPORT_SYMBOL_GPL(rpc_force_rebind);
 
-static int
+static void
 __rpc_restart_call(struct rpc_task *task, void (*action)(struct rpc_task *))
 {
 	task->tk_status = 0;
 	task->tk_rpc_status = 0;
 	task->tk_action = action;
-	return 1;
 }
 
 /*
  * Restart an (async) RPC call. Usually called from within the
  * exit handler.
  */
-int
+void
 rpc_restart_call(struct rpc_task *task)
 {
 	return __rpc_restart_call(task, call_start);
@@ -1694,7 +1693,7 @@ EXPORT_SYMBOL_GPL(rpc_restart_call);
  * Restart an (async) RPC call from the call_prepare state.
  * Usually called from within the exit handler.
  */
-int
+void
 rpc_restart_call_prepare(struct rpc_task *task)
 {
 	if (task->tk_ops->rpc_call_prepare != NULL)

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (6 preceding siblings ...)
  2025-01-29 13:40 ` [PATCH v2 7/7] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton
@ 2025-01-29 14:21 ` J. Bruce Fields
  2025-01-29 14:27   ` Jeff Layton
  2025-01-29 14:22 ` Chuck Lever
  8 siblings, 1 reply; 17+ messages in thread
From: J. Bruce Fields @ 2025-01-29 14:21 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-kernel

On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> While looking over the CB_SEQUENCE error handling, I discovered that
> callbacks don't hold a reference to a session, and the
> clp->cl_cb_session could easily change between request and response.
> If that happens at an inopportune time, there could be UAFs or weird
> slot/sequence handling problems.

Nobody should place too much faith in my understanding of how any of
this works at this point, but....  My vague memory is that a lot of
things are serialized simply by being run only on the cl_callback_wq.
Modifying clp->cl_cb_session is such a thing.

--b.

> This series changes the nfsd4_session to be RCU-freed, and then adds a
> new method of session refcounting that is compatible with the old.
> nfsd4_callback RPCs will now hold a lightweight reference to the session
> in addition to the slot. Then, all of the callback handling is switched
> to use that session instead of dereferencing clp->cb_cb_session.
> I've also reworked the error handling in nfsd4_cb_sequence_done()
> based on review comments, and lifted the v4.0 handing out of that
> function.
> 
> This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> much any of that stresses the backchannel's error handling.
> 
> These should probably go in via Chuck's tree, but the last patch touches
> some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> from Trond and/or Anna on that one.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> Changes in v2:
> - make nfsd4_session be RCU-freed
> - change code to keep reference to session over callback RPCs
> - rework error handling in nfsd4_cb_sequence_done()
> - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> 
> ---
> Jeff Layton (7):
>       nfsd: add routines to get/put session references for callbacks
>       nfsd: make clp->cl_cb_session be an RCU managed pointer
>       nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
>       nfsd: overhaul CB_SEQUENCE error handling
>       nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
>       nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
>       sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> 
>  fs/nfs/nfs4proc.c           |  12 ++-
>  fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
>  fs/nfsd/nfs4state.c         |  43 ++++++++-
>  fs/nfsd/state.h             |   6 +-
>  fs/nfsd/trace.h             |   6 +-
>  include/linux/sunrpc/clnt.h |   4 +-
>  net/sunrpc/clnt.c           |   7 +-
>  7 files changed, 210 insertions(+), 80 deletions(-)
> ---
> base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> change-id: 20250123-nfsd-6-14-b0797e385dc0
> 
> Best regards,
> -- 
> Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
                   ` (7 preceding siblings ...)
  2025-01-29 14:21 ` [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups J. Bruce Fields
@ 2025-01-29 14:22 ` Chuck Lever
  2025-01-29 14:39   ` Jeff Layton
  8 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2025-01-29 14:22 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel

On 1/29/25 8:39 AM, Jeff Layton wrote:
> While looking over the CB_SEQUENCE error handling, I discovered that
> callbacks don't hold a reference to a session, and the
> clp->cl_cb_session could easily change between request and response.
> If that happens at an inopportune time, there could be UAFs or weird
> slot/sequence handling problems.
> 
> This series changes the nfsd4_session to be RCU-freed, and then adds a
> new method of session refcounting that is compatible with the old.
> nfsd4_callback RPCs will now hold a lightweight reference to the session
> in addition to the slot. Then, all of the callback handling is switched
> to use that session instead of dereferencing clp->cb_cb_session.
> I've also reworked the error handling in nfsd4_cb_sequence_done()
> based on review comments, and lifted the v4.0 handing out of that
> function.
> 
> This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> much any of that stresses the backchannel's error handling.
> 
> These should probably go in via Chuck's tree, but the last patch touches
> some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> from Trond and/or Anna on that one.

A few initial reactions as I get to know this new revision.

- I have no objection to 7/7, but it does seem a bit out of place in
   this series. Maybe hold it back and send it separately after this
   series goes in?

- The fact that the session can be replaced while a callback operation
   is pending suggests that, IIUC, decode_cb_sequence() sanity checking
   will fail in such cases, and it's not because of a bug in the client's
   callback server. Or maybe I'm overthinking it - that is exactly what
   you are trying to prevent?

- In 1/7, the kdoc comment for "get" should enumerate the return values
   and their meanings.

- cb_session_changed => nfsd4_cb_session_changed.

- I'm still not convinced it's wise to bump the slot number in the
   ESERVERFAULT case.

- IMO the cb_sequence_done rework should rename "need_restart" to
   "need_requeue" or just "requeue" -- there is a call to
   rpc_restart_call_prepare() here that is a little confusing and
   could do with some disambiguation.


> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> Changes in v2:
> - make nfsd4_session be RCU-freed
> - change code to keep reference to session over callback RPCs
> - rework error handling in nfsd4_cb_sequence_done()
> - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> 
> ---
> Jeff Layton (7):
>        nfsd: add routines to get/put session references for callbacks
>        nfsd: make clp->cl_cb_session be an RCU managed pointer
>        nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
>        nfsd: overhaul CB_SEQUENCE error handling
>        nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
>        nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
>        sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> 
>   fs/nfs/nfs4proc.c           |  12 ++-
>   fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
>   fs/nfsd/nfs4state.c         |  43 ++++++++-
>   fs/nfsd/state.h             |   6 +-
>   fs/nfsd/trace.h             |   6 +-
>   include/linux/sunrpc/clnt.h |   4 +-
>   net/sunrpc/clnt.c           |   7 +-
>   7 files changed, 210 insertions(+), 80 deletions(-)
> ---
> base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> change-id: 20250123-nfsd-6-14-b0797e385dc0
> 
> Best regards,


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 14:21 ` [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups J. Bruce Fields
@ 2025-01-29 14:27   ` Jeff Layton
  2025-01-29 14:40     ` J. Bruce Fields
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 14:27 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-kernel

On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
> On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> > While looking over the CB_SEQUENCE error handling, I discovered that
> > callbacks don't hold a reference to a session, and the
> > clp->cl_cb_session could easily change between request and response.
> > If that happens at an inopportune time, there could be UAFs or weird
> > slot/sequence handling problems.
> 
> Nobody should place too much faith in my understanding of how any of
> this works at this point, but....  My vague memory is that a lot of
> things are serialized simply by being run only on the cl_callback_wq.
> Modifying clp->cl_cb_session is such a thing.
> 
> 

It is, but that doesn't save us here. The workqueue is just there to
submit jobs to the RPC client. Once that happens they are run via
rpciod's workqueue (and in parallel with one another since they're
async RPC calls).

So, it's possible that while we're waiting for a response from one
callback, another is submitted, and that workqueue job changes the
clp->cl_cb_session.

Thanks for taking a look, Bruce!

> 
> > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > new method of session refcounting that is compatible with the old.
> > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > in addition to the slot. Then, all of the callback handling is switched
> > to use that session instead of dereferencing clp->cb_cb_session.
> > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > based on review comments, and lifted the v4.0 handing out of that
> > function.
> > 
> > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > much any of that stresses the backchannel's error handling.
> > 
> > These should probably go in via Chuck's tree, but the last patch touches
> > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > from Trond and/or Anna on that one.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > Changes in v2:
> > - make nfsd4_session be RCU-freed
> > - change code to keep reference to session over callback RPCs
> > - rework error handling in nfsd4_cb_sequence_done()
> > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> > 
> > ---
> > Jeff Layton (7):
> >       nfsd: add routines to get/put session references for callbacks
> >       nfsd: make clp->cl_cb_session be an RCU managed pointer
> >       nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> >       nfsd: overhaul CB_SEQUENCE error handling
> >       nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> >       nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> >       sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > 
> >  fs/nfs/nfs4proc.c           |  12 ++-
> >  fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
> >  fs/nfsd/nfs4state.c         |  43 ++++++++-
> >  fs/nfsd/state.h             |   6 +-
> >  fs/nfsd/trace.h             |   6 +-
> >  include/linux/sunrpc/clnt.h |   4 +-
> >  net/sunrpc/clnt.c           |   7 +-
> >  7 files changed, 210 insertions(+), 80 deletions(-)
> > ---
> > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > 
> > Best regards,
> > -- 
> > Jeff Layton <jlayton@kernel.org>

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 14:22 ` Chuck Lever
@ 2025-01-29 14:39   ` Jeff Layton
  2025-01-29 14:42     ` Chuck Lever
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 14:39 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel

On Wed, 2025-01-29 at 09:22 -0500, Chuck Lever wrote:
> On 1/29/25 8:39 AM, Jeff Layton wrote:
> > While looking over the CB_SEQUENCE error handling, I discovered that
> > callbacks don't hold a reference to a session, and the
> > clp->cl_cb_session could easily change between request and response.
> > If that happens at an inopportune time, there could be UAFs or weird
> > slot/sequence handling problems.
> > 
> > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > new method of session refcounting that is compatible with the old.
> > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > in addition to the slot. Then, all of the callback handling is switched
> > to use that session instead of dereferencing clp->cb_cb_session.
> > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > based on review comments, and lifted the v4.0 handing out of that
> > function.
> > 
> > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > much any of that stresses the backchannel's error handling.
> > 
> > These should probably go in via Chuck's tree, but the last patch touches
> > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > from Trond and/or Anna on that one.
> 
> A few initial reactions as I get to know this new revision.
> 
> - I have no objection to 7/7, but it does seem a bit out of place in
>    this series. Maybe hold it back and send it separately after this
>    series goes in?
> 
> - The fact that the session can be replaced while a callback operation
>    is pending suggests that, IIUC, decode_cb_sequence() sanity checking
>    will fail in such cases, and it's not because of a bug in the client's
>    callback server. Or maybe I'm overthinking it - that is exactly what
>    you are trying to prevent?
> 

That's the best-case scenario, but callbacks can run at any time. If
this happens at the wrong time this could crash or cause more subtle
problems than just a spurious ESERVERFAULT. IOW, we could pass
decode_cb_sequence(), then the pointer changes and then
nfsd4_cb_sequence_done() ends up working with a different session.

> - In 1/7, the kdoc comment for "get" should enumerate the return values
>    and their meanings.
> 

Ack

> - cb_session_changed => nfsd4_cb_session_changed.
> 

Ack

> - I'm still not convinced it's wise to bump the slot number in the
>    ESERVERFAULT case.
> 

It's debatable for sure. The client _did_ respond with NFS4_OK in this
case, but it is a bit sketchy since something else didn't match. I'm
fine with removing that seq bump if you prefer.

> - IMO the cb_sequence_done rework should rename "need_restart" to
>    "need_requeue" or just "requeue" -- there is a call to
>    rpc_restart_call_prepare() here that is a little confusing and
>    could do with some disambiguation.
> 

Good point. I'll change that too.

Thanks for the review!

> 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > Changes in v2:
> > - make nfsd4_session be RCU-freed
> > - change code to keep reference to session over callback RPCs
> > - rework error handling in nfsd4_cb_sequence_done()
> > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> > 
> > ---
> > Jeff Layton (7):
> >        nfsd: add routines to get/put session references for callbacks
> >        nfsd: make clp->cl_cb_session be an RCU managed pointer
> >        nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> >        nfsd: overhaul CB_SEQUENCE error handling
> >        nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> >        nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> >        sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > 
> >   fs/nfs/nfs4proc.c           |  12 ++-
> >   fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
> >   fs/nfsd/nfs4state.c         |  43 ++++++++-
> >   fs/nfsd/state.h             |   6 +-
> >   fs/nfsd/trace.h             |   6 +-
> >   include/linux/sunrpc/clnt.h |   4 +-
> >   net/sunrpc/clnt.c           |   7 +-
> >   7 files changed, 210 insertions(+), 80 deletions(-)
> > ---
> > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > 
> > Best regards,
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 14:27   ` Jeff Layton
@ 2025-01-29 14:40     ` J. Bruce Fields
  2025-01-29 15:01       ` Jeff Layton
  0 siblings, 1 reply; 17+ messages in thread
From: J. Bruce Fields @ 2025-01-29 14:40 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-kernel

On Wed, Jan 29, 2025 at 09:27:02AM -0500, Jeff Layton wrote:
> On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
> > On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> > > While looking over the CB_SEQUENCE error handling, I discovered that
> > > callbacks don't hold a reference to a session, and the
> > > clp->cl_cb_session could easily change between request and response.
> > > If that happens at an inopportune time, there could be UAFs or weird
> > > slot/sequence handling problems.
> > 
> > Nobody should place too much faith in my understanding of how any of
> > this works at this point, but....  My vague memory is that a lot of
> > things are serialized simply by being run only on the cl_callback_wq.
> > Modifying clp->cl_cb_session is such a thing.
> 
> It is, but that doesn't save us here. The workqueue is just there to
> submit jobs to the RPC client. Once that happens they are run via
> rpciod's workqueue (and in parallel with one another since they're
> async RPC calls).
> 
> So, it's possible that while we're waiting for a response from one
> callback, another is submitted, and that workqueue job changes the
> clp->cl_cb_session.

I think it calls rpc_shutdown_client() before changing
clp->cl_cb_session.

(Though I'm not sure whether rpc_shutdown_client guarantees that all rpc
processing for the client is completed before returning?)

--b.

> 
> Thanks for taking a look, Bruce!
> 
> > 
> > > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > > new method of session refcounting that is compatible with the old.
> > > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > > in addition to the slot. Then, all of the callback handling is switched
> > > to use that session instead of dereferencing clp->cb_cb_session.
> > > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > > based on review comments, and lifted the v4.0 handing out of that
> > > function.
> > > 
> > > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > > much any of that stresses the backchannel's error handling.
> > > 
> > > These should probably go in via Chuck's tree, but the last patch touches
> > > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > > from Trond and/or Anna on that one.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > > Changes in v2:
> > > - make nfsd4_session be RCU-freed
> > > - change code to keep reference to session over callback RPCs
> > > - rework error handling in nfsd4_cb_sequence_done()
> > > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> > > 
> > > ---
> > > Jeff Layton (7):
> > >       nfsd: add routines to get/put session references for callbacks
> > >       nfsd: make clp->cl_cb_session be an RCU managed pointer
> > >       nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> > >       nfsd: overhaul CB_SEQUENCE error handling
> > >       nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> > >       nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > >       sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > > 
> > >  fs/nfs/nfs4proc.c           |  12 ++-
> > >  fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
> > >  fs/nfsd/nfs4state.c         |  43 ++++++++-
> > >  fs/nfsd/state.h             |   6 +-
> > >  fs/nfsd/trace.h             |   6 +-
> > >  include/linux/sunrpc/clnt.h |   4 +-
> > >  net/sunrpc/clnt.c           |   7 +-
> > >  7 files changed, 210 insertions(+), 80 deletions(-)
> > > ---
> > > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > > 
> > > Best regards,
> > > -- 
> > > Jeff Layton <jlayton@kernel.org>
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 14:39   ` Jeff Layton
@ 2025-01-29 14:42     ` Chuck Lever
  0 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2025-01-29 14:42 UTC (permalink / raw)
  To: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	J. Bruce Fields, Kinglong Mee, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, linux-kernel

On 1/29/25 9:39 AM, Jeff Layton wrote:
> On Wed, 2025-01-29 at 09:22 -0500, Chuck Lever wrote:
>> On 1/29/25 8:39 AM, Jeff Layton wrote:
>>> While looking over the CB_SEQUENCE error handling, I discovered that
>>> callbacks don't hold a reference to a session, and the
>>> clp->cl_cb_session could easily change between request and response.
>>> If that happens at an inopportune time, there could be UAFs or weird
>>> slot/sequence handling problems.
>>>
>>> This series changes the nfsd4_session to be RCU-freed, and then adds a
>>> new method of session refcounting that is compatible with the old.
>>> nfsd4_callback RPCs will now hold a lightweight reference to the session
>>> in addition to the slot. Then, all of the callback handling is switched
>>> to use that session instead of dereferencing clp->cb_cb_session.
>>> I've also reworked the error handling in nfsd4_cb_sequence_done()
>>> based on review comments, and lifted the v4.0 handing out of that
>>> function.
>>>
>>> This passes pynfs, nfstests, and fstests for me, but I'm not sure how
>>> much any of that stresses the backchannel's error handling.
>>>
>>> These should probably go in via Chuck's tree, but the last patch touches
>>> some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
>>> from Trond and/or Anna on that one.
>>
>> A few initial reactions as I get to know this new revision.
>>
>> - I have no objection to 7/7, but it does seem a bit out of place in
>>     this series. Maybe hold it back and send it separately after this
>>     series goes in?
>>
>> - The fact that the session can be replaced while a callback operation
>>     is pending suggests that, IIUC, decode_cb_sequence() sanity checking
>>     will fail in such cases, and it's not because of a bug in the client's
>>     callback server. Or maybe I'm overthinking it - that is exactly what
>>     you are trying to prevent?
>>
> 
> That's the best-case scenario, but callbacks can run at any time. If
> this happens at the wrong time this could crash or cause more subtle
> problems than just a spurious ESERVERFAULT. IOW, we could pass
> decode_cb_sequence(), then the pointer changes and then
> nfsd4_cb_sequence_done() ends up working with a different session.
> 
>> - In 1/7, the kdoc comment for "get" should enumerate the return values
>>     and their meanings.
>>
> 
> Ack
> 
>> - cb_session_changed => nfsd4_cb_session_changed.
>>
> 
> Ack
> 
>> - I'm still not convinced it's wise to bump the slot number in the
>>     ESERVERFAULT case.
>>
> 
> It's debatable for sure. The client _did_ respond with NFS4_OK in this
> case, but it is a bit sketchy since something else didn't match. I'm
> fine with removing that seq bump if you prefer.

I'd remove it: if the session/slot/seq number don't match, the NFS4_OK
is pretty meaningless.

Flag a session fault, and requeue the RPC.


>> - IMO the cb_sequence_done rework should rename "need_restart" to
>>     "need_requeue" or just "requeue" -- there is a call to
>>     rpc_restart_call_prepare() here that is a little confusing and
>>     could do with some disambiguation.
>>
> 
> Good point. I'll change that too.
> 
> Thanks for the review!
> 
>>
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>> Changes in v2:
>>> - make nfsd4_session be RCU-freed
>>> - change code to keep reference to session over callback RPCs
>>> - rework error handling in nfsd4_cb_sequence_done()
>>> - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
>>> - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
>>>
>>> ---
>>> Jeff Layton (7):
>>>         nfsd: add routines to get/put session references for callbacks
>>>         nfsd: make clp->cl_cb_session be an RCU managed pointer
>>>         nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
>>>         nfsd: overhaul CB_SEQUENCE error handling
>>>         nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
>>>         nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
>>>         sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
>>>
>>>    fs/nfs/nfs4proc.c           |  12 ++-
>>>    fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
>>>    fs/nfsd/nfs4state.c         |  43 ++++++++-
>>>    fs/nfsd/state.h             |   6 +-
>>>    fs/nfsd/trace.h             |   6 +-
>>>    include/linux/sunrpc/clnt.h |   4 +-
>>>    net/sunrpc/clnt.c           |   7 +-
>>>    7 files changed, 210 insertions(+), 80 deletions(-)
>>> ---
>>> base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
>>> change-id: 20250123-nfsd-6-14-b0797e385dc0
>>>
>>> Best regards,
>>
>>
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 14:40     ` J. Bruce Fields
@ 2025-01-29 15:01       ` Jeff Layton
  2025-01-29 15:09         ` Chuck Lever
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2025-01-29 15:01 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-kernel

On Wed, 2025-01-29 at 09:40 -0500, J. Bruce Fields wrote:
> On Wed, Jan 29, 2025 at 09:27:02AM -0500, Jeff Layton wrote:
> > On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
> > > On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> > > > While looking over the CB_SEQUENCE error handling, I discovered that
> > > > callbacks don't hold a reference to a session, and the
> > > > clp->cl_cb_session could easily change between request and response.
> > > > If that happens at an inopportune time, there could be UAFs or weird
> > > > slot/sequence handling problems.
> > > 
> > > Nobody should place too much faith in my understanding of how any of
> > > this works at this point, but....  My vague memory is that a lot of
> > > things are serialized simply by being run only on the cl_callback_wq.
> > > Modifying clp->cl_cb_session is such a thing.
> > 
> > It is, but that doesn't save us here. The workqueue is just there to
> > submit jobs to the RPC client. Once that happens they are run via
> > rpciod's workqueue (and in parallel with one another since they're
> > async RPC calls).
> > 
> > So, it's possible that while we're waiting for a response from one
> > callback, another is submitted, and that workqueue job changes the
> > clp->cl_cb_session.
> 
> I think it calls rpc_shutdown_client() before changing
> clp->cl_cb_session.
> 

It does, but the cl_cb_session doesn't carry a reference. My worry was
that the client could call a DESTROY_SESSION at any time.

Now that I look though, you may be right that that's enough to ensure
it because nfsd4_destroy_session() calls nfsd4_probe_callback_sync()
before putting the session reference.

Still, that is a lot of reliance on these things happening in a
particular order.

> (Though I'm not sure whether rpc_shutdown_client guarantees that all rpc
> processing for the client is completed before returning?)
> 

FWIW, it does wait for them to be killed:

        while (!list_empty(&clnt->cl_tasks)) {
                rpc_killall_tasks(clnt);
                wait_event_timeout(destroy_wait,
                        list_empty(&clnt->cl_tasks), 1*HZ);
        }

I'm not crazy about the fact that it does that synchronously in the
workqueue job, but I guess not much else can be happening with
callbacks until this completes.

> --b.
> 
> > 
> > Thanks for taking a look, Bruce!
> > 
> > > 
> > > > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > > > new method of session refcounting that is compatible with the old.
> > > > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > > > in addition to the slot. Then, all of the callback handling is switched
> > > > to use that session instead of dereferencing clp->cb_cb_session.
> > > > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > > > based on review comments, and lifted the v4.0 handing out of that
> > > > function.
> > > > 
> > > > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > > > much any of that stresses the backchannel's error handling.
> > > > 
> > > > These should probably go in via Chuck's tree, but the last patch touches
> > > > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > > > from Trond and/or Anna on that one.
> > > > 
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > > Changes in v2:
> > > > - make nfsd4_session be RCU-freed
> > > > - change code to keep reference to session over callback RPCs
> > > > - rework error handling in nfsd4_cb_sequence_done()
> > > > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> > > > 
> > > > ---
> > > > Jeff Layton (7):
> > > >       nfsd: add routines to get/put session references for callbacks
> > > >       nfsd: make clp->cl_cb_session be an RCU managed pointer
> > > >       nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> > > >       nfsd: overhaul CB_SEQUENCE error handling
> > > >       nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> > > >       nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > >       sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > > > 
> > > >  fs/nfs/nfs4proc.c           |  12 ++-
> > > >  fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
> > > >  fs/nfsd/nfs4state.c         |  43 ++++++++-
> > > >  fs/nfsd/state.h             |   6 +-
> > > >  fs/nfsd/trace.h             |   6 +-
> > > >  include/linux/sunrpc/clnt.h |   4 +-
> > > >  net/sunrpc/clnt.c           |   7 +-
> > > >  7 files changed, 210 insertions(+), 80 deletions(-)
> > > > ---
> > > > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > > > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > > > 
> > > > Best regards,
> > > > -- 
> > > > Jeff Layton <jlayton@kernel.org>
> > 
> > -- 
> > Jeff Layton <jlayton@kernel.org>

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 15:01       ` Jeff Layton
@ 2025-01-29 15:09         ` Chuck Lever
  2025-01-29 15:25           ` J. Bruce Fields
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2025-01-29 15:09 UTC (permalink / raw)
  To: Jeff Layton, J. Bruce Fields
  Cc: Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey, Kinglong Mee,
	Trond Myklebust, Anna Schumaker, linux-nfs, linux-kernel

On 1/29/25 10:01 AM, Jeff Layton wrote:
> On Wed, 2025-01-29 at 09:40 -0500, J. Bruce Fields wrote:
>> On Wed, Jan 29, 2025 at 09:27:02AM -0500, Jeff Layton wrote:
>>> On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
>>>> On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
>>>>> While looking over the CB_SEQUENCE error handling, I discovered that
>>>>> callbacks don't hold a reference to a session, and the
>>>>> clp->cl_cb_session could easily change between request and response.
>>>>> If that happens at an inopportune time, there could be UAFs or weird
>>>>> slot/sequence handling problems.
>>>>
>>>> Nobody should place too much faith in my understanding of how any of
>>>> this works at this point, but....  My vague memory is that a lot of
>>>> things are serialized simply by being run only on the cl_callback_wq.
>>>> Modifying clp->cl_cb_session is such a thing.
>>>
>>> It is, but that doesn't save us here. The workqueue is just there to
>>> submit jobs to the RPC client. Once that happens they are run via
>>> rpciod's workqueue (and in parallel with one another since they're
>>> async RPC calls).
>>>
>>> So, it's possible that while we're waiting for a response from one
>>> callback, another is submitted, and that workqueue job changes the
>>> clp->cl_cb_session.
>>
>> I think it calls rpc_shutdown_client() before changing
>> clp->cl_cb_session.
>>
> 
> It does, but the cl_cb_session doesn't carry a reference. My worry was
> that the client could call a DESTROY_SESSION at any time.
> 
> Now that I look though, you may be right that that's enough to ensure
> it because nfsd4_destroy_session() calls nfsd4_probe_callback_sync()
> before putting the session reference.
> 
> Still, that is a lot of reliance on these things happening in a
> particular order.
> 
>> (Though I'm not sure whether rpc_shutdown_client guarantees that all rpc
>> processing for the client is completed before returning?)
>>
> 
> FWIW, it does wait for them to be killed:
> 
>          while (!list_empty(&clnt->cl_tasks)) {
>                  rpc_killall_tasks(clnt);
>                  wait_event_timeout(destroy_wait,
>                          list_empty(&clnt->cl_tasks), 1*HZ);
>          }
> 
> I'm not crazy about the fact that it does that synchronously in the
> workqueue job, but I guess not much else can be happening with
> callbacks until this completes.

Bruce, note rpc_shutdown_client() can block indefinitely due to
bugs in NFSD's callback completion handlers. That's one of the
main reasons this is getting attention right not.


>>>>> This series changes the nfsd4_session to be RCU-freed, and then adds a
>>>>> new method of session refcounting that is compatible with the old.
>>>>> nfsd4_callback RPCs will now hold a lightweight reference to the session
>>>>> in addition to the slot. Then, all of the callback handling is switched
>>>>> to use that session instead of dereferencing clp->cb_cb_session.
>>>>> I've also reworked the error handling in nfsd4_cb_sequence_done()
>>>>> based on review comments, and lifted the v4.0 handing out of that
>>>>> function.
>>>>>
>>>>> This passes pynfs, nfstests, and fstests for me, but I'm not sure how
>>>>> much any of that stresses the backchannel's error handling.
>>>>>
>>>>> These should probably go in via Chuck's tree, but the last patch touches
>>>>> some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
>>>>> from Trond and/or Anna on that one.
>>>>>
>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>>> ---
>>>>> Changes in v2:
>>>>> - make nfsd4_session be RCU-freed
>>>>> - change code to keep reference to session over callback RPCs
>>>>> - rework error handling in nfsd4_cb_sequence_done()
>>>>> - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
>>>>> - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
>>>>>
>>>>> ---
>>>>> Jeff Layton (7):
>>>>>        nfsd: add routines to get/put session references for callbacks
>>>>>        nfsd: make clp->cl_cb_session be an RCU managed pointer
>>>>>        nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
>>>>>        nfsd: overhaul CB_SEQUENCE error handling
>>>>>        nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
>>>>>        nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
>>>>>        sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
>>>>>
>>>>>   fs/nfs/nfs4proc.c           |  12 ++-
>>>>>   fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
>>>>>   fs/nfsd/nfs4state.c         |  43 ++++++++-
>>>>>   fs/nfsd/state.h             |   6 +-
>>>>>   fs/nfsd/trace.h             |   6 +-
>>>>>   include/linux/sunrpc/clnt.h |   4 +-
>>>>>   net/sunrpc/clnt.c           |   7 +-
>>>>>   7 files changed, 210 insertions(+), 80 deletions(-)
>>>>> ---
>>>>> base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
>>>>> change-id: 20250123-nfsd-6-14-b0797e385dc0
>>>>>
>>>>> Best regards,
>>>>> -- 
>>>>> Jeff Layton <jlayton@kernel.org>
>>>
>>> -- 
>>> Jeff Layton <jlayton@kernel.org>
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
  2025-01-29 15:09         ` Chuck Lever
@ 2025-01-29 15:25           ` J. Bruce Fields
  0 siblings, 0 replies; 17+ messages in thread
From: J. Bruce Fields @ 2025-01-29 15:25 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Kinglong Mee, Trond Myklebust, Anna Schumaker, linux-nfs,
	linux-kernel

On Wed, Jan 29, 2025 at 10:09:14AM -0500, Chuck Lever wrote:
> On 1/29/25 10:01 AM, Jeff Layton wrote:
> > On Wed, 2025-01-29 at 09:40 -0500, J. Bruce Fields wrote:
> > > On Wed, Jan 29, 2025 at 09:27:02AM -0500, Jeff Layton wrote:
> > > > On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
> > > > > On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> > > > > > While looking over the CB_SEQUENCE error handling, I discovered that
> > > > > > callbacks don't hold a reference to a session, and the
> > > > > > clp->cl_cb_session could easily change between request and response.
> > > > > > If that happens at an inopportune time, there could be UAFs or weird
> > > > > > slot/sequence handling problems.
> > > > > 
> > > > > Nobody should place too much faith in my understanding of how any of
> > > > > this works at this point, but....  My vague memory is that a lot of
> > > > > things are serialized simply by being run only on the cl_callback_wq.
> > > > > Modifying clp->cl_cb_session is such a thing.
> > > > 
> > > > It is, but that doesn't save us here. The workqueue is just there to
> > > > submit jobs to the RPC client. Once that happens they are run via
> > > > rpciod's workqueue (and in parallel with one another since they're
> > > > async RPC calls).
> > > > 
> > > > So, it's possible that while we're waiting for a response from one
> > > > callback, another is submitted, and that workqueue job changes the
> > > > clp->cl_cb_session.
> > > 
> > > I think it calls rpc_shutdown_client() before changing
> > > clp->cl_cb_session.
> > > 
> > 
> > It does, but the cl_cb_session doesn't carry a reference. My worry was
> > that the client could call a DESTROY_SESSION at any time.
> > 
> > Now that I look though, you may be right that that's enough to ensure
> > it because nfsd4_destroy_session() calls nfsd4_probe_callback_sync()
> > before putting the session reference.
> > 
> > Still, that is a lot of reliance on these things happening in a
> > particular order.
> > 
> > > (Though I'm not sure whether rpc_shutdown_client guarantees that all rpc
> > > processing for the client is completed before returning?)
> > > 
> > 
> > FWIW, it does wait for them to be killed:
> > 
> >          while (!list_empty(&clnt->cl_tasks)) {
> >                  rpc_killall_tasks(clnt);
> >                  wait_event_timeout(destroy_wait,
> >                          list_empty(&clnt->cl_tasks), 1*HZ);
> >          }
> > 
> > I'm not crazy about the fact that it does that synchronously in the
> > workqueue job, but I guess not much else can be happening with
> > callbacks until this completes.
> 
> Bruce, note rpc_shutdown_client() can block indefinitely due to
> bugs in NFSD's callback completion handlers. That's one of the
> main reasons this is getting attention right not.

Got it.

I mean, my apologies for that code, I'm not at all attached to the way
it works right now.  But I wonder whether what's mainly needed is some
test writing.

It feels like a lot of subtle code designed to handle cases that
probably aren't much exercised by real clients.

--b.

> > > > > > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > > > > > new method of session refcounting that is compatible with the old.
> > > > > > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > > > > > in addition to the slot. Then, all of the callback handling is switched
> > > > > > to use that session instead of dereferencing clp->cb_cb_session.
> > > > > > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > > > > > based on review comments, and lifted the v4.0 handing out of that
> > > > > > function.
> > > > > > 
> > > > > > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > > > > > much any of that stresses the backchannel's error handling.
> > > > > > 
> > > > > > These should probably go in via Chuck's tree, but the last patch touches
> > > > > > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > > > > > from Trond and/or Anna on that one.
> > > > > > 
> > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > > > ---
> > > > > > Changes in v2:
> > > > > > - make nfsd4_session be RCU-freed
> > > > > > - change code to keep reference to session over callback RPCs
> > > > > > - rework error handling in nfsd4_cb_sequence_done()
> > > > > > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > > > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> > > > > > 
> > > > > > ---
> > > > > > Jeff Layton (7):
> > > > > >        nfsd: add routines to get/put session references for callbacks
> > > > > >        nfsd: make clp->cl_cb_session be an RCU managed pointer
> > > > > >        nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> > > > > >        nfsd: overhaul CB_SEQUENCE error handling
> > > > > >        nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> > > > > >        nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > > >        sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > > > > > 
> > > > > >   fs/nfs/nfs4proc.c           |  12 ++-
> > > > > >   fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
> > > > > >   fs/nfsd/nfs4state.c         |  43 ++++++++-
> > > > > >   fs/nfsd/state.h             |   6 +-
> > > > > >   fs/nfsd/trace.h             |   6 +-
> > > > > >   include/linux/sunrpc/clnt.h |   4 +-
> > > > > >   net/sunrpc/clnt.c           |   7 +-
> > > > > >   7 files changed, 210 insertions(+), 80 deletions(-)
> > > > > > ---
> > > > > > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > > > > > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > > > > > 
> > > > > > Best regards,
> > > > > > -- 
> > > > > > Jeff Layton <jlayton@kernel.org>
> > > > 
> > > > -- 
> > > > Jeff Layton <jlayton@kernel.org>
> > 
> 
> 
> -- 
> Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-01-29 15:25 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
2025-01-29 13:39 ` [PATCH v2 1/7] nfsd: add routines to get/put session references for callbacks Jeff Layton
2025-01-29 13:39 ` [PATCH v2 2/7] nfsd: make clp->cl_cb_session be an RCU managed pointer Jeff Layton
2025-01-29 13:39 ` [PATCH v2 3/7] nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session Jeff Layton
2025-01-29 13:39 ` [PATCH v2 4/7] nfsd: overhaul CB_SEQUENCE error handling Jeff Layton
2025-01-29 13:39 ` [PATCH v2 5/7] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
2025-01-29 13:39 ` [PATCH v2 6/7] nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done() Jeff Layton
2025-01-29 13:40 ` [PATCH v2 7/7] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton
2025-01-29 14:21 ` [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups J. Bruce Fields
2025-01-29 14:27   ` Jeff Layton
2025-01-29 14:40     ` J. Bruce Fields
2025-01-29 15:01       ` Jeff Layton
2025-01-29 15:09         ` Chuck Lever
2025-01-29 15:25           ` J. Bruce Fields
2025-01-29 14:22 ` Chuck Lever
2025-01-29 14:39   ` Jeff Layton
2025-01-29 14:42     ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox