[PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs
@ 2025-02-18 21:28 Jeff Layton
  2025-02-18 21:28 ` [PATCH 1/3] nfsd: prevent callback tasks running concurrently Jeff Layton
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Jeff Layton @ 2025-02-18 21:28 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: Li Lingfeng, linux-nfs, linux-kernel, Jeff Layton

While looking at the problem that Li Lingfeng reported [1] around
callback queueing failures, I noticed that there were potential
scenarios where the callback workqueue jobs could run concurrently with
an rpc_task. Since they touch some of the same fields, this is incorrect
at best and potentially dangerous.

This patchset adds a new mechanism for ensuring that the same
nfsd4_callback can't run concurrently with itself, regardless of where
it is in its execution. This also gives us a more sure mechanism for
handling the places where we need to take and hold a reference on an
object while the callback is running.

This should also fix the problem that Li Lingfeng reported, since
queueing the work from nfsd4_cb_release() should never fail. Note that
the patch they sent earlier (fdf5c9413ea) should be dropped from
nfsd-testing before this will apply cleanly.

[1]: https://lore.kernel.org/linux-nfs/20250218135423.1487309-1-lilingfeng3@huawei.com/

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Jeff Layton (3):
      nfsd: prevent callback tasks running concurrently
      nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
      nfsd: move cb_need_restart flag into cb_flags

 fs/nfsd/nfs4callback.c | 12 ++++++------
 fs/nfsd/nfs4layouts.c  |  7 ++++---
 fs/nfsd/nfs4proc.c     |  2 +-
 fs/nfsd/nfs4state.c    | 26 +++++++++++---------------
 fs/nfsd/state.h        | 13 ++++++++++---
 fs/nfsd/trace.h        |  2 +-
 6 files changed, 33 insertions(+), 29 deletions(-)
---
base-commit: 4a52e5e49d1b50fcb584e434cced6d0547ddea42
change-id: 20250218-nfsd-callback-f723b8498c78

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] nfsd: prevent callback tasks running concurrently
  2025-02-18 21:28 [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs Jeff Layton
@ 2025-02-18 21:28 ` Jeff Layton
  2025-02-18 21:28 ` [PATCH 2/3] nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY Jeff Layton
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2025-02-18 21:28 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: Li Lingfeng, linux-nfs, linux-kernel, Jeff Layton

The nfsd4_callback workqueue jobs exist to queue backchannel RPCs to
rpciod. Because they run in different workqueue contexts, the rpc_task
can run concurrently with the workqueue job itself, should it become
requeued. This is problematic as there is no locking when accessing the
fields in the nfsd4_callback.

Add a new unsigned long to nfsd4_callback and declare a new
NFSD4_CALLBACK_RUNNING flag to be set in it. When attempting to run a
workqueue job, do a test_and_set_bit() on that flag first, and don't
queue the workqueue job if it returns true. Clear NFSD4_CALLBACK_RUNNING
in nfsd41_destroy_cb().

This also gives us a more reliable mechanism for handling queueing
failures in codepaths where we have to take references under spinlocks.
We can now do the test_and_set_bit on NFSD4_CALLBACK_RUNNING first, and
only take references to the objects if that returns false.

Most of the nfsd4_run_cb() callers are converted to use this new flag or
the nfsd4_try_run_cb() wrapper. The main exception is the callback
channel probe, which has its own synchronization.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c |  2 ++
 fs/nfsd/nfs4layouts.c  |  7 ++++---
 fs/nfsd/nfs4proc.c     |  2 +-
 fs/nfsd/nfs4state.c    | 14 ++++++++++----
 fs/nfsd/state.h        |  9 +++++++++
 5 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index ae4b7b6df47ff054db197bd2f6083905e4149bee..1f26c811e5f73c2e745ee68d0b6e668d1dd7c704 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1312,6 +1312,7 @@ static void nfsd41_destroy_cb(struct nfsd4_callback *cb)
 
 	trace_nfsd_cb_destroy(clp, cb);
 	nfsd41_cb_release_slot(cb);
+	clear_bit(NFSD4_CALLBACK_RUNNING, &cb->cb_flags);
 	if (cb->cb_ops && cb->cb_ops->release)
 		cb->cb_ops->release(cb);
 	nfsd41_cb_inflight_end(clp);
@@ -1632,6 +1633,7 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 	cb->cb_msg.rpc_proc = &nfs4_cb_procedures[op];
 	cb->cb_msg.rpc_argp = cb;
 	cb->cb_msg.rpc_resp = cb;
+	cb->cb_flags = 0;
 	cb->cb_ops = ops;
 	INIT_WORK(&cb->cb_work, nfsd4_run_cb_work);
 	cb->cb_status = 0;
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index fbfddd3c4c943c34aa3991328eacb5759e311cc4..290271ac424540e4405a5fd0eacc8db9f47603cd 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -344,9 +344,10 @@ nfsd4_recall_file_layout(struct nfs4_layout_stateid *ls)
 	atomic_inc(&ls->ls_stid.sc_file->fi_lo_recalls);
 	trace_nfsd_layout_recall(&ls->ls_stid.sc_stateid);
 
-	refcount_inc(&ls->ls_stid.sc_count);
-	nfsd4_run_cb(&ls->ls_recall);
-
+	if (!test_and_set_bit(NFSD4_CALLBACK_RUNNING, &ls->ls_recall.cb_flags)) {
+		refcount_inc(&ls->ls_stid.sc_count);
+		nfsd4_run_cb(&ls->ls_recall);
+	}
 out_unlock:
 	spin_unlock(&ls->ls_lock);
 }
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f6e06c779d09dacdcea81fb3b4135bf600f6cc63..b397246dae7b7e8c2a0ba436bb3813213cfe4fa2 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1847,7 +1847,7 @@ static void nfsd4_send_cb_offload(struct nfsd4_copy *copy)
 		      NFSPROC4_CLNT_CB_OFFLOAD);
 	trace_nfsd_cb_offload(copy->cp_clp, &cbo->co_res.cb_stateid,
 			      &cbo->co_fh, copy->cp_count, copy->nfserr);
-	nfsd4_run_cb(&cbo->co_cb);
+	nfsd4_try_run_cb(&cbo->co_cb);
 }
 
 /**
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e806fd97cca972559d1929d37c1a24e760d9d6d8..fabcd979c40695ebcc795cfd2d8a035b7d589a37 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3232,8 +3232,10 @@ static void nfs4_cb_getattr(struct nfs4_cb_fattr *ncf)
 	/* set to proper status when nfsd4_cb_getattr_done runs */
 	ncf->ncf_cb_status = NFS4ERR_IO;
 
-	refcount_inc(&dp->dl_stid.sc_count);
-	nfsd4_run_cb(&ncf->ncf_getattr);
+	if (!test_and_set_bit(NFSD4_CALLBACK_RUNNING, &ncf->ncf_getattr.cb_flags)) {
+		refcount_inc(&dp->dl_stid.sc_count);
+		nfsd4_run_cb(&ncf->ncf_getattr);
+	}
 }
 
 static struct nfs4_client *create_client(struct xdr_netobj name,
@@ -5422,6 +5424,10 @@ static const struct nfsd4_callback_ops nfsd4_cb_recall_ops = {
 static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
 {
 	bool queued;
+
+	if (test_and_set_bit(NFSD4_CALLBACK_RUNNING, &dp->dl_recall.cb_flags))
+		return;
+
 	/*
 	 * We're assuming the state code never drops its reference
 	 * without first removing the lease.  Since we're in this lease
@@ -6910,7 +6916,7 @@ deleg_reaper(struct nfsd_net *nn)
 		clp->cl_ra->ra_bmval[0] = BIT(RCA4_TYPE_MASK_RDATA_DLG) |
 						BIT(RCA4_TYPE_MASK_WDATA_DLG);
 		trace_nfsd_cb_recall_any(clp->cl_ra);
-		nfsd4_run_cb(&clp->cl_ra->ra_cb);
+		nfsd4_try_run_cb(&clp->cl_ra->ra_cb);
 	}
 }
 
@@ -7839,7 +7845,7 @@ nfsd4_lm_notify(struct file_lock *fl)
 
 	if (queue) {
 		trace_nfsd_cb_notify_lock(lo, nbl);
-		nfsd4_run_cb(&nbl->nbl_cb);
+		nfsd4_try_run_cb(&nbl->nbl_cb);
 	}
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 74d2d7b42676d907bec9159b927aeed223d668c3..dc7105e685b8057ca4e2fcc5ceb85754e96981a2 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -67,6 +67,8 @@ typedef struct {
 struct nfsd4_callback {
 	struct nfs4_client *cb_clp;
 	struct rpc_message cb_msg;
+#define NFSD4_CALLBACK_RUNNING	BIT(0)
+	unsigned long cb_flags;
 	const struct nfsd4_callback_ops *cb_ops;
 	struct work_struct cb_work;
 	int cb_seq_status;
@@ -780,6 +782,13 @@ extern void nfsd4_change_callback(struct nfs4_client *clp, struct nfs4_cb_conn *
 extern void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 		const struct nfsd4_callback_ops *ops, enum nfsd4_cb_op op);
 extern bool nfsd4_run_cb(struct nfsd4_callback *cb);
+
+static inline void nfsd4_try_run_cb(struct nfsd4_callback *cb)
+{
+	if (!test_and_set_bit(NFSD4_CALLBACK_RUNNING, &cb->cb_flags))
+		WARN_ON_ONCE(!nfsd4_run_cb(cb));
+}
+
 extern void nfsd4_shutdown_callback(struct nfs4_client *);
 extern void nfsd4_shutdown_copy(struct nfs4_client *clp);
 void nfsd4_async_copy_reaper(struct nfsd_net *nn);

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
  2025-02-18 21:28 [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs Jeff Layton
  2025-02-18 21:28 ` [PATCH 1/3] nfsd: prevent callback tasks running concurrently Jeff Layton
@ 2025-02-18 21:28 ` Jeff Layton
  2025-02-18 21:28 ` [PATCH 3/3] nfsd: move cb_need_restart flag into cb_flags Jeff Layton
  2025-02-18 22:26 ` [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs NeilBrown
  3 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2025-02-18 21:28 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: Li Lingfeng, linux-nfs, linux-kernel, Jeff Layton

deleg_reaper() will walk the client_lru list and put any suitable
entries onto "cblist" using the cl_ra_cblist pointer. It then walks the
objects outside the spinlock and queues callbacks for them.

None of the operations that deleg_reaper() does outside the
nn->client_lock are blocking operations. Just queue their workqueue jobs
under the nn->client_lock instead.

Also, the NFSD4_CLIENT_CB_RECALL_ANY and NFSD4_CALLBACK_RUNNING flags
serve an identical purpose now. Drop the NFSD4_CLIENT_CB_RECALL_ANY flag
and just use the one in the callback.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4state.c | 16 +++-------------
 fs/nfsd/state.h     |  2 --
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fabcd979c40695ebcc795cfd2d8a035b7d589a37..422439a46ffd03926524b8463cfdabfb866281b3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3175,7 +3175,6 @@ nfsd4_cb_recall_any_release(struct nfsd4_callback *cb)
 {
 	struct nfs4_client *clp = cb->cb_clp;
 
-	clear_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
 	drop_client(clp);
 }
 
@@ -6881,7 +6880,6 @@ deleg_reaper(struct nfsd_net *nn)
 {
 	struct list_head *pos, *next;
 	struct nfs4_client *clp;
-	LIST_HEAD(cblist);
 
 	spin_lock(&nn->client_lock);
 	list_for_each_safe(pos, next, &nn->client_lru) {
@@ -6893,31 +6891,23 @@ deleg_reaper(struct nfsd_net *nn)
 			continue;
 		if (atomic_read(&clp->cl_delegs_in_recall))
 			continue;
-		if (test_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags))
+		if (test_and_set_bit(NFSD4_CALLBACK_RUNNING, &clp->cl_ra->ra_cb.cb_flags))
 			continue;
 		if (ktime_get_boottime_seconds() - clp->cl_ra_time < 5)
 			continue;
 		if (clp->cl_cb_state != NFSD4_CB_UP)
 			continue;
-		list_add(&clp->cl_ra_cblist, &cblist);
 
 		/* release in nfsd4_cb_recall_any_release */
 		kref_get(&clp->cl_nfsdfs.cl_ref);
-		set_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
 		clp->cl_ra_time = ktime_get_boottime_seconds();
-	}
-	spin_unlock(&nn->client_lock);
-
-	while (!list_empty(&cblist)) {
-		clp = list_first_entry(&cblist, struct nfs4_client,
-					cl_ra_cblist);
-		list_del_init(&clp->cl_ra_cblist);
 		clp->cl_ra->ra_keep = 0;
 		clp->cl_ra->ra_bmval[0] = BIT(RCA4_TYPE_MASK_RDATA_DLG) |
 						BIT(RCA4_TYPE_MASK_WDATA_DLG);
 		trace_nfsd_cb_recall_any(clp->cl_ra);
-		nfsd4_try_run_cb(&clp->cl_ra->ra_cb);
+		nfsd4_run_cb(&clp->cl_ra->ra_cb);
 	}
+	spin_unlock(&nn->client_lock);
 }
 
 static void
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index dc7105e685b8057ca4e2fcc5ceb85754e96981a2..d1a8f074885aa6576843baf46de3a55de530d8d9 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -454,7 +454,6 @@ struct nfs4_client {
 #define NFSD4_CLIENT_UPCALL_LOCK	(5)	/* upcall serialization */
 #define NFSD4_CLIENT_CB_FLAG_MASK	(1 << NFSD4_CLIENT_CB_UPDATE | \
 					 1 << NFSD4_CLIENT_CB_KILL)
-#define NFSD4_CLIENT_CB_RECALL_ANY	(6)
 	unsigned long		cl_flags;
 
 	struct workqueue_struct *cl_callback_wq;
@@ -500,7 +499,6 @@ struct nfs4_client {
 
 	struct nfsd4_cb_recall_any	*cl_ra;
 	time64_t		cl_ra_time;
-	struct list_head	cl_ra_cblist;
 };
 
 /* struct nfs4_client_reset

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] nfsd: move cb_need_restart flag into cb_flags
  2025-02-18 21:28 [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs Jeff Layton
  2025-02-18 21:28 ` [PATCH 1/3] nfsd: prevent callback tasks running concurrently Jeff Layton
  2025-02-18 21:28 ` [PATCH 2/3] nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY Jeff Layton
@ 2025-02-18 21:28 ` Jeff Layton
  2025-02-18 22:26 ` [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs NeilBrown
  3 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2025-02-18 21:28 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: Li Lingfeng, linux-nfs, linux-kernel, Jeff Layton

Since there is now a cb_flags word, use a new NFSD4_CALLBACK_RESTART
flag in that instead of a separate boolean.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4callback.c | 10 ++++------
 fs/nfsd/state.h        |  2 +-
 fs/nfsd/trace.h        |  2 +-
 3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 1f26c811e5f73c2e745ee68d0b6e668d1dd7c704..0b94fccbabb49097e23881ab170d38e0eeef90e2 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1071,7 +1071,7 @@ static void nfsd4_requeue_cb(struct rpc_task *task, struct nfsd4_callback *cb)
 	if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
 		trace_nfsd_cb_restart(clp, cb);
 		task->tk_status = 0;
-		cb->cb_need_restart = true;
+		set_bit(NFSD4_CALLBACK_RESTART, &cb->cb_flags);
 	}
 }
 
@@ -1475,7 +1475,7 @@ static void nfsd4_cb_release(void *calldata)
 
 	trace_nfsd_cb_rpc_release(cb->cb_clp);
 
-	if (cb->cb_need_restart)
+	if (test_bit(NFSD4_CALLBACK_RESTART, &cb->cb_flags))
 		nfsd4_queue_cb(cb);
 	else
 		nfsd41_destroy_cb(cb);
@@ -1614,12 +1614,11 @@ nfsd4_run_cb_work(struct work_struct *work)
 		return;
 	}
 
-	if (cb->cb_need_restart) {
-		cb->cb_need_restart = false;
-	} else {
+	if (!test_and_clear_bit(NFSD4_CALLBACK_RESTART, &cb->cb_flags)) {
 		if (cb->cb_ops && cb->cb_ops->prepare)
 			cb->cb_ops->prepare(cb);
 	}
+
 	cb->cb_msg.rpc_cred = clp->cl_cb_cred;
 	flags = clp->cl_minorversion ? RPC_TASK_NOCONNECT : RPC_TASK_SOFTCONN;
 	rpc_call_async(clnt, &cb->cb_msg, RPC_TASK_SOFT | flags,
@@ -1637,7 +1636,6 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 	cb->cb_ops = ops;
 	INIT_WORK(&cb->cb_work, nfsd4_run_cb_work);
 	cb->cb_status = 0;
-	cb->cb_need_restart = false;
 	cb->cb_held_slot = -1;
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index d1a8f074885aa6576843baf46de3a55de530d8d9..f75b77dceb47c0b3795df9dbf9131b8c0ce4525f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -68,13 +68,13 @@ struct nfsd4_callback {
 	struct nfs4_client *cb_clp;
 	struct rpc_message cb_msg;
 #define NFSD4_CALLBACK_RUNNING	BIT(0)
+#define NFSD4_CALLBACK_RESTART	BIT(1)
 	unsigned long cb_flags;
 	const struct nfsd4_callback_ops *cb_ops;
 	struct work_struct cb_work;
 	int cb_seq_status;
 	int cb_status;
 	int cb_held_slot;
-	bool cb_need_restart;
 };
 
 struct nfsd4_callback_ops {
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 49bbd26ffcdb36173047569b8d4b41efdec4880b..00140556d50aab4bab8900fb1890cd920d5124c6 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1613,7 +1613,7 @@ DECLARE_EVENT_CLASS(nfsd_cb_lifetime_class,
 		__entry->cl_id = clp->cl_clientid.cl_id;
 		__entry->cb = cb;
 		__entry->opcode = cb->cb_ops ? cb->cb_ops->opcode : _CB_NULL;
-		__entry->need_restart = cb->cb_need_restart;
+		__entry->need_restart = test_bit(NFSD4_CALLBACK_RESTART, &cb->cb_flags);
 		__assign_sockaddr(addr, &clp->cl_cb_conn.cb_addr,
 				  clp->cl_cb_conn.cb_addrlen)
 	),

-- 
2.48.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs
  2025-02-18 21:28 [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs Jeff Layton
                   ` (2 preceding siblings ...)
  2025-02-18 21:28 ` [PATCH 3/3] nfsd: move cb_need_restart flag into cb_flags Jeff Layton
@ 2025-02-18 22:26 ` NeilBrown
  2025-02-19  0:01   ` Jeff Layton
  3 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2025-02-18 22:26 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey, Li Lingfeng,
	linux-nfs, linux-kernel, Jeff Layton

On Wed, 19 Feb 2025, Jeff Layton wrote:
> While looking at the problem that Li Lingfeng reported [1] around
> callback queueing failures, I noticed that there were potential
> scenarios where the callback workqueue jobs could run concurrently with
> an rpc_task. Since they touch some of the same fields, this is incorrect
> at best and potentially dangerous.

If the problem is that workqueue jobs might run concurrently with
rpc_tasks and that we don't want that, could we simply run all the cb
tasks as "sync" rpc tasks in the workqueue??

i.e. change rpc_call_async() in nfsd4_run_cb_work() to rpc_call_sync ...
and fix any breakage because I doubt it is really as simple as that.

This would fully serialise all the callbacks.  Is that what to goal is
here, or is the goal more subtle?

Thanks,
NeilBrown


> 
> This patchset adds a new mechanism for ensuring that the same
> nfsd4_callback can't run concurrently with itself, regardless of where
> it is in its execution. This also gives us a more sure mechanism for
> handling the places where we need to take and hold a reference on an
> object while the callback is running.
> 
> This should also fix the problem that Li Lingfeng reported, since
> queueing the work from nfsd4_cb_release() should never fail. Note that
> the patch they sent earlier (fdf5c9413ea) should be dropped from
> nfsd-testing before this will apply cleanly.
> 
> [1]: https://lore.kernel.org/linux-nfs/20250218135423.1487309-1-lilingfeng3@huawei.com/
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> Jeff Layton (3):
>       nfsd: prevent callback tasks running concurrently
>       nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
>       nfsd: move cb_need_restart flag into cb_flags
> 
>  fs/nfsd/nfs4callback.c | 12 ++++++------
>  fs/nfsd/nfs4layouts.c  |  7 ++++---
>  fs/nfsd/nfs4proc.c     |  2 +-
>  fs/nfsd/nfs4state.c    | 26 +++++++++++---------------
>  fs/nfsd/state.h        | 13 ++++++++++---
>  fs/nfsd/trace.h        |  2 +-
>  6 files changed, 33 insertions(+), 29 deletions(-)
> ---
> base-commit: 4a52e5e49d1b50fcb584e434cced6d0547ddea42
> change-id: 20250218-nfsd-callback-f723b8498c78
> 
> Best regards,
> -- 
> Jeff Layton <jlayton@kernel.org>
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs
  2025-02-18 22:26 ` [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs NeilBrown
@ 2025-02-19  0:01   ` Jeff Layton
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2025-02-19  0:01 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever, Olga Kornievskaia, Dai Ngo, Tom Talpey, Li Lingfeng,
	linux-nfs, linux-kernel

On Wed, 2025-02-19 at 09:26 +1100, NeilBrown wrote:
> On Wed, 19 Feb 2025, Jeff Layton wrote:
> > While looking at the problem that Li Lingfeng reported [1] around
> > callback queueing failures, I noticed that there were potential
> > scenarios where the callback workqueue jobs could run concurrently with
> > an rpc_task. Since they touch some of the same fields, this is incorrect
> > at best and potentially dangerous.
> 
> If the problem is that workqueue jobs might run concurrently with
> rpc_tasks and that we don't want that, could we simply run all the cb
> tasks as "sync" rpc tasks in the workqueue??
> 
> i.e. change rpc_call_async() in nfsd4_run_cb_work() to rpc_call_sync ...
> and fix any breakage because I doubt it is really as simple as that.
> 
> This would fully serialise all the callbacks.  Is that what to goal is
> here, or is the goal more subtle?
> 

I think we need to serialize callbacks that use the same struct
nfsd4_callback. Today we can end up with the same callback running
concurrently with some of them:

The callback workqueue jobs only run until the rpc_task has been
submitted. After that point the workqueue job can run again and submit
a second rpc_task. That can end up with multiple RPCs racing to set and
fetch the cb_status, cb_slot, etc. in the nfsd4_callback.

Note that a few of the callbacks have their own mechanism for
serialization (CB_RECALL_ANY, CB_GETATTR, and the callback channel
probes), but the others don't.

Making all of the RPCs synchronous would mean that all callbacks to the
client would be serialized, which would be overkill. We'd be going back
to a single-slot callback channel.

I think serializing on the nfsd4_callback is right. The part I'm not
yet certain about is whether to just ignore attempts to queue up the
callback while one is running, or if we need to ensure that it runs
again after the current callback has finished if that happens.

The latter seems more correct, but I can't quite come up with a
situation where it matters. Maybe a multiple CB_LAYOUTRECALL callback
attempts can race with a layout stateid morphing?


> 
> 
> > 
> > This patchset adds a new mechanism for ensuring that the same
> > nfsd4_callback can't run concurrently with itself, regardless of where
> > it is in its execution. This also gives us a more sure mechanism for
> > handling the places where we need to take and hold a reference on an
> > object while the callback is running.
> > 
> > This should also fix the problem that Li Lingfeng reported, since
> > queueing the work from nfsd4_cb_release() should never fail. Note that
> > the patch they sent earlier (fdf5c9413ea) should be dropped from
> > nfsd-testing before this will apply cleanly.
> > 
> > [1]: https://lore.kernel.org/linux-nfs/20250218135423.1487309-1-lilingfeng3@huawei.com/
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > Jeff Layton (3):
> >       nfsd: prevent callback tasks running concurrently
> >       nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
> >       nfsd: move cb_need_restart flag into cb_flags
> > 
> >  fs/nfsd/nfs4callback.c | 12 ++++++------
> >  fs/nfsd/nfs4layouts.c  |  7 ++++---
> >  fs/nfsd/nfs4proc.c     |  2 +-
> >  fs/nfsd/nfs4state.c    | 26 +++++++++++---------------
> >  fs/nfsd/state.h        | 13 ++++++++++---
> >  fs/nfsd/trace.h        |  2 +-
> >  6 files changed, 33 insertions(+), 29 deletions(-)
> > ---
> > base-commit: 4a52e5e49d1b50fcb584e434cced6d0547ddea42
> > change-id: 20250218-nfsd-callback-f723b8498c78
> > 
> > Best regards,
> > -- 
> > Jeff Layton <jlayton@kernel.org>
> > 
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-02-19  0:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-18 21:28 [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs Jeff Layton
2025-02-18 21:28 ` [PATCH 1/3] nfsd: prevent callback tasks running concurrently Jeff Layton
2025-02-18 21:28 ` [PATCH 2/3] nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY Jeff Layton
2025-02-18 21:28 ` [PATCH 3/3] nfsd: move cb_need_restart flag into cb_flags Jeff Layton
2025-02-18 22:26 ` [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs NeilBrown
2025-02-19  0:01   ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox