[PATCH v3 0/2] provide locking for v4_end

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/2] provide locking for v4_end_grace
@ 2025-12-13 18:41 Chuck Lever
  2025-12-13 18:41 ` [PATCH v3 1/2] nfsd: " Chuck Lever
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Chuck Lever @ 2025-12-13 18:41 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, lilingfeng3, yangerkun, yi.zhang, houtao1,
	chengzhihao1, yukuai3, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Following up on:

https://lore.kernel.org/linux-nfs/175136659151.565058.6474755472267609432@noble.neil.brown.name/#r

This is now two patches: one that can be backported, and one that
simplifies the fix based on mechanisms available only in recent
kernels. I've also addressed all the review comments I could find.

These patches have been compile-tested only.

NeilBrown (2):
  nfsd: provide locking for v4_end_grace
  nfsd: use workqueue enable/disable APIs for v4_end_grace sync

 fs/nfsd/netns.h     |  1 +
 fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++++++---
 fs/nfsd/nfsctl.c    |  3 +--
 fs/nfsd/state.h     |  2 +-
 4 files changed, 40 insertions(+), 6 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] nfsd: provide locking for v4_end_grace
  2025-12-13 18:41 [PATCH v3 0/2] provide locking for v4_end_grace Chuck Lever
@ 2025-12-13 18:41 ` Chuck Lever
  2025-12-14  1:05   ` Jeff Layton
  2025-12-15  7:28   ` Li Lingfeng
  2025-12-13 18:42 ` [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync Chuck Lever
  2025-12-14  1:08 ` [PATCH v3 0/2] provide locking for v4_end_grace Jeff Layton
  2 siblings, 2 replies; 8+ messages in thread
From: Chuck Lever @ 2025-12-13 18:41 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, lilingfeng3, yangerkun, yi.zhang, houtao1,
	chengzhihao1, yukuai3

From: NeilBrown <neil@brown.name>

Writing to v4_end_grace can race with server shutdown and result in
memory being accessed after it was freed - reclaim_str_hashtbl in
particularly.

We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is
held while client_tracking_op->init() is called and that can wait for
an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a
deadlock.

nfsd4_end_grace() is also called by the landromat work queue and this
doesn't require locking as server shutdown will stop the work and wait
for it before freeing anything that nfsd4_end_grace() might access.

However, we must be sure that writing to v4_end_grace doesn't restart
the work item after shutdown has already waited for it.  For this we
add a new flag protected with nn->client_lock.  It is set only while it
is safe to make client tracking calls, and v4_end_grace only schedules
work while the flag is set with the spinlock held.

So this patch adds a nfsd_net field "client_tracking_active" which is
set as described.  Another field "grace_end_forced", is set when
v4_end_grace is written.  After this is set, and providing
client_tracking_active is set, the laundromat is scheduled.
This "grace_end_forced" field bypasses other checks for whether the
grace period has finished.

This resolves a race which can result in use-after-free.

Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t
Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd")
X-Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/netns.h     |  2 ++
 fs/nfsd/nfs4state.c | 42 ++++++++++++++++++++++++++++++++++++++++--
 fs/nfsd/nfsctl.c    |  3 +--
 fs/nfsd/state.h     |  2 +-
 4 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 3e2d0fde80a7..fe8338735e7c 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -66,6 +66,8 @@ struct nfsd_net {
 
 	struct lock_manager nfsd4_manager;
 	bool grace_ended;
+	bool grace_end_forced;
+	bool client_tracking_active;
 	time64_t boot_time;
 
 	struct dentry *nfsd_client_dir;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d0efa3e0965f..1d307cc533d9 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -84,7 +84,7 @@ static u64 current_sessionid = 1;
 /* forward declarations */
 static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner);
 static void nfs4_free_ol_stateid(struct nfs4_stid *stid);
-void nfsd4_end_grace(struct nfsd_net *nn);
+static void nfsd4_end_grace(struct nfsd_net *nn);
 static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps);
 static void nfsd4_file_hash_remove(struct nfs4_file *fi);
 static void deleg_reaper(struct nfsd_net *nn);
@@ -6570,7 +6570,7 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	return nfs_ok;
 }
 
-void
+static void
 nfsd4_end_grace(struct nfsd_net *nn)
 {
 	/* do nothing if grace period already ended */
@@ -6603,6 +6603,33 @@ nfsd4_end_grace(struct nfsd_net *nn)
 	 */
 }
 
+/**
+ * nfsd4_force_end_grace - forcibly end the NFSv4 grace period
+ * @nn: network namespace for the server instance to be updated
+ *
+ * Forces bypass of normal grace period completion, then schedules
+ * the laundromat to end the grace period immediately. Does not wait
+ * for the grace period to fully terminate before returning.
+ *
+ * Return values:
+ *   %true: Grace termination schedule
+ *   %false: No action was taken
+ */
+bool nfsd4_force_end_grace(struct nfsd_net *nn)
+{
+	if (!nn->client_tracking_ops)
+		return false;
+	spin_lock(&nn->client_lock);
+	if (nn->grace_ended || !nn->client_tracking_active) {
+		spin_unlock(&nn->client_lock);
+		return false;
+	}
+	WRITE_ONCE(nn->grace_end_forced, true);
+	mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
+	spin_unlock(&nn->client_lock);
+	return true;
+}
+
 /*
  * If we've waited a lease period but there are still clients trying to
  * reclaim, wait a little longer to give them a chance to finish.
@@ -6612,6 +6639,8 @@ static bool clients_still_reclaiming(struct nfsd_net *nn)
 	time64_t double_grace_period_end = nn->boot_time +
 					   2 * nn->nfsd4_lease;
 
+	if (READ_ONCE(nn->grace_end_forced))
+		return false;
 	if (nn->track_reclaim_completes &&
 			atomic_read(&nn->nr_reclaim_complete) ==
 			nn->reclaim_str_hashtbl_size)
@@ -8932,6 +8961,8 @@ static int nfs4_state_create_net(struct net *net)
 	nn->unconf_name_tree = RB_ROOT;
 	nn->boot_time = ktime_get_real_seconds();
 	nn->grace_ended = false;
+	nn->grace_end_forced = false;
+	nn->client_tracking_active = false;
 	nn->nfsd4_manager.block_opens = true;
 	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
 	INIT_LIST_HEAD(&nn->client_lru);
@@ -9012,6 +9043,10 @@ nfs4_state_start_net(struct net *net)
 		return ret;
 	locks_start_grace(net, &nn->nfsd4_manager);
 	nfsd4_client_tracking_init(net);
+	/* safe for laundromat to run now */
+	spin_lock(&nn->client_lock);
+	nn->client_tracking_active = true;
+	spin_unlock(&nn->client_lock);
 	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
 		goto skip_grace;
 	printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
@@ -9060,6 +9095,9 @@ nfs4_state_shutdown_net(struct net *net)
 
 	shrinker_free(nn->nfsd_client_shrinker);
 	cancel_work_sync(&nn->nfsd_shrinker_work);
+	spin_lock(&nn->client_lock);
+	nn->client_tracking_active = false;
+	spin_unlock(&nn->client_lock);
 	cancel_delayed_work_sync(&nn->laundromat_work);
 	locks_end_grace(&nn->nfsd4_manager);
 
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 5ce9a49e76ba..242fcbd958f1 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1082,10 +1082,9 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
 		case 'Y':
 		case 'y':
 		case '1':
-			if (!nn->nfsd_serv)
+			if (!nfsd4_force_end_grace(nn))
 				return -EBUSY;
 			trace_nfsd_end_grace(netns(file));
-			nfsd4_end_grace(nn);
 			break;
 		default:
 			return -EINVAL;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index b052c1effdc5..848c5383d782 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -849,7 +849,7 @@ static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 #endif
 
 /* grace period management */
-void nfsd4_end_grace(struct nfsd_net *nn);
+bool nfsd4_force_end_grace(struct nfsd_net *nn);
 
 /* nfs4recover operations */
 extern int nfsd4_client_tracking_init(struct net *net);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync
  2025-12-13 18:41 [PATCH v3 0/2] provide locking for v4_end_grace Chuck Lever
  2025-12-13 18:41 ` [PATCH v3 1/2] nfsd: " Chuck Lever
@ 2025-12-13 18:42 ` Chuck Lever
  2025-12-14  1:07   ` Jeff Layton
  2025-12-15  8:00   ` Li Lingfeng
  2025-12-14  1:08 ` [PATCH v3 0/2] provide locking for v4_end_grace Jeff Layton
  2 siblings, 2 replies; 8+ messages in thread
From: Chuck Lever @ 2025-12-13 18:42 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, lilingfeng3, yangerkun, yi.zhang, houtao1,
	chengzhihao1, yukuai3

From: NeilBrown <neil@brown.name>

"nfsd: provide locking for v4_end_grace" introduced a
client_tracking_active flag protected by nn->client_lock to prevent
the laundromat from being scheduled before client tracking
initialization or after shutdown begins. That commit is suitable for
backporting to LTS kernels that predate commit 86898fa6b8cd
("workqueue: Implement disable/enable for (delayed) work items").

However, the workqueue subsystem in recent kernels provides
enable_delayed_work() and disable_delayed_work_sync() for this
purpose. Using this mechanism enable us to remove the
client_tracking_active flag and associated spinlock operations
while preserving the same synchronization guarantees, which is
a cleaner long-term approach.

Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/netns.h     |  1 -
 fs/nfsd/nfs4state.c | 22 +++++++++-------------
 2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index fe8338735e7c..d83c68872c4c 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -67,7 +67,6 @@ struct nfsd_net {
 	struct lock_manager nfsd4_manager;
 	bool grace_ended;
 	bool grace_end_forced;
-	bool client_tracking_active;
 	time64_t boot_time;
 
 	struct dentry *nfsd_client_dir;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1d307cc533d9..c9be724c48d0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6619,14 +6619,14 @@ bool nfsd4_force_end_grace(struct nfsd_net *nn)
 {
 	if (!nn->client_tracking_ops)
 		return false;
-	spin_lock(&nn->client_lock);
-	if (nn->grace_ended || !nn->client_tracking_active) {
-		spin_unlock(&nn->client_lock);
+	if (READ_ONCE(nn->grace_ended))
 		return false;
-	}
+	/* laundromat_work must be initialised now, though it might be disabled */
 	WRITE_ONCE(nn->grace_end_forced, true);
+	/* mod_delayed_work() doesn't queue work after
+	 * nfs4_state_shutdown_net() has called disable_delayed_work_sync()
+	 */
 	mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
-	spin_unlock(&nn->client_lock);
 	return true;
 }
 
@@ -8962,7 +8962,6 @@ static int nfs4_state_create_net(struct net *net)
 	nn->boot_time = ktime_get_real_seconds();
 	nn->grace_ended = false;
 	nn->grace_end_forced = false;
-	nn->client_tracking_active = false;
 	nn->nfsd4_manager.block_opens = true;
 	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
 	INIT_LIST_HEAD(&nn->client_lru);
@@ -8977,6 +8976,8 @@ static int nfs4_state_create_net(struct net *net)
 	INIT_LIST_HEAD(&nn->blocked_locks_lru);
 
 	INIT_DELAYED_WORK(&nn->laundromat_work, laundromat_main);
+	/* Make sure this cannot run until client tracking is initialised */
+	disable_delayed_work(&nn->laundromat_work);
 	INIT_WORK(&nn->nfsd_shrinker_work, nfsd4_state_shrinker_worker);
 	get_net(net);
 
@@ -9044,9 +9045,7 @@ nfs4_state_start_net(struct net *net)
 	locks_start_grace(net, &nn->nfsd4_manager);
 	nfsd4_client_tracking_init(net);
 	/* safe for laundromat to run now */
-	spin_lock(&nn->client_lock);
-	nn->client_tracking_active = true;
-	spin_unlock(&nn->client_lock);
+	enable_delayed_work(&nn->laundromat_work);
 	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
 		goto skip_grace;
 	printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
@@ -9095,10 +9094,7 @@ nfs4_state_shutdown_net(struct net *net)
 
 	shrinker_free(nn->nfsd_client_shrinker);
 	cancel_work_sync(&nn->nfsd_shrinker_work);
-	spin_lock(&nn->client_lock);
-	nn->client_tracking_active = false;
-	spin_unlock(&nn->client_lock);
-	cancel_delayed_work_sync(&nn->laundromat_work);
+	disable_delayed_work_sync(&nn->laundromat_work);
 	locks_end_grace(&nn->nfsd4_manager);
 
 	INIT_LIST_HEAD(&reaplist);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] nfsd: provide locking for v4_end_grace
  2025-12-13 18:41 ` [PATCH v3 1/2] nfsd: " Chuck Lever
@ 2025-12-14  1:05   ` Jeff Layton
  2025-12-15  7:28   ` Li Lingfeng
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2025-12-14  1:05 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, lilingfeng3, yangerkun, yi.zhang, houtao1,
	chengzhihao1, yukuai3

On Sat, 2025-12-13 at 13:41 -0500, Chuck Lever wrote:
> From: NeilBrown <neil@brown.name>
> 
> Writing to v4_end_grace can race with server shutdown and result in
> memory being accessed after it was freed - reclaim_str_hashtbl in
> particularly.
> 

In hindsight, allowing that to be forced from userland was a bad idea.
The latest nfsdcld slurps in the whole list of clients and can make
this decision internally. Once we deprecate the legacy client tracking
upcalls, we should be able to deprecate this control at the same time.


> We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is
> held while client_tracking_op->init() is called and that can wait for
> an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a
> deadlock.
> 
> nfsd4_end_grace() is also called by the landromat work queue and this
> doesn't require locking as server shutdown will stop the work and wait
> for it before freeing anything that nfsd4_end_grace() might access.
> 
> However, we must be sure that writing to v4_end_grace doesn't restart
> the work item after shutdown has already waited for it.  For this we
> add a new flag protected with nn->client_lock.  It is set only while it
> is safe to make client tracking calls, and v4_end_grace only schedules
> work while the flag is set with the spinlock held.
> 
> So this patch adds a nfsd_net field "client_tracking_active" which is
> set as described.  Another field "grace_end_forced", is set when
> v4_end_grace is written.  After this is set, and providing
> client_tracking_active is set, the laundromat is scheduled.
> This "grace_end_forced" field bypasses other checks for whether the
> grace period has finished.
> 
> This resolves a race which can result in use-after-free.
> 
> Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
> Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t
> Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd")
> X-Cc: stable@vger.kernel.org
> Signed-off-by: NeilBrown <neil@brown.name>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/netns.h     |  2 ++
>  fs/nfsd/nfs4state.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>  fs/nfsd/nfsctl.c    |  3 +--
>  fs/nfsd/state.h     |  2 +-
>  4 files changed, 44 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 3e2d0fde80a7..fe8338735e7c 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -66,6 +66,8 @@ struct nfsd_net {
>  
>  	struct lock_manager nfsd4_manager;
>  	bool grace_ended;
> +	bool grace_end_forced;
> +	bool client_tracking_active;
>  	time64_t boot_time;
>  
>  	struct dentry *nfsd_client_dir;
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d0efa3e0965f..1d307cc533d9 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -84,7 +84,7 @@ static u64 current_sessionid = 1;
>  /* forward declarations */
>  static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner);
>  static void nfs4_free_ol_stateid(struct nfs4_stid *stid);
> -void nfsd4_end_grace(struct nfsd_net *nn);
> +static void nfsd4_end_grace(struct nfsd_net *nn);
>  static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps);
>  static void nfsd4_file_hash_remove(struct nfs4_file *fi);
>  static void deleg_reaper(struct nfsd_net *nn);
> @@ -6570,7 +6570,7 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	return nfs_ok;
>  }
>  
> -void
> +static void
>  nfsd4_end_grace(struct nfsd_net *nn)
>  {
>  	/* do nothing if grace period already ended */
> @@ -6603,6 +6603,33 @@ nfsd4_end_grace(struct nfsd_net *nn)
>  	 */
>  }
>  
> +/**
> + * nfsd4_force_end_grace - forcibly end the NFSv4 grace period
> + * @nn: network namespace for the server instance to be updated
> + *
> + * Forces bypass of normal grace period completion, then schedules
> + * the laundromat to end the grace period immediately. Does not wait
> + * for the grace period to fully terminate before returning.
> + *
> + * Return values:
> + *   %true: Grace termination schedule
> + *   %false: No action was taken
> + */
> +bool nfsd4_force_end_grace(struct nfsd_net *nn)
> +{
> +	if (!nn->client_tracking_ops)
> +		return false;
> +	spin_lock(&nn->client_lock);
> +	if (nn->grace_ended || !nn->client_tracking_active) {
> +		spin_unlock(&nn->client_lock);
> +		return false;
> +	}
> +	WRITE_ONCE(nn->grace_end_forced, true);

The client_lock is already held here. It it necessary to force this
store before mod_delayed_work() is called?

I think it'd be simpler to not bother with WRITE_ONCE() here and just
take the spinlock in clients_still_reclaiming() when checking it. That
function is only called in the context of the laundromat, which isn't
performance critical anyway.

> +	mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
> +	spin_unlock(&nn->client_lock);
> +	return true;
> +}
> +
>  /*
>   * If we've waited a lease period but there are still clients trying to
>   * reclaim, wait a little longer to give them a chance to finish.
> @@ -6612,6 +6639,8 @@ static bool clients_still_reclaiming(struct nfsd_net *nn)
>  	time64_t double_grace_period_end = nn->boot_time +
>  					   2 * nn->nfsd4_lease;
>  
> +	if (READ_ONCE(nn->grace_end_forced))
> +		return false;
>  	if (nn->track_reclaim_completes &&
>  			atomic_read(&nn->nr_reclaim_complete) ==
>  			nn->reclaim_str_hashtbl_size)
> @@ -8932,6 +8961,8 @@ static int nfs4_state_create_net(struct net *net)
>  	nn->unconf_name_tree = RB_ROOT;
>  	nn->boot_time = ktime_get_real_seconds();
>  	nn->grace_ended = false;
> +	nn->grace_end_forced = false;
> +	nn->client_tracking_active = false;
>  	nn->nfsd4_manager.block_opens = true;
>  	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
>  	INIT_LIST_HEAD(&nn->client_lru);
> @@ -9012,6 +9043,10 @@ nfs4_state_start_net(struct net *net)
>  		return ret;
>  	locks_start_grace(net, &nn->nfsd4_manager);
>  	nfsd4_client_tracking_init(net);
> +	/* safe for laundromat to run now */
> +	spin_lock(&nn->client_lock);
> +	nn->client_tracking_active = true;
> +	spin_unlock(&nn->client_lock);
>  	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
>  		goto skip_grace;
>  	printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
> @@ -9060,6 +9095,9 @@ nfs4_state_shutdown_net(struct net *net)
>  
>  	shrinker_free(nn->nfsd_client_shrinker);
>  	cancel_work_sync(&nn->nfsd_shrinker_work);
> +	spin_lock(&nn->client_lock);
> +	nn->client_tracking_active = false;
> +	spin_unlock(&nn->client_lock);
>  	cancel_delayed_work_sync(&nn->laundromat_work);
>  	locks_end_grace(&nn->nfsd4_manager);
>  
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 5ce9a49e76ba..242fcbd958f1 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1082,10 +1082,9 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
>  		case 'Y':
>  		case 'y':
>  		case '1':
> -			if (!nn->nfsd_serv)
> +			if (!nfsd4_force_end_grace(nn))
>  				return -EBUSY;
>  			trace_nfsd_end_grace(netns(file));
> -			nfsd4_end_grace(nn);
>  			break;
>  		default:
>  			return -EINVAL;
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index b052c1effdc5..848c5383d782 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -849,7 +849,7 @@ static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb)
>  #endif
>  
>  /* grace period management */
> -void nfsd4_end_grace(struct nfsd_net *nn);
> +bool nfsd4_force_end_grace(struct nfsd_net *nn);
>  
>  /* nfs4recover operations */
>  extern int nfsd4_client_tracking_init(struct net *net);

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync
  2025-12-13 18:42 ` [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync Chuck Lever
@ 2025-12-14  1:07   ` Jeff Layton
  2025-12-15  8:00   ` Li Lingfeng
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2025-12-14  1:07 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, lilingfeng3, yangerkun, yi.zhang, houtao1,
	chengzhihao1, yukuai3

On Sat, 2025-12-13 at 13:42 -0500, Chuck Lever wrote:
> From: NeilBrown <neil@brown.name>
> 
> "nfsd: provide locking for v4_end_grace" introduced a
> client_tracking_active flag protected by nn->client_lock to prevent
> the laundromat from being scheduled before client tracking
> initialization or after shutdown begins. That commit is suitable for
> backporting to LTS kernels that predate commit 86898fa6b8cd
> ("workqueue: Implement disable/enable for (delayed) work items").
> 
> However, the workqueue subsystem in recent kernels provides
> enable_delayed_work() and disable_delayed_work_sync() for this
> purpose. Using this mechanism enable us to remove the
> client_tracking_active flag and associated spinlock operations
> while preserving the same synchronization guarantees, which is
> a cleaner long-term approach.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/netns.h     |  1 -
>  fs/nfsd/nfs4state.c | 22 +++++++++-------------
>  2 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index fe8338735e7c..d83c68872c4c 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -67,7 +67,6 @@ struct nfsd_net {
>  	struct lock_manager nfsd4_manager;
>  	bool grace_ended;
>  	bool grace_end_forced;
> -	bool client_tracking_active;
>  	time64_t boot_time;
>  
>  	struct dentry *nfsd_client_dir;
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 1d307cc533d9..c9be724c48d0 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -6619,14 +6619,14 @@ bool nfsd4_force_end_grace(struct nfsd_net *nn)
>  {
>  	if (!nn->client_tracking_ops)
>  		return false;
> -	spin_lock(&nn->client_lock);
> -	if (nn->grace_ended || !nn->client_tracking_active) {
> -		spin_unlock(&nn->client_lock);
> +	if (READ_ONCE(nn->grace_ended))
>  		return false;
> -	}
> +	/* laundromat_work must be initialised now, though it might be disabled */
>  	WRITE_ONCE(nn->grace_end_forced, true);

Ahh ok, I get it now. This is much cleaner, but you do need the
READ/WRITE_ONCE semantics once you get rid of the spinlock. I withdraw
my objection to patch #1.

> +	/* mod_delayed_work() doesn't queue work after
> +	 * nfs4_state_shutdown_net() has called disable_delayed_work_sync()
> +	 */
>  	mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
> -	spin_unlock(&nn->client_lock);
>  	return true;
>  }
>  
> @@ -8962,7 +8962,6 @@ static int nfs4_state_create_net(struct net *net)
>  	nn->boot_time = ktime_get_real_seconds();
>  	nn->grace_ended = false;
>  	nn->grace_end_forced = false;
> -	nn->client_tracking_active = false;
>  	nn->nfsd4_manager.block_opens = true;
>  	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
>  	INIT_LIST_HEAD(&nn->client_lru);
> @@ -8977,6 +8976,8 @@ static int nfs4_state_create_net(struct net *net)
>  	INIT_LIST_HEAD(&nn->blocked_locks_lru);
>  
>  	INIT_DELAYED_WORK(&nn->laundromat_work, laundromat_main);
> +	/* Make sure this cannot run until client tracking is initialised */
> +	disable_delayed_work(&nn->laundromat_work);
>  	INIT_WORK(&nn->nfsd_shrinker_work, nfsd4_state_shrinker_worker);
>  	get_net(net);
>  
> @@ -9044,9 +9045,7 @@ nfs4_state_start_net(struct net *net)
>  	locks_start_grace(net, &nn->nfsd4_manager);
>  	nfsd4_client_tracking_init(net);
>  	/* safe for laundromat to run now */
> -	spin_lock(&nn->client_lock);
> -	nn->client_tracking_active = true;
> -	spin_unlock(&nn->client_lock);
> +	enable_delayed_work(&nn->laundromat_work);
>  	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
>  		goto skip_grace;
>  	printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
> @@ -9095,10 +9094,7 @@ nfs4_state_shutdown_net(struct net *net)
>  
>  	shrinker_free(nn->nfsd_client_shrinker);
>  	cancel_work_sync(&nn->nfsd_shrinker_work);
> -	spin_lock(&nn->client_lock);
> -	nn->client_tracking_active = false;
> -	spin_unlock(&nn->client_lock);
> -	cancel_delayed_work_sync(&nn->laundromat_work);
> +	disable_delayed_work_sync(&nn->laundromat_work);
>  	locks_end_grace(&nn->nfsd4_manager);
>  
>  	INIT_LIST_HEAD(&reaplist);

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] provide locking for v4_end_grace
  2025-12-13 18:41 [PATCH v3 0/2] provide locking for v4_end_grace Chuck Lever
  2025-12-13 18:41 ` [PATCH v3 1/2] nfsd: " Chuck Lever
  2025-12-13 18:42 ` [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync Chuck Lever
@ 2025-12-14  1:08 ` Jeff Layton
  2 siblings, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2025-12-14  1:08 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, lilingfeng3, yangerkun, yi.zhang, houtao1,
	chengzhihao1, yukuai3, Chuck Lever

On Sat, 2025-12-13 at 13:41 -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Following up on:
> 
> https://lore.kernel.org/linux-nfs/175136659151.565058.6474755472267609432@noble.neil.brown.name/#r
> 
> This is now two patches: one that can be backported, and one that
> simplifies the fix based on mechanisms available only in recent
> kernels. I've also addressed all the review comments I could find.
> 
> These patches have been compile-tested only.
> 
> NeilBrown (2):
>   nfsd: provide locking for v4_end_grace
>   nfsd: use workqueue enable/disable APIs for v4_end_grace sync
> 
>  fs/nfsd/netns.h     |  1 +
>  fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++++++---
>  fs/nfsd/nfsctl.c    |  3 +--
>  fs/nfsd/state.h     |  2 +-
>  4 files changed, 40 insertions(+), 6 deletions(-)

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] nfsd: provide locking for v4_end_grace
  2025-12-13 18:41 ` [PATCH v3 1/2] nfsd: " Chuck Lever
  2025-12-14  1:05   ` Jeff Layton
@ 2025-12-15  7:28   ` Li Lingfeng
  1 sibling, 0 replies; 8+ messages in thread
From: Li Lingfeng @ 2025-12-15  7:28 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, yangerkun, yi.zhang, houtao1, chengzhihao1

Hi,

在 2025/12/14 2:41, Chuck Lever 写道:
> From: NeilBrown <neil@brown.name>
>
> Writing to v4_end_grace can race with server shutdown and result in
> memory being accessed after it was freed - reclaim_str_hashtbl in
> particularly.
>
> We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is
> held while client_tracking_op->init() is called and that can wait for
> an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a
> deadlock.
>
> nfsd4_end_grace() is also called by the landromat work queue and this
> doesn't require locking as server shutdown will stop the work and wait
> for it before freeing anything that nfsd4_end_grace() might access.
>
> However, we must be sure that writing to v4_end_grace doesn't restart
> the work item after shutdown has already waited for it.  For this we
> add a new flag protected with nn->client_lock.  It is set only while it
> is safe to make client tracking calls, and v4_end_grace only schedules
> work while the flag is set with the spinlock held.
>
> So this patch adds a nfsd_net field "client_tracking_active" which is
> set as described.  Another field "grace_end_forced", is set when
> v4_end_grace is written.  After this is set, and providing
> client_tracking_active is set, the laundromat is scheduled.
> This "grace_end_forced" field bypasses other checks for whether the
> grace period has finished.
>
> This resolves a race which can result in use-after-free.
>
> Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
> Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t
> Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd")
> X-Cc: stable@vger.kernel.org
> Signed-off-by: NeilBrown <neil@brown.name>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   fs/nfsd/netns.h     |  2 ++
>   fs/nfsd/nfs4state.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>   fs/nfsd/nfsctl.c    |  3 +--
>   fs/nfsd/state.h     |  2 +-
>   4 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 3e2d0fde80a7..fe8338735e7c 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -66,6 +66,8 @@ struct nfsd_net {
>   
>   	struct lock_manager nfsd4_manager;
>   	bool grace_ended;
> +	bool grace_end_forced;
> +	bool client_tracking_active;
>   	time64_t boot_time;
>   
>   	struct dentry *nfsd_client_dir;
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d0efa3e0965f..1d307cc533d9 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -84,7 +84,7 @@ static u64 current_sessionid = 1;
>   /* forward declarations */
>   static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner);
>   static void nfs4_free_ol_stateid(struct nfs4_stid *stid);
> -void nfsd4_end_grace(struct nfsd_net *nn);
> +static void nfsd4_end_grace(struct nfsd_net *nn);
>   static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps);
>   static void nfsd4_file_hash_remove(struct nfs4_file *fi);
>   static void deleg_reaper(struct nfsd_net *nn);
> @@ -6570,7 +6570,7 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>   	return nfs_ok;
>   }
>   
> -void
> +static void
>   nfsd4_end_grace(struct nfsd_net *nn)
>   {
>   	/* do nothing if grace period already ended */
> @@ -6603,6 +6603,33 @@ nfsd4_end_grace(struct nfsd_net *nn)
>   	 */
>   }
>   
> +/**
> + * nfsd4_force_end_grace - forcibly end the NFSv4 grace period
> + * @nn: network namespace for the server instance to be updated
> + *
> + * Forces bypass of normal grace period completion, then schedules
> + * the laundromat to end the grace period immediately. Does not wait
> + * for the grace period to fully terminate before returning.
> + *
> + * Return values:
> + *   %true: Grace termination schedule
> + *   %false: No action was taken
> + */
> +bool nfsd4_force_end_grace(struct nfsd_net *nn)
> +{
> +	if (!nn->client_tracking_ops)
> +		return false;
> +	spin_lock(&nn->client_lock);
> +	if (nn->grace_ended || !nn->client_tracking_active) {
> +		spin_unlock(&nn->client_lock);
> +		return false;
> +	}
> +	WRITE_ONCE(nn->grace_end_forced, true);
> +	mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
> +	spin_unlock(&nn->client_lock);
> +	return true;
> +}
> +
>   /*
>    * If we've waited a lease period but there are still clients trying to
>    * reclaim, wait a little longer to give them a chance to finish.
> @@ -6612,6 +6639,8 @@ static bool clients_still_reclaiming(struct nfsd_net *nn)
>   	time64_t double_grace_period_end = nn->boot_time +
>   					   2 * nn->nfsd4_lease;
>   
> +	if (READ_ONCE(nn->grace_end_forced))
> +		return false;
>   	if (nn->track_reclaim_completes &&
>   			atomic_read(&nn->nr_reclaim_complete) ==
>   			nn->reclaim_str_hashtbl_size)
> @@ -8932,6 +8961,8 @@ static int nfs4_state_create_net(struct net *net)
>   	nn->unconf_name_tree = RB_ROOT;
>   	nn->boot_time = ktime_get_real_seconds();
>   	nn->grace_ended = false;
> +	nn->grace_end_forced = false;
> +	nn->client_tracking_active = false;
>   	nn->nfsd4_manager.block_opens = true;
>   	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
>   	INIT_LIST_HEAD(&nn->client_lru);
> @@ -9012,6 +9043,10 @@ nfs4_state_start_net(struct net *net)
>   		return ret;
>   	locks_start_grace(net, &nn->nfsd4_manager);
>   	nfsd4_client_tracking_init(net);
> +	/* safe for laundromat to run now */
> +	spin_lock(&nn->client_lock);
> +	nn->client_tracking_active = true;
> +	spin_unlock(&nn->client_lock);
>   	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
>   		goto skip_grace;
>   	printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
> @@ -9060,6 +9095,9 @@ nfs4_state_shutdown_net(struct net *net)
>   
>   	shrinker_free(nn->nfsd_client_shrinker);
>   	cancel_work_sync(&nn->nfsd_shrinker_work);
> +	spin_lock(&nn->client_lock);
> +	nn->client_tracking_active = false;
> +	spin_unlock(&nn->client_lock);
>   	cancel_delayed_work_sync(&nn->laundromat_work);
>   	locks_end_grace(&nn->nfsd4_manager);
>   
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 5ce9a49e76ba..242fcbd958f1 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1082,10 +1082,9 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
>   		case 'Y':
>   		case 'y':
>   		case '1':
> -			if (!nn->nfsd_serv)
> +			if (!nfsd4_force_end_grace(nn))
>   				return -EBUSY;
>   			trace_nfsd_end_grace(netns(file));
> -			nfsd4_end_grace(nn);
>   			break;
>   		default:
>   			return -EINVAL;
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index b052c1effdc5..848c5383d782 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -849,7 +849,7 @@ static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb)
>   #endif
>   
>   /* grace period management */
> -void nfsd4_end_grace(struct nfsd_net *nn);
> +bool nfsd4_force_end_grace(struct nfsd_net *nn);
>   
>   /* nfs4recover operations */
>   extern int nfsd4_client_tracking_init(struct net *net);
Thank you for your patch. I reproduced and verified the issue using the
following steps:
mkfs.ext4 -F /dev/sdb
mount /dev/sdb /mnt/sdb
echo "/mnt *(rw,no_root_squash,fsid=0)" > /etc/exports
echo "/mnt/sdb *(rw,no_root_squash,fsid=1)" >> /etc/exports
systemctl restart nfs-server
mount -t nfs -o rw,vers=4.2 127.0.0.1:/sdb /mnt/sdbb
systemctl restart nfs-server
echo 1 > /proc/fs/nfsd/v4_end_grace &
echo 0 > /proc/fs/nfsd/threads


based: master-416f99c3b16f582a3fc6d64a1f77f39d94b76de5


diff:
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index b39d4cbdfd35..339718af9be3 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -1421,6 +1421,10 @@ nfsd4_cld_grace_done(struct nfsd_net *nn)

         free_cld_upcall(cup);
  out_err:
+       printk("%s %d\n", __func__, __LINE__);
+       printk("%s sleep before release reclaim...\n", __func__);
+       msleep(5 * 1000);
+       printk("%s sleep before release reclaim done\n", __func__);
         nfs4_release_reclaim(nn);
         if (ret)
                 printk(KERN_ERR "NFSD: Unable to end grace period: 
%d\n", ret);
@@ -1454,6 +1458,10 @@ nfs4_cld_state_shutdown(struct net *net)

         nn->track_reclaim_completes = false;
         kfree(nn->reclaim_str_hashtbl);
+       printk("%s free nn->reclaim_str_hashtbl %px done\n", __func__, 
nn->reclaim_str_hashtbl);
+       printk("%s sleep after free...\n", __func__);
+       msleep(10 * 1000);
+       printk("%s sleep done\n", __func__);
  }

  static bool


The original problematic execution flow looks like this:
                 T1                            T2
// echo 1 > /proc/fs/nfsd/v4_end_grace
write_v4_end_grace
  nfsd4_end_grace
   nfsd4_record_grace_done
    nfsd4_cld_grace_done
                                 // echo 0 > /proc/fs/nfsd/threads
                                 write_threads
                                  nfsd_svc
                                   nfsd_destroy_serv
                                    nfsd_shutdown_net
                                     nfs4_state_shutdown_net
                                      nfsd4_client_tracking_exit
                                       nfsd4_cld_tracking_exit
                                        nfs4_cld_state_shutdown
                                         kfree // nn->reclaim_str_hashtbl
     nfs4_release_reclaim
      &nn->reclaim_str_hashtbl[i] // UAF


This patch moves the handling of nfsd4_end_grace() triggered by writing to
v4_end_grace into laundromat_work. As a result, the concurrently executing
T2 path will be blocked by cancel_delayed_work_sync() in
nfs4_state_shutdown_net(), preventing T2 from freeing reclaim_str_hashtbl
while laundromat_work is still running.

Tested-by: Li Lingfeng <lilingfeng3@huawei.com>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync
  2025-12-13 18:42 ` [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync Chuck Lever
  2025-12-14  1:07   ` Jeff Layton
@ 2025-12-15  8:00   ` Li Lingfeng
  1 sibling, 0 replies; 8+ messages in thread
From: Li Lingfeng @ 2025-12-15  8:00 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, yangerkun, yi.zhang, houtao1, chengzhihao1

Hi,

在 2025/12/14 2:42, Chuck Lever 写道:
> From: NeilBrown <neil@brown.name>
>
> "nfsd: provide locking for v4_end_grace" introduced a
> client_tracking_active flag protected by nn->client_lock to prevent
> the laundromat from being scheduled before client tracking
> initialization or after shutdown begins. That commit is suitable for
> backporting to LTS kernels that predate commit 86898fa6b8cd
> ("workqueue: Implement disable/enable for (delayed) work items").
>
> However, the workqueue subsystem in recent kernels provides
> enable_delayed_work() and disable_delayed_work_sync() for this
> purpose. Using this mechanism enable us to remove the
> client_tracking_active flag and associated spinlock operations
> while preserving the same synchronization guarantees, which is
> a cleaner long-term approach.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   fs/nfsd/netns.h     |  1 -
>   fs/nfsd/nfs4state.c | 22 +++++++++-------------
>   2 files changed, 9 insertions(+), 14 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index fe8338735e7c..d83c68872c4c 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -67,7 +67,6 @@ struct nfsd_net {
>   	struct lock_manager nfsd4_manager;
>   	bool grace_ended;
>   	bool grace_end_forced;
> -	bool client_tracking_active;
>   	time64_t boot_time;
>   
>   	struct dentry *nfsd_client_dir;
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 1d307cc533d9..c9be724c48d0 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -6619,14 +6619,14 @@ bool nfsd4_force_end_grace(struct nfsd_net *nn)
>   {
>   	if (!nn->client_tracking_ops)
>   		return false;
> -	spin_lock(&nn->client_lock);
> -	if (nn->grace_ended || !nn->client_tracking_active) {
> -		spin_unlock(&nn->client_lock);
> +	if (READ_ONCE(nn->grace_ended))
>   		return false;
> -	}
> +	/* laundromat_work must be initialised now, though it might be disabled */
>   	WRITE_ONCE(nn->grace_end_forced, true);
> +	/* mod_delayed_work() doesn't queue work after
> +	 * nfs4_state_shutdown_net() has called disable_delayed_work_sync()
> +	 */
>   	mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
> -	spin_unlock(&nn->client_lock);
>   	return true;
>   }
>   
> @@ -8962,7 +8962,6 @@ static int nfs4_state_create_net(struct net *net)
>   	nn->boot_time = ktime_get_real_seconds();
>   	nn->grace_ended = false;
>   	nn->grace_end_forced = false;
> -	nn->client_tracking_active = false;
>   	nn->nfsd4_manager.block_opens = true;
>   	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
>   	INIT_LIST_HEAD(&nn->client_lru);
> @@ -8977,6 +8976,8 @@ static int nfs4_state_create_net(struct net *net)
>   	INIT_LIST_HEAD(&nn->blocked_locks_lru);
>   
>   	INIT_DELAYED_WORK(&nn->laundromat_work, laundromat_main);
> +	/* Make sure this cannot run until client tracking is initialised */
> +	disable_delayed_work(&nn->laundromat_work);
>   	INIT_WORK(&nn->nfsd_shrinker_work, nfsd4_state_shrinker_worker);
>   	get_net(net);
>   
> @@ -9044,9 +9045,7 @@ nfs4_state_start_net(struct net *net)
>   	locks_start_grace(net, &nn->nfsd4_manager);
>   	nfsd4_client_tracking_init(net);
>   	/* safe for laundromat to run now */
> -	spin_lock(&nn->client_lock);
> -	nn->client_tracking_active = true;
> -	spin_unlock(&nn->client_lock);
> +	enable_delayed_work(&nn->laundromat_work);
>   	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
>   		goto skip_grace;
>   	printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
> @@ -9095,10 +9094,7 @@ nfs4_state_shutdown_net(struct net *net)
>   
>   	shrinker_free(nn->nfsd_client_shrinker);
>   	cancel_work_sync(&nn->nfsd_shrinker_work);
> -	spin_lock(&nn->client_lock);
> -	nn->client_tracking_active = false;
> -	spin_unlock(&nn->client_lock);
> -	cancel_delayed_work_sync(&nn->laundromat_work);
> +	disable_delayed_work_sync(&nn->laundromat_work);
>   	locks_end_grace(&nn->nfsd4_manager);
>   
>   	INIT_LIST_HEAD(&reaplist);
I've also tested the second patch. After applying both patches, I was no
longer able to reproduce the issue, and everything looks fine from my
testing.

Tested-by: Li Lingfeng <lilingfeng3@huawei.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-12-15  8:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-13 18:41 [PATCH v3 0/2] provide locking for v4_end_grace Chuck Lever
2025-12-13 18:41 ` [PATCH v3 1/2] nfsd: " Chuck Lever
2025-12-14  1:05   ` Jeff Layton
2025-12-15  7:28   ` Li Lingfeng
2025-12-13 18:42 ` [PATCH v3 2/2] nfsd: use workqueue enable/disable APIs for v4_end_grace sync Chuck Lever
2025-12-14  1:07   ` Jeff Layton
2025-12-15  8:00   ` Li Lingfeng
2025-12-14  1:08 ` [PATCH v3 0/2] provide locking for v4_end_grace Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).