From: NeilBrown <neil@brown.name>
To: Chuck Lever <chuck.lever@oracle.com>, Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Li Lingfeng <lilingfeng3@huawei.com>
Subject: [PATCH 1/2] nfsd: provide locking for v4_end_grace
Date: Fri, 4 Jul 2025 17:20:17 +1000 [thread overview]
Message-ID: <20250704072332.3278129-2-neil@brown.name> (raw)
In-Reply-To: <20250704072332.3278129-1-neil@brown.name>
Writing to v4_end_grace can race with server shutdown and result in
memory being accessed after it was freed - reclaim_str_hashtbl in
particular.
We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is
held while client_tracking_op->init() is called and that can wait for
an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a
deadlock.
nfsd4_end_grace() is also called by the landromat work queue and this
doesn't require locking as server shutdown will stop the work and wait
for it before freeing anything that nfsd4_end_grace() might access.
However, we must be sure that writing to v4_end_grace doesn't restart
the work item after shutdown has already waited for it. For this we add
a new flag protected with a spin_lock, and nn->client_lock is suitable.
It is set only while it is safe to make client tracking calls, and
v4_end_grace only schedules work while the flag is set and with the
spin_lock held.
So this patch adds an nfsd_net field "client_tracking_active" which is
set as described. Another field "grace_end_forced", is set when
v4_end_grace is written. After this is set, and providing
client_tracking_active is set, the laundromat is scheduled.
This "grace_end_forced" field bypasses other checks for whether the
grace period has finished.
This resolves a race which can result in use-after-free.
Reported-and-tested-by: Li Lingfeng <lilingfeng3@huawei.com>
Closes: https://lore.kernel.org/linux-nfs/20250513074305.3362209-1-lilingfeng3@huawei.com
Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd")
Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfsd/netns.h | 2 ++
fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
fs/nfsd/nfsctl.c | 6 +++---
fs/nfsd/state.h | 2 +-
4 files changed, 49 insertions(+), 6 deletions(-)
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 3e2d0fde80a7..fe8338735e7c 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -66,6 +66,8 @@ struct nfsd_net {
struct lock_manager nfsd4_manager;
bool grace_ended;
+ bool grace_end_forced;
+ bool client_tracking_active;
time64_t boot_time;
struct dentry *nfsd_client_dir;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d5694987f86f..124fe4f669aa 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -84,7 +84,7 @@ static u64 current_sessionid = 1;
/* forward declarations */
static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner);
static void nfs4_free_ol_stateid(struct nfs4_stid *stid);
-void nfsd4_end_grace(struct nfsd_net *nn);
+static void nfsd4_end_grace(struct nfsd_net *nn);
static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps);
static void nfsd4_file_hash_remove(struct nfs4_file *fi);
static void deleg_reaper(struct nfsd_net *nn);
@@ -6458,7 +6458,7 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
return nfs_ok;
}
-void
+static void
nfsd4_end_grace(struct nfsd_net *nn)
{
/* do nothing if grace period already ended */
@@ -6491,6 +6491,36 @@ nfsd4_end_grace(struct nfsd_net *nn)
*/
}
+/**
+ * nfsd4_force_end_grace - forcibly end the grace period
+ * @nn: nfsd_net in which the grace period must end.
+ *
+ *
+ * An nfsv4 grace period can be terminated early if it is known that
+ * no more client could reclaim state. Sometimes user-space can provide
+ * that information - which will potentially be provided asychnronously
+ * w.r.t. server startup or shutdown.
+ *
+ * nfsd4_force_end_grace() causing the grace period to end and takes
+ * care to ensure races with server start/stop are not problematic.
+ *
+ * Return value: %false if the NFS server was not active and
+ * %true if the server was, or may have been, active.
+ */
+bool
+nfsd4_force_end_grace(struct nfsd_net *nn)
+{
+ if (!nn->client_tracking_ops)
+ return false;
+ spin_lock(&nn->client_lock);
+ if (nn->client_tracking_active) {
+ nn->grace_end_forced = true;
+ mod_delayed_work(laundry_wq, &nn->laundromat_work, 0);
+ }
+ spin_unlock(&nn->client_lock);
+ return true;
+}
+
/*
* If we've waited a lease period but there are still clients trying to
* reclaim, wait a little longer to give them a chance to finish.
@@ -6500,6 +6530,8 @@ static bool clients_still_reclaiming(struct nfsd_net *nn)
time64_t double_grace_period_end = nn->boot_time +
2 * nn->nfsd4_lease;
+ if (nn->grace_end_forced)
+ return false;
if (nn->track_reclaim_completes &&
atomic_read(&nn->nr_reclaim_complete) ==
nn->reclaim_str_hashtbl_size)
@@ -8807,6 +8839,8 @@ static int nfs4_state_create_net(struct net *net)
nn->unconf_name_tree = RB_ROOT;
nn->boot_time = ktime_get_real_seconds();
nn->grace_ended = false;
+ nn->grace_end_forced = false;
+ nn->client_tracking_active = false;
nn->nfsd4_manager.block_opens = true;
INIT_LIST_HEAD(&nn->nfsd4_manager.list);
INIT_LIST_HEAD(&nn->client_lru);
@@ -8887,6 +8921,10 @@ nfs4_state_start_net(struct net *net)
return ret;
locks_start_grace(net, &nn->nfsd4_manager);
nfsd4_client_tracking_init(net);
+ /* safe for laundromat to run now */
+ spin_lock(&nn->client_lock);
+ nn->client_tracking_active = true;
+ spin_unlock(&nn->client_lock);
if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
goto skip_grace;
printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n",
@@ -8935,6 +8973,9 @@ nfs4_state_shutdown_net(struct net *net)
shrinker_free(nn->nfsd_client_shrinker);
cancel_work_sync(&nn->nfsd_shrinker_work);
+ spin_lock(&nn->client_lock);
+ nn->client_tracking_active = false;
+ spin_unlock(&nn->client_lock);
cancel_delayed_work_sync(&nn->laundromat_work);
locks_end_grace(&nn->nfsd4_manager);
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 3f3e9f6c4250..658f3f86a59f 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1082,10 +1082,10 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
case 'Y':
case 'y':
case '1':
- if (!nn->nfsd_serv)
- return -EBUSY;
trace_nfsd_end_grace(netns(file));
- nfsd4_end_grace(nn);
+ if (!nfsd4_force_end_grace(nn))
+ return -EBUSY;
+
break;
default:
return -EINVAL;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 1995bca158b8..05eabc69de40 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -836,7 +836,7 @@ static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb)
#endif
/* grace period management */
-void nfsd4_end_grace(struct nfsd_net *nn);
+bool nfsd4_force_end_grace(struct nfsd_net *nn);
/* nfs4recover operations */
extern int nfsd4_client_tracking_init(struct net *net);
--
2.49.0
next prev parent reply other threads:[~2025-07-04 7:24 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-04 7:20 [PATCH 0/2] nfsd: provide locking for v4_end_grace NeilBrown
2025-07-04 7:20 ` NeilBrown [this message]
2025-07-05 11:15 ` [PATCH 1/2] " Jeff Layton
2025-07-06 5:56 ` NeilBrown
2025-07-04 7:20 ` [PATCH 2/2] nfsd: discard client_tracking_active and instead disable laundromat_work NeilBrown
2025-07-04 15:10 ` Chuck Lever
2025-07-04 15:13 ` Chuck Lever
2025-07-05 11:20 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250704072332.3278129-2-neil@brown.name \
--to=neil@brown.name \
--cc=Dai.Ngo@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=lilingfeng3@huawei.com \
--cc=linux-nfs@vger.kernel.org \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox