From: Zhihao Cheng <chengzhihao1@huawei.com>
To: <trondmy@kernel.org>, <anna@kernel.org>
Cc: <linux-nfs@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<yangerkun@huawei.com>, Li Lingfeng <lilingfeng3@huawei.com>
Subject: Re: [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period
Date: Wed, 22 Apr 2026 14:55:46 +0800 [thread overview]
Message-ID: <e8c5a503-e8b4-11ef-68fb-a0195ce07b07@huawei.com> (raw)
In-Reply-To: <20260422064447.358447-1-chengzhihao1@huawei.com>
在 2026/4/22 14:44, Zhihao Cheng 写道:
Add lilingfeng3@huawei.com
> NFS server restart causes client to enter an infinite loop during state
> recovery. The state manager gets stuck in NFS4CLNT_RECLAIM_NOGRACE processing,
> with the server repeatedly returning NFS4ERR_GRACE for each file iteration.
> This problem is reported in [1].
>
> Trigger sequence:
> 1. Client opens 2 files. After server reboot, client enters
> nfs4_do_reclaim(RECLAIM_REBOOT). Server misses grace period and returns
> NFS4ERR_NO_GRACE, causing client to set NFS4CLNT_RECLAIM_NOGRACE.
> 2. Client enters nfs4_do_reclaim(RECLAIM_NOGRACE) to recover first file.
> Server reboots again, open request returns NFS4ERR_BADSESSION, client
> sets NFS4CLNT_SESSION_RESET.
> 3. nfs4_reset_session calls nfs4_proc_create_session which fails with
> ETIMEDOUT due to network¹ÊÕÏ, nfs4_handle_reclaim_lease_error sets
> NFS4CLNT_LEASE_EXPIRED but does NOT set NFS4CLNT_RECLAIM_REBOOT.
> 4. When nfs4_reclaim_lease runs, because NFS4CLNT_RECLAIM_NOGRACE is already
> set, it skips setting NFS4CLNT_RECLAIM_REBOOT (the bug, modified by
> commit b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")).
> 5. Server never receives RECLAIM_COMPLETE, so cl_flags lacks
> NFSD4_CLIENT_RECLAIM_COMPLETE. When processing subsequent files,
> server always returns nfserr_grace, causing infinite retry loop.
>
> Fix it by setting NFS4CLNT_RECLAIM_REBOOT in nfs4_reclaim_lease if
> NFS4CLNT_SERVER_SCOPE_MISMATCH is not set, so that the client sends
> RECLAIM_COMPLETE to the server first, allowing subsequent nograce
> recovery to proceed.
>
> Fetch a reproducer in [2].
>
> [1] https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=221399
>
> Fixes: b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")
> Cc: stable@vger.kernel.org
> Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
> Closes: https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> ---
> fs/nfs/nfs4state.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 305a772e5497..817327e73d88 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2012,7 +2012,7 @@ static int nfs4_reclaim_lease(struct nfs_client *clp)
> return nfs4_handle_reclaim_lease_error(clp, status);
> if (test_and_clear_bit(NFS4CLNT_SERVER_SCOPE_MISMATCH, &clp->cl_state))
> nfs4_state_start_reclaim_nograce(clp);
> - if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
> + else
> set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
> clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
> clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
>
next prev parent reply other threads:[~2026-04-22 6:55 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-22 6:44 [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period Zhihao Cheng
2026-04-22 6:55 ` Zhihao Cheng [this message]
2026-04-22 12:38 ` Trond Myklebust
2026-04-23 9:05 ` Zhihao Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8c5a503-e8b4-11ef-68fb-a0195ce07b07@huawei.com \
--to=chengzhihao1@huawei.com \
--cc=anna@kernel.org \
--cc=lilingfeng3@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@kernel.org \
--cc=yangerkun@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox