[PATCH] NFSv4: Fix state recovery deadlock when server misses grace period

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period
@ 2026-04-22  6:44 Zhihao Cheng
  2026-04-22  6:55 ` Zhihao Cheng
  0 siblings, 1 reply; 4+ messages in thread
From: Zhihao Cheng @ 2026-04-22  6:44 UTC (permalink / raw)
  To: trondmy, anna; +Cc: linux-nfs, linux-kernel, chengzhihao1, yangerkun

NFS server restart causes client to enter an infinite loop during state
recovery. The state manager gets stuck in NFS4CLNT_RECLAIM_NOGRACE processing,
with the server repeatedly returning NFS4ERR_GRACE for each file iteration.
This problem is reported in [1].

Trigger sequence:
 1. Client opens 2 files. After server reboot, client enters
    nfs4_do_reclaim(RECLAIM_REBOOT). Server misses grace period and returns
    NFS4ERR_NO_GRACE, causing client to set NFS4CLNT_RECLAIM_NOGRACE.
 2. Client enters nfs4_do_reclaim(RECLAIM_NOGRACE) to recover first file.
    Server reboots again, open request returns NFS4ERR_BADSESSION, client
    sets NFS4CLNT_SESSION_RESET.
 3. nfs4_reset_session calls nfs4_proc_create_session which fails with
    ETIMEDOUT due to network¹ÊÕÏ, nfs4_handle_reclaim_lease_error sets
    NFS4CLNT_LEASE_EXPIRED but does NOT set NFS4CLNT_RECLAIM_REBOOT.
 4. When nfs4_reclaim_lease runs, because NFS4CLNT_RECLAIM_NOGRACE is already
    set, it skips setting NFS4CLNT_RECLAIM_REBOOT (the bug, modified by
    commit b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")).
 5. Server never receives RECLAIM_COMPLETE, so cl_flags lacks
    NFSD4_CLIENT_RECLAIM_COMPLETE. When processing subsequent files,
    server always returns nfserr_grace, causing infinite retry loop.

Fix it by setting NFS4CLNT_RECLAIM_REBOOT in nfs4_reclaim_lease if
NFS4CLNT_SERVER_SCOPE_MISMATCH is not set, so that the client sends
RECLAIM_COMPLETE to the server first, allowing subsequent nograce
recovery to proceed.

Fetch a reproducer in [2].

[1] https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
[2] https://bugzilla.kernel.org/show_bug.cgi?id=221399

Fixes: b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")
Cc: stable@vger.kernel.org
Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
Closes: https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/nfs/nfs4state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 305a772e5497..817327e73d88 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2012,7 +2012,7 @@ static int nfs4_reclaim_lease(struct nfs_client *clp)
 		return nfs4_handle_reclaim_lease_error(clp, status);
 	if (test_and_clear_bit(NFS4CLNT_SERVER_SCOPE_MISMATCH, &clp->cl_state))
 		nfs4_state_start_reclaim_nograce(clp);
-	if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
+	else
 		set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
 	clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
 	clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period
  2026-04-22  6:44 [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period Zhihao Cheng
@ 2026-04-22  6:55 ` Zhihao Cheng
  2026-04-22 12:38   ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Zhihao Cheng @ 2026-04-22  6:55 UTC (permalink / raw)
  To: trondmy, anna; +Cc: linux-nfs, linux-kernel, yangerkun, Li Lingfeng

在 2026/4/22 14:44, Zhihao Cheng 写道:
Add lilingfeng3@huawei.com
> NFS server restart causes client to enter an infinite loop during state
> recovery. The state manager gets stuck in NFS4CLNT_RECLAIM_NOGRACE processing,
> with the server repeatedly returning NFS4ERR_GRACE for each file iteration.
> This problem is reported in [1].
> 
> Trigger sequence:
>   1. Client opens 2 files. After server reboot, client enters
>      nfs4_do_reclaim(RECLAIM_REBOOT). Server misses grace period and returns
>      NFS4ERR_NO_GRACE, causing client to set NFS4CLNT_RECLAIM_NOGRACE.
>   2. Client enters nfs4_do_reclaim(RECLAIM_NOGRACE) to recover first file.
>      Server reboots again, open request returns NFS4ERR_BADSESSION, client
>      sets NFS4CLNT_SESSION_RESET.
>   3. nfs4_reset_session calls nfs4_proc_create_session which fails with
>      ETIMEDOUT due to network¹ÊÕÏ, nfs4_handle_reclaim_lease_error sets
>      NFS4CLNT_LEASE_EXPIRED but does NOT set NFS4CLNT_RECLAIM_REBOOT.
>   4. When nfs4_reclaim_lease runs, because NFS4CLNT_RECLAIM_NOGRACE is already
>      set, it skips setting NFS4CLNT_RECLAIM_REBOOT (the bug, modified by
>      commit b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")).
>   5. Server never receives RECLAIM_COMPLETE, so cl_flags lacks
>      NFSD4_CLIENT_RECLAIM_COMPLETE. When processing subsequent files,
>      server always returns nfserr_grace, causing infinite retry loop.
> 
> Fix it by setting NFS4CLNT_RECLAIM_REBOOT in nfs4_reclaim_lease if
> NFS4CLNT_SERVER_SCOPE_MISMATCH is not set, so that the client sends
> RECLAIM_COMPLETE to the server first, allowing subsequent nograce
> recovery to proceed.
> 
> Fetch a reproducer in [2].
> 
> [1] https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=221399
> 
> Fixes: b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")
> Cc: stable@vger.kernel.org
> Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
> Closes: https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> ---
>   fs/nfs/nfs4state.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 305a772e5497..817327e73d88 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2012,7 +2012,7 @@ static int nfs4_reclaim_lease(struct nfs_client *clp)
>   		return nfs4_handle_reclaim_lease_error(clp, status);
>   	if (test_and_clear_bit(NFS4CLNT_SERVER_SCOPE_MISMATCH, &clp->cl_state))
>   		nfs4_state_start_reclaim_nograce(clp);
> -	if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
> +	else
>   		set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
>   	clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
>   	clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period
  2026-04-22  6:55 ` Zhihao Cheng
@ 2026-04-22 12:38   ` Trond Myklebust
  2026-04-23  9:05     ` Zhihao Cheng
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2026-04-22 12:38 UTC (permalink / raw)
  To: Zhihao Cheng, anna; +Cc: linux-nfs, linux-kernel, yangerkun, Li Lingfeng

On Wed, 2026-04-22 at 14:55 +0800, Zhihao Cheng wrote:
> 在 2026/4/22 14:44, Zhihao Cheng 写道:
> Add lilingfeng3@huawei.com
> > NFS server restart causes client to enter an infinite loop during
> > state
> > recovery. The state manager gets stuck in NFS4CLNT_RECLAIM_NOGRACE
> > processing,
> > with the server repeatedly returning NFS4ERR_GRACE for each file
> > iteration.
> > This problem is reported in [1].
> > 
> > Trigger sequence:
> >   1. Client opens 2 files. After server reboot, client enters
> >      nfs4_do_reclaim(RECLAIM_REBOOT). Server misses grace period
> > and returns
> >      NFS4ERR_NO_GRACE, causing client to set
> > NFS4CLNT_RECLAIM_NOGRACE.
> >   2. Client enters nfs4_do_reclaim(RECLAIM_NOGRACE) to recover
> > first file.
> >      Server reboots again, open request returns NFS4ERR_BADSESSION,
> > client
> >      sets NFS4CLNT_SESSION_RESET.
> >   3. nfs4_reset_session calls nfs4_proc_create_session which fails
> > with
> >      ETIMEDOUT due to network¹ÊÕÏ, nfs4_handle_reclaim_lease_error
> > sets
> >      NFS4CLNT_LEASE_EXPIRED but does NOT set
> > NFS4CLNT_RECLAIM_REBOOT.
> >   4. When nfs4_reclaim_lease runs, because NFS4CLNT_RECLAIM_NOGRACE
> > is already
> >      set, it skips setting NFS4CLNT_RECLAIM_REBOOT (the bug,
> > modified by
> >      commit b42353ff8d346 ("NFSv4.1: Clean up
> > nfs4_reclaim_lease")).
> >   5. Server never receives RECLAIM_COMPLETE, so cl_flags lacks
> >      NFSD4_CLIENT_RECLAIM_COMPLETE. When processing subsequent
> > files,
> >      server always returns nfserr_grace, causing infinite retry
> > loop.
> > 
> > Fix it by setting NFS4CLNT_RECLAIM_REBOOT in nfs4_reclaim_lease if
> > NFS4CLNT_SERVER_SCOPE_MISMATCH is not set, so that the client sends
> > RECLAIM_COMPLETE to the server first, allowing subsequent nograce
> > recovery to proceed.
> > 
> > Fetch a reproducer in [2].
> > 
> > [1]
> > https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
> > [2] https://bugzilla.kernel.org/show_bug.cgi?id=221399
> > 
> > Fixes: b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")
> > Cc: stable@vger.kernel.org
> > Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
> > Closes:
> > https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
> > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> > ---
> >   fs/nfs/nfs4state.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > index 305a772e5497..817327e73d88 100644
> > --- a/fs/nfs/nfs4state.c
> > +++ b/fs/nfs/nfs4state.c
> > @@ -2012,7 +2012,7 @@ static int nfs4_reclaim_lease(struct
> > nfs_client *clp)
> >   		return nfs4_handle_reclaim_lease_error(clp,
> > status);
> >   	if (test_and_clear_bit(NFS4CLNT_SERVER_SCOPE_MISMATCH,
> > &clp->cl_state))
> >   		nfs4_state_start_reclaim_nograce(clp);
> > -	if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
> > +	else
> >   		set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
> >   	clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
> >   	clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
> > 
> 
This will cause the client to try to do reboot recovery in a situation
where it isn't allowed to do so by the spec. We should never be setting
NFS4CLNT_RECLAIM_REBOOT if NFS4CLNT_RECLAIM_NOGRACE is already set.

One solution would be to just immediately call
nfs4_state_end_reclaim_reboot() if NFS4CLNT_RECLAIM_NOGRACE is set.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period
  2026-04-22 12:38   ` Trond Myklebust
@ 2026-04-23  9:05     ` Zhihao Cheng
  0 siblings, 0 replies; 4+ messages in thread
From: Zhihao Cheng @ 2026-04-23  9:05 UTC (permalink / raw)
  To: Trond Myklebust, anna; +Cc: linux-nfs, linux-kernel, yangerkun, Li Lingfeng

在 2026/4/22 20:38, Trond Myklebust 写道:
> On Wed, 2026-04-22 at 14:55 +0800, Zhihao Cheng wrote:
>> 在 2026/4/22 14:44, Zhihao Cheng 写道:
>> Add lilingfeng3@huawei.com
>>> NFS server restart causes client to enter an infinite loop during
>>> state
>>> recovery. The state manager gets stuck in NFS4CLNT_RECLAIM_NOGRACE
>>> processing,
>>> with the server repeatedly returning NFS4ERR_GRACE for each file
>>> iteration.
>>> This problem is reported in [1].
>>>
>>> Trigger sequence:
>>>    1. Client opens 2 files. After server reboot, client enters
>>>       nfs4_do_reclaim(RECLAIM_REBOOT). Server misses grace period
>>> and returns
>>>       NFS4ERR_NO_GRACE, causing client to set
>>> NFS4CLNT_RECLAIM_NOGRACE.
>>>    2. Client enters nfs4_do_reclaim(RECLAIM_NOGRACE) to recover
>>> first file.
>>>       Server reboots again, open request returns NFS4ERR_BADSESSION,
>>> client
>>>       sets NFS4CLNT_SESSION_RESET.
>>>    3. nfs4_reset_session calls nfs4_proc_create_session which fails
>>> with
>>>       ETIMEDOUT due to network¹ÊÕÏ, nfs4_handle_reclaim_lease_error
>>> sets
>>>       NFS4CLNT_LEASE_EXPIRED but does NOT set
>>> NFS4CLNT_RECLAIM_REBOOT.
>>>    4. When nfs4_reclaim_lease runs, because NFS4CLNT_RECLAIM_NOGRACE
>>> is already
>>>       set, it skips setting NFS4CLNT_RECLAIM_REBOOT (the bug,
>>> modified by
>>>       commit b42353ff8d346 ("NFSv4.1: Clean up
>>> nfs4_reclaim_lease")).
>>>    5. Server never receives RECLAIM_COMPLETE, so cl_flags lacks
>>>       NFSD4_CLIENT_RECLAIM_COMPLETE. When processing subsequent
>>> files,
>>>       server always returns nfserr_grace, causing infinite retry
>>> loop.
>>>
>>> Fix it by setting NFS4CLNT_RECLAIM_REBOOT in nfs4_reclaim_lease if
>>> NFS4CLNT_SERVER_SCOPE_MISMATCH is not set, so that the client sends
>>> RECLAIM_COMPLETE to the server first, allowing subsequent nograce
>>> recovery to proceed.
>>>
>>> Fetch a reproducer in [2].
>>>
>>> [1]
>>> https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
>>> [2] https://bugzilla.kernel.org/show_bug.cgi?id=221399
>>>
>>> Fixes: b42353ff8d346 ("NFSv4.1: Clean up nfs4_reclaim_lease")
>>> Cc: stable@vger.kernel.org
>>> Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
>>> Closes:
>>> https://lore.kernel.org/linux-nfs/55da00d4-a656-4ed2-ae57-7f881297a1b2@huawei.com/
>>> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
>>> ---
>>>    fs/nfs/nfs4state.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>>> index 305a772e5497..817327e73d88 100644
>>> --- a/fs/nfs/nfs4state.c
>>> +++ b/fs/nfs/nfs4state.c
>>> @@ -2012,7 +2012,7 @@ static int nfs4_reclaim_lease(struct
>>> nfs_client *clp)
>>>    		return nfs4_handle_reclaim_lease_error(clp,
>>> status);
>>>    	if (test_and_clear_bit(NFS4CLNT_SERVER_SCOPE_MISMATCH,
>>> &clp->cl_state))
>>>    		nfs4_state_start_reclaim_nograce(clp);
>>> -	if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
>>> +	else
>>>    		set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
>>>    	clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
>>>    	clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
>>>
>>
> This will cause the client to try to do reboot recovery in a situation
> where it isn't allowed to do so by the spec. We should never be setting
> NFS4CLNT_RECLAIM_REBOOT if NFS4CLNT_RECLAIM_NOGRACE is already set.
Hi Trond, the client can only do reclaim reboot during the grace 
period(after server reboot), according to [1], any reclaim type messages 
should be rejected by server for NOGRACE state, am I understanding right?
[1] https://datatracker.ietf.org/doc/html/rfc5661#section-8.4.2.1
> 
> One solution would be to just immediately call
> nfs4_state_end_reclaim_reboot() if NFS4CLNT_RECLAIM_NOGRACE is set.
> 
I tried with your suggestion, and the problem still happens:
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 5044bb4c870f..5b6cb3f7ff2e 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2014,6 +2014,8 @@ static int nfs4_reclaim_lease(struct nfs_client *clp)
                 nfs4_state_start_reclaim_nograce(clp);
         if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
                 set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
+       else
+               nfs4_state_end_reclaim_reboot(clp);
         clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
         clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
         return 0;

The nfs4_state_end_reclaim_reboot sends RECLAIM_COMPLETE only if the clp 
has NFS4CLNT_RECLAIM_REBOOT flag, but it does not.

The root cause is that the client has identified an incorrect server 
status after the server's second reboot(since step 2). I think the 
client should do reclaim reboot every time the server restarts, so we 
should let client know the server reboots again, can we take following 
methods?
1. Clear NFS4CLNT_RECLAIM_NOGRACE flag if client knows that the server 
restart while doing nfs4_do_reclaim(NFS4CLNT_RECLAIM_NOGRACE). Client 
could identify the server state by NFS4ERR_BADSESSION retcode:
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 5044bb4c870f..3f73dc3919a2 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1875,6 +1875,9 @@ static int nfs4_do_reclaim(struct nfs_client *clp, 
const struct nfs4_state_recov
  						clp->cl_hostname, lost_locks);
  				set_bit(ops->owner_flag_bit, &sp->so_flags);
  				nfs4_put_state_owner(sp);
+				if (ops == clp->cl_mvops->nograce_recovery_ops &&
+				    status == -NFS4ERR_BADSESSION)
+					clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state);
  				status = nfs4_recovery_handle_error(clp, status);
  				nfs4_free_state_owners(&freeme);
  				return (status != 0) ? status : -EAGAIN;

2. Client knows the server restarts by error code 
NFS4ERR_STALE_CLIENTID(nfs4_proc_create_session -> 
nfs4_handle_reclaim_lease_error), but nfs4_reset_session won't retry if 
the error code is ETIMEDOUT(eg. network failure in step 3), we should 
let client retry nfs4_reset_session if network failure.
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 5044bb4c870f..62fd57e602d6 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2459,6 +2459,8 @@ static int nfs4_reset_session(struct nfs_client *clp)
  	if (status) {
  		dprintk("%s: session reset failed with status %d for server %s!\n",
  			__func__, status, clp->cl_hostname);
+		if (status == -ETIMEDOUT)
+			goto out;
  		status = nfs4_handle_reclaim_lease_error(clp, status);
  		goto out;
  	}
@@ -2552,6 +2554,8 @@ static void nfs4_state_manager(struct nfs_client *clp)
  		if (test_and_clear_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state)) {
  			section = "reset session";
  			status = nfs4_reset_session(clp);
+			if (status == -ETIMEDOUT)
+				continue;
  			if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state))
  				continue;
  			if (status < 0)
3. Conditionaly set NFS4CLNT_RECLAIM_REBOOT in nfs4_reclaim_lease, if 
the client has ever been recovered failed by 
nfs4_do_reclaim(NFS4CLNT_RECLAIM_NOGRACE)->BAD_SESSION, mark clp with 
NFS4CLNT_RECLAIM_REBOOT.
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 5044bb4c870f..82eb650cc135 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1875,6 +1875,9 @@ static int nfs4_do_reclaim(struct nfs_client *clp, 
const struct nfs4_state_recov
  						clp->cl_hostname, lost_locks);
  				set_bit(ops->owner_flag_bit, &sp->so_flags);
  				nfs4_put_state_owner(sp);
+				if (ops == clp->cl_mvops->nograce_recovery_ops &&
+				    status == -NFS4ERR_BADSESSION)
+					__set_bit(NFS_CS_NOGRACE_REBOOT, &clp->cl_flags);
  				status = nfs4_recovery_handle_error(clp, status);
  				nfs4_free_state_owners(&freeme);
  				return (status != 0) ? status : -EAGAIN;
@@ -2012,7 +2015,7 @@ static int nfs4_reclaim_lease(struct nfs_client *clp)
  		return nfs4_handle_reclaim_lease_error(clp, status);
  	if (test_and_clear_bit(NFS4CLNT_SERVER_SCOPE_MISMATCH, &clp->cl_state))
  		nfs4_state_start_reclaim_nograce(clp);
-	if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
+	if (!test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state) || 
test_bit(NFS_CS_NOGRACE_REBOOT, &clp->cl_flags))
  		set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state);
  	clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
  	clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
@@ -2624,6 +2627,7 @@ static void nfs4_state_manager(struct nfs_client *clp)
  				continue;
  			if (status < 0)
  				goto out_error;
+			clear_bit(NFS_CS_NOGRACE_REBOOT, &clp->cl_flags);
  			clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state);
  		}

diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 4daee27fa5eb..8d36a727513c 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -51,6 +51,7 @@ struct nfs_client {
  #define NFS_CS_REUSEPORT	8		/* - reuse src port on reconnect */
  #define NFS_CS_PNFS		9		/* - Server used for pnfs */
  #define NFS_CS_NETUNREACH_FATAL	10		/* - ENETUNREACH errors are fatal */
+#define NFS_CS_NOGRACE_REBOOT	11		/* - Server reboot during nograce 
recovery */
  	struct sockaddr_storage	cl_addr;	/* server identifier */
  	size_t			cl_addrlen;
  	char *			cl_hostname;	/* hostname of server */

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-23  9:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22  6:44 [PATCH] NFSv4: Fix state recovery deadlock when server misses grace period Zhihao Cheng
2026-04-22  6:55 ` Zhihao Cheng
2026-04-22 12:38   ` Trond Myklebust
2026-04-23  9:05     ` Zhihao Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox