* [PATCH 0/5] Reclaim Stage Bug Fixes Version 2 @ 2009-12-07 8:21 Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 1/5] nfs41: Mark stateids in need of reclaim if state manager gets stale clientid Ricardo Labiaga 0 siblings, 1 reply; 10+ messages in thread From: Ricardo Labiaga @ 2009-12-07 8:21 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-nfs The following five patches fix problems during state reclaim. They incorporate Trond's comments and apply to Trond's nfs-for-next branch. Patches 2 and 5 handle session errors by scheduling the state recovery. They do not differentiate between DEAD/BADSESSION and other session errors in order to be consistent with the existing error handling. If this needs to be different, I can submit a separate patch to address all of the areas that handle this. Patch 4 is new in the series. It results from the changes requested to Patch 1. [PATCH 1/5] nfs41: Mark stateids in need of reclaim if state manager gets stale clientid [PATCH 2/5] nfs41: Handle session errors during delegation return [PATCH 3/5] nfs41: Retry delegation return if it failed with session error [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state [PATCH 5/5] nfs41: Handle NFSv4.1 session errors in the delegation recall code - ricardo ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/5] nfs41: Mark stateids in need of reclaim if state manager gets stale clientid 2009-12-07 8:21 [PATCH 0/5] Reclaim Stage Bug Fixes Version 2 Ricardo Labiaga @ 2009-12-07 8:21 ` Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 2/5] nfs41: Handle session errors during delegation return Ricardo Labiaga 0 siblings, 1 reply; 10+ messages in thread From: Ricardo Labiaga @ 2009-12-07 8:21 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-nfs, Ricardo Labiaga The state manager was not marking the stateids as needing to be reclaimed after reestablishing the clientid. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> --- fs/nfs/nfs4state.c | 12 ++---------- 1 files changed, 2 insertions(+), 10 deletions(-) diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index d3c9c9e..d741ec6 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1256,14 +1256,6 @@ void nfs41_handle_sequence_flag_errors(struct nfs_client *clp, u32 flags) nfs_expire_all_delegations(clp); } -static void nfs4_session_recovery_handle_error(struct nfs_client *clp, int err) -{ - switch (err) { - case -NFS4ERR_STALE_CLIENTID: - set_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state); - } -} - static int nfs4_reset_session(struct nfs_client *clp) { struct nfs4_session *ses = clp->cl_session; @@ -1276,14 +1268,14 @@ static int nfs4_reset_session(struct nfs_client *clp) status = nfs4_proc_destroy_session(clp->cl_session); if (status && status != -NFS4ERR_BADSESSION && status != -NFS4ERR_DEADSESSION) { - nfs4_session_recovery_handle_error(clp, status); + nfs4_recovery_handle_error(clp, status); goto out; } memset(clp->cl_session->sess_id.data, 0, NFS4_MAX_SESSIONID_LEN); status = nfs4_proc_create_session(clp); if (status) - nfs4_session_recovery_handle_error(clp, status); + nfs4_recovery_handle_error(clp, status); out: /* * Let the state manager reestablish state -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/5] nfs41: Handle session errors during delegation return 2009-12-07 8:21 ` [PATCH 1/5] nfs41: Mark stateids in need of reclaim if state manager gets stale clientid Ricardo Labiaga @ 2009-12-07 8:21 ` Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 3/5] nfs41: Retry delegation return if it failed with session error Ricardo Labiaga 0 siblings, 1 reply; 10+ messages in thread From: Ricardo Labiaga @ 2009-12-07 8:21 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-nfs, Ricardo Labiaga Add session error handling to nfs4_open_delegation_recall() Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> --- fs/nfs/nfs4proc.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index fbae2c9..6a8861c 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1179,6 +1179,14 @@ int nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs4_state case -ENOENT: case -ESTALE: goto out; + case -NFS4ERR_BADSESSION: + case -NFS4ERR_BADSLOT: + case -NFS4ERR_BAD_HIGH_SLOT: + case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION: + case -NFS4ERR_DEADSESSION: + nfs4_schedule_state_recovery( + server->nfs_client); + goto out; case -NFS4ERR_STALE_CLIENTID: case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/5] nfs41: Retry delegation return if it failed with session error 2009-12-07 8:21 ` [PATCH 2/5] nfs41: Handle session errors during delegation return Ricardo Labiaga @ 2009-12-07 8:21 ` Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Ricardo Labiaga 0 siblings, 1 reply; 10+ messages in thread From: Ricardo Labiaga @ 2009-12-07 8:21 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-nfs, Ricardo Labiaga Update nfs4_delegreturn_done() to retry the operation after setting the NFS4CLNT_SESSION_SETUP bit to indicate the need to reset the session. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> --- fs/nfs/nfs4proc.c | 15 +++++++++++++-- 1 files changed, 13 insertions(+), 2 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 6a8861c..53b0a7d 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -3490,9 +3490,20 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata) nfs4_sequence_done(data->res.server, &data->res.seq_res, task->tk_status); - data->rpc_status = task->tk_status; - if (data->rpc_status == 0) + switch (task->tk_status) { + case -NFS4ERR_STALE_STATEID: + case -NFS4ERR_EXPIRED: + case 0: renew_lease(data->res.server, data->timestamp); + break; + default: + if (nfs4_async_handle_error(task, data->res.server, NULL) == + -EAGAIN) { + nfs4_restart_rpc(task, data->res.server->nfs_client); + return; + } + } + data->rpc_status = task->tk_status; } static void nfs4_delegreturn_release(void *calldata) -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state 2009-12-07 8:21 ` [PATCH 3/5] nfs41: Retry delegation return if it failed with session error Ricardo Labiaga @ 2009-12-07 8:21 ` Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 5/5] nfs41: Handle NFSv4.1 session errors in the delegation recall code Ricardo Labiaga 2009-12-07 14:47 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Trond Myklebust 0 siblings, 2 replies; 10+ messages in thread From: Ricardo Labiaga @ 2009-12-07 8:21 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-nfs, Ricardo Labiaga nfs4_state_end_reclaim_reboot() can also be invoked as a result of error processing, so it is not a safe place to invoke RECLAIM_COMPLETE. Instead, create a new state flag that tracks the fact that a RECLAIM_COMPLETE needs to be issued when all state has been reclaimed, or when we're done establishing the session for the first time. If an error occurs in the main state manager loop, just clear the flag. No sense in checking if the flag is set in order to clear it. We're not going to issue the RECLAIM_COMPLETE since there's a high probability that we had some kind of communication or session problem which is s how we ended up in the error case. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> --- fs/nfs/nfs4_fs.h | 1 + fs/nfs/nfs4state.c | 12 +++++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 19f04cc..9962e28 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -46,6 +46,7 @@ enum nfs4_client_state { NFS4CLNT_DELEGRETURN, NFS4CLNT_SESSION_RESET, NFS4CLNT_SESSION_DRAINING, + NFS4CLNT_RECLAIM_COMPLETE_PENDING, }; /* diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index d741ec6..3bb43df 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1096,9 +1096,6 @@ static void nfs4_state_end_reclaim_reboot(struct nfs_client *clp) struct rb_node *pos; struct nfs4_state *state; - nfs4_reclaim_complete(clp, - nfs4_reboot_recovery_ops[clp->cl_minorversion]); - if (!test_and_clear_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state)) return; @@ -1335,6 +1332,8 @@ static void nfs4_state_manager(struct nfs_client *clp) goto out_error; } clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state); + set_bit(NFS4CLNT_RECLAIM_COMPLETE_PENDING, + &clp->cl_state); } if (test_and_clear_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state)) { @@ -1386,6 +1385,12 @@ static void nfs4_state_manager(struct nfs_client *clp) continue; } + if (test_and_clear_bit(NFS4CLNT_RECLAIM_COMPLETE_PENDING, + &clp->cl_state)) { + nfs4_reclaim_complete(clp, + nfs4_reboot_recovery_ops[clp->cl_minorversion]); + } + nfs4_clear_state_manager_bit(clp); /* Did we race with an attempt to give us more work? */ if (clp->cl_state == 0) @@ -1397,6 +1402,7 @@ static void nfs4_state_manager(struct nfs_client *clp) out_error: printk(KERN_WARNING "Error: state manager failed on NFSv4 server %s" " with error %d\n", clp->cl_hostname, -status); + clear_bit(NFS4CLNT_RECLAIM_COMPLETE_PENDING, &clp->cl_state); nfs4_clear_state_manager_bit(clp); } -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 5/5] nfs41: Handle NFSv4.1 session errors in the delegation recall code 2009-12-07 8:21 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Ricardo Labiaga @ 2009-12-07 8:21 ` Ricardo Labiaga 2009-12-07 14:47 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Trond Myklebust 1 sibling, 0 replies; 10+ messages in thread From: Ricardo Labiaga @ 2009-12-07 8:21 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-nfs, Ricardo Labiaga Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> --- fs/nfs/nfs4proc.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 53b0a7d..a286d70 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -4191,6 +4191,11 @@ int nfs4_lock_delegation_recall(struct nfs4_state *state, struct file_lock *fl) case -NFS4ERR_EXPIRED: case -NFS4ERR_STALE_CLIENTID: case -NFS4ERR_STALE_STATEID: + case -NFS4ERR_BADSESSION: + case -NFS4ERR_BADSLOT: + case -NFS4ERR_BAD_HIGH_SLOT: + case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION: + case -NFS4ERR_DEADSESSION: nfs4_schedule_state_recovery(server->nfs_client); goto out; case -ERESTARTSYS: -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state 2009-12-07 8:21 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 5/5] nfs41: Handle NFSv4.1 session errors in the delegation recall code Ricardo Labiaga @ 2009-12-07 14:47 ` Trond Myklebust 2009-12-07 17:51 ` Labiaga, Ricardo 1 sibling, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2009-12-07 14:47 UTC (permalink / raw) To: Ricardo Labiaga; +Cc: linux-nfs On Mon, 2009-12-07 at 00:21 -0800, Ricardo Labiaga wrote: > nfs4_state_end_reclaim_reboot() can also be invoked as a > result of error processing, so it is not a safe place to > invoke RECLAIM_COMPLETE. Instead, create a new state > flag that tracks the fact that a RECLAIM_COMPLETE needs > to be issued when all state has been reclaimed, or when > we're done establishing the session for the first time. > > If an error occurs in the main state manager loop, just clear the > flag. No sense in checking if the flag is set in order to clear it. > We're not going to issue the RECLAIM_COMPLETE since there's a high > probability that we had some kind of communication or session problem > which is s how we ended up in the error case. This patch looks wrong for two reasons. 1. We only want to call RECLAIM_COMPLETE if we saw a STALE_CLIENTID error prior to the last attempt to re-establish the client id. 2. We have to call it before we can start the no-grace reclaims. Looking at the code, I'm not convinced that we need a separate 'RECLAIM_COMPLETE_PENDING' state. It should be pretty much identical to the existing NFS4CLNT_RECLAIM_REBOOT state. The only difference is that in the NFSv4.1 case we want to be able to call RECLAIM_COMPLETE even in the case where we have no state to reclaim. Cheers Trond ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state 2009-12-07 14:47 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Trond Myklebust @ 2009-12-07 17:51 ` Labiaga, Ricardo 2009-12-07 18:11 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: Labiaga, Ricardo @ 2009-12-07 17:51 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs On 12/7/09 6:47 AM, "Trond Myklebust" <Trond.Myklebust@netapp.com> wrote: > On Mon, 2009-12-07 at 00:21 -0800, Ricardo Labiaga wrote: >> nfs4_state_end_reclaim_reboot() can also be invoked as a >> result of error processing, so it is not a safe place to >> invoke RECLAIM_COMPLETE. Instead, create a new state >> flag that tracks the fact that a RECLAIM_COMPLETE needs >> to be issued when all state has been reclaimed, or when >> we're done establishing the session for the first time. >> >> If an error occurs in the main state manager loop, just clear the >> flag. No sense in checking if the flag is set in order to clear it. >> We're not going to issue the RECLAIM_COMPLETE since there's a high >> probability that we had some kind of communication or session problem >> which is s how we ended up in the error case. > > This patch looks wrong for two reasons. > > 1. We only want to call RECLAIM_COMPLETE if we saw a STALE_CLIENTID > error prior to the last attempt to re-establish the client id. Section 18.51.3 "Whenever a client establishes a new client ID and before it does the first non-reclaim operation that obtains a lock, it MUST send a RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there are no locks to reclaim. If non-reclaim locking operations are done before the RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned." I interpreted the spec as you did, but I was talked into interpreting the previous statement as: "The client doesn't always know if the server has rebooted, so send it a RECLAIM_COMPLETE after your initial EXCHANGE_ID/ CREATE_SESSION as well and all will be happy". Let me send an email to the NFSv4 IETF list to discuss it there. > 2. We have to call it before we can start the no-grace reclaims. My oversight. I should have placed the check for the new state under RECLAIM_REBOOT and before RECLAIM_NOGRACE - provided we determine that we need to send RECLAIM_COMPLETE even in the case where we didn't see STALE_CLIENTID > > Looking at the code, I'm not convinced that we need a separate > 'RECLAIM_COMPLETE_PENDING' state. It should be pretty much identical to > the existing NFS4CLNT_RECLAIM_REBOOT state. > The only difference is that in the NFSv4.1 case we want to be able to > call RECLAIM_COMPLETE even in the case where we have no state to > reclaim. > Yes, this would be the case if my interpretation of the spec is incorrect. - ricardo > Cheers > Trond ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state 2009-12-07 17:51 ` Labiaga, Ricardo @ 2009-12-07 18:11 ` Trond Myklebust 2009-12-07 21:41 ` Labiaga, Ricardo 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2009-12-07 18:11 UTC (permalink / raw) To: Labiaga, Ricardo; +Cc: linux-nfs On Mon, 2009-12-07 at 09:51 -0800, Labiaga, Ricardo wrote: > On 12/7/09 6:47 AM, "Trond Myklebust" <Trond.Myklebust@netapp.com> wrote: > > > Looking at the code, I'm not convinced that we need a separate > > 'RECLAIM_COMPLETE_PENDING' state. It should be pretty much identical to > > the existing NFS4CLNT_RECLAIM_REBOOT state. > > The only difference is that in the NFSv4.1 case we want to be able to > > call RECLAIM_COMPLETE even in the case where we have no state to > > reclaim. > > > > Yes, this would be the case if my interpretation of the spec is incorrect. I don't see how your interpretation changes anything w.r.t the question of whether we need a new state or not. You can still set NFS4CLNT_RECLAIM_REBOOT and get it to do the right thing... Trond ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state 2009-12-07 18:11 ` Trond Myklebust @ 2009-12-07 21:41 ` Labiaga, Ricardo 0 siblings, 0 replies; 10+ messages in thread From: Labiaga, Ricardo @ 2009-12-07 21:41 UTC (permalink / raw) To: Myklebust, Trond; +Cc: linux-nfs > -----Original Message----- > From: Myklebust, Trond > Sent: Monday, December 07, 2009 10:12 AM > To: Labiaga, Ricardo > Cc: linux-nfs@vger.kernel.org > Subject: Re: [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING > state > > On Mon, 2009-12-07 at 09:51 -0800, Labiaga, Ricardo wrote: > > On 12/7/09 6:47 AM, "Trond Myklebust" <Trond.Myklebust@netapp.com> > wrote: > > > > > Looking at the code, I'm not convinced that we need a separate > > > 'RECLAIM_COMPLETE_PENDING' state. It should be pretty much identical > to > > > the existing NFS4CLNT_RECLAIM_REBOOT state. > > > The only difference is that in the NFSv4.1 case we want to be able to > > > call RECLAIM_COMPLETE even in the case where we have no state to > > > reclaim. > > > > > > > Yes, this would be the case if my interpretation of the spec is > incorrect. > > I don't see how your interpretation changes anything w.r.t the question > of whether we need a new state or not. You can still set > NFS4CLNT_RECLAIM_REBOOT and get it to do the right thing... > Agreed, I was trying to avoid calling nfs4_do_reclaim() and friends when there was no work to do. I agree the optimization does not warrant the extra state complexity since NFS4CLNT_RECLAIM_REBOOT will indeed do the job. I'll make the change after we reach agreement on the NFSv4 IETF list. - ricardo ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-12-07 21:40 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-07 8:21 [PATCH 0/5] Reclaim Stage Bug Fixes Version 2 Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 1/5] nfs41: Mark stateids in need of reclaim if state manager gets stale clientid Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 2/5] nfs41: Handle session errors during delegation return Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 3/5] nfs41: Retry delegation return if it failed with session error Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Ricardo Labiaga 2009-12-07 8:21 ` [PATCH 5/5] nfs41: Handle NFSv4.1 session errors in the delegation recall code Ricardo Labiaga 2009-12-07 14:47 ` [PATCH 4/5] nfs41: New NFS4CLNT_RECLAIM_COMPLETE_PENDING state Trond Myklebust 2009-12-07 17:51 ` Labiaga, Ricardo 2009-12-07 18:11 ` Trond Myklebust 2009-12-07 21:41 ` Labiaga, Ricardo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox