* Re: NFS4 clients cannot reclaim locks [not found] <18697573.14.1286380841649.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com> @ 2010-10-06 16:01 ` Sachin Prabhu 0 siblings, 0 replies; 9+ messages in thread From: Sachin Prabhu @ 2010-10-06 16:01 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs ----- "Trond Myklebust" <Trond.Myklebust@netapp.com> wrote: > ...Here is the second patch. > > Cheers > Trond > ------------------------------------------------------------------------------------------------------ > NFSv4: Don't call nfs4_state_mark_reclaim_reboot() from error > handlers > > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > In the case of a server reboot, the state recovery thread starts by > calling > nfs4_state_end_reclaim_reboot() in order to avoid edge conditions > when > the server reboots while the client is in the middle of recovery. > > However, if the client has already marked the nfs4_state as requiring > reboot recovery, then the above behaviour will cause the recovery > thread to > treat the open as if it was part of such an edge condition: the open > will > be recovered as if it was part of a lease expiration (and all the > locks > will be lost). > Fix is to remove the call to nfs4_state_mark_reclaim_reboot from > nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we > leave it > to the recovery thread to do this for us. > > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > --- > > fs/nfs/nfs4proc.c | 6 ------ > 1 files changed, 0 insertions(+), 6 deletions(-) > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 01b4817..74aa54e 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -255,9 +255,6 @@ static int nfs4_handle_exception(const struct > nfs_server *server, int errorcode, > nfs4_state_mark_reclaim_nograce(clp, state); > goto do_state_recovery; > case -NFS4ERR_STALE_STATEID: > - if (state == NULL) > - break; > - nfs4_state_mark_reclaim_reboot(clp, state); > case -NFS4ERR_STALE_CLIENTID: > case -NFS4ERR_EXPIRED: > goto do_state_recovery; > @@ -3493,9 +3490,6 @@ nfs4_async_handle_error(struct rpc_task *task, > const struct nfs_server *server, > nfs4_state_mark_reclaim_nograce(clp, state); > goto do_state_recovery; > case -NFS4ERR_STALE_STATEID: > - if (state == NULL) > - break; > - nfs4_state_mark_reclaim_reboot(clp, state); > case -NFS4ERR_STALE_CLIENTID: > case -NFS4ERR_EXPIRED: > goto do_state_recovery; Yes. The patch works for me. An open call is made to the server with claim-type set to claim previous. This resets the stateid and the write operation can continue. Thank You Sachin Prabhu ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <18163799.104.1286186355944.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com>]
* Re: NFS4 clients cannot reclaim locks [not found] <18163799.104.1286186355944.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com> @ 2010-10-04 10:03 ` Sachin Prabhu 2010-10-05 13:37 ` Trond Myklebust 2010-10-05 13:38 ` Trond Myklebust 0 siblings, 2 replies; 9+ messages in thread From: Sachin Prabhu @ 2010-10-04 10:03 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs ----- "Trond Myklebust" <Trond.Myklebust@netapp.com> wrote: > On Fri, 2010-10-01 at 07:30 -0400, Sachin Prabhu wrote: > > NFS4 clients appear to have problems reclaiming locks after a server > reboot. I can recreate the issue on 2.6.34.7-56.fc13.x86_64 on a > Fedora system. > > > > The problem appears to happen in cases where after a reboot, a WRITE > call is made just before the RENEW call. In that case, the > NFS4ERR_STALE_STATEID is returned for the WRITE call which results in > NFS_STATE_RECLAIM_REBOOT being set in the state flags. However the > NFS4ERR_STALE_CLIENTID returned for the subsequent RENEW call is > handled by > > nfs4_recovery_handle_error() -> nfs4_state_end_reclaim_reboot(clp); > > > which ends up setting the state flag to NFS_STATE_RECLAIM_NOGRACE > and clearing the NFS_STATE_RECLAIM_REBOOT in > nfs4_state_mark_reclaim_nograce(). > > Yup. I don't think we should call nfs4_state_mark_reclaim_reboot() > here. > > > The process of reclaiming the locks then seem to hit another > roadblock in nfs4_open_expired() where it fails to open the file and > reset the state. It ends up calling nfs4_reclaim_locks() in a loop > with the old stateid in nfs4_reclaim_open_state(). > > Any idea how nfs4_open_expired() is failing? It seems that if it > does, > we should see an error, which would cause the lock reclaim to fail. > > Also, why is the call to nfs4_reclaim_locks() looping? That too > should > exit in case of an error. > >From instrumentation, the problem appears to happen at nfs4_open_prepare static void nfs4_open_prepare(struct rpc_task *task, void *calldata) { .. /* * Check if we still need to send an OPEN call, or if we can use * a delegation instead. */ if (data->state != NULL) { struct nfs_delegation *delegation; if (can_open_cached(data->state, data->o_arg.fmode, data->o_arg.open_flags)) goto out_no_action; .. out_no_action: task->tk_action = NULL; } Here, can_open_cached returns true. The open call is never made and the old state is used. static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs4_state_recovery_ops *ops) { .. restart: .. status = ops->recover_open(sp, state); <-- This call attempts to use cached state and status is set to 0 if (status >= 0) { status = nfs4_reclaim_locks(state, ops); <-- Attempts to reclaim locks using old stateid -- Here status is set to -NFS4ERR_BAD_STATEID -- .. } switch (status) { .. case -NFS4ERR_BAD_STATEID: case -NFS4ERR_RECLAIM_BAD: case -NFS4ERR_RECLAIM_CONFLICT: nfs4_state_mark_reclaim_nograce(sp->so_client, state); break; .. } nfs4_put_open_state(state); goto restart; .. } The call to ops->recover_open() calls nfs4_open_expired(). While preparing the RPC call to OPEN, in nfs4_open_prepare(), it decides that the caches copy is valid and it attempts to use it. So nfs4_open_expired() returns 0. The subsequent call to reclaim locks using nfs4_reclaim_locks() fails with with a -NFS4ERR_BAD_STATEID. A goto statement in nfs4_reclaim_open_state() results in it looping with the same results as before. Sachin Prabhu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NFS4 clients cannot reclaim locks 2010-10-04 10:03 ` Sachin Prabhu @ 2010-10-05 13:37 ` Trond Myklebust 2010-10-06 15:59 ` Sachin Prabhu 2010-10-05 13:38 ` Trond Myklebust 1 sibling, 1 reply; 9+ messages in thread From: Trond Myklebust @ 2010-10-05 13:37 UTC (permalink / raw) To: Sachin Prabhu; +Cc: linux-nfs On Mon, 2010-10-04 at 06:03 -0400, Sachin Prabhu wrote: > From instrumentation, the problem appears to happen at nfs4_open_prepare > > static void nfs4_open_prepare(struct rpc_task *task, void *calldata) > { > .. > /* > * Check if we still need to send an OPEN call, or if we can use > * a delegation instead. > */ > > if (data->state != NULL) { > struct nfs_delegation *delegation; > > if (can_open_cached(data->state, data->o_arg.fmode, data->o_arg.open_flags)) > goto out_no_action; > .. > out_no_action: > task->tk_action = NULL; > > } > > Here, can_open_cached returns true. The open call is never made and the old state is used. > static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs4_state_recovery_ops *ops) > { > .. > restart: > .. > status = ops->recover_open(sp, state); <-- This call attempts to use cached state and status is set to 0 > if (status >= 0) { > status = nfs4_reclaim_locks(state, ops); <-- Attempts to reclaim locks using old stateid > -- Here status is set to -NFS4ERR_BAD_STATEID -- > .. > } > switch (status) { > .. > case -NFS4ERR_BAD_STATEID: > case -NFS4ERR_RECLAIM_BAD: > case -NFS4ERR_RECLAIM_CONFLICT: > nfs4_state_mark_reclaim_nograce(sp->so_client, state); > break; > .. > } > nfs4_put_open_state(state); > goto restart; > .. > } > > The call to ops->recover_open() calls nfs4_open_expired(). While preparing the RPC call to OPEN, in nfs4_open_prepare(), it decides that the caches copy is valid and it attempts to use it. So nfs4_open_expired() returns 0. The subsequent call to reclaim locks using nfs4_reclaim_locks() fails with with a -NFS4ERR_BAD_STATEID. A goto statement in nfs4_reclaim_open_state() results in it looping with the same results as before. Yup. That makes sense. Does the following patch help? Cheers Trond -------------------------------------------------------------------------------------------------------- NFSv4: Fix open recovery From: Trond Myklebust <Trond.Myklebust@netapp.com> NFSv4 open recovery is currently broken: since we do not clear the state->flags states before attempting recovery, we end up with the 'can_open_cached()' function triggering. This again leads to no OPEN call being put on the wire. Reported-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- fs/nfs/nfs4proc.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 089da5b..01b4817 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1120,6 +1120,7 @@ static int nfs4_open_recover(struct nfs4_opendata *opendata, struct nfs4_state * clear_bit(NFS_DELEGATED_STATE, &state->flags); smp_rmb(); if (state->n_rdwr != 0) { + clear_bit(NFS_O_RDWR_STATE, &state->flags); ret = nfs4_open_recover_helper(opendata, FMODE_READ|FMODE_WRITE, &newstate); if (ret != 0) return ret; @@ -1127,6 +1128,7 @@ static int nfs4_open_recover(struct nfs4_opendata *opendata, struct nfs4_state * return -ESTALE; } if (state->n_wronly != 0) { + clear_bit(NFS_O_WRONLY_STATE, &state->flags); ret = nfs4_open_recover_helper(opendata, FMODE_WRITE, &newstate); if (ret != 0) return ret; @@ -1134,6 +1136,7 @@ static int nfs4_open_recover(struct nfs4_opendata *opendata, struct nfs4_state * return -ESTALE; } if (state->n_rdonly != 0) { + clear_bit(NFS_O_RDONLY_STATE, &state->flags); ret = nfs4_open_recover_helper(opendata, FMODE_READ, &newstate); if (ret != 0) return ret; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: NFS4 clients cannot reclaim locks 2010-10-05 13:37 ` Trond Myklebust @ 2010-10-06 15:59 ` Sachin Prabhu 0 siblings, 0 replies; 9+ messages in thread From: Sachin Prabhu @ 2010-10-06 15:59 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs ----- "Trond Myklebust" <Trond.Myklebust@netapp.com> wrote: > Yup. That makes sense. Does the following patch help? > > Cheers > Trond > -------------------------------------------------------------------------------------------------------- > NFSv4: Fix open recovery > > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > NFSv4 open recovery is currently broken: since we do not clear the > state->flags states before attempting recovery, we end up with the > 'can_open_cached()' function triggering. This again leads to no OPEN > call > being put on the wire. > > Reported-by: Sachin Prabhu <sprabhu@redhat.com> > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > --- > > fs/nfs/nfs4proc.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 089da5b..01b4817 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -1120,6 +1120,7 @@ static int nfs4_open_recover(struct > nfs4_opendata *opendata, struct nfs4_state * > clear_bit(NFS_DELEGATED_STATE, &state->flags); > smp_rmb(); > if (state->n_rdwr != 0) { > + clear_bit(NFS_O_RDWR_STATE, &state->flags); > ret = nfs4_open_recover_helper(opendata, FMODE_READ|FMODE_WRITE, > &newstate); > if (ret != 0) > return ret; > @@ -1127,6 +1128,7 @@ static int nfs4_open_recover(struct > nfs4_opendata *opendata, struct nfs4_state * > return -ESTALE; > } > if (state->n_wronly != 0) { > + clear_bit(NFS_O_WRONLY_STATE, &state->flags); > ret = nfs4_open_recover_helper(opendata, FMODE_WRITE, &newstate); > if (ret != 0) > return ret; > @@ -1134,6 +1136,7 @@ static int nfs4_open_recover(struct > nfs4_opendata *opendata, struct nfs4_state * > return -ESTALE; > } > if (state->n_rdonly != 0) { > + clear_bit(NFS_O_RDONLY_STATE, &state->flags); > ret = nfs4_open_recover_helper(opendata, FMODE_READ, &newstate); > if (ret != 0) > return ret; > Yes. The patch works. As expected, repeated open calls are made with claim-type set to NULL. For each of these calls, a NFS4ERR_GRACE is returned by the server as long as it is in Grace period. Once the grace period has completed, the open call succeeds, a new stateid is set and the write operation continues. Thank You Sachin Prabhu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NFS4 clients cannot reclaim locks 2010-10-04 10:03 ` Sachin Prabhu 2010-10-05 13:37 ` Trond Myklebust @ 2010-10-05 13:38 ` Trond Myklebust 1 sibling, 0 replies; 9+ messages in thread From: Trond Myklebust @ 2010-10-05 13:38 UTC (permalink / raw) To: Sachin Prabhu; +Cc: linux-nfs On Mon, 2010-10-04 at 06:03 -0400, Sachin Prabhu wrote: > ----- "Trond Myklebust" <Trond.Myklebust@netapp.com> wrote: > > On Fri, 2010-10-01 at 07:30 -0400, Sachin Prabhu wrote: > > > NFS4 clients appear to have problems reclaiming locks after a server > > reboot. I can recreate the issue on 2.6.34.7-56.fc13.x86_64 on a > > Fedora system. > > > > > > The problem appears to happen in cases where after a reboot, a WRITE > > call is made just before the RENEW call. In that case, the > > NFS4ERR_STALE_STATEID is returned for the WRITE call which results in > > NFS_STATE_RECLAIM_REBOOT being set in the state flags. However the > > NFS4ERR_STALE_CLIENTID returned for the subsequent RENEW call is > > handled by > > > nfs4_recovery_handle_error() -> nfs4_state_end_reclaim_reboot(clp); > > > > > which ends up setting the state flag to NFS_STATE_RECLAIM_NOGRACE > > and clearing the NFS_STATE_RECLAIM_REBOOT in > > nfs4_state_mark_reclaim_nograce(). > > > > Yup. I don't think we should call nfs4_state_mark_reclaim_reboot() > > here. ...Here is the second patch. Cheers Trond ------------------------------------------------------------------------------------------------------ NFSv4: Don't call nfs4_state_mark_reclaim_reboot() from error handlers From: Trond Myklebust <Trond.Myklebust@netapp.com> In the case of a server reboot, the state recovery thread starts by calling nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when the server reboots while the client is in the middle of recovery. However, if the client has already marked the nfs4_state as requiring reboot recovery, then the above behaviour will cause the recovery thread to treat the open as if it was part of such an edge condition: the open will be recovered as if it was part of a lease expiration (and all the locks will be lost). Fix is to remove the call to nfs4_state_mark_reclaim_reboot from nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it to the recovery thread to do this for us. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- fs/nfs/nfs4proc.c | 6 ------ 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 01b4817..74aa54e 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -255,9 +255,6 @@ static int nfs4_handle_exception(const struct nfs_server *server, int errorcode, nfs4_state_mark_reclaim_nograce(clp, state); goto do_state_recovery; case -NFS4ERR_STALE_STATEID: - if (state == NULL) - break; - nfs4_state_mark_reclaim_reboot(clp, state); case -NFS4ERR_STALE_CLIENTID: case -NFS4ERR_EXPIRED: goto do_state_recovery; @@ -3493,9 +3490,6 @@ nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server, nfs4_state_mark_reclaim_nograce(clp, state); goto do_state_recovery; case -NFS4ERR_STALE_STATEID: - if (state == NULL) - break; - nfs4_state_mark_reclaim_reboot(clp, state); case -NFS4ERR_STALE_CLIENTID: case -NFS4ERR_EXPIRED: goto do_state_recovery; ^ permalink raw reply related [flat|nested] 9+ messages in thread
[parent not found: <8181361.84.1285932468389.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com>]
* NFS4 clients cannot reclaim locks [not found] <8181361.84.1285932468389.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com> @ 2010-10-01 11:30 ` Sachin Prabhu 2010-10-01 20:46 ` Trond Myklebust 2010-10-05 15:03 ` Timo Aaltonen 0 siblings, 2 replies; 9+ messages in thread From: Sachin Prabhu @ 2010-10-01 11:30 UTC (permalink / raw) To: linux-nfs NFS4 clients appear to have problems reclaiming locks after a server reboot. I can recreate the issue on 2.6.34.7-56.fc13.x86_64 on a Fedora system. The problem appears to happen in cases where after a reboot, a WRITE call is made just before the RENEW call. In that case, the NFS4ERR_STALE_STATEID is returned for the WRITE call which results in NFS_STATE_RECLAIM_REBOOT being set in the state flags. However the NFS4ERR_STALE_CLIENTID returned for the subsequent RENEW call is handled by nfs4_recovery_handle_error() -> nfs4_state_end_reclaim_reboot(clp); which ends up setting the state flag to NFS_STATE_RECLAIM_NOGRACE and clearing the NFS_STATE_RECLAIM_REBOOT in nfs4_state_mark_reclaim_nograce(). The process of reclaiming the locks then seem to hit another roadblock in nfs4_open_expired() where it fails to open the file and reset the state. It ends up calling nfs4_reclaim_locks() in a loop with the old stateid in nfs4_reclaim_open_state(). By commenting out the call to nfs4_state_end_reclaim_reboot(clp) in nfs4_recovery_handle_error(), the client was able to handle this particular scenario properly. Has any one else seen this issue? Sachin Prabhu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NFS4 clients cannot reclaim locks 2010-10-01 11:30 ` Sachin Prabhu @ 2010-10-01 20:46 ` Trond Myklebust 2010-10-05 15:03 ` Timo Aaltonen 1 sibling, 0 replies; 9+ messages in thread From: Trond Myklebust @ 2010-10-01 20:46 UTC (permalink / raw) To: Sachin Prabhu; +Cc: linux-nfs On Fri, 2010-10-01 at 07:30 -0400, Sachin Prabhu wrote: > NFS4 clients appear to have problems reclaiming locks after a server reboot. I can recreate the issue on 2.6.34.7-56.fc13.x86_64 on a Fedora system. > > The problem appears to happen in cases where after a reboot, a WRITE call is made just before the RENEW call. In that case, the NFS4ERR_STALE_STATEID is returned for the WRITE call which results in NFS_STATE_RECLAIM_REBOOT being set in the state flags. However the NFS4ERR_STALE_CLIENTID returned for the subsequent RENEW call is handled by > nfs4_recovery_handle_error() -> nfs4_state_end_reclaim_reboot(clp); > which ends up setting the state flag to NFS_STATE_RECLAIM_NOGRACE and clearing the NFS_STATE_RECLAIM_REBOOT in nfs4_state_mark_reclaim_nograce(). Yup. I don't think we should call nfs4_state_mark_reclaim_reboot() here. > The process of reclaiming the locks then seem to hit another roadblock in nfs4_open_expired() where it fails to open the file and reset the state. It ends up calling nfs4_reclaim_locks() in a loop with the old stateid in nfs4_reclaim_open_state(). Any idea how nfs4_open_expired() is failing? It seems that if it does, we should see an error, which would cause the lock reclaim to fail. Also, why is the call to nfs4_reclaim_locks() looping? That too should exit in case of an error. > By commenting out the call to nfs4_state_end_reclaim_reboot(clp) in nfs4_recovery_handle_error(), the client was able to handle this particular scenario properly. We do need to keep the nfs4_state_end_reclaim_reboot() there. Otherwise, we have a problem if the server reboots again while we're in the middle of reclaiming state. Cheers Trond ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NFS4 clients cannot reclaim locks 2010-10-01 11:30 ` Sachin Prabhu 2010-10-01 20:46 ` Trond Myklebust @ 2010-10-05 15:03 ` Timo Aaltonen 2010-11-22 16:02 ` Timo Aaltonen 1 sibling, 1 reply; 9+ messages in thread From: Timo Aaltonen @ 2010-10-05 15:03 UTC (permalink / raw) To: Sachin Prabhu; +Cc: linux-nfs On Fri, 1 Oct 2010, Sachin Prabhu wrote: > NFS4 clients appear to have problems reclaiming locks after a server reboot. I can recreate the issue on 2.6.34.7-56.fc13.x86_64 on a Fedora system. > > The problem appears to happen in cases where after a reboot, a WRITE call is made just before the RENEW call. In that case, the NFS4ERR_STALE_STATEID is returned for the WRITE call which results in NFS_STATE_RECLAIM_REBOOT being set in the state flags. However the NFS4ERR_STALE_CLIENTID returned for the subsequent RENEW call is handled by > nfs4_recovery_handle_error() -> nfs4_state_end_reclaim_reboot(clp); > which ends up setting the state flag to NFS_STATE_RECLAIM_NOGRACE and clearing the NFS_STATE_RECLAIM_REBOOT in nfs4_state_mark_reclaim_nograce(). > > The process of reclaiming the locks then seem to hit another roadblock in nfs4_open_expired() where it fails to open the file and reset the state. It ends up calling nfs4_reclaim_locks() in a loop with the old stateid in nfs4_reclaim_open_state(). > > By commenting out the call to nfs4_state_end_reclaim_reboot(clp) in nfs4_recovery_handle_error(), the client was able to handle this particular scenario properly. > > Has any one else seen this issue? could this be related to the bug I was seeing with nfsv4 (now using v3 with success): https://bugzilla.kernel.org/show_bug.cgi?id=15973 though the error returned by the server is BAD_STATEID.. -- Timo Aaltonen Systems Specialist, Aalto IT ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NFS4 clients cannot reclaim locks 2010-10-05 15:03 ` Timo Aaltonen @ 2010-11-22 16:02 ` Timo Aaltonen 0 siblings, 0 replies; 9+ messages in thread From: Timo Aaltonen @ 2010-11-22 16:02 UTC (permalink / raw) To: linux-nfs On Tue, 5 Oct 2010, Timo Aaltonen wrote: > On Fri, 1 Oct 2010, Sachin Prabhu wrote: > >> NFS4 clients appear to have problems reclaiming locks after a server >> reboot. I can recreate the issue on 2.6.34.7-56.fc13.x86_64 on a Fedora >> system. >> >> The problem appears to happen in cases where after a reboot, a WRITE call >> is made just before the RENEW call. In that case, the NFS4ERR_STALE_STATEID >> is returned for the WRITE call which results in NFS_STATE_RECLAIM_REBOOT >> being set in the state flags. However the NFS4ERR_STALE_CLIENTID returned >> for the subsequent RENEW call is handled by >> nfs4_recovery_handle_error() -> nfs4_state_end_reclaim_reboot(clp); >> which ends up setting the state flag to NFS_STATE_RECLAIM_NOGRACE and >> clearing the NFS_STATE_RECLAIM_REBOOT in nfs4_state_mark_reclaim_nograce(). >> >> The process of reclaiming the locks then seem to hit another roadblock in >> nfs4_open_expired() where it fails to open the file and reset the state. It >> ends up calling nfs4_reclaim_locks() in a loop with the old stateid in >> nfs4_reclaim_open_state(). >> >> By commenting out the call to nfs4_state_end_reclaim_reboot(clp) in >> nfs4_recovery_handle_error(), the client was able to handle this particular >> scenario properly. >> >> Has any one else seen this issue? > > could this be related to the bug I was seeing with nfsv4 (now using v3 with > success): > > https://bugzilla.kernel.org/show_bug.cgi?id=15973 > > though the error returned by the server is BAD_STATEID.. At least testing .37rc2 has so far been positive, suggesting that the bug is fixed there. -- Timo Aaltonen Systems Specialist, Aalto IT ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-11-22 16:14 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <18697573.14.1286380841649.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com>
2010-10-06 16:01 ` NFS4 clients cannot reclaim locks Sachin Prabhu
[not found] <18163799.104.1286186355944.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com>
2010-10-04 10:03 ` Sachin Prabhu
2010-10-05 13:37 ` Trond Myklebust
2010-10-06 15:59 ` Sachin Prabhu
2010-10-05 13:38 ` Trond Myklebust
[not found] <8181361.84.1285932468389.JavaMail.sprabhu@dhcp-1-233.fab.redhat.com>
2010-10-01 11:30 ` Sachin Prabhu
2010-10-01 20:46 ` Trond Myklebust
2010-10-05 15:03 ` Timo Aaltonen
2010-11-22 16:02 ` Timo Aaltonen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).