* [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
@ 2014-10-24 20:13 Olga Kornievskaia
0 siblings, 0 replies; 8+ messages in thread
From: Olga Kornievskaia @ 2014-10-24 20:13 UTC (permalink / raw)
To: Trond.Myklebust; +Cc: linux-nfs
If we get a bad-stateid-type of error when we send OPEN with delegate_cur
to return currently held delegation, we shouldn't be trying to reclaim locks
associated with that delegation state_id because we don't have an
open_stateid to be used for the LOCK operation. Thus, we should
return an error from the nfs4_open_delegation_recall() in that case.
Furthermore, if an error occurs the delegation code will call
nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
flags in the state and it leads the state manager to into an infinite loop
for trying to reclaim the delegated state.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
fs/nfs/delegation.c | 5 +++--
fs/nfs/nfs4proc.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 5853f53..8016d89 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
err = nfs4_wait_clnt_recover(clp);
} while (err == 0);
- if (err) {
+ if (err && err != -EIO) {
nfs_abort_delegation_return(delegation, clp);
goto out;
}
@@ -458,7 +458,8 @@ restart:
iput(inode);
if (!err)
goto restart;
- set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
+ if (err != -EIO)
+ set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
return err;
}
}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 5aa55c1..6871055 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1655,7 +1655,7 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
nfs_inode_find_state_and_recover(state->inode,
stateid);
nfs4_schedule_stateid_recovery(server, state);
- return 0;
+ return -EIO;
case -NFS4ERR_DELAY:
case -NFS4ERR_GRACE:
set_bit(NFS_DELEGATED_STATE, &state->flags);
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
@ 2014-11-03 14:36 Olga Kornievskaia
2014-11-04 23:22 ` Olga Kornievskaia
0 siblings, 1 reply; 8+ messages in thread
From: Olga Kornievskaia @ 2014-11-03 14:36 UTC (permalink / raw)
To: Trond.Myklebust; +Cc: linux-nfs
If we get a bad-stateid-type of error when we send OPEN with delegate_cur
to return currently held delegation, we shouldn't be trying to reclaim locks
associated with that delegation state_id because we don't have an
open_stateid to be used for the LOCK operation. Thus, we should
return an error from the nfs4_open_delegation_recall() in that case.
Furthermore, if an error occurs the delegation code will call
nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
flags in the state and it leads the state manager to into an infinite loop
for trying to reclaim the delegated state.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
fs/nfs/delegation.c | 5 +++--
fs/nfs/nfs4proc.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 5853f53..8016d89 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
err = nfs4_wait_clnt_recover(clp);
} while (err == 0);
- if (err) {
+ if (err && err != -EIO) {
nfs_abort_delegation_return(delegation, clp);
goto out;
}
@@ -458,7 +458,8 @@ restart:
iput(inode);
if (!err)
goto restart;
- set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
+ if (err != -EIO)
+ set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
return err;
}
}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 5aa55c1..6871055 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1655,7 +1655,7 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
nfs_inode_find_state_and_recover(state->inode,
stateid);
nfs4_schedule_stateid_recovery(server, state);
- return 0;
+ return -EIO;
case -NFS4ERR_DELAY:
case -NFS4ERR_GRACE:
set_bit(NFS_DELEGATED_STATE, &state->flags);
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
2014-11-03 14:36 Olga Kornievskaia
@ 2014-11-04 23:22 ` Olga Kornievskaia
2014-11-05 0:14 ` Trond Myklebust
0 siblings, 1 reply; 8+ messages in thread
From: Olga Kornievskaia @ 2014-11-04 23:22 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs
Trond, can you please respond to the patch?
As per earlier conversation, in this solution, state recovery is
initiated which marks the locks lost.
Please either accept this patch or let me know what needs to be fixed.
Thank you.
On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
> to return currently held delegation, we shouldn't be trying to reclaim locks
> associated with that delegation state_id because we don't have an
> open_stateid to be used for the LOCK operation. Thus, we should
> return an error from the nfs4_open_delegation_recall() in that case.
>
> Furthermore, if an error occurs the delegation code will call
> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
> flags in the state and it leads the state manager to into an infinite loop
> for trying to reclaim the delegated state.
>
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
> fs/nfs/delegation.c | 5 +++--
> fs/nfs/nfs4proc.c | 2 +-
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 5853f53..8016d89 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
> err = nfs4_wait_clnt_recover(clp);
> } while (err == 0);
>
> - if (err) {
> + if (err && err != -EIO) {
> nfs_abort_delegation_return(delegation, clp);
> goto out;
> }
> @@ -458,7 +458,8 @@ restart:
> iput(inode);
> if (!err)
> goto restart;
> - set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
> + if (err != -EIO)
> + set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
> return err;
> }
> }
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 5aa55c1..6871055 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -1655,7 +1655,7 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
> nfs_inode_find_state_and_recover(state->inode,
> stateid);
> nfs4_schedule_stateid_recovery(server, state);
> - return 0;
> + return -EIO;
> case -NFS4ERR_DELAY:
> case -NFS4ERR_GRACE:
> set_bit(NFS_DELEGATED_STATE, &state->flags);
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
2014-11-04 23:22 ` Olga Kornievskaia
@ 2014-11-05 0:14 ` Trond Myklebust
2014-11-05 17:53 ` Olga Kornievskaia
0 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2014-11-05 0:14 UTC (permalink / raw)
To: Olga Kornievskaia; +Cc: linux-nfs
On Tue, Nov 4, 2014 at 6:22 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> Trond, can you please respond to the patch?
>
> As per earlier conversation, in this solution, state recovery is
> initiated which marks the locks lost.
>
> Please either accept this patch or let me know what needs to be fixed.
>
Please see the 3 fixes I just sent out concerning delegation recovery
w.r.t. NFSv4+NFSv4.1. In addition, we need to handle the case you
patch attempts to address (however see the question I have below).
> Thank you.
>
> On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
>> to return currently held delegation, we shouldn't be trying to reclaim locks
>> associated with that delegation state_id because we don't have an
>> open_stateid to be used for the LOCK operation. Thus, we should
>> return an error from the nfs4_open_delegation_recall() in that case.
>>
>> Furthermore, if an error occurs the delegation code will call
>> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
>> flags in the state and it leads the state manager to into an infinite loop
>> for trying to reclaim the delegated state.
>>
>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>> ---
>> fs/nfs/delegation.c | 5 +++--
>> fs/nfs/nfs4proc.c | 2 +-
>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>> index 5853f53..8016d89 100644
>> --- a/fs/nfs/delegation.c
>> +++ b/fs/nfs/delegation.c
>> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
>> err = nfs4_wait_clnt_recover(clp);
>> } while (err == 0);
>>
>> - if (err) {
>> + if (err && err != -EIO) {
>> nfs_abort_delegation_return(delegation, clp);
This exception for EIO now has me worried. If we detach the
delegation, then it looks to me as if we will never send a
FREE_STATEID, as required for the case of NFSv4.1.
>> goto out;
>> }
>> @@ -458,7 +458,8 @@ restart:
>> iput(inode);
>> if (!err)
>> goto restart;
>> - set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
>> + if (err != -EIO)
>> + set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
Please explain why this is needed. If we've cleared the bad
delegation, then why should we not attempt to return any others that
may be pending?
>> return err;
>> }
>> }
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 5aa55c1..6871055 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -1655,7 +1655,7 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
>> nfs_inode_find_state_and_recover(state->inode,
>> stateid);
>> nfs4_schedule_stateid_recovery(server, state);
>> - return 0;
>> + return -EIO;
>> case -NFS4ERR_DELAY:
>> case -NFS4ERR_GRACE:
>> set_bit(NFS_DELEGATED_STATE, &state->flags);
>> --
>> 1.7.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
2014-11-05 0:14 ` Trond Myklebust
@ 2014-11-05 17:53 ` Olga Kornievskaia
2014-11-05 18:38 ` Trond Myklebust
0 siblings, 1 reply; 8+ messages in thread
From: Olga Kornievskaia @ 2014-11-05 17:53 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs
Thank you for looking into this!
On Tue, Nov 4, 2014 at 7:14 PM, Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> On Tue, Nov 4, 2014 at 6:22 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> Trond, can you please respond to the patch?
>>
>> As per earlier conversation, in this solution, state recovery is
>> initiated which marks the locks lost.
>>
>> Please either accept this patch or let me know what needs to be fixed.
>>
>
> Please see the 3 fixes I just sent out concerning delegation recovery
> w.r.t. NFSv4+NFSv4.1. In addition, we need to handle the case you
> patch attempts to address (however see the question I have below).
>
>> Thank you.
>>
>> On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>>> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
>>> to return currently held delegation, we shouldn't be trying to reclaim locks
>>> associated with that delegation state_id because we don't have an
>>> open_stateid to be used for the LOCK operation. Thus, we should
>>> return an error from the nfs4_open_delegation_recall() in that case.
>>>
>>> Furthermore, if an error occurs the delegation code will call
>>> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
>>> flags in the state and it leads the state manager to into an infinite loop
>>> for trying to reclaim the delegated state.
>>>
>>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>>> ---
>>> fs/nfs/delegation.c | 5 +++--
>>> fs/nfs/nfs4proc.c | 2 +-
>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>>> index 5853f53..8016d89 100644
>>> --- a/fs/nfs/delegation.c
>>> +++ b/fs/nfs/delegation.c
>>> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
>>> err = nfs4_wait_clnt_recover(clp);
>>> } while (err == 0);
>>>
>>> - if (err) {
>>> + if (err && err != -EIO) {
>>> nfs_abort_delegation_return(delegation, clp);
>
> This exception for EIO now has me worried. If we detach the
> delegation, then it looks to me as if we will never send a
> FREE_STATEID, as required for the case of NFSv4.1.
I'd need to brush up on the FREE_STATEID op to confidently answer
that. But can't this be taken care of in state_recovery when locks are
marked lost?
>
>>> goto out;
>>> }
>>> @@ -458,7 +458,8 @@ restart:
>>> iput(inode);
>>> if (!err)
>>> goto restart;
>>> - set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
>>> + if (err != -EIO)
>>> + set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
>
> Please explain why this is needed. If we've cleared the bad
> delegation, then why should we not attempt to return any others that
> may be pending?
You are right, I think this is unnecessary because after we return the
delegation the bit can be set again for any other leftover
delegations. I have tested my patch without that chuck and it still
fixes the issues.
>
>>> return err;
>>> }
>>> }
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 5aa55c1..6871055 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -1655,7 +1655,7 @@ static int nfs4_handle_delegation_recall_error(struct nfs_server *server, struct
>>> nfs_inode_find_state_and_recover(state->inode,
>>> stateid);
>>> nfs4_schedule_stateid_recovery(server, state);
>>> - return 0;
>>> + return -EIO;
>>> case -NFS4ERR_DELAY:
>>> case -NFS4ERR_GRACE:
>>> set_bit(NFS_DELEGATED_STATE, &state->flags);
>>> --
>>> 1.7.1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> trond.myklebust@primarydata.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
2014-11-05 17:53 ` Olga Kornievskaia
@ 2014-11-05 18:38 ` Trond Myklebust
2014-11-06 18:54 ` Olga Kornievskaia
0 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2014-11-05 18:38 UTC (permalink / raw)
To: Olga Kornievskaia; +Cc: linux-nfs
On Wed, Nov 5, 2014 at 12:53 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> Thank you for looking into this!
>
> On Tue, Nov 4, 2014 at 7:14 PM, Trond Myklebust
> <trond.myklebust@primarydata.com> wrote:
>> On Tue, Nov 4, 2014 at 6:22 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> Trond, can you please respond to the patch?
>>>
>>> As per earlier conversation, in this solution, state recovery is
>>> initiated which marks the locks lost.
>>>
>>> Please either accept this patch or let me know what needs to be fixed.
>>>
>>
>> Please see the 3 fixes I just sent out concerning delegation recovery
>> w.r.t. NFSv4+NFSv4.1. In addition, we need to handle the case you
>> patch attempts to address (however see the question I have below).
>>
>>> Thank you.
>>>
>>> On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>>>> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
>>>> to return currently held delegation, we shouldn't be trying to reclaim locks
>>>> associated with that delegation state_id because we don't have an
>>>> open_stateid to be used for the LOCK operation. Thus, we should
>>>> return an error from the nfs4_open_delegation_recall() in that case.
>>>>
>>>> Furthermore, if an error occurs the delegation code will call
>>>> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
>>>> flags in the state and it leads the state manager to into an infinite loop
>>>> for trying to reclaim the delegated state.
>>>>
>>>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>>>> ---
>>>> fs/nfs/delegation.c | 5 +++--
>>>> fs/nfs/nfs4proc.c | 2 +-
>>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>>>> index 5853f53..8016d89 100644
>>>> --- a/fs/nfs/delegation.c
>>>> +++ b/fs/nfs/delegation.c
>>>> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
>>>> err = nfs4_wait_clnt_recover(clp);
>>>> } while (err == 0);
>>>>
>>>> - if (err) {
>>>> + if (err && err != -EIO) {
>>>> nfs_abort_delegation_return(delegation, clp);
>>
>> This exception for EIO now has me worried. If we detach the
>> delegation, then it looks to me as if we will never send a
>> FREE_STATEID, as required for the case of NFSv4.1.
>
> I'd need to brush up on the FREE_STATEID op to confidently answer
> that. But can't this be taken care of in state_recovery when locks are
> marked lost?
No, because in the case of EIO we now fall through to
nfs_detach_delegation() and nfs_do_return_delegation(). What we really
want to do is call TEST_STATEID + FREE_STATEID (see RFC5661 sections
8.2.4, 8.5 and 18.38.3), which will now be done as part of state
recovery.
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
2014-11-05 18:38 ` Trond Myklebust
@ 2014-11-06 18:54 ` Olga Kornievskaia
2014-11-06 19:15 ` Trond Myklebust
0 siblings, 1 reply; 8+ messages in thread
From: Olga Kornievskaia @ 2014-11-06 18:54 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs
On Wed, Nov 5, 2014 at 1:38 PM, Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> On Wed, Nov 5, 2014 at 12:53 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> Thank you for looking into this!
>>
>> On Tue, Nov 4, 2014 at 7:14 PM, Trond Myklebust
>> <trond.myklebust@primarydata.com> wrote:
>>> On Tue, Nov 4, 2014 at 6:22 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>> Trond, can you please respond to the patch?
>>>>
>>>> As per earlier conversation, in this solution, state recovery is
>>>> initiated which marks the locks lost.
>>>>
>>>> Please either accept this patch or let me know what needs to be fixed.
>>>>
>>>
>>> Please see the 3 fixes I just sent out concerning delegation recovery
>>> w.r.t. NFSv4+NFSv4.1. In addition, we need to handle the case you
>>> patch attempts to address (however see the question I have below).
>>>
>>>> Thank you.
>>>>
>>>> On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>>>>> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
>>>>> to return currently held delegation, we shouldn't be trying to reclaim locks
>>>>> associated with that delegation state_id because we don't have an
>>>>> open_stateid to be used for the LOCK operation. Thus, we should
>>>>> return an error from the nfs4_open_delegation_recall() in that case.
>>>>>
>>>>> Furthermore, if an error occurs the delegation code will call
>>>>> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
>>>>> flags in the state and it leads the state manager to into an infinite loop
>>>>> for trying to reclaim the delegated state.
>>>>>
>>>>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>>>>> ---
>>>>> fs/nfs/delegation.c | 5 +++--
>>>>> fs/nfs/nfs4proc.c | 2 +-
>>>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>>>>> index 5853f53..8016d89 100644
>>>>> --- a/fs/nfs/delegation.c
>>>>> +++ b/fs/nfs/delegation.c
>>>>> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
>>>>> err = nfs4_wait_clnt_recover(clp);
>>>>> } while (err == 0);
>>>>>
>>>>> - if (err) {
>>>>> + if (err && err != -EIO) {
>>>>> nfs_abort_delegation_return(delegation, clp);
>>>
>>> This exception for EIO now has me worried. If we detach the
>>> delegation, then it looks to me as if we will never send a
>>> FREE_STATEID, as required for the case of NFSv4.1.
>>
>> I'd need to brush up on the FREE_STATEID op to confidently answer
>> that. But can't this be taken care of in state_recovery when locks are
>> marked lost?
>
> No, because in the case of EIO we now fall through to
> nfs_detach_delegation() and nfs_do_return_delegation(). What we really
> want to do is call TEST_STATEID + FREE_STATEID (see RFC5661 sections
> 8.2.4, 8.5 and 18.38.3), which will now be done as part of state
> recovery.
I don't see it. :-/ Is it because there is a race between returning a
delegation and state recovery that was initiated? Because as far as I
can see, the stateid recovery initiated by the
handle_delegation_recall_error() will call nfs4_test_stateid() via
nfs41_open_expire().
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> trond.myklebust@primarydata.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
2014-11-06 18:54 ` Olga Kornievskaia
@ 2014-11-06 19:15 ` Trond Myklebust
0 siblings, 0 replies; 8+ messages in thread
From: Trond Myklebust @ 2014-11-06 19:15 UTC (permalink / raw)
To: Olga Kornievskaia; +Cc: linux-nfs
On Thu, Nov 6, 2014 at 1:54 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> On Wed, Nov 5, 2014 at 1:38 PM, Trond Myklebust
> <trond.myklebust@primarydata.com> wrote:
>> On Wed, Nov 5, 2014 at 12:53 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> Thank you for looking into this!
>>>
>>> On Tue, Nov 4, 2014 at 7:14 PM, Trond Myklebust
>>> <trond.myklebust@primarydata.com> wrote:
>>>> On Tue, Nov 4, 2014 at 6:22 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> Trond, can you please respond to the patch?
>>>>>
>>>>> As per earlier conversation, in this solution, state recovery is
>>>>> initiated which marks the locks lost.
>>>>>
>>>>> Please either accept this patch or let me know what needs to be fixed.
>>>>>
>>>>
>>>> Please see the 3 fixes I just sent out concerning delegation recovery
>>>> w.r.t. NFSv4+NFSv4.1. In addition, we need to handle the case you
>>>> patch attempts to address (however see the question I have below).
>>>>
>>>>> Thank you.
>>>>>
>>>>> On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>>>>>> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
>>>>>> to return currently held delegation, we shouldn't be trying to reclaim locks
>>>>>> associated with that delegation state_id because we don't have an
>>>>>> open_stateid to be used for the LOCK operation. Thus, we should
>>>>>> return an error from the nfs4_open_delegation_recall() in that case.
>>>>>>
>>>>>> Furthermore, if an error occurs the delegation code will call
>>>>>> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
>>>>>> flags in the state and it leads the state manager to into an infinite loop
>>>>>> for trying to reclaim the delegated state.
>>>>>>
>>>>>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>>>>>> ---
>>>>>> fs/nfs/delegation.c | 5 +++--
>>>>>> fs/nfs/nfs4proc.c | 2 +-
>>>>>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>>>>>> index 5853f53..8016d89 100644
>>>>>> --- a/fs/nfs/delegation.c
>>>>>> +++ b/fs/nfs/delegation.c
>>>>>> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
>>>>>> err = nfs4_wait_clnt_recover(clp);
>>>>>> } while (err == 0);
>>>>>>
>>>>>> - if (err) {
>>>>>> + if (err && err != -EIO) {
>>>>>> nfs_abort_delegation_return(delegation, clp);
>>>>
>>>> This exception for EIO now has me worried. If we detach the
>>>> delegation, then it looks to me as if we will never send a
>>>> FREE_STATEID, as required for the case of NFSv4.1.
>>>
>>> I'd need to brush up on the FREE_STATEID op to confidently answer
>>> that. But can't this be taken care of in state_recovery when locks are
>>> marked lost?
>>
>> No, because in the case of EIO we now fall through to
>> nfs_detach_delegation() and nfs_do_return_delegation(). What we really
>> want to do is call TEST_STATEID + FREE_STATEID (see RFC5661 sections
>> 8.2.4, 8.5 and 18.38.3), which will now be done as part of state
>> recovery.
>
> I don't see it. :-/ Is it because there is a race between returning a
> delegation and state recovery that was initiated? Because as far as I
> can see, the stateid recovery initiated by the
> handle_delegation_recall_error() will call nfs4_test_stateid() via
> nfs41_open_expire().
>
No. My point is that once we call nfs_detach_delegation(), then nobody
will be able to find the delegation from either the struct nfs_inode
or from the server->delegations list.
IOW: the question is how would the state recovery thread find it in
the first place?
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-11-06 19:16 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-24 20:13 [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return Olga Kornievskaia
-- strict thread matches above, loose matches on Subject: below --
2014-11-03 14:36 Olga Kornievskaia
2014-11-04 23:22 ` Olga Kornievskaia
2014-11-05 0:14 ` Trond Myklebust
2014-11-05 17:53 ` Olga Kornievskaia
2014-11-05 18:38 ` Trond Myklebust
2014-11-06 18:54 ` Olga Kornievskaia
2014-11-06 19:15 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox