* [PATCH 0/2 v2] a stateid race and a cleanup
@ 2020-09-22 19:15 Benjamin Coddington
2020-09-22 19:15 ` [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence Benjamin Coddington
2020-09-22 19:15 ` [PATCH 2/2 v2] NFSv4: cleanup unused zero_stateid copy Benjamin Coddington
0 siblings, 2 replies; 8+ messages in thread
From: Benjamin Coddington @ 2020-09-22 19:15 UTC (permalink / raw)
To: trond.myklebust, anna.schumaker; +Cc: linux-nfs
Cover letter this time explaining the v2: Anna helped me find that the
first version's stable fix was wrong, but was fixed by the refactor patch
that followed.
After putting in the logic to fix the stable version, it was messy enough
that it made more sense to squash the two patches together. So, this time
the first patch does rewrite nfs_need_update_open_stateid a bit more in
order to handle both cases:
- where two OPENs race to NFS_OPEN_STATE and the second wins
- where an OPEN and CLOSE+1 race to update nfs4_state and CLOSE+1 wins
The end result is that these two patches are code-equivalent to the first
three. (It is still getting one final run through my testing, but I haven't
delayed posting for that).
Benjamin Coddington (2):
NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
NFSv4: cleanup unused zero_stateid copy
fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
fs/nfs/nfs4state.c | 8 ++------
2 files changed, 19 insertions(+), 16 deletions(-)
--
2.20.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
2020-09-22 19:15 [PATCH 0/2 v2] a stateid race and a cleanup Benjamin Coddington
@ 2020-09-22 19:15 ` Benjamin Coddington
2020-09-22 19:48 ` Trond Myklebust
2020-09-22 19:15 ` [PATCH 2/2 v2] NFSv4: cleanup unused zero_stateid copy Benjamin Coddington
1 sibling, 1 reply; 8+ messages in thread
From: Benjamin Coddington @ 2020-09-22 19:15 UTC (permalink / raw)
To: trond.myklebust, anna.schumaker; +Cc: linux-nfs
Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE races
with the update of the nfs_state:
Process 1 Process 2 Server
========= ========= ========
OPEN file
OPEN file
Reply OPEN (1)
Reply OPEN (2)
Update state (1)
CLOSE file (1)
Reply OLD_STATEID (1)
CLOSE file (2)
Reply CLOSE (-1)
Update state (2)
wait for state change
OPEN file
wake
CLOSE file
OPEN file
wake
CLOSE file
...
...
As long as the first process continues updating state, the second process
will fail to exit the loop in nfs_set_open_stateid_locked(). This livelock
has been observed in generic/168.
Fix this by detecting the case in nfs_need_update_open_stateid() and
then exit the loop if:
- the state is NFS_OPEN_STATE, and
- the stateid sequence is > 1, and
- the stateid doesn't match the current open stateid
Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6e95c85fe395..9db29a438540 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1588,18 +1588,25 @@ static void nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
static bool nfs_need_update_open_stateid(struct nfs4_state *state,
const nfs4_stateid *stateid)
{
- if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
- !nfs4_stateid_match_other(stateid, &state->open_stateid)) {
- if (stateid->seqid == cpu_to_be32(1))
+ bool state_matches_other = nfs4_stateid_match_other(stateid, &state->open_stateid);
+ bool seqid_one = stateid->seqid == cpu_to_be32(1);
+
+ if (test_bit(NFS_OPEN_STATE, &state->flags)) {
+ /* The common case - we're updating to a new sequence number */
+ if (state_matches_other && nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
+ nfs_state_log_out_of_order_open_stateid(state, stateid);
+ return true;
+ }
+ } else {
+ /* This is the first OPEN */
+ if (!state_matches_other && seqid_one) {
nfs_state_log_update_open_stateid(state);
- else
+ return true;
+ }
+ if (!state_matches_other && !seqid_one) {
set_bit(NFS_STATE_CHANGE_WAIT, &state->flags);
- return true;
- }
-
- if (nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
- nfs_state_log_out_of_order_open_stateid(state, stateid);
- return true;
+ return true;
+ }
}
return false;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2 v2] NFSv4: cleanup unused zero_stateid copy
2020-09-22 19:15 [PATCH 0/2 v2] a stateid race and a cleanup Benjamin Coddington
2020-09-22 19:15 ` [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence Benjamin Coddington
@ 2020-09-22 19:15 ` Benjamin Coddington
1 sibling, 0 replies; 8+ messages in thread
From: Benjamin Coddington @ 2020-09-22 19:15 UTC (permalink / raw)
To: trond.myklebust, anna.schumaker; +Cc: linux-nfs
Since commit d9aba2b40de6 ("NFSv4: Don't use the zero stateid with
layoutget") the zero stateid will never be used.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
fs/nfs/nfs4state.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 4bf10792cb5b..06bbe19c8b2c 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1018,18 +1018,14 @@ static int nfs4_copy_lock_stateid(nfs4_stateid *dst,
bool nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
{
bool ret;
- const nfs4_stateid *src;
int seq;
do {
ret = false;
- src = &zero_stateid;
seq = read_seqbegin(&state->seqlock);
- if (test_bit(NFS_OPEN_STATE, &state->flags)) {
- src = &state->open_stateid;
+ if (test_bit(NFS_OPEN_STATE, &state->flags))
ret = true;
- }
- nfs4_stateid_copy(dst, src);
+ nfs4_stateid_copy(dst, &state->open_stateid);
} while (read_seqretry(&state->seqlock, seq));
return ret;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
2020-09-22 19:15 ` [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence Benjamin Coddington
@ 2020-09-22 19:48 ` Trond Myklebust
2020-09-22 21:08 ` Benjamin Coddington
0 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2020-09-22 19:48 UTC (permalink / raw)
To: bcodding@redhat.com, anna.schumaker@netapp.com; +Cc: linux-nfs@vger.kernel.org
On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote:
> Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
> CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE
> races
> with the update of the nfs_state:
>
> Process 1 Process 2 Server
> ========= ========= ========
> OPEN file
> OPEN file
> Reply OPEN (1)
> Reply OPEN (2)
> Update state (1)
> CLOSE file (1)
> Reply OLD_STATEID (1)
> CLOSE file (2)
> Reply CLOSE (-1)
> Update state (2)
> wait for state change
> OPEN file
> wake
> CLOSE file
> OPEN file
> wake
> CLOSE file
> ...
> ...
>
> As long as the first process continues updating state, the second
> process
> will fail to exit the loop in nfs_set_open_stateid_locked(). This
> livelock
> has been observed in generic/168.
>
> Fix this by detecting the case in nfs_need_update_open_stateid() and
> then exit the loop if:
> - the state is NFS_OPEN_STATE, and
> - the stateid sequence is > 1, and
> - the stateid doesn't match the current open stateid
>
> Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
> CLOSE/OPEN_DOWNGRADE")
> Cc: stable@vger.kernel.org # v5.4+
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
> fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
> 1 file changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 6e95c85fe395..9db29a438540 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -1588,18 +1588,25 @@ static void
> nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
> static bool nfs_need_update_open_stateid(struct nfs4_state *state,
> const nfs4_stateid *stateid)
> {
> - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
> - !nfs4_stateid_match_other(stateid, &state->open_stateid)) {
> - if (stateid->seqid == cpu_to_be32(1))
> + bool state_matches_other = nfs4_stateid_match_other(stateid,
> &state->open_stateid);
> + bool seqid_one = stateid->seqid == cpu_to_be32(1);
> +
> + if (test_bit(NFS_OPEN_STATE, &state->flags)) {
> + /* The common case - we're updating to a new sequence
> number */
> + if (state_matches_other &&
> nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
> + nfs_state_log_out_of_order_open_stateid(state,
> stateid);
> + return true;
> + }
> + } else {
> + /* This is the first OPEN */
> + if (!state_matches_other && seqid_one) {
Why are we looking at the value of state_matches_other here? If the
NFS_OPEN_STATE flags is not set, then the state->open_stateid is not
initialised, so how does it make sense to look at a comparison with the
incoming stateid?
> nfs_state_log_update_open_stateid(state);
> - else
> + return true;
> + }
> + if (!state_matches_other && !seqid_one) {
Ditto.
> set_bit(NFS_STATE_CHANGE_WAIT, &state->flags);
> - return true;
> - }
> -
> - if (nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
> - nfs_state_log_out_of_order_open_stateid(state,
> stateid);
> - return true;
> + return true;
> + }
> }
> return false;
> }
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
2020-09-22 19:48 ` Trond Myklebust
@ 2020-09-22 21:08 ` Benjamin Coddington
2020-09-22 21:15 ` Trond Myklebust
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Coddington @ 2020-09-22 21:08 UTC (permalink / raw)
To: Trond Myklebust; +Cc: anna.schumaker, linux-nfs
On 22 Sep 2020, at 15:48, Trond Myklebust wrote:
> On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote:
>> Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
>> CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE
>> races
>> with the update of the nfs_state:
>>
>> Process 1 Process 2 Server
>> ========= ========= ========
>> OPEN file
>> OPEN file
>> Reply OPEN (1)
>> Reply OPEN (2)
>> Update state (1)
>> CLOSE file (1)
>> Reply OLD_STATEID (1)
>> CLOSE file (2)
>> Reply CLOSE (-1)
>> Update state (2)
>> wait for state change
>> OPEN file
>> wake
>> CLOSE file
>> OPEN file
>> wake
>> CLOSE file
>> ...
>> ...
>>
>> As long as the first process continues updating state, the second
>> process
>> will fail to exit the loop in nfs_set_open_stateid_locked(). This
>> livelock
>> has been observed in generic/168.
>>
>> Fix this by detecting the case in nfs_need_update_open_stateid() and
>> then exit the loop if:
>> - the state is NFS_OPEN_STATE, and
>> - the stateid sequence is > 1, and
>> - the stateid doesn't match the current open stateid
>>
>> Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
>> CLOSE/OPEN_DOWNGRADE")
>> Cc: stable@vger.kernel.org # v5.4+
>> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>> ---
>> fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 6e95c85fe395..9db29a438540 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -1588,18 +1588,25 @@ static void
>> nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
>> static bool nfs_need_update_open_stateid(struct nfs4_state *state,
>> const nfs4_stateid *stateid)
>> {
>> - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
>> - !nfs4_stateid_match_other(stateid, &state->open_stateid)) {
>> - if (stateid->seqid == cpu_to_be32(1))
>> + bool state_matches_other = nfs4_stateid_match_other(stateid,
>> &state->open_stateid);
>> + bool seqid_one = stateid->seqid == cpu_to_be32(1);
>> +
>> + if (test_bit(NFS_OPEN_STATE, &state->flags)) {
>> + /* The common case - we're updating to a new sequence
>> number */
>> + if (state_matches_other &&
>> nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
>> + nfs_state_log_out_of_order_open_stateid(state,
>> stateid);
>> + return true;
>> + }
>> + } else {
>> + /* This is the first OPEN */
>> + if (!state_matches_other && seqid_one) {
>
> Why are we looking at the value of state_matches_other here? If the
> NFS_OPEN_STATE flags is not set, then the state->open_stateid is not
> initialised, so how does it make sense to look at a comparison with
> the
> incoming stateid?
You're right - it doesn't make sense. I failed to clean it up out of my
decision matrix. I'll send a v3 after it gets tested overnight.
Thanks for the look, and if you have another moment - what do you think
about not bumping the seqid in the CLOSE/OPEN_DOWNGRADE path on
OLD_STATEID
if the state's refcount > 1? This would optimize away a lot of recovery
for
races like this, but I wonder if it would be possible to end up with two
OPEN_DOWNGRADEs dueling that would never recover.
If you don't have the moment, no problem. Thanks again for looking at
my patch.
Ben
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
2020-09-22 21:08 ` Benjamin Coddington
@ 2020-09-22 21:15 ` Trond Myklebust
2020-09-22 21:46 ` Trond Myklebust
0 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2020-09-22 21:15 UTC (permalink / raw)
To: bcodding@redhat.com; +Cc: linux-nfs@vger.kernel.org, anna.schumaker@netapp.com
On Tue, 2020-09-22 at 17:08 -0400, Benjamin Coddington wrote:
> On 22 Sep 2020, at 15:48, Trond Myklebust wrote:
>
> > On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote:
> > > Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
> > > CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a
> > > CLOSE
> > > races
> > > with the update of the nfs_state:
> > >
> > > Process 1 Process 2 Server
> > > ========= ========= ========
> > > OPEN file
> > > OPEN file
> > > Reply OPEN (1)
> > > Reply OPEN (2)
> > > Update state (1)
> > > CLOSE file (1)
> > > Reply OLD_STATEID (1)
> > > CLOSE file (2)
> > > Reply CLOSE (-1)
> > > Update state (2)
> > > wait for state change
> > > OPEN file
> > > wake
> > > CLOSE file
> > > OPEN file
> > > wake
> > > CLOSE file
> > > ...
> > > ...
> > >
> > > As long as the first process continues updating state, the second
> > > process
> > > will fail to exit the loop in
> > > nfs_set_open_stateid_locked(). This
> > > livelock
> > > has been observed in generic/168.
> > >
> > > Fix this by detecting the case in nfs_need_update_open_stateid()
> > > and
> > > then exit the loop if:
> > > - the state is NFS_OPEN_STATE, and
> > > - the stateid sequence is > 1, and
> > > - the stateid doesn't match the current open stateid
> > >
> > > Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
> > > CLOSE/OPEN_DOWNGRADE")
> > > Cc: stable@vger.kernel.org # v5.4+
> > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > > ---
> > > fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
> > > 1 file changed, 17 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 6e95c85fe395..9db29a438540 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -1588,18 +1588,25 @@ static void
> > > nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
> > > static bool nfs_need_update_open_stateid(struct nfs4_state
> > > *state,
> > > const nfs4_stateid *stateid)
> > > {
> > > - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
> > > - !nfs4_stateid_match_other(stateid, &state->open_stateid)) {
> > > - if (stateid->seqid == cpu_to_be32(1))
> > > + bool state_matches_other = nfs4_stateid_match_other(stateid,
> > > &state->open_stateid);
> > > + bool seqid_one = stateid->seqid == cpu_to_be32(1);
> > > +
> > > + if (test_bit(NFS_OPEN_STATE, &state->flags)) {
> > > + /* The common case - we're updating to a new sequence
> > > number */
> > > + if (state_matches_other &&
> > > nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
> > > + nfs_state_log_out_of_order_open_stateid(state,
> > > stateid);
> > > + return true;
> > > + }
> > > + } else {
> > > + /* This is the first OPEN */
> > > + if (!state_matches_other && seqid_one) {
> >
> > Why are we looking at the value of state_matches_other here? If the
> > NFS_OPEN_STATE flags is not set, then the state->open_stateid is
> > not
> > initialised, so how does it make sense to look at a comparison
> > with
> > the
> > incoming stateid?
>
> You're right - it doesn't make sense. I failed to clean it up out of
> my
> decision matrix. I'll send a v3 after it gets tested overnight.
>
> Thanks for the look, and if you have another moment - what do you
> think
> about not bumping the seqid in the CLOSE/OPEN_DOWNGRADE path on
> OLD_STATEID
> if the state's refcount > 1? This would optimize away a lot of
> recovery
> for
> races like this, but I wonder if it would be possible to end up with
> two
> OPEN_DOWNGRADEs dueling that would never recover.
>
It would lead to a return of the infinite loop in cases where the
client misses a reply to an OPEN or CLOSE due to a soft timeout (which
is an important use case for us).
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
2020-09-22 21:15 ` Trond Myklebust
@ 2020-09-22 21:46 ` Trond Myklebust
2020-09-22 22:38 ` Benjamin Coddington
0 siblings, 1 reply; 8+ messages in thread
From: Trond Myklebust @ 2020-09-22 21:46 UTC (permalink / raw)
To: bcodding@redhat.com; +Cc: linux-nfs@vger.kernel.org, anna.schumaker@netapp.com
On Tue, 2020-09-22 at 17:15 -0400, Trond Myklebust wrote:
> On Tue, 2020-09-22 at 17:08 -0400, Benjamin Coddington wrote:
> > On 22 Sep 2020, at 15:48, Trond Myklebust wrote:
> >
> > > On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote:
> > > > Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID
> > > > in
> > > > CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a
> > > > CLOSE
> > > > races
> > > > with the update of the nfs_state:
> > > >
> > > > Process 1 Process 2 Server
> > > > ========= ========= ========
> > > > OPEN file
> > > > OPEN file
> > > > Reply OPEN (1)
> > > > Reply OPEN (2)
> > > > Update state (1)
> > > > CLOSE file (1)
> > > > Reply OLD_STATEID (1)
> > > > CLOSE file (2)
> > > > Reply CLOSE (-1)
> > > > Update state (2)
> > > > wait for state change
> > > > OPEN file
> > > > wake
> > > > CLOSE file
> > > > OPEN file
> > > > wake
> > > > CLOSE file
> > > > ...
> > > > ...
> > > >
> > > > As long as the first process continues updating state, the
> > > > second
> > > > process
> > > > will fail to exit the loop in
> > > > nfs_set_open_stateid_locked(). This
> > > > livelock
> > > > has been observed in generic/168.
> > > >
> > > > Fix this by detecting the case in
> > > > nfs_need_update_open_stateid()
> > > > and
> > > > then exit the loop if:
> > > > - the state is NFS_OPEN_STATE, and
> > > > - the stateid sequence is > 1, and
> > > > - the stateid doesn't match the current open stateid
> > > >
> > > > Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
> > > > CLOSE/OPEN_DOWNGRADE")
> > > > Cc: stable@vger.kernel.org # v5.4+
> > > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > > > ---
> > > > fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
> > > > 1 file changed, 17 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > > index 6e95c85fe395..9db29a438540 100644
> > > > --- a/fs/nfs/nfs4proc.c
> > > > +++ b/fs/nfs/nfs4proc.c
> > > > @@ -1588,18 +1588,25 @@ static void
> > > > nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
> > > > static bool nfs_need_update_open_stateid(struct nfs4_state
> > > > *state,
> > > > const nfs4_stateid *stateid)
> > > > {
> > > > - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
> > > > - !nfs4_stateid_match_other(stateid, &state-
> > > > >open_stateid)) {
> > > > - if (stateid->seqid == cpu_to_be32(1))
> > > > + bool state_matches_other =
> > > > nfs4_stateid_match_other(stateid,
> > > > &state->open_stateid);
> > > > + bool seqid_one = stateid->seqid == cpu_to_be32(1);
> > > > +
> > > > + if (test_bit(NFS_OPEN_STATE, &state->flags)) {
> > > > + /* The common case - we're updating to a new
> > > > sequence
> > > > number */
> > > > + if (state_matches_other &&
> > > > nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
> > > > + nfs_state_log_out_of_order_open_stateid
> > > > (state,
> > > > stateid);
> > > > + return true;
> > > > + }
> > > > + } else {
> > > > + /* This is the first OPEN */
> > > > + if (!state_matches_other && seqid_one) {
> > >
> > > Why are we looking at the value of state_matches_other here? If
> > > the
> > > NFS_OPEN_STATE flags is not set, then the state->open_stateid is
> > > not
> > > initialised, so how does it make sense to look at a comparison
> > > with
> > > the
> > > incoming stateid?
> >
> > You're right - it doesn't make sense. I failed to clean it up out
> > of
> > my
> > decision matrix. I'll send a v3 after it gets tested overnight.
> >
> > Thanks for the look, and if you have another moment - what do you
> > think
> > about not bumping the seqid in the CLOSE/OPEN_DOWNGRADE path on
> > OLD_STATEID
> > if the state's refcount > 1? This would optimize away a lot of
> > recovery
> > for
> > races like this, but I wonder if it would be possible to end up
> > with
> > two
> > OPEN_DOWNGRADEs dueling that would never recover.
> >
>
> It would lead to a return of the infinite loop in cases where the
> client misses a reply to an OPEN or CLOSE due to a soft timeout
> (which
> is an important use case for us).
>
In principle, RFC5661 says that the client is supposed to process all
stateid operations in the order dictated by the seqid. The one problem
with that mandate is that open-by-filename doesn't allow us to know
whether or not a seqid bump is outstanding. This is why we have the 5
second timeout wait in nfs_set_open_stateid_locked().
We could do something similar to that for CLOSE by setting the seqid to
0, and then ensuring we process the CLOSE in the order the server ends
up processing it. Unfortunately, that would require us to replay any
OPEN calls that got shut down because we used seqid 0 (it would also
not work for NFSv4.0)... So perhaps the best solution would be to
instead allow CLOSE to wait for outstanding OPENs to complete, just
like we do in nfs_set_open_stateid_locked()? Thoughts?
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence
2020-09-22 21:46 ` Trond Myklebust
@ 2020-09-22 22:38 ` Benjamin Coddington
0 siblings, 0 replies; 8+ messages in thread
From: Benjamin Coddington @ 2020-09-22 22:38 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs, anna.schumaker
On 22 Sep 2020, at 17:46, Trond Myklebust wrote:
> On Tue, 2020-09-22 at 17:15 -0400, Trond Myklebust wrote:
>> On Tue, 2020-09-22 at 17:08 -0400, Benjamin Coddington wrote:
>>> On 22 Sep 2020, at 15:48, Trond Myklebust wrote:
>>>
>>>> On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote:
>>>>> Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID
>>>>> in
>>>>> CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a
>>>>> CLOSE
>>>>> races
>>>>> with the update of the nfs_state:
>>>>>
>>>>> Process 1 Process 2 Server
>>>>> ========= ========= ========
>>>>> OPEN file
>>>>> OPEN file
>>>>> Reply OPEN (1)
>>>>> Reply OPEN (2)
>>>>> Update state (1)
>>>>> CLOSE file (1)
>>>>> Reply OLD_STATEID (1)
>>>>> CLOSE file (2)
>>>>> Reply CLOSE (-1)
>>>>> Update state (2)
>>>>> wait for state change
>>>>> OPEN file
>>>>> wake
>>>>> CLOSE file
>>>>> OPEN file
>>>>> wake
>>>>> CLOSE file
>>>>> ...
>>>>> ...
>>>>>
>>>>> As long as the first process continues updating state, the
>>>>> second
>>>>> process
>>>>> will fail to exit the loop in
>>>>> nfs_set_open_stateid_locked(). This
>>>>> livelock
>>>>> has been observed in generic/168.
>>>>>
>>>>> Fix this by detecting the case in
>>>>> nfs_need_update_open_stateid()
>>>>> and
>>>>> then exit the loop if:
>>>>> - the state is NFS_OPEN_STATE, and
>>>>> - the stateid sequence is > 1, and
>>>>> - the stateid doesn't match the current open stateid
>>>>>
>>>>> Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
>>>>> CLOSE/OPEN_DOWNGRADE")
>>>>> Cc: stable@vger.kernel.org # v5.4+
>>>>> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>>>>> ---
>>>>> fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
>>>>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>>>> index 6e95c85fe395..9db29a438540 100644
>>>>> --- a/fs/nfs/nfs4proc.c
>>>>> +++ b/fs/nfs/nfs4proc.c
>>>>> @@ -1588,18 +1588,25 @@ static void
>>>>> nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
>>>>> static bool nfs_need_update_open_stateid(struct nfs4_state
>>>>> *state,
>>>>> const nfs4_stateid *stateid)
>>>>> {
>>>>> - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
>>>>> - !nfs4_stateid_match_other(stateid, &state-
>>>>>> open_stateid)) {
>>>>> - if (stateid->seqid == cpu_to_be32(1))
>>>>> + bool state_matches_other =
>>>>> nfs4_stateid_match_other(stateid,
>>>>> &state->open_stateid);
>>>>> + bool seqid_one = stateid->seqid == cpu_to_be32(1);
>>>>> +
>>>>> + if (test_bit(NFS_OPEN_STATE, &state->flags)) {
>>>>> + /* The common case - we're updating to a new
>>>>> sequence
>>>>> number */
>>>>> + if (state_matches_other &&
>>>>> nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
>>>>> + nfs_state_log_out_of_order_open_stateid
>>>>> (state,
>>>>> stateid);
>>>>> + return true;
>>>>> + }
>>>>> + } else {
>>>>> + /* This is the first OPEN */
>>>>> + if (!state_matches_other && seqid_one) {
>>>>
>>>> Why are we looking at the value of state_matches_other here? If
>>>> the
>>>> NFS_OPEN_STATE flags is not set, then the state->open_stateid is
>>>> not
>>>> initialised, so how does it make sense to look at a comparison
>>>> with
>>>> the
>>>> incoming stateid?
>>>
>>> You're right - it doesn't make sense. I failed to clean it up out
>>> of
>>> my
>>> decision matrix. I'll send a v3 after it gets tested overnight.
>>>
>>> Thanks for the look, and if you have another moment - what do you
>>> think
>>> about not bumping the seqid in the CLOSE/OPEN_DOWNGRADE path on
>>> OLD_STATEID
>>> if the state's refcount > 1? This would optimize away a lot of
>>> recovery
>>> for
>>> races like this, but I wonder if it would be possible to end up
>>> with
>>> two
>>> OPEN_DOWNGRADEs dueling that would never recover.
>>>
>>
>> It would lead to a return of the infinite loop in cases where the
>> client misses a reply to an OPEN or CLOSE due to a soft timeout
>> (which
>> is an important use case for us).
I hadn't thought about that, but doesn't the task return and decrement the
refcount on nfs4_state?
> In principle, RFC5661 says that the client is supposed to process all
> stateid operations in the order dictated by the seqid. The one problem
> with that mandate is that open-by-filename doesn't allow us to know
> whether or not a seqid bump is outstanding. This is why we have the 5
> second timeout wait in nfs_set_open_stateid_locked().
>
> We could do something similar to that for CLOSE by setting the seqid to
> 0, and then ensuring we process the CLOSE in the order the server ends
> up processing it. Unfortunately, that would require us to replay any
> OPEN calls that got shut down because we used seqid 0 (it would also
> not work for NFSv4.0)... So perhaps the best solution would be to
> instead allow CLOSE to wait for outstanding OPENs to complete, just
> like we do in nfs_set_open_stateid_locked()? Thoughts?
How can you know if there are outstanding OPENs? I thought that was the
whole intention of the CLOSE -> OLD_STATEID retry bump -- you cannot know if
the server was able to process an interrupted OPEN.
I had a dumb patch that just caused CLOSE to delay a bit before retrying if
it received OLD_STATEID, but I figured that was vulnerable to another
livelock where another process could just keep doing open(), close() as
well.
Ben
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-09-22 22:38 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-22 19:15 [PATCH 0/2 v2] a stateid race and a cleanup Benjamin Coddington
2020-09-22 19:15 ` [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence Benjamin Coddington
2020-09-22 19:48 ` Trond Myklebust
2020-09-22 21:08 ` Benjamin Coddington
2020-09-22 21:15 ` Trond Myklebust
2020-09-22 21:46 ` Trond Myklebust
2020-09-22 22:38 ` Benjamin Coddington
2020-09-22 19:15 ` [PATCH 2/2 v2] NFSv4: cleanup unused zero_stateid copy Benjamin Coddington
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.