linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anna Schumaker <Anna.Schumaker@netapp.com>
To: Benjamin Coddington <bcodding@redhat.com>
Cc: Trond Myklebust <trondmy@primarydata.com>,
	List Linux NFS Mailing <linux-nfs@vger.kernel.org>,
	Oleg Drokin <green@linuxhacker.ru>
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks
Date: Thu, 10 Nov 2016 15:54:46 -0500	[thread overview]
Message-ID: <50b6aeb9-cb21-9f46-dadd-e7ba0f5d86ed@Netapp.com> (raw)
In-Reply-To: <BDCCD810-781F-4DD6-91E8-279A2C3377EF@redhat.com>

On 11/10/2016 03:18 PM, Benjamin Coddington wrote:
> 
> On 10 Nov 2016, at 10:58, Benjamin Coddington wrote:
> 
>> Hi Anna,
>>
>> On 10 Nov 2016, at 10:01, Anna Schumaker wrote:
>>> Do you have an estimate for when this patch will be ready?  I want to include it in my next bugfix pull request for 4.9.
>>
>> I haven't posted because I am still trying to get to the bottom of another
>> problem where the client gets stuck in a loop sending the same stateid over
>> and over on NFS4ERR_OLD_STATEID.  I want to make sure this problem isn't
>> caused by this fix -- which I don't think it is, but I'd rather make sure.
>> If I don't make any progress on this problem by the end of today, I'll post
>> what I have.
>>
>> Read on if interested in this new problem:
>>
>> It looks like racing opens with the same openowner can be returned out of
>> order by the server, so the client sees stateid seqid of 2 before 1.  Then a
>> LOCK sent with seqid 1 is endlessly retried if sent while doing recovery.
>>
>> It's hard to tell if I was able to capture all the moving parts to describe
>> this problem, though.  As it takes a very long time for me to reproduce, and
>> the packet captures were dropping frames.  I'm working on manually
>> reproducing it now.
> 
> Anna,
> 
> I haven't gotten to the bottom of it, and so I'm not confident it isn't a
> problem created by the fix I've been testing, which is:
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index e809498..2aa9d86 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2564,12 +2564,15 @@ static void nfs41_check_delegation_stateid(struct
> nfs4_state *state)
>  static int nfs41_check_expired_locks(struct nfs4_state *state)
>  {
>         int status, ret = NFS_OK;
> -       struct nfs4_lock_state *lsp;
> +       struct nfs4_lock_state *lsp, *tmp;
>         struct nfs_server *server = NFS_SERVER(state->inode);
> 
>         if (!test_bit(LK_STATE_IN_USE, &state->flags))
>                 goto out;
> -       list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> +       spin_lock(&state->state_lock);
> +       list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) {
> +               atomic_inc(&lsp->ls_count);
> +               spin_unlock(&state->state_lock);
>                 if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>                         struct rpc_cred *cred =
> lsp->ls_state->owner->so_cred;
> 
> @@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct
> nfs4_state *state)
>                                 break;
>                         }
>                 }
> -       };
> +               nfs4_put_lock_state(lsp);
> +               spin_lock(&state->state_lock);
> +       }
> +       spin_unlock(&state->state_lock);
>  out:
>         return ret;
>  }
> 
> http://people.redhat.com/bcodding/old_stateid_loop is tshark output of my
> only good wirecapture of the problem.  Without this patch, generic/089
> crashes long before this problem is reproduced, so I am stuck figuring it
> out, I'm afraid.  Don't wait on my account.
> 
> I plan on trying a bit more to reproduce tomorrow, and if I cannot, I'll
> write about it under separate cover.

Sounds good.  Thanks for the update!

Anna

> 
> Ben

  reply	other threads:[~2016-11-10 20:54 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-22 17:38 [PATCH v7 00/31] Fix delegation behaviour when server revokes some state Trond Myklebust
2016-09-22 17:38 ` [PATCH v7 01/31] NFSv4.1: Don't deadlock the state manager on the SEQUENCE status flags Trond Myklebust
2016-09-22 17:38   ` [PATCH v7 02/31] NFS: Fix inode corruption in nfs_prime_dcache() Trond Myklebust
2016-09-22 17:38     ` [PATCH v7 03/31] NFSv4: Don't report revoked delegations as valid in nfs_have_delegation() Trond Myklebust
2016-09-22 17:38       ` [PATCH v7 04/31] NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid Trond Myklebust
2016-09-22 17:38         ` [PATCH v7 05/31] NFSv4.1: Don't check delegations that are already marked as revoked Trond Myklebust
2016-09-22 17:38           ` [PATCH v7 06/31] NFSv4.1: Allow test_stateid to handle session errors without waiting Trond Myklebust
2016-09-22 17:38             ` [PATCH v7 07/31] NFSv4.1: Add a helper function to deal with expired stateids Trond Myklebust
2016-09-22 17:38               ` [PATCH v7 08/31] NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a stateid Trond Myklebust
2016-09-22 17:38                 ` [PATCH v7 09/31] NFSv4.1: Test delegation stateids when server declares "some state revoked" Trond Myklebust
2016-09-22 17:39                   ` [PATCH v7 10/31] NFSv4.1: Deal with server reboots during delegation expiration recovery Trond Myklebust
2016-09-22 17:39                     ` [PATCH v7 11/31] NFSv4.1: Don't recheck delegations that have already been checked Trond Myklebust
2016-09-22 17:39                       ` [PATCH v7 12/31] NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEID Trond Myklebust
2016-09-22 17:39                         ` [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks Trond Myklebust
2016-09-22 17:39                           ` [PATCH v7 14/31] NFSv4.1: FREE_STATEID can be asynchronous Trond Myklebust
2016-09-22 17:39                             ` [PATCH v7 15/31] NFSv4.1: Ensure we call FREE_STATEID if needed on close/delegreturn/locku Trond Myklebust
2016-09-22 17:39                               ` [PATCH v7 16/31] NFSv4: Ensure we don't re-test revoked and freed stateids Trond Myklebust
2016-09-22 17:39                                 ` [PATCH v7 17/31] NFSv4: nfs_inode_find_state_and_recover() should check all stateids Trond Myklebust
2016-09-22 17:39                                   ` [PATCH v7 18/31] NFSv4: nfs4_handle_delegation_recall_error() handle expiration as revoke case Trond Myklebust
2016-09-22 17:39                                     ` [PATCH v7 19/31] NFSv4: nfs4_handle_setlk_error() " Trond Myklebust
2016-09-22 17:39                                       ` [PATCH v7 20/31] NFSv4.1: nfs4_layoutget_handle_exception handle revoked state Trond Myklebust
2016-09-22 17:39                                         ` [PATCH v7 21/31] NFSv4: Pass the stateid to the exception handler in nfs4_read/write_done_cb Trond Myklebust
2016-09-22 17:39                                           ` [PATCH v7 22/31] NFSv4: Fix a race in nfs_inode_reclaim_delegation() Trond Myklebust
2016-09-22 17:39                                             ` [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid Trond Myklebust
2016-09-22 17:39                                               ` [PATCH v7 24/31] NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation Trond Myklebust
2016-09-22 17:39                                                 ` [PATCH v7 25/31] NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid Trond Myklebust
2016-09-22 17:39                                                   ` [PATCH v7 26/31] NFSv4: Don't test open_stateid unless it is set Trond Myklebust
2016-09-22 17:39                                                     ` [PATCH v7 27/31] NFSv4: Mark the lock and open stateids as invalid after freeing them Trond Myklebust
2016-09-22 17:39                                                       ` [PATCH v7 28/31] NFSv4: Open state recovery must account for file permission changes Trond Myklebust
2016-09-22 17:39                                                         ` [PATCH v7 29/31] NFSv4: Fix retry issues with nfs41_test/free_stateid Trond Myklebust
2016-09-22 17:39                                                           ` [PATCH v7 30/31] NFSv4: If recovery failed for a specific open stateid, then don't retry Trond Myklebust
2016-09-22 17:39                                                             ` [PATCH v7 31/31] NFSv4.1: Even if the stateid is OK, we may need to recover the open modes Trond Myklebust
2016-10-14 12:50                                               ` [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid Christoph Hellwig
2016-11-04 16:02                           ` [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks Benjamin Coddington
2016-11-07 13:09                             ` Benjamin Coddington
2016-11-07 13:45                               ` Benjamin Coddington
2016-11-07 14:50                                 ` Benjamin Coddington
2016-11-07 14:59                                   ` Trond Myklebust
2016-11-08 15:10                                     ` Benjamin Coddington
2016-11-08 15:20                                       ` Trond Myklebust
2016-11-10 15:01                                       ` Anna Schumaker
2016-11-10 15:58                                         ` Benjamin Coddington
2016-11-10 16:51                                           ` Trond Myklebust
2016-11-10 20:18                                           ` Benjamin Coddington
2016-11-10 20:54                                             ` Anna Schumaker [this message]
2016-09-24 20:38 ` [PATCH v7 00/31] Fix delegation behaviour when server revokes some state Oleg Drokin
2016-09-26 20:23 ` Oleg Drokin
     [not found]   ` <A84EB639-97C3-4517-A92F-3A4176A7F916@primarydata.com>
2016-09-26 21:03     ` Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50b6aeb9-cb21-9f46-dadd-e7ba0f5d86ed@Netapp.com \
    --to=anna.schumaker@netapp.com \
    --cc=bcodding@redhat.com \
    --cc=green@linuxhacker.ru \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).