linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: Question on nfs40_discover_server_trunking.
Date: Fri, 18 Jan 2013 14:34:07 -0800	[thread overview]
Message-ID: <50F9CDDF.4070609@candelatech.com> (raw)
In-Reply-To: <1A56CC87-38FF-47B4-9CA9-7BAE394AF0D2@oracle.com>

On 01/18/2013 02:29 PM, Chuck Lever wrote:
>
> On Jan 18, 2013, at 4:59 PM, Ben Greear <greearb@candelatech.com> wrote:
>
>> On 01/18/2013 01:33 PM, Chuck Lever wrote:
>>>
>>> On Jan 18, 2013, at 4:28 PM, Ben Greear <greearb@candelatech.com> wrote:
>>>
>>>> Any chance the STALE_CLIENTID case needs a 'break'?
>>>
>>> I don't think so.  LEASE_CONFIRM is set, and we want to wake the state renewal thread.
>>>
>>>>
>>>> Twice I've seen kernel crashes after the nfs40_walk_client_list
>>>> failed (though code comments say it should never fail).
>>>
>>> nfs40_walk_client_list() is looking for an nfs_client that is supposed to already be in the nfs_client list.  If the search fails, that's a bug.
>>>
>>> Eyeball the contents of your nfs_client list.  You should find an appropriate nfs_client in there, and then figure out why the search doesn't find it.
>>
>> Ok, I think I found another issue.
>>
>> nfs4_client_init does not initialize 'old', but passes it to nfs4_discover_server_trunking.
>>
>> That gets passed to the detect_trunking operation, which is nfs40_discover_server_trunking
>> or nfs41_discover_server_trunking.
>
> Yes, *result is an output parameter, not an input parameter.
>
>> This will call walk_client_list, which also may not ever assign a value to
>> 'result'.
>
> It should always assign *result in the SUCCESS case.
>
>> The code in walk_client_list always dereferences result, however.
>>
>> So, that is probably why my system blows up shortly after the 'impossible'
>> error message...
>
> In particular, nfs4_walk_client_list() does not set *result when it returns NFS4ERR_STALE_CLIENTID.  In nfs4_discover_server_trunking(), should we therefore have this:
>
>       status = nfs40_walk_client_list(clp, result, cred);
>       switch (status) {
>       case -NFS4ERR_STALE_CLIENTID:
>               set_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
>>>            nfs4_schedule_state_renewal(clp);
>>>            break;
>       case 0:
>
> I'm not sure the nfs4_schedule_state_renewal() call is necessary here.


I'm not sure I follow you entirely..but I'm going to start testing this patch in
a minute.  Looks like it should fix my crash, and maybe give clues about
why we get to the error case in the first place.


diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index d6b39a9..3188283 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -307,6 +307,8 @@ int nfs40_walk_client_list(struct nfs_client *new,
         };
         int status;

+       *result = NULL;
+
         spin_lock(&nn->nfs_client_lock);
         list_for_each_entry_safe(pos, n, &nn->nfs_client_list, cl_share_link) {
                 /* If "pos" isn't marked ready, we can't trust the
@@ -364,6 +366,7 @@ int nfs40_walk_client_list(struct nfs_client *new,
                 nfs_put_client(prev);
         spin_unlock(&nn->nfs_client_lock);
         pr_err("NFS: %s Error: no matching nfs_client found\n", __func__);
+       WARN_ON(1);
         return -NFS4ERR_STALE_CLIENTID;
  }

@@ -433,6 +436,8 @@ int nfs41_walk_client_list(struct nfs_client *new,
         struct nfs_client *pos, *n, *prev = NULL;
         int error;

+       *result = NULL;
+
         spin_lock(&nn->nfs_client_lock);
         list_for_each_entry_safe(pos, n, &nn->nfs_client_list, cl_share_link) {
                 /* If "pos" isn't marked ready, we can't trust the
@@ -489,6 +494,7 @@ int nfs41_walk_client_list(struct nfs_client *new,
          */
         spin_unlock(&nn->nfs_client_lock);
         pr_err("NFS: %s Error: no matching nfs_client found\n", __func__);
+       WARN_ON(1);
         return -NFS4ERR_STALE_CLIENTID;
  }
  #endif /* CONFIG_NFS_V4_1 */
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index c351e6b..54d29e2 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -142,7 +142,8 @@ int nfs40_discover_server_trunking(struct nfs_client *clp,
         case 0:
                 /* Sustain the lease, even if it's empty.  If the clientid4
                  * goes stale it's of no use for trunking discovery. */
-               nfs4_schedule_state_renewal(*result);
+               if (*result)
+                       nfs4_schedule_state_renewal(*result);
                 break;
         }




-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


  reply	other threads:[~2013-01-18 22:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-18 21:28 Question on nfs40_discover_server_trunking Ben Greear
2013-01-18 21:33 ` Chuck Lever
2013-01-18 21:36   ` Ben Greear
2013-01-18 21:59   ` Ben Greear
2013-01-18 22:29     ` Chuck Lever
2013-01-18 22:34       ` Ben Greear [this message]
2013-01-18 22:43         ` Chuck Lever
2013-01-18 22:51           ` Ben Greear
2013-01-18 23:01             ` Ben Greear
2013-01-18 22:03   ` Myklebust, Trond
     [not found]   ` <1358546604.2872.6.camel@leira.trondhjem.org>
2013-01-18 22:59     ` Myklebust, Trond
2013-01-18 23:06       ` Ben Greear
2013-01-18 23:14       ` Chuck Lever
2013-01-18 23:21         ` Ben Greear
2013-01-19  0:44         ` Myklebust, Trond
     [not found]         ` <1358556248.2835.10.camel@leira.trondhjem.org>
2013-01-19  1:01           ` Myklebust, Trond
2013-01-19  1:27             ` Chuck Lever
2013-01-19  4:11               ` Myklebust, Trond
2013-01-21 17:32                 ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50F9CDDF.4070609@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).