public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Razik <linux@razik.name>
To: Chuck Lever <chuck.lever@oracle.com>, Jim Rees <rees@umich.edu>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [BUG?] Maybe NFS bug since 2.6.37 on SPARC64
Date: Sun, 13 Nov 2011 01:03:50 +0000 (GMT)	[thread overview]
Message-ID: <1321146230.5436.YahooMailNeo@web24710.mail.ird.yahoo.com> (raw)
In-Reply-To: <1E7FF4C1-B7BA-4429-92ED-DC90D6B269C4@oracle.com>

Chuck Lever <chuck.lever@oracle.com> wrote:

> On Nov 12, 2011, at 1:49 PM, Jim Rees wrote:
> 
>>  The question for us is how long should an nfsroot client wait for the 
> server
>>  to reply.  It sounds like the client used to wait longer than it does now.
> 
> Before, the client performed the GETPORT(NFS) step synchronously, first.  This 
> took 30 seconds or so to timeout.  When it did, the client decided to proceed 
> with port 2049.  Then it went on to do the other mount tasks, and at the point 
> had waited long enough that these tasks did not time out while waiting for the 
> switch port.
> 
>>  It seems to me the client should wait at least 90 seconds so that the
>>  situation you're in (servers on non-portfast ports) will work.  I would
>>  think they should wait indefinitely, since there's not much else they 
> can
>>  do.
> 
> It should be simple to wrap the (MNT(mnt), NFS(getroot)) steps in a while(true) 
> loop.  Would mount_root_nfs() be the right place for this?
> 

I thought it would be harder and I had no time to look inside the kernel but now I wrote a patch:
The kernel tries to create the MNT RPC client not once as before but three times - then it gives up.
Third time lucky... ;-)
In my case the 2. MNT request is successful:
---
[   71.594744] ADDRCONF(NETDEV_UP): eth0: link is not ready
[   72.617007] IP-Config: Complete:
[   72.617077]      device=eth0, addr=137.226.167.242, mask=255.255.255.224, gw=137.226.167.225,
[   72.617278]      host=137.226.167.242, domain=, nis-domain=(none),
[   72.617393]      bootserver=255.255.255.255, rootserver=137.226.167.241, rootpath=
[   72.617741] Root-NFS: nfsroot=/srv/nfs/cluster2
[   72.618010] NFS: nfs mount opts='udp,nolock,addr=137.226.167.241'
[   72.618147] NFS:   parsing nfs mount option 'udp'
[   72.618187] NFS:   parsing nfs mount option 'nolock'
[   72.618233] NFS:   parsing nfs mount option 'addr=137.226.167.241'
[   72.618301] NFS: MNTPATH: '/srv/nfs/cluster2'
[   72.618335] NFS: sending MNT request for 137.226.167.241:/srv/nfs/cluster2
[   72.618383] NFS: 1. MNT request
[   73.691872] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx
[   73.711988] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[  107.697332] NFS: 2. MNT request
[  107.704591] NFS: received 1 auth flavors
[  107.704653] NFS:   auth flavor[0]: 1
[  107.704834] NFS: MNT request succeeded
[  107.704897] NFS: using auth flavor 1
[  107.711857] VFS: Mounted root (nfs filesystem) on device 0:13.
INIT: version 2.88 booting
---

So many thanks again for your help and your very helpful hints!

Regards,
Lukas


PS: That's what I've done:
--- linux-2.6.39.4/fs/nfs/mount_clnt.c  2011-08-03 21:43:28.000000000 +0200
+++ linux-2.6.39.4-fix/fs/nfs/mount_clnt.c      2011-11-13 01:58:13.000000000 +0100
@@ -164,6 +164,7 @@
        };
        struct rpc_clnt         *mnt_clnt;
        int                     status;
+       int                     attempt = 0;
 
        dprintk("NFS: sending MNT request for %s:%s\n",
                (info->hostname ? info->hostname : "server"),
@@ -172,7 +173,13 @@
        if (info->noresvport)
                args.flags |= RPC_CLNT_CREATE_NONPRIVPORT;
 
-       mnt_clnt = rpc_create(&args);
+       do {
+               attempt++;
+               dprintk("NFS: %d. MNT request\n", attempt);
+               mnt_clnt = rpc_create(&args);
+       } while (IS_ERR(mnt_clnt) && attempt < 3);
+
+
        if (IS_ERR(mnt_clnt))
                goto out_clnt_err;
--

  reply	other threads:[~2011-11-13  1:03 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-03 19:43 [BUG?] Maybe NFS bug since 2.6.37 on SPARC64 Lukas Razik
2011-11-03 20:54 ` Trond Myklebust
2011-11-03 21:10   ` Chuck Lever
2011-11-03 21:11   ` Jim Rees
2011-11-03 21:16     ` Chuck Lever
2011-11-03 21:37       ` Lukas Razik
2011-11-03 21:51         ` Chuck Lever
2011-11-03 23:09           ` Lukas Razik
2011-11-03 23:59             ` Jim Rees
2011-11-04  0:59               ` Lukas Razik
2011-11-04  1:06             ` Chuck Lever
2011-11-04  1:33               ` Lukas Razik
2011-11-04  9:44               ` Lukas Razik
2011-11-04 13:20                 ` Jim Rees
2011-11-04 14:01                   ` Chuck Lever
2011-11-04 14:09                     ` Myklebust, Trond
2011-11-04 14:24                       ` J. Bruce Fields
2011-11-04 14:46                     ` Jim Rees
2011-11-04 15:02                       ` Lukas Razik
2011-11-04 15:18                       ` Myklebust, Trond
2011-11-04 15:46                       ` Lukas Razik
2011-11-04 22:55                         ` Chuck Lever
2011-11-04 23:17                           ` Lukas Razik
2011-11-04 13:54                 ` Chuck Lever
2011-11-04 14:57                   ` Lukas Razik
2011-11-04 16:56                   ` Lukas Razik
2011-11-04 17:55                   ` Lukas Razik
2011-11-04 23:15                     ` NFSROOT mount fails on SPARC after 2.6.37 Chuck Lever
2011-11-05  2:03                       ` David Miller
2011-11-05  2:38                         ` Trond Myklebust
2011-11-04 23:40                   ` [BUG?] Maybe NFS bug since 2.6.37 on SPARC64 Lukas Razik
2011-11-05  1:19                     ` Trond Myklebust
2011-11-05  1:52                       ` Lukas Razik
2011-11-05  2:14                       ` Lukas Razik
2011-11-05  2:30                         ` Trond Myklebust
2011-11-05  2:31                         ` Trond Myklebust
2011-11-05  2:31                         ` Trond Myklebust
2011-11-05  3:51                           ` Lukas Razik
2011-11-05 13:05                             ` Jim Rees
2011-11-12 11:35                               ` Lukas Razik
2011-11-12 18:49                                 ` Jim Rees
2011-11-12 21:06                                   ` Chuck Lever
2011-11-13  1:03                                     ` Lukas Razik [this message]
2011-11-13 19:32                                       ` Chuck Lever
2011-11-13 21:28                                         ` Lukas Razik
2011-11-13 22:19                                           ` Lukas Razik
2011-11-14 15:31                                             ` Chuck Lever
2011-11-03 21:18   ` Lukas Razik
2011-11-03 21:38     ` Jim Rees
2011-11-03 21:58       ` Lukas Razik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1321146230.5436.YahooMailNeo@web24710.mail.ird.yahoo.com \
    --to=linux@razik.name \
    --cc=Trond.Myklebust@netapp.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rees@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox