public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Razik <linux@razik.name>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [BUG?] Maybe NFS bug since 2.6.37 on SPARC64
Date: Thu, 3 Nov 2011 21:18:07 +0000 (GMT)	[thread overview]
Message-ID: <1320355087.59657.YahooMailNeo@web24703.mail.ird.yahoo.com> (raw)
In-Reply-To: <1320353685.18396.119.camel@lade.trondhjem.org>

> On Thu, 2011-11-03 at 19:43 +0000, Lukas Razik wrote: 

>>  Hello together!
>> 
>>  My OS: Debian 6.0.3 (squeeze)
>>  Machines: SUN Enterprise T5120 (USPARC64)
>>  ---
>>  Issue description:
>> 
>>  I've an NFS
>>  server (cluster1=137.226.167.241) and a
>>  client (cluster2=137.226.167.242) which should mount it's nfsroot from 
> cluster1.
>> 
>>  The linux-2.6.32 kernel on cluster2 shows this during startup:
>>  [ 528.982985] IP-Config: Complete:
>>  [ 528.983049] device=eth0, addr=137.226.167.242, mask=255.255.255.224, 
> gw=137.226.167.225,
>>  [ 528.983299] host=cluster2, domain=, nis-domain=(none),
>>  [ 528.983383] bootserver=255.255.255.255, rootserver=137.226.167.241, 
> rootpath=
>>  [ 528.983633] Looking up port of RPC 100003/2 on 137.226.167.241
>>  [ 530.037059] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: Rx
>>  [ 530.056881] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>  [ 564.002113] rpcbind: server 137.226.167.241 not responding, timed out
>>  [ 564.002295] Root-NFS: Unable to get nfsd port number from server, using 
> default
>>  [ 564.002412] Looking up port of RPC 100005/1 on 137.226.167.241
>>  [ 564.104137] VFS: Mounted root (nfs filesystem) on device 0:15.
>> 
>>  It can mount the nfsroot finally.
>> 
>>  But if I use kernel linux-2.6.39.4 on cluster2 it can't mount it's 
> nfsroot.
>>  (I've added "nfsdebug" to the kernel arguments for more debug 
> info):
>>  [ 407.571521] IP-Config: Complete:
>>  [ 407.571589] device=eth0, addr=137.226.167.242, mask=255.255.255.224, 
> gw=137.226.167.225,
>>  [ 407.571793] host=cluster2, domain=, nis-domain=(none),
>>  [ 407.571907] bootserver=255.255.255.255, rootserver=137.226.167.241, 
> rootpath=
>>  [ 407.572332] Root-NFS: nfsroot=/srv/nfs/cluster2
>>  [ 407.572726] NFS: nfs mount opts='udp,nolock,addr=137.226.167.241'
>>  [ 407.572927] NFS: parsing nfs mount option 'udp'
>>  [ 407.572995] NFS: parsing nfs mount option 'nolock'
>>  [ 407.573071] NFS: parsing nfs mount option 'addr=137.226.167.241'
>>  [ 407.573139] NFS: MNTPATH: '/srv/nfs/cluster2'
>>  [ 407.573203] NFS: sending MNT request for 
> 137.226.167.241:/srv/nfs/cluster2
>>  [ 408.617894] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: Rx
>>  [ 408.638319] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>  [ 442.666622] NFS: failed to create MNT RPC client, status=-60
>>  [ 442.666732] NFS: unable to mount server 137.226.167.241, error -60
>>  [ 442.666868] VFS: Unable to mount root fs via NFS, trying floppy.
>>  [ 442.667032] VFS: Insert root floppy and press ENTER
>> 
> Error 60 is ETIMEDOUT on SPARC, so it seems that the problem is
> basically the same one that you see in your 2.6.32 trace (rpcbind:
> server 137.226.167.241 not responding, timed out) except that now it is
> a fatal error.
> 
> Any idea why the first RPC calls might be failing here? A switch
> misconfiguration or something like that perhaps?
> 

Honestly, I must state that I also thought of any hardware between the nodes etc. in our computing centre which could cause this fault. Therefore I want to connect the nodes directly but this will take some days (because of bureaucracy)... :(

The next thing is:
Really all working kernels (<=2.6.36.4) first output
 Looking up port of RPC 100003/2 on 137.226.167.241
then
 Looking up port of RPC 100005/1 on 137.226.167.241
and then the mount is successful
 VFS: Mounted root (nfs filesystem) on device 0:15.

So what about >=2.6.37?
Why don't these kernels try other ports, too?
Or why do the old kernels try more than one port?
Why is there no output (even in the nfsdebug mode) that the kernel tries to connect to the RPC service?
Is there a "easy" possibility to change port 100003 to 100005 in >=2.6.37?

Many thanks for your fast answer!

Regards,
Lukas

  parent reply	other threads:[~2011-11-03 21:24 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-03 19:43 [BUG?] Maybe NFS bug since 2.6.37 on SPARC64 Lukas Razik
2011-11-03 20:54 ` Trond Myklebust
2011-11-03 21:10   ` Chuck Lever
2011-11-03 21:11   ` Jim Rees
2011-11-03 21:16     ` Chuck Lever
2011-11-03 21:37       ` Lukas Razik
2011-11-03 21:51         ` Chuck Lever
2011-11-03 23:09           ` Lukas Razik
2011-11-03 23:59             ` Jim Rees
2011-11-04  0:59               ` Lukas Razik
2011-11-04  1:06             ` Chuck Lever
2011-11-04  1:33               ` Lukas Razik
2011-11-04  9:44               ` Lukas Razik
2011-11-04 13:20                 ` Jim Rees
2011-11-04 14:01                   ` Chuck Lever
2011-11-04 14:09                     ` Myklebust, Trond
2011-11-04 14:24                       ` J. Bruce Fields
2011-11-04 14:46                     ` Jim Rees
2011-11-04 15:02                       ` Lukas Razik
2011-11-04 15:18                       ` Myklebust, Trond
2011-11-04 15:46                       ` Lukas Razik
2011-11-04 22:55                         ` Chuck Lever
2011-11-04 23:17                           ` Lukas Razik
2011-11-04 13:54                 ` Chuck Lever
2011-11-04 14:57                   ` Lukas Razik
2011-11-04 16:56                   ` Lukas Razik
2011-11-04 17:55                   ` Lukas Razik
2011-11-04 23:15                     ` NFSROOT mount fails on SPARC after 2.6.37 Chuck Lever
2011-11-05  2:03                       ` David Miller
2011-11-05  2:38                         ` Trond Myklebust
2011-11-04 23:40                   ` [BUG?] Maybe NFS bug since 2.6.37 on SPARC64 Lukas Razik
2011-11-05  1:19                     ` Trond Myklebust
2011-11-05  1:52                       ` Lukas Razik
2011-11-05  2:14                       ` Lukas Razik
2011-11-05  2:30                         ` Trond Myklebust
2011-11-05  2:31                         ` Trond Myklebust
2011-11-05  2:31                         ` Trond Myklebust
2011-11-05  3:51                           ` Lukas Razik
2011-11-05 13:05                             ` Jim Rees
2011-11-12 11:35                               ` Lukas Razik
2011-11-12 18:49                                 ` Jim Rees
2011-11-12 21:06                                   ` Chuck Lever
2011-11-13  1:03                                     ` Lukas Razik
2011-11-13 19:32                                       ` Chuck Lever
2011-11-13 21:28                                         ` Lukas Razik
2011-11-13 22:19                                           ` Lukas Razik
2011-11-14 15:31                                             ` Chuck Lever
2011-11-03 21:18   ` Lukas Razik [this message]
2011-11-03 21:38     ` Jim Rees
2011-11-03 21:58       ` Lukas Razik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1320355087.59657.YahooMailNeo@web24703.mail.ird.yahoo.com \
    --to=linux@razik.name \
    --cc=Trond.Myklebust@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox