NFS Server Not Responding after hw change (svc: transport busy, not enqueued)

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

* NFS Server Not Responding after hw change (svc: transport busy, not enqueued)
@ 2010-01-06  3:25 Scott Sturdivant
       [not found] ` <alpine.DEB.1.10.1001052221550.22025-GVPuzBPOE2JdLJ4KwZ+rgqxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Scott Sturdivant @ 2010-01-06  3:25 UTC (permalink / raw)
  To: linux-nfs

I'm not sure what level of detail is appropriate here, so I apologize in 
advance.

This past weekend I swapped some hardware on my NFS server.  I swapped in 
a new motherboard, processor, ram, and am now using the on-board LAN.  My 
hard drives did not change, and upon booting up everything seemed to be 
working just fine.  The problems start coming when my clients mount a 
share and attempt to access a file.  The server is running Ubuntu 9.10 
32-bit server edition.  Uname -a: Linux blargh-server 
2.6.31-16-generic-pae #53-Ubuntu SMP Tue Dec 8 05:20:21 UTC 2009 i686 
GNU/Linux The 1:1.2.0-2ubuntu8 nfs-kernel-server package is installed.

On the clients, I can mount the shares with "mount -t nfs 
file-server:/home/scott/Videos/ ~/Videos".  The server's dmesg shows "Jan 
5 07:49:23 file-server mountd[1606]: authenticated mount request from 
192.168.1.100:802 for /home/scott/Videos/ (/home/scott/Videos/)" I can 
then "ls" that directory and retrieve the directory listing.  But if I 
access a file (cp ~/Videos/*.avi /tmp), only a portion of a single file 
copies before the I/O will be blocked.  Eventually dmesg on the client 
will give the following error:  nfs: server file-server not responding, 
still trying

At this point, executing 'rpcinfo -p file-server' from the client still 
seems to indicate that NFS is running just fine on the server.

(scott) file-client:~
507 -> rpcinfo -p file-server
    program vers proto   port
     100000    2   tcp    111  portmapper
     100000    2   udp    111  portmapper
     100024    1   udp  41238  status
     100024    1   tcp  55833  status
     100021    1   udp  38360  nlockmgr
     100021    3   udp  38360  nlockmgr
     100021    4   udp  38360  nlockmgr
     100021    1   tcp  59774  nlockmgr
     100021    3   tcp  59774  nlockmgr
     100021    4   tcp  59774  nlockmgr
     100003    2   udp   2049  nfs
     100003    3   udp   2049  nfs
     100003    4   udp   2049  nfs
     100003    2   tcp   2049  nfs
     100003    3   tcp   2049  nfs
     100003    4   tcp   2049  nfs
     100005    1   udp  42451  mountd
     100005    1   tcp  57648  mountd
     100005    2   udp  42451  mountd
     100005    2   tcp  57648  mountd
     100005    3   udp  42451  mountd
     100005    3   tcp  57648  mountd

As you can see though, the I/O is blocked.

(scott) file-client:~
504 -> ps aux | grep " D"
scott     4405  0.0  0.0   3428   920 pts/1    D+   08:04   0:00 cp 
Videos/*.avi /tmp/

On the server's end, I do not see any errors in dmesg or syslog or 
messages.  That is until I increased the logging level using rpcdebug. 
(Now I'm not sure if I did this correctly, but I did 'rpcdebug -m module 
-s all' for all of the modules listed by rpcdebug -vh).

In the below snippet from the server's dmesg, there are many svc: 
transport %p busy, not enqueued messages:

[ 6588.481185] nfsd_dispatch: vers 3 proc 6
[ 6588.481211] nfsd: READ(3) 36: 01070001 0141401d 00000000 e12f98aa 
1c4965f0 0d4e5b93 131072 bytes at 22282240
[ 6588.481231] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa 
1c4965f0 0d4e5b93)
[ 6588.481747] svc: socket f45f8e00 sendto([ed215000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.481776] svc: socket f45f8e00 recvfrom(f45f8f70, 0) = 4
[ 6588.481792] svc: TCP record, 156 bytes
[ 6588.481821] svc: server f6ccd000 waiting for data (to = 900000)
[ 6588.482701] svc: socket f45f8e00 sendto([ea53a000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.482727] svc: socket f45f8e00 recvfrom(c7ab109c, 3940) = 156
[ 6588.482732] svc: TCP complete record (156 bytes)
[ 6588.482739] svc: transport f45f8e00 served by daemon f6ccd000
[ 6588.482752] svc: transport f45f8e00 busy, not enqueued
[ 6588.482766] svc: got len=156
[ 6588.482781] svc: server f6cca000 waiting for data (to = 900000)
[ 6588.482787] svc: svc_authenticate (1)
[ 6588.482798] svc: calling dispatcher
[ 6588.482806] nfsd_dispatch: vers 3 proc 6
[ 6588.482831] nfsd: READ(3) 36: 01070001 0141401d 00000000 e12f98aa 
1c4965f0 0d4e5b93 131072 bytes at 22151168
[ 6588.482854] svc: transport f45f8e00 busy, not enqueued
[ 6588.482870] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa 
1c4965f0 0d4e5b93)
[ 6588.483499] svc: socket f45f8e00 sendto([cddbc000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.483531] svc: transport f45f8e00 busy, not enqueued
[ 6588.483543] svc: server de5f6000 waiting for data (to = 900000)
[ 6588.483639] svc: socket f45f8e00 sendto([f4dbd000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.483667] svc: transport f45f8e00 busy, not enqueued
[ 6588.483674] svc: server f45d4000 waiting for data (to = 900000)
[ 6588.483904] svc: socket f45f8e00 sendto([ea445000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.483931] svc: transport f45f8e00 busy, not enqueued
[ 6588.483937] svc: server de5f0000 waiting for data (to = 900000)
[ 6588.483987] svc: server f6ccd000, pool 0, transport f45f8e00, inuse=2
[ 6588.484004] svc: tcp_recv f45f8e00 data 1 conn 0 close 0
[ 6588.484018] svc: socket f45f8e00 recvfrom(f45f8f70, 0) = 4
[ 6588.484023] svc: TCP record, 156 bytes
[ 6588.484036] svc: socket f45f8e00 recvfrom(cdc2f09c, 3940) = 156

While I'm obviously suspect of the hardware being as that's what changed, 
I can ssh to the server, scp large files between the two, and I can samba 
share the same directories without any problems.  On the server I can even 
mount an NFS share locally and manipulate the files just fine.  NFS over 
the network seems to be the only thing giving me problems.

Thanks for any help, and please let me know if there's more detail that I 
can add to assist debugging.

Scott

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS Server Not Responding after hw change (svc: transport busy, not enqueued)
       [not found] ` <alpine.DEB.1.10.1001052221550.22025-GVPuzBPOE2JdLJ4KwZ+rgqxOck334EZe@public.gmane.org>
@ 2010-01-06 20:40   ` J. Bruce Fields
  2010-01-07  1:02     ` Scott Sturdivant
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2010-01-06 20:40 UTC (permalink / raw)
  To: Scott Sturdivant; +Cc: linux-nfs

On Tue, Jan 05, 2010 at 10:25:34PM -0500, Scott Sturdivant wrote:
> I'm not sure what level of detail is appropriate here, so I apologize in  
> advance.
>
> This past weekend I swapped some hardware on my NFS server.  I swapped in 
> a new motherboard, processor, ram, and am now using the on-board LAN.  My 
> hard drives did not change, and upon booting up everything seemed to be  
> working just fine.  The problems start coming when my clients mount a  
> share and attempt to access a file.  The server is running Ubuntu 9.10  
> 32-bit server edition.  Uname -a: Linux blargh-server  
> 2.6.31-16-generic-pae #53-Ubuntu SMP Tue Dec 8 05:20:21 UTC 2009 i686  
> GNU/Linux The 1:1.2.0-2ubuntu8 nfs-kernel-server package is installed.

On a quick skim I don't see an obvious reason; one approach (if you're
*positive* there weren't also any software changes) might be just to try
swapping the hardware back (starting with the LAN?) and see if you can
reliably turn the problem on/off with just one hardware change.

--b.

>
> On the clients, I can mount the shares with "mount -t nfs  
> file-server:/home/scott/Videos/ ~/Videos".  The server's dmesg shows "Jan 
> 5 07:49:23 file-server mountd[1606]: authenticated mount request from  
> 192.168.1.100:802 for /home/scott/Videos/ (/home/scott/Videos/)" I can  
> then "ls" that directory and retrieve the directory listing.  But if I  
> access a file (cp ~/Videos/*.avi /tmp), only a portion of a single file  
> copies before the I/O will be blocked.  Eventually dmesg on the client  
> will give the following error:  nfs: server file-server not responding,  
> still trying
>
> At this point, executing 'rpcinfo -p file-server' from the client still  
> seems to indicate that NFS is running just fine on the server.
>
> (scott) file-client:~
> 507 -> rpcinfo -p file-server
>    program vers proto   port
>     100000    2   tcp    111  portmapper
>     100000    2   udp    111  portmapper
>     100024    1   udp  41238  status
>     100024    1   tcp  55833  status
>     100021    1   udp  38360  nlockmgr
>     100021    3   udp  38360  nlockmgr
>     100021    4   udp  38360  nlockmgr
>     100021    1   tcp  59774  nlockmgr
>     100021    3   tcp  59774  nlockmgr
>     100021    4   tcp  59774  nlockmgr
>     100003    2   udp   2049  nfs
>     100003    3   udp   2049  nfs
>     100003    4   udp   2049  nfs
>     100003    2   tcp   2049  nfs
>     100003    3   tcp   2049  nfs
>     100003    4   tcp   2049  nfs
>     100005    1   udp  42451  mountd
>     100005    1   tcp  57648  mountd
>     100005    2   udp  42451  mountd
>     100005    2   tcp  57648  mountd
>     100005    3   udp  42451  mountd
>     100005    3   tcp  57648  mountd
>
> As you can see though, the I/O is blocked.
>
> (scott) file-client:~
> 504 -> ps aux | grep " D"
> scott     4405  0.0  0.0   3428   920 pts/1    D+   08:04   0:00 cp  
> Videos/*.avi /tmp/
>
> On the server's end, I do not see any errors in dmesg or syslog or  
> messages.  That is until I increased the logging level using rpcdebug.  
> (Now I'm not sure if I did this correctly, but I did 'rpcdebug -m module  
> -s all' for all of the modules listed by rpcdebug -vh).
>
> In the below snippet from the server's dmesg, there are many svc:  
> transport %p busy, not enqueued messages:
>
> [ 6588.481185] nfsd_dispatch: vers 3 proc 6
> [ 6588.481211] nfsd: READ(3) 36: 01070001 0141401d 00000000 e12f98aa  
> 1c4965f0 0d4e5b93 131072 bytes at 22282240
> [ 6588.481231] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa  
> 1c4965f0 0d4e5b93)
> [ 6588.481747] svc: socket f45f8e00 sendto([ed215000 132... ], 131204) =  
> 131204 (addr 192.168.1.100, port=915)
> [ 6588.481776] svc: socket f45f8e00 recvfrom(f45f8f70, 0) = 4
> [ 6588.481792] svc: TCP record, 156 bytes
> [ 6588.481821] svc: server f6ccd000 waiting for data (to = 900000)
> [ 6588.482701] svc: socket f45f8e00 sendto([ea53a000 132... ], 131204) =  
> 131204 (addr 192.168.1.100, port=915)
> [ 6588.482727] svc: socket f45f8e00 recvfrom(c7ab109c, 3940) = 156
> [ 6588.482732] svc: TCP complete record (156 bytes)
> [ 6588.482739] svc: transport f45f8e00 served by daemon f6ccd000
> [ 6588.482752] svc: transport f45f8e00 busy, not enqueued
> [ 6588.482766] svc: got len=156
> [ 6588.482781] svc: server f6cca000 waiting for data (to = 900000)
> [ 6588.482787] svc: svc_authenticate (1)
> [ 6588.482798] svc: calling dispatcher
> [ 6588.482806] nfsd_dispatch: vers 3 proc 6
> [ 6588.482831] nfsd: READ(3) 36: 01070001 0141401d 00000000 e12f98aa  
> 1c4965f0 0d4e5b93 131072 bytes at 22151168
> [ 6588.482854] svc: transport f45f8e00 busy, not enqueued
> [ 6588.482870] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa  
> 1c4965f0 0d4e5b93)
> [ 6588.483499] svc: socket f45f8e00 sendto([cddbc000 132... ], 131204) =  
> 131204 (addr 192.168.1.100, port=915)
> [ 6588.483531] svc: transport f45f8e00 busy, not enqueued
> [ 6588.483543] svc: server de5f6000 waiting for data (to = 900000)
> [ 6588.483639] svc: socket f45f8e00 sendto([f4dbd000 132... ], 131204) =  
> 131204 (addr 192.168.1.100, port=915)
> [ 6588.483667] svc: transport f45f8e00 busy, not enqueued
> [ 6588.483674] svc: server f45d4000 waiting for data (to = 900000)
> [ 6588.483904] svc: socket f45f8e00 sendto([ea445000 132... ], 131204) =  
> 131204 (addr 192.168.1.100, port=915)
> [ 6588.483931] svc: transport f45f8e00 busy, not enqueued
> [ 6588.483937] svc: server de5f0000 waiting for data (to = 900000)
> [ 6588.483987] svc: server f6ccd000, pool 0, transport f45f8e00, inuse=2
> [ 6588.484004] svc: tcp_recv f45f8e00 data 1 conn 0 close 0
> [ 6588.484018] svc: socket f45f8e00 recvfrom(f45f8f70, 0) = 4
> [ 6588.484023] svc: TCP record, 156 bytes
> [ 6588.484036] svc: socket f45f8e00 recvfrom(cdc2f09c, 3940) = 156
>
> While I'm obviously suspect of the hardware being as that's what changed, 
> I can ssh to the server, scp large files between the two, and I can samba 
> share the same directories without any problems.  On the server I can 
> even mount an NFS share locally and manipulate the files just fine.  NFS 
> over the network seems to be the only thing giving me problems.
>
> Thanks for any help, and please let me know if there's more detail that I 
> can add to assist debugging.
>
> Scott
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS Server Not Responding after hw change (svc: transport busy, not enqueued)
  2010-01-06 20:40   ` J. Bruce Fields
@ 2010-01-07  1:02     ` Scott Sturdivant
       [not found]       ` <alpine.DEB.1.10.1001061958560.22025-GVPuzBPOE2JdLJ4KwZ+rgqxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Scott Sturdivant @ 2010-01-07  1:02 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs



On Wed, 6 Jan 2010, J. Bruce Fields wrote:

> On Tue, Jan 05, 2010 at 10:25:34PM -0500, Scott Sturdivant wrote:
>> I'm not sure what level of detail is appropriate here, so I apologize in
>> advance.
>>
>> This past weekend I swapped some hardware on my NFS server.  I swapped in
>> a new motherboard, processor, ram, and am now using the on-board LAN.  My
>> hard drives did not change, and upon booting up everything seemed to be
>> working just fine.  The problems start coming when my clients mount a
>> share and attempt to access a file.  The server is running Ubuntu 9.10
>> 32-bit server edition.  Uname -a: Linux blargh-server
>> 2.6.31-16-generic-pae #53-Ubuntu SMP Tue Dec 8 05:20:21 UTC 2009 i686
>> GNU/Linux The 1:1.2.0-2ubuntu8 nfs-kernel-server package is installed.
>
> On a quick skim I don't see an obvious reason; one approach (if you're
> *positive* there weren't also any software changes) might be just to try
> swapping the hardware back (starting with the LAN?) and see if you can
> reliably turn the problem on/off with just one hardware change.
>
> --b.

Thank you for the good suggestion! I have done this and have verified that 
indeed the onboard LAN is the root of the problem.  However, as the 
onboard LAN is able to handle Samba / scp but fails with NFS, I'm curious 
if this is an actual hardware problem or a driver issue?  Does anyone know 
where the appropriate place for this problem would be?  Is there an atl1c 
list?

Thanks again,

Scott

>> On the clients, I can mount the shares with "mount -t nfs
>> file-server:/home/scott/Videos/ ~/Videos".  The server's dmesg shows "Jan
>> 5 07:49:23 file-server mountd[1606]: authenticated mount request from
>> 192.168.1.100:802 for /home/scott/Videos/ (/home/scott/Videos/)" I can
>> then "ls" that directory and retrieve the directory listing.  But if I
>> access a file (cp ~/Videos/*.avi /tmp), only a portion of a single file
>> copies before the I/O will be blocked.  Eventually dmesg on the client
>> will give the following error:  nfs: server file-server not responding,
>> still trying
>>
>> At this point, executing 'rpcinfo -p file-server' from the client still
>> seems to indicate that NFS is running just fine on the server.
>>
>> (scott) file-client:~
>> 507 -> rpcinfo -p file-server
>>    program vers proto   port
>>     100000    2   tcp    111  portmapper
>>     100000    2   udp    111  portmapper
>>     100024    1   udp  41238  status
>>     100024    1   tcp  55833  status
>>     100021    1   udp  38360  nlockmgr
>>     100021    3   udp  38360  nlockmgr
>>     100021    4   udp  38360  nlockmgr
>>     100021    1   tcp  59774  nlockmgr
>>     100021    3   tcp  59774  nlockmgr
>>     100021    4   tcp  59774  nlockmgr
>>     100003    2   udp   2049  nfs
>>     100003    3   udp   2049  nfs
>>     100003    4   udp   2049  nfs
>>     100003    2   tcp   2049  nfs
>>     100003    3   tcp   2049  nfs
>>     100003    4   tcp   2049  nfs
>>     100005    1   udp  42451  mountd
>>     100005    1   tcp  57648  mountd
>>     100005    2   udp  42451  mountd
>>     100005    2   tcp  57648  mountd
>>     100005    3   udp  42451  mountd
>>     100005    3   tcp  57648  mountd
>>
>> As you can see though, the I/O is blocked.
>>
>> (scott) file-client:~
>> 504 -> ps aux | grep " D"
>> scott     4405  0.0  0.0   3428   920 pts/1    D+   08:04   0:00 cp
>> Videos/*.avi /tmp/
>>
>> On the server's end, I do not see any errors in dmesg or syslog or
>> messages.  That is until I increased the logging level using rpcdebug.
>> (Now I'm not sure if I did this correctly, but I did 'rpcdebug -m module
>> -s all' for all of the modules listed by rpcdebug -vh).
>>
>> In the below snippet from the server's dmesg, there are many svc:
>> transport %p busy, not enqueued messages:
>>
>> [ 6588.481185] nfsd_dispatch: vers 3 proc 6
>> [ 6588.481211] nfsd: READ(3) 36: 01070001 0141401d 00000000 e12f98aa
>> 1c4965f0 0d4e5b93 131072 bytes at 22282240
>> [ 6588.481231] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa
>> 1c4965f0 0d4e5b93)
>> [ 6588.481747] svc: socket f45f8e00 sendto([ed215000 132... ], 131204) =
>> 131204 (addr 192.168.1.100, port=915)
>> [ 6588.481776] svc: socket f45f8e00 recvfrom(f45f8f70, 0) = 4
>> [ 6588.481792] svc: TCP record, 156 bytes
>> [ 6588.481821] svc: server f6ccd000 waiting for data (to = 900000)
>> [ 6588.482701] svc: socket f45f8e00 sendto([ea53a000 132... ], 131204) =
>> 131204 (addr 192.168.1.100, port=915)
>> [ 6588.482727] svc: socket f45f8e00 recvfrom(c7ab109c, 3940) = 156
>> [ 6588.482732] svc: TCP complete record (156 bytes)
>> [ 6588.482739] svc: transport f45f8e00 served by daemon f6ccd000
>> [ 6588.482752] svc: transport f45f8e00 busy, not enqueued
>> [ 6588.482766] svc: got len=156
>> [ 6588.482781] svc: server f6cca000 waiting for data (to = 900000)
>> [ 6588.482787] svc: svc_authenticate (1)
>> [ 6588.482798] svc: calling dispatcher
>> [ 6588.482806] nfsd_dispatch: vers 3 proc 6
>> [ 6588.482831] nfsd: READ(3) 36: 01070001 0141401d 00000000 e12f98aa
>> 1c4965f0 0d4e5b93 131072 bytes at 22151168
>> [ 6588.482854] svc: transport f45f8e00 busy, not enqueued
>> [ 6588.482870] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa
>> 1c4965f0 0d4e5b93)
>> [ 6588.483499] svc: socket f45f8e00 sendto([cddbc000 132... ], 131204) =
>> 131204 (addr 192.168.1.100, port=915)
>> [ 6588.483531] svc: transport f45f8e00 busy, not enqueued
>> [ 6588.483543] svc: server de5f6000 waiting for data (to = 900000)
>> [ 6588.483639] svc: socket f45f8e00 sendto([f4dbd000 132... ], 131204) =
>> 131204 (addr 192.168.1.100, port=915)
>> [ 6588.483667] svc: transport f45f8e00 busy, not enqueued
>> [ 6588.483674] svc: server f45d4000 waiting for data (to = 900000)
>> [ 6588.483904] svc: socket f45f8e00 sendto([ea445000 132... ], 131204) =
>> 131204 (addr 192.168.1.100, port=915)
>> [ 6588.483931] svc: transport f45f8e00 busy, not enqueued
>> [ 6588.483937] svc: server de5f0000 waiting for data (to = 900000)
>> [ 6588.483987] svc: server f6ccd000, pool 0, transport f45f8e00, inuse=2
>> [ 6588.484004] svc: tcp_recv f45f8e00 data 1 conn 0 close 0
>> [ 6588.484018] svc: socket f45f8e00 recvfrom(f45f8f70, 0) = 4
>> [ 6588.484023] svc: TCP record, 156 bytes
>> [ 6588.484036] svc: socket f45f8e00 recvfrom(cdc2f09c, 3940) = 156
>>
>> While I'm obviously suspect of the hardware being as that's what changed,
>> I can ssh to the server, scp large files between the two, and I can samba
>> share the same directories without any problems.  On the server I can
>> even mount an NFS share locally and manipulate the files just fine.  NFS
>> over the network seems to be the only thing giving me problems.
>>
>> Thanks for any help, and please let me know if there's more detail that I
>> can add to assist debugging.
>>
>> Scott
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS Server Not Responding after hw change (svc: transport busy, not enqueued)
       [not found]       ` <alpine.DEB.1.10.1001061958560.22025-GVPuzBPOE2JdLJ4KwZ+rgqxOck334EZe@public.gmane.org>
@ 2010-01-07 14:35         ` J. Bruce Fields
  2010-01-07 14:41           ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2010-01-07 14:35 UTC (permalink / raw)
  To: Scott Sturdivant; +Cc: linux-nfs, netdev

On Wed, Jan 06, 2010 at 08:02:14PM -0500, Scott Sturdivant wrote:
> On Wed, 6 Jan 2010, J. Bruce Fields wrote:
>> On a quick skim I don't see an obvious reason; one approach (if you're
>> *positive* there weren't also any software changes) might be just to try
>> swapping the hardware back (starting with the LAN?) and see if you can
>> reliably turn the problem on/off with just one hardware change.
>>
>> --b.
>
> Thank you for the good suggestion! I have done this and have verified 
> that indeed the onboard LAN is the root of the problem.

Woo-hoo!

> However, as the  
> onboard LAN is able to handle Samba / scp but fails with NFS, I'm curious 
> if this is an actual hardware problem or a driver issue?  Does anyone 
> know where the appropriate place for this problem would be?  Is there an 
> atl1c list?

Adding netdev-u79uwXL29TaiAVqoAR/hOOOyGI2DFzLe@public.gmane.org  Could you repeat any details about
the exact models of the network interfaces?

--b.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS Server Not Responding after hw change (svc: transport busy, not enqueued)
  2010-01-07 14:35         ` J. Bruce Fields
@ 2010-01-07 14:41           ` J. Bruce Fields
  2010-01-07 14:53             ` Scott Sturdivant
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2010-01-07 14:41 UTC (permalink / raw)
  To: Scott Sturdivant; +Cc: linux-nfs, netdev

Oops--trying again with address fixed.

On Thu, Jan 07, 2010 at 09:35:58AM -0500, J. Bruce Fields wrote:
> On Wed, Jan 06, 2010 at 08:02:14PM -0500, Scott Sturdivant wrote:
> > On Wed, 6 Jan 2010, J. Bruce Fields wrote:
> >> On a quick skim I don't see an obvious reason; one approach (if you're
> >> *positive* there weren't also any software changes) might be just to try
> >> swapping the hardware back (starting with the LAN?) and see if you can
> >> reliably turn the problem on/off with just one hardware change.
> >>
> >> --b.
> >
> > Thank you for the good suggestion! I have done this and have verified 
> > that indeed the onboard LAN is the root of the problem.
> 
> Woo-hoo!
> 
> > However, as the  
> > onboard LAN is able to handle Samba / scp but fails with NFS, I'm curious 
> > if this is an actual hardware problem or a driver issue?  Does anyone 
> > know where the appropriate place for this problem would be?  Is there an 
> > atl1c list?
> 
> Adding netdev@vger.kernel.org....  Could you repeat any details about
> the exact models of the network interfaces?
> 
> --b.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS Server Not Responding after hw change (svc: transport busy, not enqueued)
  2010-01-07 14:41           ` J. Bruce Fields
@ 2010-01-07 14:53             ` Scott Sturdivant
  0 siblings, 0 replies; 6+ messages in thread
From: Scott Sturdivant @ 2010-01-07 14:53 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, netdev

>>> However, as the
>>> onboard LAN is able to handle Samba / scp but fails with NFS, I'm curious
>>> if this is an actual hardware problem or a driver issue?  Does anyone
>>> know where the appropriate place for this problem would be?  Is there an
>>> atl1c list?
>>
>> Adding netdev-u79uwXL29TaiAVqoAR/hOOOyGI2DFzLe@public.gmane.org  Could you repeat any details about
>> the exact models of the network interfaces?
>>
>> --b.
>

For those of you just joining us, I have some new hardware that handles 
ssh, scp, and samba all fine but NFS does not work.  Put back in the old 
NIC and all services are functional.

The onboard LAN (the bad one!) as seen from lspci -vvv:

01:00.0 Ethernet controller [0200]: Attansic Technology Corp. Device 
[1969:1063] (rev c0)
 	Subsystem: Elitegroup Computer Systems Device [1019:8131]
 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B- DisINTx+
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 32 bytes
 	Interrupt: pin A routed to IRQ 25
 	Region 0: Memory at feac0000 (64-bit, non-prefetchable) 
[size=256K]
 	Region 2: I/O ports at dc00 [size=128]
 	Capabilities: [40] Power Management version 3
 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA 
PME(D0+,D1+,D2+,D3hot+,D3cold+)
 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
 	Capabilities: [48] Message Signalled Interrupts: Mask- 64bit+ 
Queue=0/0 Enable+
 		Address: 00000000fee0f00c  Data: 4171
 	Capabilities: [58] Express (v1) Endpoint, MSI 00
 		DevCap:	MaxPayload 4096 bytes, PhantFunc 0, Latency L0s 
<4us, L1 unlimited
 			ExtTag- AttnBtn+ AttnInd+ PwrInd+ RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
 			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
 		DevSta:	CorrErr+ UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ 
TransPend-
 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, 
Latency L0 unlimited, L1 unlimited
 			ClockPM+ Suprise- LLActRep- BwNot-
 		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- 
CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
 	Capabilities: [6c] Vital Product Data <?>
 	Kernel driver in use: atl1c
 	Kernel modules: atl1c

This is on an ECS 945GCD-M motherboard:

dmi.bios.date: 07/21/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 080013
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: 945GCD-M
dmi.board.vendor: ECS
dmi.board.version: V1.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: ECS
dmi.chassis.version: V1.0
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr080013:bd07/21/2009:svnECS:pn945GCD-M:pvrV1.0:rvnECS:rn945GCD-M:rvrV1.0:cvnECS:ct3:cvrV1.0:
dmi.product.name: 945GCD-M
dmi.product.version: V1.0
dmi.sys.vendor: ECS

The PCI add-on NIC (the good one!):

02:01.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet 
Controller (rev 05)
         Subsystem: Intel Corporation Device 1376
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- 
ParErr- Stepping- SERR- FastB2B- DisINTx-
         Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 64 (63750ns min), Cache Line Size: 32 bytes
         Interrupt: pin A routed to IRQ 18
         Region 0: Memory at febe0000 (32-bit, non-prefetchable) 
[size=128K]
         Region 1: Memory at febc0000 (32-bit, non-prefetchable) 
[size=128K]
         Region 2: I/O ports at ec00 [size=64]
         Expansion ROM at 40000000 [disabled] [size=128K]
         Capabilities: [dc] Power Management version 2
                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [e4] PCI-X non-bridge device
                 Command: DPERE- ERO+ RBC=512 OST=1
                 Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple 
DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-
         Kernel driver in use: e1000
         Kernel modules: e1000

I've filed a bug here: 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/503484

Basically I see this in dmesg on the buggy onboad LAN:

[ 6588.482854] svc: transport f45f8e00 busy, not enqueued
[ 6588.482870] nfsd: fh_verify(36: 01070001 0141401d 00000000 e12f98aa 
1c4965f0 0d4e5b93)
[ 6588.483499] svc: socket f45f8e00 sendto([cddbc000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.483531] svc: transport f45f8e00 busy, not enqueued
[ 6588.483543] svc: server de5f6000 waiting for data (to = 900000)
[ 6588.483639] svc: socket f45f8e00 sendto([f4dbd000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.483667] svc: transport f45f8e00 busy, not enqueued
[ 6588.483674] svc: server f45d4000 waiting for data (to = 900000)
[ 6588.483904] svc: socket f45f8e00 sendto([ea445000 132... ], 131204) = 
131204 (addr 192.168.1.100, port=915)
[ 6588.483931] svc: transport f45f8e00 busy, not enqueued

Thank you!

Scott

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-01-07 14:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-06  3:25 NFS Server Not Responding after hw change (svc: transport busy, not enqueued) Scott Sturdivant
     [not found] ` <alpine.DEB.1.10.1001052221550.22025-GVPuzBPOE2JdLJ4KwZ+rgqxOck334EZe@public.gmane.org>
2010-01-06 20:40   ` J. Bruce Fields
2010-01-07  1:02     ` Scott Sturdivant
     [not found]       ` <alpine.DEB.1.10.1001061958560.22025-GVPuzBPOE2JdLJ4KwZ+rgqxOck334EZe@public.gmane.org>
2010-01-07 14:35         ` J. Bruce Fields
2010-01-07 14:41           ` J. Bruce Fields
2010-01-07 14:53             ` Scott Sturdivant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox