linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Emmanuel Florac <eflorac@intellique.com>
To: linux-nfs@vger.kernel.org
Subject: Hard to debug NFS loss of connectivity
Date: Thu, 5 Sep 2013 19:18:00 +0200	[thread overview]
Message-ID: <20130905191800.1c75b2fb@harpe.intellique.com> (raw)


Hi list, I have a serious problem I've never met before. Here is the
setup:

The NFS server is running Debian 6 amd64 but with a plain vanilla 3.2.50
kernel. it shares a large 81 TB volume (XFS over LVM on hardware RAID6)
through nfs without any particular options. Here is a glimpse
of /etc/exports:

/mnt/raid 10.1.1.0/255.255.255.128(fsid=1,rw,no_root_squash,async,no_subtree_check)

On the other side is a VMWare ESX VM running Ubuntu 12.04LTS, kernel 3.2.0-52 Ubuntu 
amd64 mounting the share. From the fstab:

10.1.1.99:/mnt/raid          /server         nfs  rw,hard,intr            0    0

The problem is as follow: stat'ing files on the VM makes the 
NFS connection drop. For instance:

find /server -type f -ls

It works for a while, then stops responding. The NFS mount is frozen. 
The network link is OK; I still can ssh from the server to the VM 
and back, I can wget from the VM to the server, ping the server 
from the VM, etc. Only NFS is affected.

Restarting NFS on the server does nothing to unfreeze the mount. 
Using nfs4 instead of nfs3 does nothing. The only remedy is to reboot the VM.
There isn't any error in dmesg, /var/log/syslog or 
/var/log/messages in the VM nor the server.

I've tried rebooting the server on a 3.9.7 kernel. Same thing. 
Of course there isn't any data corruption of any sort. 
Running "find /mnt/raid -type f -ls" on the server works 
perfectly and lists about 25000 files without the slightest trouble.

It works equally well if I mount the NFS share on the server itself.


Now it's becoming crazier: When I run the find command as
previously said, it freezes always on the same file, for
instance :

/server/folder1/folder2/folder3/folder4/.svn/somefile

However, if after a fresh reboot I do

stat /server/folder1/folder2/folder3/folder4/.svn/somefile

no problem. Even doing this:

cd /server/folder1/folder2/folder3/folder4/ && find . -type f -ls

works. However this

cd /server/folder1/folder2/folder3/ && find . -type f -ls

doesn't fly. It freezes at exactly the same point.
In the first test (running directly from /server) it
freezes after successfully listing 10000 files. In the last
test it freezes after only 25 files. 
So apparently it's not about the number of files.


Now I'm stuck. Out of going through tcpdump, I have absolutely 
not the faintest idea about what's going on, except I tend to 
think that's some Ubuntu kernel bug.

Any hint, idea, etc would be extremely welcome. Even some
debugging method less painful than digging through huge 
tcpdumps would be nice :)

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

             reply	other threads:[~2013-09-05 17:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-05 17:18 Emmanuel Florac [this message]
2013-09-05 20:45 ` Hard to debug NFS loss of connectivity J. Bruce Fields
2013-09-05 21:34   ` Emmanuel Florac
2013-09-05 21:40     ` J. Bruce Fields
2013-09-06 15:57       ` Emmanuel Florac
2013-09-06 16:07         ` J. Bruce Fields
2013-09-06 16:55           ` Emmanuel Florac
2013-09-10 13:28             ` J. Bruce Fields
2013-09-10 13:34               ` Emmanuel Florac
2013-09-06 17:15         ` Jim Rees
2013-09-11 15:11 ` Hard to debug NFS loss of connectivity: problem solved Emmanuel Florac
2013-09-11 20:14   ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130905191800.1c75b2fb@harpe.intellique.com \
    --to=eflorac@intellique.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).