From: Simon Kirby <sim@hostway.ca>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFS client/sunrpc getting stuck on 2.6.36
Date: Fri, 19 Nov 2010 12:20:05 -0800 [thread overview]
Message-ID: <20101119202004.GA3270@hostway.ca> (raw)
In-Reply-To: <1289452967.4062.10.camel@heimdal.trondhjem.org>
On Thu, Nov 11, 2010 at 01:22:47PM +0800, Trond Myklebust wrote:
> On Wed, 2010-11-10 at 18:35 -0800, Simon Kirby wrote:
> > Still seeing all sorts of boxes fall over with 2.6.35 and 2.6.36 NFS.
> > Unfortunately, it doesn't happen all the time...only certain load
> > patterns seem to start it off. Once it starts, I can't find a way to
> > make it recover without rebooting.
> >...
> > NFS: permission(0:4c/5284877), mask=0x1, res=0
> > NFS: revalidating (0:4c/3247737045)
> >
> > 900ms matches the probably-silly nfs mount settings we're currently using:
> >
> > rw,hard,intr,tcp,timeo=9,retrans=3,rsize=8192,wsize=8192
> >
> > Full kernel log here: http://0x.ca/sim/ref/2.6.36_stuck_nfs/
>
> timeo=9 is a completely insane retransmit value for a tcp connection.
>
> Please use the default timeo=600, and all will work correctly.
Ok, so, we were running with timeo=300 instead on a number of servers,
and we were still seeing the problem on 2.6.36. I've uploaded a new
kernel log (lsh1051) here:
http://0x.ca/sim/ref/2.6.36_stuck_nfs/
The log starts out with the hung task warnings occurring after
otherwise-normal operation. Once I noticed, I set rpc/nfs_debug to 1,
and then later set it to 255.
Since several servers were stuck at the same time and we were losing
quorum, I decided to try something more drastic and booted into
2.6.37-rc2-git3. This kernel hasn't got stuck yet! However, it's
spitting out some new errors which may be worth looking into:
[ 1574.088812] NFS: server 10.10.52.222 error: fileid changed
[ 1574.088814] fsid 0:18: expected fileid 0x4c081940, got 0x4c081950
[11340.409447] NFS: server 10.10.52.228 error: fileid changed
[11340.409450] fsid 0:45: expected fileid 0x696ff82, got 0x16a98bd7
[20832.579912] NFS: server 10.10.52.225 error: fileid changed
[20832.579914] fsid 0:2a: expected fileid 0x8c67ebab, got 0x8c6811e5
[32775.957351] NFS: server 10.10.52.230 error: fileid changed
[32775.957354] fsid 0:52: expected fileid 0x919041fd, got 0x93f1962d
These are also in the same kernel log. The error code isn't new, so
something else seems to have changed to cause it.
Simon-
next prev parent reply other threads:[~2010-11-19 20:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-11 2:35 NFS client/sunrpc getting stuck on 2.6.36 Simon Kirby
2010-11-11 5:22 ` Trond Myklebust
2010-11-11 8:49 ` Simon Kirby
2010-11-19 20:20 ` Simon Kirby [this message]
2010-11-19 21:24 ` Trond Myklebust
2010-11-19 22:03 ` Simon Kirby
2010-11-19 22:17 ` Trond Myklebust
2010-11-19 22:58 ` Simon Kirby
2010-11-19 23:17 ` Trond Myklebust
2010-11-21 6:43 ` Simon Kirby
2010-11-21 19:55 ` Trond Myklebust
2010-11-21 6:40 ` Simon Kirby
2010-11-21 19:54 ` Trond Myklebust
2010-11-24 5:18 ` Simon Kirby
2010-11-24 15:05 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101119202004.GA3270@hostway.ca \
--to=sim@hostway.ca \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).