From: Frank van Maarseveen <frankvm@frankvm.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Linux NFS mailing list <linux-nfs@vger.kernel.org>
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?
Date: Sun, 11 Dec 2011 19:10:42 +0100 [thread overview]
Message-ID: <20111211181042.GA13425@janus> (raw)
In-Reply-To: <20111211124008.GA10460@janus>
On Sun, Dec 11, 2011 at 01:40:08PM +0100, Frank van Maarseveen wrote:
> On Fri, Dec 09, 2011 at 10:10:01PM -0500, Trond Myklebust wrote:
> > [...]
> > I'm still mystified as to what is going on here...
> >
> > Would it be possible to upgrade some of your clients to 3.1.5 (which
> > contains a fix for a sunrpc socket buffer problem) and then to add the
> > following patch?
>
> Did so, the mount locked up and still is, ready for some more
> experimentation. I don't see any difference however. Did a
> echo 0 >/proc/sys/sunrpc/rpc_debug afterwards (see below).
>
> A recipe which seems to trigger the issue (at least occasionally) is
>
> cd /mount-point
> ssh server echo 3 \>/proc/sys/vm/drop_caches
> echo 3 >/proc/sys/vm/drop_caches
> for i in `seq 100`
> do
> du >/dev/null 2>&1 &
> done
>
> I'll try it on a pristine kernel to rule out some kernel patches (unlikely to
> be the cause or trigger but just to be sure).
Tried, same result: my own NFS client patches seem not to make any
difference, as I expected. The ICMP port unreachable (see my other mail)
go away when I stop ypbind and they are triggered by "ypwhich" commands
too so I consider them no longer relevant.
Not much output this time after "echo 0 >/proc/sys/sunrpc/rpc_debug". I
tried twice:
-pid- flgs status -client- --rqstp- -timeout ---ops--
16020 0080 -11 f4778230 f325d0a0 0 c191b4ac nfsv3 GETATTR a:call_status q:xprt_sending
16038 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:none
16041 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16045 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16048 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 READDIRPLUS a:call_reserveresult q:xprt_sending
16060 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
16062 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16069 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
-pid- flgs status -client- --rqstp- -timeout ---ops--
16020 0080 -11 f4778230 f325d0a0 0 c191b4ac nfsv3 GETATTR a:call_status q:xprt_sending
16038 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:none
16041 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16045 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16048 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 READDIRPLUS a:call_reserveresult q:xprt_sending
16060 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
16062 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16069 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
The NFS client mounts from a machine holding many virtual NFS servers
using an separate IP address for every export. When access on the client
hangs then the same export is still mountable on this NFS client using
a different server IP address (one NIC at both sides btw.). The dead
virtual server IP address seems only dead for NFS RPC and only from the
client in question: there is no traffic going out. Ping, rpcinfo et al
just work. Mount on the client in trouble using the dead IP address but
specifying a different virtual server export produces some traffic and
then gets stuck too, I guess at the point when kernel needs to do NFS RPC.
So, kernel NFS RPC from client drops dead for a specific server IP address.
--
Frank
next prev parent reply other threads:[~2011-12-11 18:10 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-05 16:50 3.1.4: NFSv3 RPC scheduling issue? Frank van Maarseveen
2011-12-05 23:39 ` Trond Myklebust
2011-12-06 8:11 ` Frank van Maarseveen
2011-12-06 19:57 ` Trond Myklebust
2011-12-07 13:43 ` Frank van Maarseveen
2011-12-10 3:10 ` Trond Myklebust
2011-12-11 12:40 ` Frank van Maarseveen
2011-12-11 18:10 ` Frank van Maarseveen [this message]
2011-12-11 14:09 ` Frank van Maarseveen
2011-12-06 9:04 ` Frank van Maarseveen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111211181042.GA13425@janus \
--to=frankvm@frankvm.com \
--cc=Trond.Myklebust@netapp.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).