From: Simon Kirby <sim@hostway.ca>
To: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [3.1-rc4] NFSv3 client hang
Date: Fri, 9 Sep 2011 12:45:10 -0700 [thread overview]
Message-ID: <20110909194509.GB6195@hostway.ca> (raw)
The 3.1-rc4 NFSv3 client hung on another box (separate from the other one
which Oopsed in vfs_rmdir() with similar workload). This build was also
of 9e79e3e9dd9672b37ac9412e9a926714306551fe (slightly past 3.1-rc4), and
"git log 9e79e3e9dd96.. fs/nfs net/sunrpc" is empty.
All mounts to one server IP have hung, while all other mounts work fine.
I ran "cd /proc/sys/sunrpc; echo 255 > rpc_debug; echo 255 > nfs_debug"
for a while, then kill -9'd all D-state processes to simplify the
debugging, and was left with one that was not interruptible:
28612 D /usr/local/apache2/bin/http sleep_on_page
# cat /proc/28612/stack
[<ffffffff810bdf49>] sleep_on_page+0x9/0x10
[<ffffffff810bdf34>] __lock_page+0x64/0x70
[<ffffffff8112a9e5>] __generic_file_splice_read+0x2d5/0x500
[<ffffffff8112ac5a>] generic_file_splice_read+0x4a/0x90
[<ffffffff812030e5>] nfs_file_splice_read+0x85/0xd0
[<ffffffff81128fb2>] do_splice_to+0x72/0xa0
[<ffffffff811297e4>] splice_direct_to_actor+0xc4/0x1d0
[<ffffffff81129942>] do_splice_direct+0x52/0x70
[<ffffffff81100096>] do_sendfile+0x166/0x1d0
[<ffffffff81100185>] sys_sendfile64+0x85/0xb0
[<ffffffff816af57b>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
echo 1 > /proc/sys/sunrpc/rpc_debug emits:
-pid- flgs status -client- --rqstp- -timeout ---ops--
37163 0001 -11 ffff8802251bca00 (null) 0 ffffffff816e4110 nfsv3 READ a:call_reserveresult q:xprt_sending
tcpdump to this server shows absolutely no packets to the server IP for
several minutes. netstat shows the socket in CLOSE_WAIT:
# netstat -tan|grep 2049
tcp 0 0 10.10.52.50:806 10.10.52.230:2049 CLOSE_WAIT
This is the only port-2049 socket that still exists.
rpcinfo -p 10.10.52.230, -t 10.10.52.230 lockmgr, etc., all show the
server seems fine. rpciod is sleeping in rescuer_thread, and nothing
else is in D state.
mount opts were "rw,hard,intr,tcp,timeo=300,retrans=2,vers=3"
Running another "df" on the mountpoint with rpc_debug = 255 shows:
-pid- flgs status -client- --rqstp- -timeout ---ops--
37163 0001 -11 ffff8802251bca00 (null) 0 ffffffff816e4110 nfsv3 READ a:call_reserveresult q:xprt_sending
RPC: looking up Generic cred
NFS call access
RPC: new task initialized, procpid 30679
RPC: allocated task ffff880030c17a00
RPC: 37133 __rpc_execute flags=0x80
RPC: 37133 call_start nfs3 proc ACCESS (sync)
RPC: 37133 call_reserve (status 0)
RPC: 37133 failed to lock transport ffff880223d0a000
RPC: 37133 sleep_on(queue "xprt_sending" time 4489651610)
RPC: 37133 added to queue ffff880223d0a178 "xprt_sending"
RPC: 37133 sync task going to sleep
So something is not closing the old transport socket here?
Simon-
next reply other threads:[~2011-09-09 20:02 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 19:45 Simon Kirby [this message]
2011-09-09 23:18 ` [3.1-rc4] NFSv3 client hang Trond Myklebust
2011-10-20 19:03 ` Simon Kirby
2012-03-01 22:55 ` Simon Kirby
2012-03-02 0:25 ` Simon Kirby
2012-03-02 18:49 ` [3.2.5] NFSv3 CLOSE_WAIT hang Simon Kirby
2012-09-05 7:49 ` Yan-Pai Chen
2012-09-05 15:09 ` Myklebust, Trond
2012-09-07 13:57 ` Dick Streefland, rnews
2012-09-07 14:13 ` Myklebust, Trond
2012-09-07 14:33 ` Dick Streefland, rnews
2012-09-07 15:46 ` Myklebust, Trond
2012-09-08 19:32 ` Dick Streefland, rnews
2012-09-10 9:00 ` Yan-Pai Chen
2012-09-11 19:40 ` Simon Kirby
2012-09-11 22:17 ` Myklebust, Trond
2012-09-13 5:22 ` Yan-Pai Chen
2012-09-13 13:32 ` Myklebust, Trond
2012-09-21 7:30 ` Yan-Pai Chen
[not found] ` <1347401844.15208.17.camel@lade.trondhjem.org>
2012-09-12 20:54 ` Myklebust, Trond
2012-09-19 22:01 ` Simon Kirby
2012-09-19 22:11 ` Myklebust, Trond
2012-10-12 8:15 ` Simon Kirby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110909194509.GB6195@hostway.ca \
--to=sim@hostway.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).