From: Andrew Martin <amartin@xes-inc.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Jim Rees <rees@umich.edu>,
bhawley@luminex.com, Brown Neil <neilb@suse.de>,
linux-nfs-owner@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: Optimal NFS mount options to safely allow interrupts and timeouts on newer kernels
Date: Thu, 6 Mar 2014 14:45:58 -0600 (CST) [thread overview]
Message-ID: <521763040.159828.1394138758307.JavaMail.zimbra@xes-inc.com> (raw)
In-Reply-To: <76B038DA-3E86-4C46-BFB6-928BFB8202D8@primarydata.com>
----- Original Message -----
> From: "Trond Myklebust" <trond.myklebust@primarydata.com>
> > I attempted to get a backtrace from one of the uninterruptable apache
> > processes:
> > echo w > /proc/sysrq-trigger
> >
> > Here's one example:
> > [1227348.003904] apache2 D 0000000000000000 0 10175 1773
> > 0x00000004
> > [1227348.003906] ffff8802813178c8 0000000000000082 0000000000015e00
> > 0000000000015e00
> > [1227348.003908] ffff8801d88f03d0 ffff880281317fd8 0000000000015e00
> > ffff8801d88f0000
> > [1227348.003910] 0000000000015e00 ffff880281317fd8 0000000000015e00
> > ffff8801d88f03d0
> > [1227348.003912] Call Trace:
> > [1227348.003918] [<ffffffffa00a5ca0>] ? rpc_wait_bit_killable+0x0/0x40
> > [sunrpc]
> > [1227348.003923] [<ffffffffa00a5cc4>] rpc_wait_bit_killable+0x24/0x40
> > [sunrpc]
> > [1227348.003925] [<ffffffff8156a41f>] __wait_on_bit+0x5f/0x90
> > [1227348.003930] [<ffffffffa00a5ca0>] ? rpc_wait_bit_killable+0x0/0x40
> > [sunrpc]
> > [1227348.003932] [<ffffffff8156a4c8>] out_of_line_wait_on_bit+0x78/0x90
> > [1227348.003934] [<ffffffff81086790>] ? wake_bit_function+0x0/0x40
> > [1227348.003939] [<ffffffffa00a6611>] __rpc_execute+0x191/0x2a0 [sunrpc]
> > [1227348.003945] [<ffffffffa00a6746>] rpc_execute+0x26/0x30 [sunrpc]
>
> That basically means that the process is hanging in the RPC layer, somewhere
> in the state machine. ‘echo 0 >/proc/sys/sunrpc/rpc_debug’ as the ‘root’
> user should give us a dump of which state these RPC calls are in. Can you
> please try that?
Yes I will definitely run that the next time it happens, but since it occurs
sporadically (and I have not yet found a way to reproduce it on demand), it
could be days before it occurs again. I'll also run "netstat -tn" to check the
TCP connections the next time this happens.
next prev parent reply other threads:[~2014-03-06 20:46 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1696396609.119284.1394040541217.JavaMail.zimbra@xes-inc.com>
2014-03-05 17:45 ` Optimal NFS mount options to safely allow interrupts and timeouts on newer kernels Andrew Martin
2014-03-05 20:11 ` Jim Rees
2014-03-05 20:41 ` Andrew Martin
2014-03-05 21:11 ` Jim Rees
2014-03-06 3:34 ` NeilBrown
2014-03-06 3:47 ` Jim Rees
2014-03-06 4:37 ` NeilBrown
2014-03-05 20:15 ` Brian Hawley
2014-03-05 20:54 ` Chuck Lever
2014-03-06 9:37 ` Ric Wheeler
2014-03-06 3:50 ` NeilBrown
2014-03-06 5:03 ` Andrew Martin
2014-03-06 5:37 ` NeilBrown
2014-03-06 5:47 ` Brian Hawley
2014-03-06 15:30 ` Andrew Martin
2014-03-06 16:22 ` Jim Rees
2014-03-06 16:43 ` Andrew Martin
2014-03-06 17:36 ` Jim Rees
2014-03-06 18:26 ` Trond Myklebust
2014-03-06 18:35 ` Andrew Martin
2014-03-06 18:48 ` Jim Rees
2014-03-06 19:02 ` Trond Myklebust
2014-03-06 18:50 ` Trond Myklebust
2014-03-06 19:46 ` Andrew Martin
2014-03-06 19:52 ` Trond Myklebust
2014-03-06 20:45 ` Andrew Martin [this message]
2014-03-06 21:01 ` Trond Myklebust
2014-03-18 21:50 ` Andrew Martin
2014-03-18 22:27 ` Trond Myklebust
2014-03-28 22:00 ` Dr Fields James Bruce
2014-04-04 18:15 ` Andrew Martin
2014-03-06 19:00 ` Brian Hawley
2014-03-06 19:06 ` Trond Myklebust
2014-03-06 19:14 ` Brian Hawley
2014-03-06 19:26 ` Trond Myklebust
2014-03-06 19:33 ` Brian Hawley
2014-03-06 19:47 ` Trond Myklebust
2014-03-06 19:56 ` Brian Hawley
2014-03-06 20:31 ` Trond Myklebust
2014-03-06 20:34 ` Brian Hawley
2014-03-06 20:41 ` Trond Myklebust
2014-03-06 19:29 ` Ric Wheeler
2014-03-06 19:38 ` Brian Hawley
2014-04-04 18:15 ` Andrew Martin
2014-03-06 18:56 ` Brian Hawley
2014-03-06 12:34 ` Jim Rees
2014-03-06 15:26 ` Chuck Lever
2014-03-06 15:33 ` Trond Myklebust
2014-03-06 15:59 ` Chuck Lever
2014-03-06 16:02 ` Trond Myklebust
2014-03-06 16:13 ` Chuck Lever
2014-03-06 16:16 ` Trond Myklebust
2014-03-06 16:45 ` Chuck Lever
2014-03-06 17:47 ` Trond Myklebust
2014-03-06 20:38 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=521763040.159828.1394138758307.JavaMail.zimbra@xes-inc.com \
--to=amartin@xes-inc.com \
--cc=bhawley@luminex.com \
--cc=linux-nfs-owner@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=rees@umich.edu \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).