From: Nix <nix@esperi.org.uk>
To: NFS list <linux-nfs@vger.kernel.org>
Cc: Linux-Netdev <netdev@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Unix-domain sockets hanging on 3.14.x (was Re: possible 3.14.1 lockd problem (?) causing nfsd hangs)
Date: Mon, 28 Apr 2014 19:55:51 +0100 [thread overview]
Message-ID: <87ha5dp9l4.fsf@spindle.srvr.nix> (raw)
In-Reply-To: <20140428154648.GA31788@order.stressinduktion.org> (Hannes Frederic Sowa's message of "Mon, 28 Apr 2014 17:46:48 +0200")
On 28 Apr 2014, Hannes Frederic Sowa uttered the following:
> On Mon, Apr 28, 2014 at 04:35:38PM +0100, Nix wrote:
>> /proc/$pid/stack of the two communicating ssh daemons was instructive:
>>
>> [<ffffffff814e3512>] unix_wait_for_peer+0x9f/0xbc
>> [<ffffffff814e5d48>] unix_dgram_sendmsg+0x41b/0x534
>
> This one is a dgram socket...
>
>> [<ffffffff8146618f>] sock_sendmsg+0x84/0x9e
>> [<ffffffff81467f3d>] SyS_sendto+0x10e/0x13f
>> [<ffffffff815770e2>] system_call_fastpath+0x16/0x1b
>> [<ffffffffffffffff>] 0xffffffffffffffff
>> spindle:/var/log.real/by-facility# cat /proc/5941/stack
>> [<ffffffff814e493a>] unix_stream_recvmsg+0x289/0x6d5
>
> ...and that's a stream receiver.
>
>> [<ffffffff814673e0>] sock_aio_read.part.12+0xf0/0xff
>> [<ffffffff8146740b>] sock_aio_read+0x1c/0x28
>> [<ffffffff811388fd>] do_sync_read+0x59/0x78
>> [<ffffffff81138d8b>] vfs_read+0xa2/0x13f
>> [<ffffffff81139699>] SyS_read+0x47/0x8b
>> [<ffffffff81577259>] tracesys+0xd0/0xd5
>> [<ffffffffffffffff>] 0xffffffffffffffff
>
> Are you sure those are the communicating tasks?
Normally I'd say yes. This time I'd say probably not. I am very much not
at my best right now.
I'll wait for it to implode and try the same thing again. On past form I
won't have to wait very many days...
(this time, there was a six-hour interval between boot and
misbehaviour. Whatever the misbehaviour *is*.)
One more instance I didn't share because I only have one end of it:
starting up ISC dhcpd 4.2.4 also hung unexpectedly (which it did not
after rebooting):
[<ffffffff814e3512>] unix_wait_for_peer+0x9f/0xbc
[<ffffffff814e5d48>] unix_dgram_sendmsg+0x41b/0x534
[<ffffffff8146618f>] sock_sendmsg+0x84/0x9e
[<ffffffff81467f3d>] SyS_sendto+0x10e/0x13f
[<ffffffff815770e2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
The common factor in these hangs is definitely unix_wait_for_peer. The
question is, what are they talking to? (In theory this could be a
userspace bug... but what common userspace factor is there between the
NFS server, OpenSSH, and ISC dhcpd? If it helps, named pipes may well be
suffering from the same malaise: /sbin/shutdown from sysvinit also hangs
and I have to /sbin/reboot -f instead. I'll check next time it hits, but
I'll bet it's trying to talk over /dev/initctl and getting nowhere.)
prev parent reply other threads:[~2014-04-28 19:25 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-25 20:37 possible 3.14.1 lockd problem (?) causing nfsd hangs Nix
2014-04-28 15:35 ` Unix-domain sockets hanging on 3.14.x (was Re: possible 3.14.1 lockd problem (?) causing nfsd hangs) Nix
2014-04-28 15:46 ` Hannes Frederic Sowa
2014-04-28 18:55 ` Nix [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ha5dp9l4.fsf@spindle.srvr.nix \
--to=nix@esperi.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.