From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754691AbcBEP2j (ORCPT ); Fri, 5 Feb 2016 10:28:39 -0500 Received: from parrot.pmhahn.de ([88.198.50.102]:57047 "EHLO parrot.pmhahn.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754574AbcBEP2d (ORCPT ); Fri, 5 Feb 2016 10:28:33 -0500 Subject: Re: Bug 4.1.16: self-detected stall in net/unix/? To: Hannes Frederic Sowa , Sasha Levin , Rainer Weikusat , Andrey Vagin , Aaron Conole , "David S. Miller" , linux-kernel@vger.kernel.org, Eric Dumazet References: <56B0D88A.1020609@pmhahn.de> <56B15B32.1020701@stressinduktion.org> Cc: Greg Kroah-Hartman From: Philipp Hahn X-Enigmail-Draft-Status: N1110 Message-ID: <56B4BF9D.9070609@pmhahn.de> Date: Fri, 5 Feb 2016 16:28:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 MIME-Version: 1.0 In-Reply-To: <56B15B32.1020701@stressinduktion.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Hannes, thank you for your previous reply. Am 03.02.2016 um 02:43 schrieb Hannes Frederic Sowa: > On 02.02.2016 17:25, Philipp Hahn wrote: >> we recently updated our kernel to 4.1.16 + patch for "unix: properly >> account for FDs passed over unix sockets" and have since then >> self-detected stalls triggered by the Samba daemon: ... >> We have not yet been able to reproduce the hang, but going back to our >> previous kernel 4.1.12 makes the problem go away. > > Can you remove the patch "unix: properly account for FDs passed over > unix sockets" and see if the problem still happens? I will try. The problem is that I can't trigger the bug reliably. It always happens to "smbd", but I don't know the triggering condition. > I couldn't quickly see any problems with your added patch. I currently > suspect a tight loop because of a SOCK_DEAD flag set but the socket not > removed from unix_socket_table or the vfs. Hmmm... All occurrences of the bug show _raw_spin_lock() as the sleeping function, never anything other thus far. I have seen the bug both on amd64 and i386. If it would spin in > 1097 restart: > 1098 »···»···other = unix_find_other(net, sunaddr, alen, sock->type, hash, &err); > 1099 »···»···if (!other) > 1100 »···»···»···goto out; > 1101 > 1102 »···»···unix_state_double_lock(sk, other); > 1103 > 1104 »···»···/* Apparently VFS overslept socket death. Retry. */ > 1105 »···»···if (sock_flag(other, SOCK_DEAD)) { > 1106 »···»···»···unix_state_double_unlock(sk, other); > 1107 »···»···»···sock_put(other); > 1108 »···»···»···goto restart; > 1109 »···»···} I would expect some other function to show up in the trace, but it's always the call chain "unix_dgram_connect -> unix_state_double_lock -> _raw_spin_lock". The only case I see is that unix_find_other() returns the same "other" again and again. Q: How will that dead "other" be garbage collected finally? The kernel is > # CONFIG_PREEMPT_NONE is not set > CONFIG_PREEMPT_VOLUNTARY=y > # CONFIG_PREEMPT is not set During my hunt I found the following difference between 4.4 and 4.1.16 in net/unix/af_unix.c: > commit 1586a5877db9eee313379738d6581bc7c6ffb5e3 > Author: Eric Dumazet > Date: Fri Oct 23 10:59:16 2015 -0700 > > af_unix: do not report POLLOUT on listeners > > poll(POLLOUT) on a listener should not report fd is ready for > a write(). > > This would break some applications using poll() and pfd.events = -1, > as they would not block in poll() > > Signed-off-by: Eric Dumazet > Reported-by: Alan Burlison > Tested-by: Eric Dumazet > Signed-off-by: David S. Miller ... > -static inline int unix_writable(struct sock *sk) > +static int unix_writable(const struct sock *sk) > { > - return (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf; > + return sk->sk_state != TCP_LISTEN && > + (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf; > } Q: That *TCP*_LISTEN in net/*unix*/ looks strange to my eyes, but I'm not a kernel guru (yet). There is one hunk in 5c77e26862ce604edea05b3442ed765e9756fe0f, which uses the result of that function, which might explain the difference, why it shows with 4.1.16 but not with 4.4? > @@ -2245,14 +2388,16 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock, > writable = unix_writable(sk); > - other = unix_peer_get(sk); > - if (other) { > - if (unix_peer(other) != sk) { > - sock_poll_wait(file, &unix_sk(other)->peer_wait, wait); > - if (unix_recvq_full(other)) > - writable = 0; > - } > - sock_put(other); > + if (writable) { > + unix_state_lock(sk); > + > + other = unix_peer(sk); > + if (other && unix_peer(other) != sk && > + unix_recvq_full(other) && > + unix_dgram_peer_wake_me(sk, other)) > + writable = 0; > + > + unix_state_unlock(sk); > } > if (writable) I will for now build a new kernel with > $ git log --oneline v4.1.12..v4.1.17 -- net/unix > dc6b0ec unix: properly account for FDs passed over unix sockets > cc01a0a af_unix: Revert 'lock_interruptible' in stream receive code > 5c77e26 unix: avoid use-after-free in ep_remove_wait_queue reverted to see if it still happens. The "middle" patch seems harmless, as it only changes a code path for STREAMS, while the bug triggers with DGRAMS only. > The stack trace is rather unreliable, maybe something completely > different happend. Do you happend to see better reports? So far they look all the same. Anything more I can do to prepare for collection better information next time I get that bug? Thanks again. Philipp