From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933534AbbJAM7o (ORCPT ); Thu, 1 Oct 2015 08:59:44 -0400 Received: from tiger.mobileactivedefense.com ([217.174.251.109]:43633 "EHLO tiger.mobileactivedefense.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755000AbbJAM7m (ORCPT ); Thu, 1 Oct 2015 08:59:42 -0400 From: Rainer Weikusat To: Jason Baron Cc: Mathias Krause , netdev@vger.kernel.org, "linux-kernel\@vger.kernel.org" , Eric Wong , Eric Dumazet , Rainer Weikusat , Alexander Viro , Davide Libenzi , Davidlohr Bueso , Olivier Mauras , PaX Team , Linus Torvalds , "peterz\@infradead.org" , "davem\@davemloft.net" Subject: Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket In-Reply-To: <87bnciiybf.fsf@doppelsaurus.mobileactivedefense.com> (Rainer Weikusat's message of "Thu, 01 Oct 2015 13:10:44 +0100") References: <20150913195354.GA12352@jig.fritz.box> <20150914023949.GA15012@dcvr.yhbt.net> <560AE202.4020402@akamai.com> <560C9CFE.6090509@akamai.com> <87oagiho88.fsf@doppelsaurus.mobileactivedefense.com> <87bnciiybf.fsf@doppelsaurus.mobileactivedefense.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Date: Thu, 01 Oct 2015 13:58:42 +0100 Message-ID: <87y4fm4uf1.fsf@doppelsaurus.mobileactivedefense.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (tiger.mobileactivedefense.com [217.174.251.109]); Thu, 01 Oct 2015 13:58:52 +0100 (BST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Rainer Weikusat writes: > Rainer Weikusat writes: >> Jason Baron writes: >>> On 09/30/2015 01:54 AM, Mathias Krause wrote: >>>> On 29 September 2015 at 21:09, Jason Baron wrote: >>>>> However, if we call connect on socket 's', to connect to a new socket 'o2', we >>>>> drop the reference on the original socket 'o'. Thus, we can now close socket >>>>> 'o' without unregistering from epoll. Then, when we either close the ep >>>>> or unregister 'o', we end up with this list corruption. Thus, this is not a >>>>> race per se, but can be triggered sequentially. [...] > Test program (assumes that it can execute itself as ./a.out): > > ------------- > #include > #include > #include > #include > #include > #include > #include > #include > > static int sk; > > static void *epoller(void *unused) > { > struct epoll_event epev; > int epfd; > > epfd = epoll_create(1); > > epev.events = EPOLLOUT; > epoll_ctl(epfd, EPOLL_CTL_ADD, sk, &epev); > epoll_wait(epfd, &epev, 1, 5000); > > execl("./a.out", "./a.out", (void *)0); > > return NULL; > } [...] Possibly interesting additional bit of information: The list corruption warnings appear only if the 2nd connect is there and both the sk and epfd file descriptors are left open accross the exec. Closing either of both still triggers the _destructor warnings but nothing else (until the process runs out of file descriptors).