From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752112AbbJTK7Q (ORCPT ); Tue, 20 Oct 2015 06:59:16 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37181 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750832AbbJTK7O (ORCPT ); Tue, 20 Oct 2015 06:59:14 -0400 Date: Tue, 20 Oct 2015 12:55:39 +0200 From: Oleg Nesterov To: Dmitry Vyukov Cc: LKML , Roland McGrath , syzkaller@googlegroups.com, Kostya Serebryany , Alexander Potapenko , Robert Swiecki , Kees Cook , Julien Tinnes , Eric Dumazet Subject: Re: Unkillable processes due to PTRACE_TRACEME Message-ID: <20151020105539.GA27706@redhat.com> References: <20151019194911.GA20063@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/20, Dmitry Vyukov wrote: > > On Tue, Oct 20, 2015 at 10:34 AM, Dmitry Vyukov wrote: > > On Mon, Oct 19, 2015 at 10:17 PM, Dmitry Vyukov wrote: > >> On Mon, Oct 19, 2015 at 9:49 PM, Oleg Nesterov wrote: > >>> > >>> So I bet the problem is that your /sbin/init doesn't use __WALL, > >>> so wait() doesn't reap the traced zombie sub-thread, and thus it > >>> can't release the non-empty thread group. > >>> > >>> Could you please verify? Just do "strace -p1" and send SIGCHLD to > >>> init. > >>> > >>> perhaps eligible_child() should assume WALL if ptrace && ZOMBIE... > >> > >> > >> I am using Ubuntu. > >> Here strace output from init: > >> > >> waitid(P_ALL, 0, {}, WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = 0 > >> > >> So what should be fixed here? Kernel of distro init? > > > > waitpid(__WALL) indeed joins these processes. Thanks. And I just checked Fedora 22, it doesn't use __WALL too. So I think we should change the kernel even if this is not a bug... I'll send the patch. > > But __WALL can't be used with waitid and Ubuntu init uses waitid... Yes, and I never understood why. Perhaps we should change this too. > #include > #include > #include > #include > #include > #include > #include > #include > > void *thr(void *arg) { > ptrace(PTRACE_TRACEME, 0, 0, 0); > return 0; > } > > int main() { > int pid = fork(); > if (pid == 0) { > pthread_t th; > pthread_create(&th, 0, thr, 0); > sleep(1); > return 0; > } > siginfo_t info = {}; > int status = 0; > int res = waitpid(-1, &status, __WALL); > printf("pid=%d res=%d errno=%d\n", pid, res, errno); > res = waitpid(-1, &status, __WALL); > printf("pid=%d res=%d errno=%d\n", pid, res, errno); > return 0; > } > > > However, I need to wait for a particular child and if I change the > first waitpid to: > > int res = waitpid(pid, &status, __WALL); > > then it does not terminate. > So how can I wait for such child process? You can't. This is one of historical oddities. You need to reap the traced sub-thread first. And PTRACE_DETACH doesn't work. Oleg.