From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753804AbbJZMJG (ORCPT ); Mon, 26 Oct 2015 08:09:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34652 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753315AbbJZMJD (ORCPT ); Mon, 26 Oct 2015 08:09:03 -0400 Message-ID: <562E17D8.4000108@redhat.com> Date: Mon, 26 Oct 2015 12:08:56 +0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Oleg Nesterov , Denys Vlasenko CC: Denys Vlasenko , Andrew Morton , Dmitry Vyukov , Alexander Potapenko , Eric Dumazet , Jan Kratochvil , Julien Tinnes , Kees Cook , Kostya Serebryany , Linus Torvalds , "Michael Kerrisk (man-pages)" , Robert Swiecki , Roland McGrath , syzkaller@googlegroups.com, Linux Kernel Mailing List , "gdb@sourceware.org" Subject: Re: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced References: <20151020171740.GA29290@redhat.com> <20151020171754.GA29304@redhat.com> <20151020153155.e03f4219da4014efe6f810b0@linux-foundation.org> <5627EE9E.8040600@redhat.com> <5627F607.4050506@redhat.com> <20151021214703.GA1810@redhat.com> <20151025155440.GB2043@redhat.com> In-Reply-To: <20151025155440.GB2043@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/25/2015 03:54 PM, Oleg Nesterov wrote: > On 10/22, Denys Vlasenko wrote: >> >> On Wed, Oct 21, 2015 at 11:47 PM, Oleg Nesterov wrote: >>> On 10/21, Denys Vlasenko wrote: >>>> >>>> On 10/21/2015 09:59 PM, Denys Vlasenko wrote: >>>>> On 10/21/2015 12:31 AM, Andrew Morton wrote: >>>>>> Well, to fix this a distro needs to roll out a new kernel. Or a new >>>>>> init(8). Is there any reason to believe that distributing/deploying a >>>>>> new kernel is significantly easier for everyone? Because fixing init >>>>>> sounds like a much preferable solution to this problem. >>>>> >>>>> People will continue to write new init(8) implementations, >>>>> and they will miss this obscure case. >>>>> >>>>> Before this bug was found, it was considered possible to use >>>>> a shell script as init process. What now, every shell needs to add >>>>> __WALL to its waitpids? >>> >>> Why not? I think it can safely use __WALL too. >> >> Because having any userspace program which can happen to be init, >> which includes all shells out there in the wild >> (bash, dash, ash, ksh, zsh, msh, hush,...) >> learn about __WALL is wrong: apart from this wart, they do not have >> to use any Linux-specific code. It can all be perfectly legitimate POSIX. > > Yes, this is true. I meant that they could safely use __WALL to, but I > understand that this change can be painful. > >>> Sure. But people do the things which were never intended to be >>> used all the time. We simply can not know if this "feature" >>> already has a creative user or not. >> >> It won't be unfixable: they will just have to switch from PTRACE_TRACEME >> to PTRACE_ATTACH. >> >> As of now we do not know any people craz^W creative enough >> to create a cross between init and strace. If such specimens would >> materialize, don't they deserve to have to make that change? > > This also applies to people who use bash/whatever as /sbin/init and allow > the untrusted users to run the exploits ;) I do not know who is more crazy. > > In any case, the real question is whether we should change the kernel to > fix the problem, or ask the distros to fix their init's. In the former > case 1/2 looks simpler/safer to me than the change in ptrace_traceme(), > and you seem to agree that 1/2 is not that bad. A risk here seems to be that waitpid will start returning unexpected (thread) PIDs to parent processes, and it's not unreasonable to assume that e.g., a program asserts that waitpid either returns error or a known (process) PID. That's not an init-only issue, but something that might affect any process that runs a child that happens to decide to call PTRACE_TRACEME. The ptrace man page says: "A process can initiate a trace by calling fork(2) and having the resulting child do a PTRACE_TRACEME, followed (typically) by an execve(2)." Given that, can we instead make the kernel error out on PTRACE_TRACEME issued from a non-leader thread? Then between PTRACE_TRACEME and the parent's waitpid, __WALL or !__WALL should make no difference. (Also, in the original test case, if the child gets/raises a signal or execs before exiting, the bash/init/whatever process won't be issuing PTRACE_CONT, and the child will thus end up stuck (though should be SIGKILLable, I believe). All this because PTRACE_TRACEME is broken by design by making it be the child's choice whether to be traced...) Thanks, Pedro Alves