From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752193AbdCCRdb (ORCPT ); Fri, 3 Mar 2017 12:33:31 -0500 Received: from mx1.redhat.com ([209.132.183.28]:57700 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752019AbdCCRd3 (ORCPT ); Fri, 3 Mar 2017 12:33:29 -0500 Date: Fri, 3 Mar 2017 18:33:26 +0100 From: Oleg Nesterov To: "Eric W. Biederman" Cc: Andrew Morton , Aleksa Sarai , Andy Lutomirski , Attila Fazekas , Jann Horn , Kees Cook , Michal Hocko , Ulrich Obergfell , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/2] fix the traced mt-exec deadlock Message-ID: <20170303173326.GA17899@redhat.com> References: <20170213141452.GA30203@redhat.com> <20170224160354.GA845@redhat.com> <87shmv6ufl.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87shmv6ufl.fsf@xmission.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 03 Mar 2017 17:33:29 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/02, Eric W. Biederman wrote: > > Oleg Nesterov writes: > > > our discussion was a bit confusing, and it seems that we did not > > fully convince each other. So let me ask what do you finally think > > about this fix. > > > > Let me repeat. Even if I do not agree with some of your objections, > > I do agree that 1/2 does not look nice and clean. And we seem to > > agree that either way, with or without this fix, we need more changes > > in this area. > > > > But we need a simple and backportable fix for stable trees, say for > > rhel7. This bug was reported many times, and this is the simplest > > solution I was able to find. > > I am not 100% convinced that we need a backportable solution, I think we need, this was already requested, > but > regardless my experience is that good clean solutions are almost always > easier to backport that something we just settle for. Sure. > The patch below needs a little more looking and testing but arguably > it is sufficient. > > It implements autoreaping for non-leader threads always. We might want > to limit this to the case of exec. I should have mentioned this. Of course, this change was proposed from the very beginning, when this problem was reported first time. And of course I like this fix much more than my patch (but yes, we shouldd limit it to the case of exec). The only problem is that it is incompatible change, and I have no idea what can be broken. Trivial test-case: void *thread(void *arg) { for (;;) pause(); return NULL; } int main(void) { pthread_t pt; pthread_create(&pt, NULL, thread, NULL); execlp("true", "true", NULL); } with your patch applied $ strace -f ./test strace: wait4(__WALL): No child processes Yes, this is just a warning, but still. Now we need to change strace. And we can't know what else can be confused/broken by this change. man(ptrace) even documents that all other threads except the thread group leader report death as if they exited via _exit(2). Yes, yes, it also says that other threads stop in PTRACE_EVENT_EXIT stop, so 2/2 (which we need with your change too) is is not compatible too and I am worried, but: - this was never really true, an already exiting thread won't stop if it races with exec - PTRACE_O_TRACEEXEC is not used that often, it never really worked - man(ptrace) also mentions that PTRACE_EVENT_EXIT behaviour may be change in the future. In short. Of course I considered this change. Looks too risky to me. But. I will be happy if you send this change and take all the blame ;) I won't argue (if you limit it to execve case). > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -690,7 +690,9 @@ static void exit_notify(struct task_struct *tsk, int group_dead) > thread_group_empty(tsk) && > !ptrace_reparented(tsk) ? > tsk->exit_signal : SIGCHLD; > - autoreap = do_notify_parent(tsk, sig); > + do_notify_parent(tsk, sig); > + /* Autoreap threads even when ptraced */ > + autoreap = !thread_group_leader(tsk); > } else if (thread_group_leader(tsk)) { > autoreap = thread_group_empty(tsk) && > do_notify_parent(tsk, tsk->exit_signal); This is all you need, > @@ -699,8 +701,6 @@ static void exit_notify(struct task_struct *tsk, int group_dead) > } > > tsk->exit_state = autoreap ? EXIT_DEAD : EXIT_ZOMBIE; > - if (tsk->exit_state == EXIT_DEAD) > - list_add(&tsk->ptrace_entry, &dead); > > /* mt-exec, de_thread() is waiting for group leader */ > if (unlikely(tsk->signal->notify_count < 0)) > @@ -711,6 +711,8 @@ static void exit_notify(struct task_struct *tsk, int group_dead) > list_del_init(&p->ptrace_entry); > release_task(p); > } > + if (autoreap) > + release_task(tsk); These 2 changes are not needed. release_task(tsk) will be called by list_for_each_entry_safe() above if autoreap == T. Oleg.