From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753079AbdATSPI (ORCPT ); Fri, 20 Jan 2017 13:15:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39982 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751303AbdATSPG (ORCPT ); Fri, 20 Jan 2017 13:15:06 -0500 Date: Fri, 20 Jan 2017 19:14:00 +0100 From: Oleg Nesterov To: Pavel Tikhomirov Cc: Ingo Molnar , Peter Zijlstra , Andrew Morton , Cyrill Gorcunov , John Stultz , Thomas Gleixner , Nicolas Pitre , Michal Hocko , Stanislav Kinsburskiy , Mateusz Guzik , linux-kernel@vger.kernel.org, Pavel Emelyanov , Konstantin Khorenko Subject: Re: [PATCH] prctl: propagate has_child_subreaper flag to every descendant Message-ID: <20170120181359.GA17205@redhat.com> References: <20170119164346.4214-1-ptikhomirov@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170119164346.4214-1-ptikhomirov@virtuozzo.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 20 Jan 2017 18:15:07 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/19, Pavel Tikhomirov wrote: > > Having these two > differently behaving groups can lead to confusion. Also it is > a problem for CRIU, as when we restore process tree we need to > somehow determine which descendants belong to which group and > much harder - to put them exactly to these group. Hmm. could you explain how this change helps CRIU? I mean, why restorer can't do prctl(CHILD_SUBREAPER) before the first fork? Anyway, afaics the patch is sub-optimal and not correct... > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1715,6 +1715,8 @@ struct task_struct { > struct signal_struct *signal; > struct sighand_struct *sighand; > > + struct list_head csr_descendant; > + You don't need this new member and descendants_lock. task_struct has the ->real_parent pointer so you can work the tree without recursion. > +static void prctl_set_child_subreaper(struct task_struct *reaper, bool arg2) > +{ > + LIST_HEAD(descendants); > + > + reaper->signal->is_child_subreaper = arg2; > + if (!arg2) > + return; > + > + spin_lock(&descendants_lock); > + read_lock(&tasklist_lock); > + > + list_add(&reaper->csr_descendant, &descendants); > + > + while (!list_empty(&descendants)) { > + struct task_struct *tsk; > + struct task_struct *p; > + > + tsk = list_first_entry(&descendants, struct task_struct, > + csr_descendant); > + > + list_for_each_entry(p, &tsk->children, sibling) { This is not enough. Every thread has its own ->children list, you need to walk the sub-threads as well. > + * If we've found child_reaper - skip descendants in > + * it's subtree as they will never get out pidns > + */ > + if (is_child_reaper(task_pid(p))) > + continue; Again, a child reaper can be multi-threaded, this check can be false negative. Probably is_child_reaper() should be renamed somehow and a new helper makes sense... something like bool task_is_child_reaper(struct task_struct *p) { return same_thread_group(p, task_active_pid_ns(p)->child_reaper); } Oleg.