From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754820AbdCPQeN (ORCPT <rfc822;w@1wt.eu>);
        Thu, 16 Mar 2017 12:34:13 -0400
Received: from mx1.redhat.com ([209.132.183.28]:55112 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1752896AbdCPQeK (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 16 Mar 2017 12:34:10 -0400
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com A12F64E357
Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=oleg@redhat.com
DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com A12F64E357
Date: Thu, 16 Mar 2017 17:31:59 +0100
From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Peter Zijlstra <peterz@infradead.org>,
        Thomas Gleixner <tglx@linutronix.de>, Chris Mason <clm@fb.com>,
        linux-kernel@vger.kernel.org, kernel-team@fb.com,
        Li Zefan <lizefan@huawei.com>, Johannes Weiner <hannes@cmpxchg.org>,
        cgroups@vger.kernel.org
Subject: Re: [PATCH 2/2] kthread, cgroup: close race window where new
        kthreads can be migrated to non-root cgroups
Message-ID: <20170316163158.GB27613@redhat.com>
References: <20170315231827.GA13656@htj.duckdns.org> <20170315231920.GB13656@htj.duckdns.org> <20170316150233.GB24478@redhat.com> <20170316153925.GA26391@redhat.com> <20170316160734.GD15810@htj.duckdns.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170316160734.GD15810@htj.duckdns.org>
User-Agent: Mutt/1.5.18 (2008-05-17)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 16 Mar 2017 16:33:57 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 03/16, Tejun Heo wrote:
>
> > --- x/kernel/kthread.c
> > +++ x/kernel/kthread.c
> > @@ -226,6 +226,7 @@
> >  	ret = -EINTR;
> >  	if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) {
> >  		__kthread_parkme(self);
> > +		current->flags &= ~PF_IDONTLIKECGROUPS;
> >  		ret = threadfn(data);
> >  	}
> >  	do_exit(ret);
> > @@ -537,7 +538,7 @@
> >  	set_cpus_allowed_ptr(tsk, cpu_all_mask);
> >  	set_mems_allowed(node_states[N_MEMORY]);
> >
> > -	current->flags |= PF_NOFREEZE;
> > +	current->flags |= (PF_NOFREEZE | PF_IDONTLIKECGROUPS);
> >
> >  	for (;;) {
> >  		set_current_state(TASK_INTERRUPTIBLE);
> > --- x/kernel/cgroup/cgroup.c
> > +++ x/kernel/cgroup/cgroup.c
> > @@ -2429,7 +2429,7 @@
> >  	 * trapped in a cpuset, or RT worker may be born in a cgroup
> >  	 * with no rt_runtime allocated.  Just say no.
> >  	 */
> > -	if (tsk == kthreadd_task || (tsk->flags & PF_NO_SETAFFINITY)) {
> > +	if (tsk->flags & (PF_NO_SETAFFINITY | PF_IDONTLIKECGROUPS)) {
> >  		ret = -EINVAL;
> >  		goto out_unlock_rcu;
> >  	}
>
> Absolutely.  If we're willing to spend a PF flag on it, we can
> properly wait for it too instead of failing it.

Or we can add another "unsigned no_cgroups:1" bit into task_struct,
not sure.

Anyway, I do not understand the PF_NO_SETAFFINITY check in
__cgroup_procs_write(). task_can_attach() checks it too, so cgroups
can't change the affinity. Imo something explicit like no_cgroups
makes more sense.

Oleg.