From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933012AbcKJNKE (ORCPT ); Thu, 10 Nov 2016 08:10:04 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36168 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932295AbcKJNKC (ORCPT ); Thu, 10 Nov 2016 08:10:02 -0500 Date: Thu, 10 Nov 2016 14:09:13 +0100 From: Oleg Nesterov To: Peter Zijlstra Cc: Ingo Molnar , Linus Torvalds , Mike Galbraith , hartsjc@redhat.com, vbendel@redhat.com, vlovejoy@redhat.com, linux-kernel@vger.kernel.org Subject: Re: sched/autogroup: race if !sysctl_sched_autogroup_enabled ? Message-ID: <20161110130913.GA11933@redhat.com> References: <20161109165933.GA26071@redhat.com> <20161109175005.GS3142@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161109175005.GS3142@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 10 Nov 2016 13:09:16 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/09, Peter Zijlstra wrote: > > On Wed, Nov 09, 2016 at 05:59:33PM +0100, Oleg Nesterov wrote: > > > We need to ensure that autogroup/tg returned by autogroup_task_group() > > can't go away if we race with autogroup_move_group(), and unless the > > caller holds ->siglock we rely on fact that autogroup_move_group() > > will a) see this task and b) do sched_move_task() which needs the same > > same rq->lock. > > > > However. autogroup_move_group() skips for_each_thread/sched_move_task > > if sysctl_sched_autogroup_enabled == 0. > > > > So. Doesn't this mean that cgroup migration to the root cgroup can race > > with autogroup_move_group() and use the soon-to-be-freed autogroup->tg? > > Argh, its too late for this, also jet-lag. But maybe, I can sort of feel > a hole here but cannot for the life of me still think. And the 3rd case which I didn't think about yesterday. And now I really hope it can explain the vmcore we have. If sysctl_sched_autogroup_enabled was enabled and then disabled, it is possible that the "autogrouped" process runs with ag->kref.refcount == 1, and if it does setsid() it frees its active task_group. > > although this is a bit off-topic. Another question is that I fail to > > understand why sched_autogroup_create_attach() does autogroup_create() > > and changes signal->autogroup even if !sysctl_sched_autogroup_enabled. > > I really cannot remember back that far, but it could be to allow > flipping it back on. Yes, I thought about this too, but I think it is hardly possible to explain what do we actually want when sysctl_sched_autogroup_enabled changes from 0 to 1. So I am going to send the patch which simply moves the sysctl check from autogroup_move_group() to sched_autogroup_create_attach(), but perhaps I should split this change? I mean, the first patch for -stable could just remove the current check, the 2nd one will add it into sched_autogroup_create_attach(). Oleg.