From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756506AbYLKCZv (ORCPT ); Wed, 10 Dec 2008 21:25:51 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755911AbYLKCZS (ORCPT ); Wed, 10 Dec 2008 21:25:18 -0500 Received: from cn.fujitsu.com ([222.73.24.84]:64800 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1754156AbYLKCZQ (ORCPT ); Wed, 10 Dec 2008 21:25:16 -0500 Message-ID: <494079AF.3040704@cn.fujitsu.com> Date: Thu, 11 Dec 2008 10:23:43 +0800 From: Li Zefan User-Agent: Thunderbird 2.0.0.9 (X11/20071115) MIME-Version: 1.0 To: Paul Menage CC: Andrew Morton , Linux Containers , LKML Subject: Re: [PATCH] CGroups: Fix a race between rmdir and remount References: <6599ad830812101533r6cb007eeu5ef19499e278f6b7@mail.gmail.com> In-Reply-To: <6599ad830812101533r6cb007eeu5ef19499e278f6b7@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Paul Menage wrote: > Fix a race between rmdir and remount > > When a cgroup is removed, it's unlinked from its parent's children > list, but not actually freed until the last dentry on it is released > (at which point cgrp->root->number_of_cgroups is decremented). > > Currently rebind_subsystems checks for the top cgroup's child list > being empty in order to rebind subsystems into or out of a hierarchy - > this can result in the set of subsystems bound to a hierarchy being > different than the set of subsystems with state in the > removed-but-not-freed cgroup. > > The simplest fix for this is to forbid remounts that change the set of > subsystems on a hierarchy that has removed-but-not-freed cgroups. > > This bug can be reproduced via: > Seems this bug is revealed by my patch: cgroups-remove-some-redundant-null-checks.patch (http://marc.info/?l=linux-mm-commits&m=122730918427045&w=2) @@ -611,10 +611,8 @@ static void cgroup_diput(struct dentry * ... - for_each_subsys(cgrp->root, ss) { - if (cgrp->subsys[ss->subsys_id]) - ss->destroy(ss, cgrp); - } + for_each_subsys(cgrp->root, ss) + ss->destroy(ss, cgrp); But this patch is not guilty. :) The original code leaked memory silently due to this race, if the remount removes some subsystems from the hierarchy. > mkdir /mnt/cg > mount -t cgroup -o ns,freezer cgroup /mnt/cg > mkdir /mnt/cg/foo > sleep 1h < /mnt/cg/foo & > rmdir /mnt/cg/foo > mount -t cgroup -o remount,ns,devices,freezer cgroup /mnt/cg > kill $! > > Signed-off-by: Paul Menage > Reviewed-by: Li Zefan > --- > kernel/cgroup.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: hierarchy_lock-mmotm-2008-12-09/kernel/cgroup.c > =================================================================== > --- hierarchy_lock-mmotm-2008-12-09.orig/kernel/cgroup.c > +++ hierarchy_lock-mmotm-2008-12-09/kernel/cgroup.c > @@ -702,7 +702,7 @@ static int rebind_subsystems(struct cgro > * any child cgroups exist. This is theoretically supportable > * but involves complex error handling, so it's being left until > * later */ > - if (!list_empty(&cgrp->children)) > + if (root->number_of_cgroups > 1) > return -EBUSY; > > /* Process each subsystem */