From mboxrd@z Thu Jan 1 00:00:00 1970 From: Li Zefan Subject: Re: [PATCH cgroup 1/2] cgroup: move module ref handling into rebind_subsystems() Date: Fri, 12 Jul 2013 17:08:23 +0800 Message-ID: <51DFC787.7080703@huawei.com> References: <20130629041231.GA31353@htj.dyndns.org> <51DFC650.9010801@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51DFC650.9010801-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org >> Subject: [PATCH 1/2] cgroup: move module ref handling into rebind_subsystems() >> >> Module ref handling in cgroup is rather weird. >> parse_cgroupfs_options() grabs all the modules for the specified >> subsystems. A module ref is kept if the specified subsystem is newly >> bound to the hierarchy. If not, or the operation fails, the refs are >> dropped. This scatters module ref handling across multiple functions >> making it difficult to track. It also make the function nasty to use >> for dynamic subsystem binding which is necessary for the planned >> unified hierarchy. >> >> There's nothing which requires the subsystem modules to be pinned >> between parse_cgroupfs_options() and rebind_subsystems() in both mount >> and remount paths. parse_cgroupfs_options() can just parse and >> rebind_subsystems() can handle pinning the subsystems that it wants to >> bind, which is a natural part of its task - binding - anyway. >> >> Move module ref handling into rebind_subsystems() which makes the code >> a lot simpler - modules are gotten iff it's gonna be bound and put iff >> unbound or binding fails. >> >> Signed-off-by: Tejun Heo >> --- >> kernel/cgroup.c | 87 +++++++++++++++------------------------------------------ >> 1 file changed, 22 insertions(+), 65 deletions(-) >> >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c >> index 3bc7a1a..a65aff1 100644 >> --- a/kernel/cgroup.c >> +++ b/kernel/cgroup.c >> @@ -1003,6 +1003,7 @@ static int rebind_subsystems(struct cgroupfs_root *root, >> { >> struct cgroup *cgrp = &root->top_cgroup; >> struct cgroup_subsys *ss; >> + unsigned long pinned = 0; >> int i, ret; >> >> BUG_ON(!mutex_is_locked(&cgroup_mutex)); >> @@ -1010,20 +1011,26 @@ static int rebind_subsystems(struct cgroupfs_root *root, >> >> /* Check that any added subsystems are currently free */ >> for_each_subsys(ss, i) { >> - unsigned long bit = 1UL << i; >> - >> - if (!(bit & added_mask)) >> + if (!(added_mask & (1 << i))) >> continue; >> >> + /* is the subsystem mounted elsewhere? */ >> if (ss->root != &cgroup_dummy_root) { >> - /* Subsystem isn't free */ >> - return -EBUSY; >> + ret = -EBUSY; >> + goto out_put; >> } >> + >> + /* pin the module */ >> + if (!try_module_get(ss->module)) { >> + ret = -ENOENT; >> + goto out_put; >> + } >> + pinned |= 1 << i; >> } > > This looks wrong to me. > > cgroup_mount() > { > mutex_lock(cgroup_mutex); > parse_cgroupfs_options(); > mutex_unlock(cgroup_mutex); > ... > > mutex_lock(cgroup_mutex); > ... > rebind_subsystems(); > ... > mutex_unlock(cgroup_mutex); > } > > so a modular cgroup subsystem can be unloaded inbetween, say it's net_cls, and > then it's possible that: > > # mount -t cgroup -o net_cls xxx /cgroup > > The above operation succeeds but it's not binded to cgroupfs as it just got > unloaded. > for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { ... if (!subsys[i] && (added_mask & (1 << i)) return -EINVAL; ... } This should work.