* [PATCH 1/2] cgroup: no need to check css refs for release notification
@ 2013-03-01 7:06 Li Zefan
[not found] ` <5130535F.7060201-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Li Zefan @ 2013-03-01 7:06 UTC (permalink / raw)
To: Tejun Heo; +Cc: LKML, cgroups
We no longer fail rmdir() when there're still css refs, so we don't
need to check css refs in check_for_release().
This also voids a bug. cgroup_has_css_refs() accesses subsys[i]
without cgroup_mutex, so it can race with cgroup_unload_subsys().
cgroup_has_css_refs()
...
if (ss == NULL || ss->root != cgrp->root)
if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded
right after the former check but before the latter, the memory that
net_cls_subsys resides has become invalid.
Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
kernel/cgroup.c | 67 +++++++--------------------------------------------------
1 file changed, 8 insertions(+), 59 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 43ff59e..f4554cc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4343,47 +4343,6 @@ static int cgroup_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
return cgroup_create(c_parent, dentry, mode | S_IFDIR);
}
-/*
- * Check the reference count on each subsystem. Since we already
- * established that there are no tasks in the cgroup, if the css refcount
- * is also 1, then there should be no outstanding references, so the
- * subsystem is safe to destroy. We scan across all subsystems rather than
- * using the per-hierarchy linked list of mounted subsystems since we can
- * be called via check_for_release() with no synchronization other than
- * RCU, and the subsystem linked list isn't RCU-safe.
- */
-static int cgroup_has_css_refs(struct cgroup *cgrp)
-{
- int i;
-
- /*
- * We won't need to lock the subsys array, because the subsystems
- * we're concerned about aren't going anywhere since our cgroup root
- * has a reference on them.
- */
- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
- struct cgroup_subsys *ss = subsys[i];
- struct cgroup_subsys_state *css;
-
- /* Skip subsystems not present or not in this hierarchy */
- if (ss == NULL || ss->root != cgrp->root)
- continue;
-
- css = cgrp->subsys[ss->subsys_id];
- /*
- * When called from check_for_release() it's possible
- * that by this point the cgroup has been removed
- * and the css deleted. But a false-positive doesn't
- * matter, since it can only happen if the cgroup
- * has been deleted and hence no longer needs the
- * release agent to be called anyway.
- */
- if (css && css_refcnt(css) > 1)
- return 1;
- }
- return 0;
-}
-
static int cgroup_destroy_locked(struct cgroup *cgrp)
__releases(&cgroup_mutex) __acquires(&cgroup_mutex)
{
@@ -5112,12 +5071,15 @@ static void check_for_release(struct cgroup *cgrp)
{
/* All of these checks rely on RCU to keep the cgroup
* structure alive */
- if (cgroup_is_releasable(cgrp) && !atomic_read(&cgrp->count)
- && list_empty(&cgrp->children) && !cgroup_has_css_refs(cgrp)) {
- /* Control Group is currently removeable. If it's not
+ if (cgroup_is_releasable(cgrp) &&
+ !atomic_read(&cgrp->count) && list_empty(&cgrp->children)) {
+ /*
+ * Control Group is currently removeable. If it's not
* already queued for a userspace notification, queue
- * it now */
+ * it now
+ */
int need_schedule_work = 0;
+
raw_spin_lock(&release_list_lock);
if (!cgroup_is_removed(cgrp) &&
list_empty(&cgrp->release_list)) {
@@ -5150,24 +5112,11 @@ EXPORT_SYMBOL_GPL(__css_tryget);
/* Caller must verify that the css is not for root cgroup */
void __css_put(struct cgroup_subsys_state *css)
{
- struct cgroup *cgrp = css->cgroup;
int v;
- rcu_read_lock();
v = css_unbias_refcnt(atomic_dec_return(&css->refcnt));
-
- switch (v) {
- case 1:
- if (notify_on_release(cgrp)) {
- set_bit(CGRP_RELEASABLE, &cgrp->flags);
- check_for_release(cgrp);
- }
- break;
- case 0:
+ if (v == 0)
schedule_work(&css->dput_work);
- break;
- }
- rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(__css_put);
--
1.8.0.2
^ permalink raw reply related [flat|nested] 4+ messages in thread[parent not found: <5130535F.7060201-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking [not found] ` <5130535F.7060201-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2013-03-01 7:06 ` Li Zefan [not found] ` <5130537C.5010608-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2013-03-04 18:05 ` [PATCH 1/2] cgroup: no need to check css refs for release notification Tejun Heo 1 sibling, 1 reply; 4+ messages in thread From: Li Zefan @ 2013-03-01 7:06 UTC (permalink / raw) To: Tejun Heo; +Cc: LKML, cgroups subsys[i] is set to NULL in cgroup_unload_subsys() at modular unload, and that's protected by cgroup_mutex, and then the memory *subsys[i] resides will be freed. So this is unsafe without any locking: if (!ss || ss->module) ... Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> --- include/linux/cgroup.h | 11 +++++++++-- kernel/cgroup.c | 32 ++++++++++++++++++-------------- 2 files changed, 27 insertions(+), 16 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 75c6ec1..3ac6bb0 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -46,12 +46,19 @@ extern const struct file_operations proc_cgroup_operations; /* Define the enumeration of all builtin cgroup subsystems */ #define SUBSYS(_x) _x ## _subsys_id, -#define IS_SUBSYS_ENABLED(option) IS_ENABLED(option) enum cgroup_subsys_id { +#define IS_SUBSYS_ENABLED(option) IS_BUILTIN(option) #include <linux/cgroup_subsys.h> +#undef IS_SUBSYS_ENABLED + CGROUP_BUILTIN_SUBSYS_COUNT, + + __CGROUP_SUBSYS_TEMP_PLACEHOLDER = CGROUP_BUILTIN_SUBSYS_COUNT - 1, + +#define IS_SUBSYS_ENABLED(option) IS_MODULE(option) +#include <linux/cgroup_subsys.h> +#undef IS_SUBSYS_ENABLED CGROUP_SUBSYS_COUNT, }; -#undef IS_SUBSYS_ENABLED #undef SUBSYS /* Per-subsystem/per-cgroup state maintained by the system. */ diff --git a/kernel/cgroup.c b/kernel/cgroup.c index f4554cc..29273db 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4944,17 +4944,17 @@ void cgroup_post_fork(struct task_struct *child) * and addition to css_set. */ if (need_forkexit_callback) { - for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { + /* + * fork/exit callbacks are supported only for builtin + * subsystems, and the builtin section of the subsys + * array is immutable, so we don't need to lock the + * subsys array here. On the other hand, modular section + * of the array can be freed at module unload, so we + * can't touch that. + */ + for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) { struct cgroup_subsys *ss = subsys[i]; - /* - * fork/exit callbacks are supported only for - * builtin subsystems and we don't need further - * synchronization as they never go away. - */ - if (!ss || ss->module) - continue; - if (ss->fork) ss->fork(child); } @@ -5019,13 +5019,17 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks) tsk->cgroups = &init_css_set; if (run_callbacks && need_forkexit_callback) { - for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { + /* + * fork/exit callbacks are supported only for builtin + * subsystems, and the builtin section of the subsys + * array is immutable, so we don't need to lock the + * subsys array here. On the other hand, modular section + * of the array can be freed at module unload, so we + * can't touch that. + */ + for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) { struct cgroup_subsys *ss = subsys[i]; - /* modular subsystems can't use callbacks */ - if (!ss || ss->module) - continue; - if (ss->exit) { struct cgroup *old_cgrp = rcu_dereference_raw(cg->subsys[i])->cgroup; -- 1.8.0.2 ^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <5130537C.5010608-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking [not found] ` <5130537C.5010608-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2013-03-04 18:04 ` Tejun Heo 0 siblings, 0 replies; 4+ messages in thread From: Tejun Heo @ 2013-03-04 18:04 UTC (permalink / raw) To: Li Zefan; +Cc: LKML, cgroups Hello, Li. On Fri, Mar 01, 2013 at 03:06:36PM +0800, Li Zefan wrote: > /* Define the enumeration of all builtin cgroup subsystems */ > #define SUBSYS(_x) _x ## _subsys_id, > -#define IS_SUBSYS_ENABLED(option) IS_ENABLED(option) > enum cgroup_subsys_id { > +#define IS_SUBSYS_ENABLED(option) IS_BUILTIN(option) > #include <linux/cgroup_subsys.h> > +#undef IS_SUBSYS_ENABLED > + CGROUP_BUILTIN_SUBSYS_COUNT, > + > + __CGROUP_SUBSYS_TEMP_PLACEHOLDER = CGROUP_BUILTIN_SUBSYS_COUNT - 1, > + > +#define IS_SUBSYS_ENABLED(option) IS_MODULE(option) > +#include <linux/cgroup_subsys.h> > +#undef IS_SUBSYS_ENABLED > CGROUP_SUBSYS_COUNT, > }; > -#undef IS_SUBSYS_ENABLED > #undef SUBSYS Arghh.... can we at least have a comment explaining what we're doing here? It's ugly and confusing. > @@ -5019,13 +5019,17 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks) > tsk->cgroups = &init_css_set; > > if (run_callbacks && need_forkexit_callback) { > - for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { > + /* > + * fork/exit callbacks are supported only for builtin > + * subsystems, and the builtin section of the subsys > + * array is immutable, so we don't need to lock the > + * subsys array here. On the other hand, modular section > + * of the array can be freed at module unload, so we > + * can't touch that. > + */ > + for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) { Probably enough to say "for/exit callback are supported only for builtin subsys, see cgroup_for() for details"? Thanks. -- tejun ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] cgroup: no need to check css refs for release notification [not found] ` <5130535F.7060201-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2013-03-01 7:06 ` [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking Li Zefan @ 2013-03-04 18:05 ` Tejun Heo 1 sibling, 0 replies; 4+ messages in thread From: Tejun Heo @ 2013-03-04 18:05 UTC (permalink / raw) To: Li Zefan; +Cc: LKML, cgroups On Fri, Mar 01, 2013 at 03:06:07PM +0800, Li Zefan wrote: > We no longer fail rmdir() when there're still css refs, so we don't > need to check css refs in check_for_release(). > > This also voids a bug. cgroup_has_css_refs() accesses subsys[i] > without cgroup_mutex, so it can race with cgroup_unload_subsys(). > > cgroup_has_css_refs() > ... > if (ss == NULL || ss->root != cgrp->root) > > if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded > right after the former check but before the latter, the memory that > net_cls_subsys resides has become invalid. > > Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> Applied to cgroup/for-3.10. Thanks. -- tejun ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-04 18:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-01 7:06 [PATCH 1/2] cgroup: no need to check css refs for release notification Li Zefan
[not found] ` <5130535F.7060201-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-03-01 7:06 ` [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking Li Zefan
[not found] ` <5130537C.5010608-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-03-04 18:04 ` Tejun Heo
2013-03-04 18:05 ` [PATCH 1/2] cgroup: no need to check css refs for release notification Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox