* [PATCH 1/2] cgroup: no need to check css refs for release notification
@ 2013-03-01 7:06 Li Zefan
2013-03-01 7:06 ` [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking Li Zefan
2013-03-04 18:05 ` [PATCH 1/2] cgroup: no need to check css refs for release notification Tejun Heo
0 siblings, 2 replies; 4+ messages in thread
From: Li Zefan @ 2013-03-01 7:06 UTC (permalink / raw)
To: Tejun Heo; +Cc: LKML, cgroups
We no longer fail rmdir() when there're still css refs, so we don't
need to check css refs in check_for_release().
This also voids a bug. cgroup_has_css_refs() accesses subsys[i]
without cgroup_mutex, so it can race with cgroup_unload_subsys().
cgroup_has_css_refs()
...
if (ss == NULL || ss->root != cgrp->root)
if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded
right after the former check but before the latter, the memory that
net_cls_subsys resides has become invalid.
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
kernel/cgroup.c | 67 +++++++--------------------------------------------------
1 file changed, 8 insertions(+), 59 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 43ff59e..f4554cc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4343,47 +4343,6 @@ static int cgroup_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
return cgroup_create(c_parent, dentry, mode | S_IFDIR);
}
-/*
- * Check the reference count on each subsystem. Since we already
- * established that there are no tasks in the cgroup, if the css refcount
- * is also 1, then there should be no outstanding references, so the
- * subsystem is safe to destroy. We scan across all subsystems rather than
- * using the per-hierarchy linked list of mounted subsystems since we can
- * be called via check_for_release() with no synchronization other than
- * RCU, and the subsystem linked list isn't RCU-safe.
- */
-static int cgroup_has_css_refs(struct cgroup *cgrp)
-{
- int i;
-
- /*
- * We won't need to lock the subsys array, because the subsystems
- * we're concerned about aren't going anywhere since our cgroup root
- * has a reference on them.
- */
- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
- struct cgroup_subsys *ss = subsys[i];
- struct cgroup_subsys_state *css;
-
- /* Skip subsystems not present or not in this hierarchy */
- if (ss == NULL || ss->root != cgrp->root)
- continue;
-
- css = cgrp->subsys[ss->subsys_id];
- /*
- * When called from check_for_release() it's possible
- * that by this point the cgroup has been removed
- * and the css deleted. But a false-positive doesn't
- * matter, since it can only happen if the cgroup
- * has been deleted and hence no longer needs the
- * release agent to be called anyway.
- */
- if (css && css_refcnt(css) > 1)
- return 1;
- }
- return 0;
-}
-
static int cgroup_destroy_locked(struct cgroup *cgrp)
__releases(&cgroup_mutex) __acquires(&cgroup_mutex)
{
@@ -5112,12 +5071,15 @@ static void check_for_release(struct cgroup *cgrp)
{
/* All of these checks rely on RCU to keep the cgroup
* structure alive */
- if (cgroup_is_releasable(cgrp) && !atomic_read(&cgrp->count)
- && list_empty(&cgrp->children) && !cgroup_has_css_refs(cgrp)) {
- /* Control Group is currently removeable. If it's not
+ if (cgroup_is_releasable(cgrp) &&
+ !atomic_read(&cgrp->count) && list_empty(&cgrp->children)) {
+ /*
+ * Control Group is currently removeable. If it's not
* already queued for a userspace notification, queue
- * it now */
+ * it now
+ */
int need_schedule_work = 0;
+
raw_spin_lock(&release_list_lock);
if (!cgroup_is_removed(cgrp) &&
list_empty(&cgrp->release_list)) {
@@ -5150,24 +5112,11 @@ EXPORT_SYMBOL_GPL(__css_tryget);
/* Caller must verify that the css is not for root cgroup */
void __css_put(struct cgroup_subsys_state *css)
{
- struct cgroup *cgrp = css->cgroup;
int v;
- rcu_read_lock();
v = css_unbias_refcnt(atomic_dec_return(&css->refcnt));
-
- switch (v) {
- case 1:
- if (notify_on_release(cgrp)) {
- set_bit(CGRP_RELEASABLE, &cgrp->flags);
- check_for_release(cgrp);
- }
- break;
- case 0:
+ if (v == 0)
schedule_work(&css->dput_work);
- break;
- }
- rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(__css_put);
--
1.8.0.2
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking
2013-03-01 7:06 [PATCH 1/2] cgroup: no need to check css refs for release notification Li Zefan
@ 2013-03-01 7:06 ` Li Zefan
2013-03-04 18:04 ` Tejun Heo
2013-03-04 18:05 ` [PATCH 1/2] cgroup: no need to check css refs for release notification Tejun Heo
1 sibling, 1 reply; 4+ messages in thread
From: Li Zefan @ 2013-03-01 7:06 UTC (permalink / raw)
To: Tejun Heo; +Cc: LKML, cgroups
subsys[i] is set to NULL in cgroup_unload_subsys() at modular unload,
and that's protected by cgroup_mutex, and then the memory *subsys[i]
resides will be freed.
So this is unsafe without any locking:
if (!ss || ss->module)
...
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
include/linux/cgroup.h | 11 +++++++++--
kernel/cgroup.c | 32 ++++++++++++++++++--------------
2 files changed, 27 insertions(+), 16 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 75c6ec1..3ac6bb0 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -46,12 +46,19 @@ extern const struct file_operations proc_cgroup_operations;
/* Define the enumeration of all builtin cgroup subsystems */
#define SUBSYS(_x) _x ## _subsys_id,
-#define IS_SUBSYS_ENABLED(option) IS_ENABLED(option)
enum cgroup_subsys_id {
+#define IS_SUBSYS_ENABLED(option) IS_BUILTIN(option)
#include <linux/cgroup_subsys.h>
+#undef IS_SUBSYS_ENABLED
+ CGROUP_BUILTIN_SUBSYS_COUNT,
+
+ __CGROUP_SUBSYS_TEMP_PLACEHOLDER = CGROUP_BUILTIN_SUBSYS_COUNT - 1,
+
+#define IS_SUBSYS_ENABLED(option) IS_MODULE(option)
+#include <linux/cgroup_subsys.h>
+#undef IS_SUBSYS_ENABLED
CGROUP_SUBSYS_COUNT,
};
-#undef IS_SUBSYS_ENABLED
#undef SUBSYS
/* Per-subsystem/per-cgroup state maintained by the system. */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index f4554cc..29273db 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4944,17 +4944,17 @@ void cgroup_post_fork(struct task_struct *child)
* and addition to css_set.
*/
if (need_forkexit_callback) {
- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /*
+ * fork/exit callbacks are supported only for builtin
+ * subsystems, and the builtin section of the subsys
+ * array is immutable, so we don't need to lock the
+ * subsys array here. On the other hand, modular section
+ * of the array can be freed at module unload, so we
+ * can't touch that.
+ */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
- /*
- * fork/exit callbacks are supported only for
- * builtin subsystems and we don't need further
- * synchronization as they never go away.
- */
- if (!ss || ss->module)
- continue;
-
if (ss->fork)
ss->fork(child);
}
@@ -5019,13 +5019,17 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
tsk->cgroups = &init_css_set;
if (run_callbacks && need_forkexit_callback) {
- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /*
+ * fork/exit callbacks are supported only for builtin
+ * subsystems, and the builtin section of the subsys
+ * array is immutable, so we don't need to lock the
+ * subsys array here. On the other hand, modular section
+ * of the array can be freed at module unload, so we
+ * can't touch that.
+ */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
- /* modular subsystems can't use callbacks */
- if (!ss || ss->module)
- continue;
-
if (ss->exit) {
struct cgroup *old_cgrp =
rcu_dereference_raw(cg->subsys[i])->cgroup;
--
1.8.0.2
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking
2013-03-01 7:06 ` [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking Li Zefan
@ 2013-03-04 18:04 ` Tejun Heo
0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2013-03-04 18:04 UTC (permalink / raw)
To: Li Zefan; +Cc: LKML, cgroups
Hello, Li.
On Fri, Mar 01, 2013 at 03:06:36PM +0800, Li Zefan wrote:
> /* Define the enumeration of all builtin cgroup subsystems */
> #define SUBSYS(_x) _x ## _subsys_id,
> -#define IS_SUBSYS_ENABLED(option) IS_ENABLED(option)
> enum cgroup_subsys_id {
> +#define IS_SUBSYS_ENABLED(option) IS_BUILTIN(option)
> #include <linux/cgroup_subsys.h>
> +#undef IS_SUBSYS_ENABLED
> + CGROUP_BUILTIN_SUBSYS_COUNT,
> +
> + __CGROUP_SUBSYS_TEMP_PLACEHOLDER = CGROUP_BUILTIN_SUBSYS_COUNT - 1,
> +
> +#define IS_SUBSYS_ENABLED(option) IS_MODULE(option)
> +#include <linux/cgroup_subsys.h>
> +#undef IS_SUBSYS_ENABLED
> CGROUP_SUBSYS_COUNT,
> };
> -#undef IS_SUBSYS_ENABLED
> #undef SUBSYS
Arghh.... can we at least have a comment explaining what we're doing
here? It's ugly and confusing.
> @@ -5019,13 +5019,17 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
> tsk->cgroups = &init_css_set;
>
> if (run_callbacks && need_forkexit_callback) {
> - for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
> + /*
> + * fork/exit callbacks are supported only for builtin
> + * subsystems, and the builtin section of the subsys
> + * array is immutable, so we don't need to lock the
> + * subsys array here. On the other hand, modular section
> + * of the array can be freed at module unload, so we
> + * can't touch that.
> + */
> + for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
Probably enough to say "for/exit callback are supported only for
builtin subsys, see cgroup_for() for details"?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] cgroup: no need to check css refs for release notification
2013-03-01 7:06 [PATCH 1/2] cgroup: no need to check css refs for release notification Li Zefan
2013-03-01 7:06 ` [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking Li Zefan
@ 2013-03-04 18:05 ` Tejun Heo
1 sibling, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2013-03-04 18:05 UTC (permalink / raw)
To: Li Zefan; +Cc: LKML, cgroups
On Fri, Mar 01, 2013 at 03:06:07PM +0800, Li Zefan wrote:
> We no longer fail rmdir() when there're still css refs, so we don't
> need to check css refs in check_for_release().
>
> This also voids a bug. cgroup_has_css_refs() accesses subsys[i]
> without cgroup_mutex, so it can race with cgroup_unload_subsys().
>
> cgroup_has_css_refs()
> ...
> if (ss == NULL || ss->root != cgrp->root)
>
> if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded
> right after the former check but before the latter, the memory that
> net_cls_subsys resides has become invalid.
>
> Signed-off-by: Li Zefan <lizefan@huawei.com>
Applied to cgroup/for-3.10.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-04 18:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-01 7:06 [PATCH 1/2] cgroup: no need to check css refs for release notification Li Zefan
2013-03-01 7:06 ` [PATCH 2/2] cgroup: avoid accessing modular cgroup subsys structure without locking Li Zefan
2013-03-04 18:04 ` Tejun Heo
2013-03-04 18:05 ` [PATCH 1/2] cgroup: no need to check css refs for release notification Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox