* [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups
@ 2013-06-21 22:51 Tejun Heo
[not found] ` <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2013-06-21 22:51 UTC (permalink / raw)
To: Li Zefan
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Fengguang Wu
task->cgroups is a RCU pointer pointing to struct css_set. A task
switches to a different css_set on cgroup migration but a css_set
doesn't change once created and its pointers to cgroup_subsys_states
aren't RCU protected.
task_subsys_state[_check]() is the macro to acquire css given a task
and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not
task->cgroups, so the RCU pointer task->cgroups ends up being
dereferenced without read_barrier_depends() after it. It's broken.
Fix it by introducing task_css_set[_check]() which does
RCU-dereference on task->cgroups. task_subsys_state[_check]() is
reimplemented to directly dereference ->subsys[] of the css_set
returned from task_css_set[_check]().
This removes some of sparse RCU warnings in cgroup.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
Hello,
Three RCU fixe patches. The first one fixes an actual bug. The other
two add missing annoations so that sparse doesn't generate spurious
RCU address space warnings.
Thanks!
include/linux/cgroup.h | 58 ++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 48 insertions(+), 10 deletions(-)
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -638,22 +638,60 @@ static inline struct cgroup_subsys_state
return cgrp->subsys[subsys_id];
}
-/*
- * function to get the cgroup_subsys_state which allows for extra
- * rcu_dereference_check() conditions, such as locks used during the
- * cgroup_subsys::attach() methods.
+/**
+ * task_css_set_check - obtain a task's css_set with extra access conditions
+ * @task: the task to obtain css_set for
+ * @__c: extra condition expression to be passed to rcu_dereference_check()
+ *
+ * A task's css_set is RCU protected, initialized and exited while holding
+ * task_lock(), and can only be modified while holding both cgroup_mutex
+ * and task_lock() while the task is alive. This macro verifies that the
+ * caller is inside proper critical section and returns @task's css_set.
+ *
+ * The caller can also specify additional allowed conditions via @__c, such
+ * as locks used during the cgroup_subsys::attach() methods.
*/
#ifdef CONFIG_PROVE_RCU
extern struct mutex cgroup_mutex;
-#define task_subsys_state_check(task, subsys_id, __c) \
- rcu_dereference_check((task)->cgroups->subsys[(subsys_id)], \
- lockdep_is_held(&(task)->alloc_lock) || \
- lockdep_is_held(&cgroup_mutex) || (__c))
+#define task_css_set_check(task, __c) \
+ rcu_dereference_check((task)->cgroups, \
+ lockdep_is_held(&(task)->alloc_lock) || \
+ lockdep_is_held(&cgroup_mutex) || (__c))
#else
-#define task_subsys_state_check(task, subsys_id, __c) \
- rcu_dereference((task)->cgroups->subsys[(subsys_id)])
+#define task_css_set_check(task, __c) \
+ rcu_dereference_raw((task)->cgroups
#endif
+/**
+ * task_subsys_state_check - obtain css for (task, subsys) w/ extra access conds
+ * @task: the target task
+ * @subsys_id: the target subsystem ID
+ * @__c: extra condition expression to be passed to rcu_dereference_check()
+ *
+ * Return the cgroup_subsys_state for the (@task, @subsys_id) pair. The
+ * synchronization rules are the same as task_css_set_check().
+ */
+#define task_subsys_state_check(task, subsys_id, __c) \
+ task_css_set_check((task), (__c))->subsys[(subsys_id)]
+
+/**
+ * task_css_set - obtain a task's css_set
+ * @task: the task to obtain css_set for
+ *
+ * See task_css_set_check().
+ */
+static inline struct css_set *task_css_set(struct task_struct *task)
+{
+ return task_css_set_check(task, false);
+}
+
+/**
+ * task_subsys_state - obtain css for (task, subsys)
+ * @task: the target task
+ * @subsys_id: the target subsystem ID
+ *
+ * See task_subsys_state_check().
+ */
static inline struct cgroup_subsys_state *
task_subsys_state(struct task_struct *task, int subsys_id)
{
^ permalink raw reply [flat|nested] 11+ messages in thread[parent not found: <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups [not found] ` <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2013-06-21 22:52 ` Tejun Heo [not found] ` <20130621225204.GD3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-25 1:55 ` [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups Li Zefan ` (2 subsequent siblings) 3 siblings, 1 reply; 11+ messages in thread From: Tejun Heo @ 2013-06-21 22:52 UTC (permalink / raw) To: Li Zefan Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu There are several places in kernel/cgroup.c where task->cgroups is accessed and modified without going through proper RCU accessors. None is broken as they're all lock protected accesses; however, this still triggers sparse RCU address space warnings. * Consistently use task_css_set() for task->cgroups dereferencing. * Use RCU_INIT_POINTER() to clear task->cgroups to &init_css_set on exit. * Remove unnecessary rcu_dereference_raw() from cset->subsys[] dereference in cgroup_exit(). Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- kernel/cgroup.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -704,7 +704,7 @@ static struct cgroup *task_cgroup_from_r * task can't change groups, so the only thing that can happen * is that it exits and its css is set back to init_css_set. */ - cset = task->cgroups; + cset = task_css_set(task); if (cset == &init_css_set) { res = &root->top_cgroup; } else { @@ -1948,7 +1948,7 @@ static void cgroup_task_migrate(struct c * css_set to init_css_set and dropping the old one. */ WARN_ON_ONCE(tsk->flags & PF_EXITING); - old_cset = tsk->cgroups; + old_cset = task_css_set(tsk); task_lock(tsk); rcu_assign_pointer(tsk->cgroups, new_cset); @@ -2071,8 +2071,11 @@ static int cgroup_attach_task(struct cgr * we use find_css_set, which allocates a new one if necessary. */ for (i = 0; i < group_size; i++) { + struct css_set *old_cset; + tc = flex_array_get(group, i); - tc->cg = find_css_set(tc->task->cgroups, cgrp); + old_cset = task_css_set(tc->task); + tc->cg = find_css_set(old_cset, cgrp); if (!tc->cg) { retval = -ENOMEM; goto out_put_css_set_refs; @@ -2989,7 +2992,7 @@ static void cgroup_enable_task_cg_lists( * entry won't be deleted though the process has exited. */ if (!(p->flags & PF_EXITING) && list_empty(&p->cg_list)) - list_add(&p->cg_list, &p->cgroups->tasks); + list_add(&p->cg_list, &task_css_set(p)->tasks); task_unlock(p); } while_each_thread(g, p); read_unlock(&tasklist_lock); @@ -5046,8 +5049,8 @@ static const struct file_operations proc void cgroup_fork(struct task_struct *child) { task_lock(current); + get_css_set(task_css_set(current)); child->cgroups = current->cgroups; - get_css_set(child->cgroups); task_unlock(current); INIT_LIST_HEAD(&child->cg_list); } @@ -5081,7 +5084,7 @@ void cgroup_post_fork(struct task_struct write_lock(&css_set_lock); task_lock(child); if (list_empty(&child->cg_list)) - list_add(&child->cg_list, &child->cgroups->tasks); + list_add(&child->cg_list, &task_css_set(child)->tasks); task_unlock(child); write_unlock(&css_set_lock); } @@ -5163,8 +5166,8 @@ void cgroup_exit(struct task_struct *tsk /* Reassign the task to the init_css_set. */ task_lock(tsk); - cset = tsk->cgroups; - tsk->cgroups = &init_css_set; + cset = task_css_set(tsk); + RCU_INIT_POINTER(tsk->cgroups, &init_css_set); if (run_callbacks && need_forkexit_callback) { /* @@ -5175,8 +5178,7 @@ void cgroup_exit(struct task_struct *tsk struct cgroup_subsys *ss = subsys[i]; if (ss->exit) { - struct cgroup *old_cgrp = - rcu_dereference_raw(cset->subsys[i])->cgroup; + struct cgroup *old_cgrp = cset->subsys[i]->cgroup; struct cgroup *cgrp = task_cgroup(tsk, i); ss->exit(cgrp, old_cgrp, tsk); } @@ -5546,7 +5548,7 @@ static u64 current_css_set_refcount_read u64 count; rcu_read_lock(); - count = atomic_read(¤t->cgroups->refcount); + count = atomic_read(&task_css_set(current)->refcount); rcu_read_unlock(); return count; } ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20130621225204.GD3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* [PATCH cgroup/for-3.11 3/3] cgroup: always use RCU accessors for protected accesses [not found] ` <20130621225204.GD3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2013-06-21 22:52 ` Tejun Heo [not found] ` <20130621225233.GE3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-25 2:02 ` [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups Li Zefan 2013-06-26 3:29 ` Li Zefan 2 siblings, 1 reply; 11+ messages in thread From: Tejun Heo @ 2013-06-21 22:52 UTC (permalink / raw) To: Li Zefan Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, Fengguang Wu kernel/cgroup.c still has places where a RCU pointer is set and accessed directly without going through RCU_INIT_POINTER() or rcu_dereference_protected(). They're all properly protected accesses so nothing is broken but it leads to spurious sparse RCU address space warnings. Substitute direct accesses with RCU_INIT_POINTER() and rcu_dereference_protected(). Note that %true is specified as the extra condition for all derference updates. This isn't ideal as all it does is suppressing warning without actually policing synchronization rules; however, most are scheduled to be removed pretty soon along with css_id itself, so no reason to be more elaborate. Combined with the previous changes, this removes all RCU related sparse warnings from cgroup. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- kernel/cgroup.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1416,7 +1416,7 @@ static void init_cgroup_root(struct cgro INIT_LIST_HEAD(&root->root_list); root->number_of_cgroups = 1; cgrp->root = root; - cgrp->name = &root_cgroup_name; + RCU_INIT_POINTER(cgrp->name, &root_cgroup_name); init_cgroup_housekeeping(cgrp); } @@ -2535,7 +2535,7 @@ static int cgroup_rename(struct inode *o return ret; } - old_name = cgrp->name; + old_name = rcu_dereference_protected(cgrp->name, true); rcu_assign_pointer(cgrp->name, name); kfree_rcu(old_name, rcu_head); @@ -4154,13 +4154,15 @@ static int cgroup_populate_dir(struct cg /* This cgroup is ready now */ for_each_subsys(cgrp->root, ss) { struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id]; + struct css_id *id = rcu_dereference_protected(css->id, true); + /* * Update id->css pointer and make this css visible from * CSS ID functions. This pointer will be dereferened * from RCU-read-side without locks. */ - if (css->id) - rcu_assign_pointer(css->id->css, css); + if (id) + rcu_assign_pointer(id->css, css); } return 0; @@ -4837,7 +4839,7 @@ int __init cgroup_init_early(void) css_set_count = 1; init_cgroup_root(&rootnode); root_count = 1; - init_task.cgroups = &init_css_set; + RCU_INIT_POINTER(init_task.cgroups, &init_css_set); init_cgrp_cset_link.cset = &init_css_set; init_cgrp_cset_link.cgrp = dummytop; @@ -5371,7 +5373,8 @@ bool css_is_ancestor(struct cgroup_subsy void free_css_id(struct cgroup_subsys *ss, struct cgroup_subsys_state *css) { - struct css_id *id = css->id; + struct css_id *id = rcu_dereference_protected(css->id, true); + /* When this is called before css_id initialization, id can be NULL */ if (!id) return; @@ -5437,8 +5440,8 @@ static int __init_or_module cgroup_init_ return PTR_ERR(newid); newid->stack[0] = newid->id; - newid->css = rootcss; - rootcss->id = newid; + RCU_INIT_POINTER(newid->css, rootcss); + RCU_INIT_POINTER(rootcss->id, newid); return 0; } @@ -5452,7 +5455,7 @@ static int alloc_css_id(struct cgroup_su subsys_id = ss->subsys_id; parent_css = parent->subsys[subsys_id]; child_css = child->subsys[subsys_id]; - parent_id = parent_css->id; + parent_id = rcu_dereference_protected(parent_css->id, true); depth = parent_id->depth + 1; child_id = get_new_cssid(ss, depth); ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20130621225233.GE3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* Re: [PATCH cgroup/for-3.11 3/3] cgroup: always use RCU accessors for protected accesses [not found] ` <20130621225233.GE3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2013-06-25 2:04 ` Li Zefan 0 siblings, 0 replies; 11+ messages in thread From: Li Zefan @ 2013-06-25 2:04 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu On 2013/6/22 6:52, Tejun Heo wrote: > kernel/cgroup.c still has places where a RCU pointer is set and > accessed directly without going through RCU_INIT_POINTER() or > rcu_dereference_protected(). They're all properly protected accesses > so nothing is broken but it leads to spurious sparse RCU address space > warnings. > > Substitute direct accesses with RCU_INIT_POINTER() and > rcu_dereference_protected(). Note that %true is specified as the > extra condition for all derference updates. This isn't ideal as all > it does is suppressing warning without actually policing > synchronization rules; however, most are scheduled to be removed > pretty soon along with css_id itself, so no reason to be more > elaborate. > > Combined with the previous changes, this removes all RCU related > sparse warnings from cgroup. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > --- > kernel/cgroup.c | 21 ++++++++++++--------- > 1 file changed, 12 insertions(+), 9 deletions(-) > Acked-by; Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups [not found] ` <20130621225204.GD3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-21 22:52 ` [PATCH cgroup/for-3.11 3/3] cgroup: always use RCU accessors for protected accesses Tejun Heo @ 2013-06-25 2:02 ` Li Zefan [not found] ` <51C8FA3E.9020104-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2013-06-26 3:29 ` Li Zefan 2 siblings, 1 reply; 11+ messages in thread From: Li Zefan @ 2013-06-25 2:02 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu > @@ -5046,8 +5049,8 @@ static const struct file_operations proc > void cgroup_fork(struct task_struct *child) > { > task_lock(current); > + get_css_set(task_css_set(current)); > child->cgroups = current->cgroups; While we use RCU_INIT_POINTER() in cgroup_exit(), we don't need to use it here? > - get_css_set(child->cgroups); > task_unlock(current); > INIT_LIST_HEAD(&child->cg_list); > } > @@ -5081,7 +5084,7 @@ void cgroup_post_fork(struct task_struct > write_lock(&css_set_lock); > task_lock(child); > if (list_empty(&child->cg_list)) > - list_add(&child->cg_list, &child->cgroups->tasks); > + list_add(&child->cg_list, &task_css_set(child)->tasks); > task_unlock(child); > write_unlock(&css_set_lock); > } > @@ -5163,8 +5166,8 @@ void cgroup_exit(struct task_struct *tsk > > /* Reassign the task to the init_css_set. */ > task_lock(tsk); > - cset = tsk->cgroups; > - tsk->cgroups = &init_css_set; > + cset = task_css_set(tsk); > + RCU_INIT_POINTER(tsk->cgroups, &init_css_set); > > if (run_callbacks && need_forkexit_callback) { > /* ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <51C8FA3E.9020104-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups [not found] ` <51C8FA3E.9020104-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2013-06-25 18:50 ` Tejun Heo 0 siblings, 0 replies; 11+ messages in thread From: Tejun Heo @ 2013-06-25 18:50 UTC (permalink / raw) To: Li Zefan Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Tejun Heo, Fengguang Wu, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Tue, Jun 25, 2013 at 10:02:38AM +0800, Li Zefan wrote: > > @@ -5046,8 +5049,8 @@ static const struct file_operations proc > > void cgroup_fork(struct task_struct *child) > > { > > task_lock(current); > > + get_css_set(task_css_set(current)); > > child->cgroups = current->cgroups; > > While we use RCU_INIT_POINTER() in cgroup_exit(), we don't need to use it here? Yeap, because both are RCU pointers. There's no cross (sparse) address space assignment going on. Thanks. -- tejun ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups [not found] ` <20130621225204.GD3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-21 22:52 ` [PATCH cgroup/for-3.11 3/3] cgroup: always use RCU accessors for protected accesses Tejun Heo 2013-06-25 2:02 ` [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups Li Zefan @ 2013-06-26 3:29 ` Li Zefan 2 siblings, 0 replies; 11+ messages in thread From: Li Zefan @ 2013-06-26 3:29 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu On 2013/6/22 6:52, Tejun Heo wrote: > There are several places in kernel/cgroup.c where task->cgroups is > accessed and modified without going through proper RCU accessors. > None is broken as they're all lock protected accesses; however, this > still triggers sparse RCU address space warnings. > > * Consistently use task_css_set() for task->cgroups dereferencing. > > * Use RCU_INIT_POINTER() to clear task->cgroups to &init_css_set on > exit. > > * Remove unnecessary rcu_dereference_raw() from cset->subsys[] > dereference in cgroup_exit(). > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups [not found] ` <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-21 22:52 ` [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups Tejun Heo @ 2013-06-25 1:55 ` Li Zefan 2013-06-25 18:48 ` [PATCH v2 " Tejun Heo 2013-06-26 17:49 ` [PATCH " Tejun Heo 3 siblings, 0 replies; 11+ messages in thread From: Li Zefan @ 2013-06-25 1:55 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu On 2013/6/22 6:51, Tejun Heo wrote: > task->cgroups is a RCU pointer pointing to struct css_set. A task > switches to a different css_set on cgroup migration but a css_set > doesn't change once created and its pointers to cgroup_subsys_states > aren't RCU protected. > > task_subsys_state[_check]() is the macro to acquire css given a task > and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not > task->cgroups, so the RCU pointer task->cgroups ends up being > dereferenced without read_barrier_depends() after it. It's broken. > > Fix it by introducing task_css_set[_check]() which does > RCU-dereference on task->cgroups. task_subsys_state[_check]() is > reimplemented to directly dereference ->subsys[] of the css_set > returned from task_css_set[_check](). > > This removes some of sparse RCU warnings in cgroup. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > --- > Hello, > > Three RCU fixe patches. The first one fixes an actual bug. The other > two add missing annoations so that sparse doesn't generate spurious > RCU address space warnings. > > Thanks! > > include/linux/cgroup.h | 58 ++++++++++++++++++++++++++++++++++++++++--------- > 1 file changed, 48 insertions(+), 10 deletions(-) > > --- a/include/linux/cgroup.h > +++ b/include/linux/cgroup.h > @@ -638,22 +638,60 @@ static inline struct cgroup_subsys_state > return cgrp->subsys[subsys_id]; > } > > -/* > - * function to get the cgroup_subsys_state which allows for extra > - * rcu_dereference_check() conditions, such as locks used during the > - * cgroup_subsys::attach() methods. > +/** > + * task_css_set_check - obtain a task's css_set with extra access conditions > + * @task: the task to obtain css_set for > + * @__c: extra condition expression to be passed to rcu_dereference_check() > + * > + * A task's css_set is RCU protected, initialized and exited while holding > + * task_lock(), and can only be modified while holding both cgroup_mutex > + * and task_lock() while the task is alive. This macro verifies that the > + * caller is inside proper critical section and returns @task's css_set. > + * > + * The caller can also specify additional allowed conditions via @__c, such > + * as locks used during the cgroup_subsys::attach() methods. > */ > #ifdef CONFIG_PROVE_RCU > extern struct mutex cgroup_mutex; > -#define task_subsys_state_check(task, subsys_id, __c) \ > - rcu_dereference_check((task)->cgroups->subsys[(subsys_id)], \ > - lockdep_is_held(&(task)->alloc_lock) || \ > - lockdep_is_held(&cgroup_mutex) || (__c)) > +#define task_css_set_check(task, __c) \ > + rcu_dereference_check((task)->cgroups, \ > + lockdep_is_held(&(task)->alloc_lock) || \ > + lockdep_is_held(&cgroup_mutex) || (__c)) > #else > -#define task_subsys_state_check(task, subsys_id, __c) \ > - rcu_dereference((task)->cgroups->subsys[(subsys_id)]) > +#define task_css_set_check(task, __c) \ > + rcu_dereference_raw((task)->cgroups parenthesis unmatched. and why not just use rcu_dereference()? I guess it should be equivalent to rcu_derference_raw() if CONFIG_PROVE_RCU=n ? > #endif > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups [not found] ` <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-21 22:52 ` [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups Tejun Heo 2013-06-25 1:55 ` [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups Li Zefan @ 2013-06-25 18:48 ` Tejun Heo [not found] ` <20130625184832.GC20051-9pTldWuhBndy/B6EtB590w@public.gmane.org> 2013-06-26 17:49 ` [PATCH " Tejun Heo 3 siblings, 1 reply; 11+ messages in thread From: Tejun Heo @ 2013-06-25 18:48 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu task->cgroups is a RCU pointer pointing to struct css_set. A task switches to a different css_set on cgroup migration but a css_set doesn't change once created and its pointers to cgroup_subsys_states aren't RCU protected. task_subsys_state[_check]() is the macro to acquire css given a task and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not task->cgroups, so the RCU pointer task->cgroups ends up being dereferenced without read_barrier_depends() after it. It's broken. Fix it by introducing task_css_set[_check]() which does RCU-dereference on task->cgroups. task_subsys_state[_check]() is reimplemented to directly dereference ->subsys[] of the css_set returned from task_css_set[_check](). This removes some of sparse RCU warnings in cgroup. v2: Fixed unbalanced parenthsis and there's no need to use rcu_dereference_raw() when !CONFIG_PROVE_RCU. Both spotted by Li. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org --- include/linux/cgroup.h | 58 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 48 insertions(+), 10 deletions(-) --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -634,22 +634,60 @@ static inline struct cgroup_subsys_state return cgrp->subsys[subsys_id]; } -/* - * function to get the cgroup_subsys_state which allows for extra - * rcu_dereference_check() conditions, such as locks used during the - * cgroup_subsys::attach() methods. +/** + * task_css_set_check - obtain a task's css_set with extra access conditions + * @task: the task to obtain css_set for + * @__c: extra condition expression to be passed to rcu_dereference_check() + * + * A task's css_set is RCU protected, initialized and exited while holding + * task_lock(), and can only be modified while holding both cgroup_mutex + * and task_lock() while the task is alive. This macro verifies that the + * caller is inside proper critical section and returns @task's css_set. + * + * The caller can also specify additional allowed conditions via @__c, such + * as locks used during the cgroup_subsys::attach() methods. */ #ifdef CONFIG_PROVE_RCU extern struct mutex cgroup_mutex; -#define task_subsys_state_check(task, subsys_id, __c) \ - rcu_dereference_check((task)->cgroups->subsys[(subsys_id)], \ - lockdep_is_held(&(task)->alloc_lock) || \ - lockdep_is_held(&cgroup_mutex) || (__c)) +#define task_css_set_check(task, __c) \ + rcu_dereference_check((task)->cgroups, \ + lockdep_is_held(&(task)->alloc_lock) || \ + lockdep_is_held(&cgroup_mutex) || (__c)) #else -#define task_subsys_state_check(task, subsys_id, __c) \ - rcu_dereference((task)->cgroups->subsys[(subsys_id)]) +#define task_css_set_check(task, __c) \ + rcu_dereference((task)->cgroups) #endif +/** + * task_subsys_state_check - obtain css for (task, subsys) w/ extra access conds + * @task: the target task + * @subsys_id: the target subsystem ID + * @__c: extra condition expression to be passed to rcu_dereference_check() + * + * Return the cgroup_subsys_state for the (@task, @subsys_id) pair. The + * synchronization rules are the same as task_css_set_check(). + */ +#define task_subsys_state_check(task, subsys_id, __c) \ + task_css_set_check((task), (__c))->subsys[(subsys_id)] + +/** + * task_css_set - obtain a task's css_set + * @task: the task to obtain css_set for + * + * See task_css_set_check(). + */ +static inline struct css_set *task_css_set(struct task_struct *task) +{ + return task_css_set_check(task, false); +} + +/** + * task_subsys_state - obtain css for (task, subsys) + * @task: the target task + * @subsys_id: the target subsystem ID + * + * See task_subsys_state_check(). + */ static inline struct cgroup_subsys_state * task_subsys_state(struct task_struct *task, int subsys_id) { ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20130625184832.GC20051-9pTldWuhBndy/B6EtB590w@public.gmane.org>]
* Re: [PATCH v2 cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups [not found] ` <20130625184832.GC20051-9pTldWuhBndy/B6EtB590w@public.gmane.org> @ 2013-06-26 3:28 ` Li Zefan 0 siblings, 0 replies; 11+ messages in thread From: Li Zefan @ 2013-06-26 3:28 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Tejun Heo, Fengguang Wu, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On 2013/6/26 2:48, Tejun Heo wrote: > task->cgroups is a RCU pointer pointing to struct css_set. A task > switches to a different css_set on cgroup migration but a css_set > doesn't change once created and its pointers to cgroup_subsys_states > aren't RCU protected. > > task_subsys_state[_check]() is the macro to acquire css given a task > and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not > task->cgroups, so the RCU pointer task->cgroups ends up being > dereferenced without read_barrier_depends() after it. It's broken. > > Fix it by introducing task_css_set[_check]() which does > RCU-dereference on task->cgroups. task_subsys_state[_check]() is > reimplemented to directly dereference ->subsys[] of the css_set > returned from task_css_set[_check](). > > This removes some of sparse RCU warnings in cgroup. > > v2: Fixed unbalanced parenthsis and there's no need to use > rcu_dereference_raw() when !CONFIG_PROVE_RCU. Both spotted by Li. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups [not found] ` <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> ` (2 preceding siblings ...) 2013-06-25 18:48 ` [PATCH v2 " Tejun Heo @ 2013-06-26 17:49 ` Tejun Heo 3 siblings, 0 replies; 11+ messages in thread From: Tejun Heo @ 2013-06-26 17:49 UTC (permalink / raw) To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Fengguang Wu On Fri, Jun 21, 2013 at 03:51:17PM -0700, Tejun Heo wrote: > task->cgroups is a RCU pointer pointing to struct css_set. A task > switches to a different css_set on cgroup migration but a css_set > doesn't change once created and its pointers to cgroup_subsys_states > aren't RCU protected. > > task_subsys_state[_check]() is the macro to acquire css given a task > and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not > task->cgroups, so the RCU pointer task->cgroups ends up being > dereferenced without read_barrier_depends() after it. It's broken. > > Fix it by introducing task_css_set[_check]() which does > RCU-dereference on task->cgroups. task_subsys_state[_check]() is > reimplemented to directly dereference ->subsys[] of the css_set > returned from task_css_set[_check](). > > This removes some of sparse RCU warnings in cgroup. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Reported-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Applied 1-3 to cgroup/for-3.11. Thanks. -- tejun ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-06-26 17:49 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-21 22:51 [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups Tejun Heo
[not found] ` <20130621225116.GC3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-06-21 22:52 ` [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups Tejun Heo
[not found] ` <20130621225204.GD3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-06-21 22:52 ` [PATCH cgroup/for-3.11 3/3] cgroup: always use RCU accessors for protected accesses Tejun Heo
[not found] ` <20130621225233.GE3949-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-06-25 2:04 ` Li Zefan
2013-06-25 2:02 ` [PATCH cgroup/for-3.11 2/3] cgroup: fix RCU accesses around task->cgroups Li Zefan
[not found] ` <51C8FA3E.9020104-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-06-25 18:50 ` Tejun Heo
2013-06-26 3:29 ` Li Zefan
2013-06-25 1:55 ` [PATCH cgroup/for-3.11 1/3] cgroup: fix RCU accesses to task->cgroups Li Zefan
2013-06-25 18:48 ` [PATCH v2 " Tejun Heo
[not found] ` <20130625184832.GC20051-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-06-26 3:28 ` Li Zefan
2013-06-26 17:49 ` [PATCH " Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).