* [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates
@ 2014-05-09 19:32 Tejun Heo
2014-05-09 19:32 ` [PATCH 1/7] cgroup: fix offlining child waiting in cgroup_subtree_control_write() Tejun Heo
` (8 more replies)
0 siblings, 9 replies; 15+ messages in thread
From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw)
To: lizefan; +Cc: cgroups, linux-kernel
Hello,
Changes from the last take[L] are,
* 0002, 0003 and 0007 added.
* Other patches are refreshed without content change.
This patchset contains the following seven patches.
0001-cgroup-fix-offlining-child-waiting-in-cgroup_subtree.patch
0002-cgroup-cgroup_idr_lock-should-be-bh.patch
0003-cgroup-css_release-shouldn-t-clear-cgroup-subsys.patch
0004-cgroup-update-and-fix-parsing-of-cgroup.subtree_cont.patch
0005-cgroup-use-restart_syscall-for-retries-after-offline.patch
0006-cgroup-use-release_agent_path_lock-in-cgroup_release.patch
0007-cgroup-rename-css_tryget-to-css_tryget_online.patch
0001 fixes two bugs in cgroup_subtree_control_write().
0004 fixes and makes subtree_control parsing stricter.
0005 simplifies cgroup_substree_control_write() retry path by using
restart_syscall().
0006 makes cgroup_release_agent_show() use release_path_lock. The
original conversion missed this one.
0007 renames css_tryget() to css_tryget_online(). This patch was
posted separately before - [1] - and acked by Michal and Johannes.
This patchset is on top of cgroup/for-3.16 6e1a046e9458 ("Merge branch
'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
into for-3.16") and available in the following git branch.
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-post-unified-updates-v2
diffstat follows.
block/blk-cgroup.c | 2 -
fs/bio.c | 2 -
include/linux/cgroup.h | 14 +++----
kernel/cgroup.c | 88 +++++++++++++++++++++++++------------------------
kernel/cpuset.c | 6 +--
kernel/events/core.c | 3 +
mm/hugetlb_cgroup.c | 2 -
mm/memcontrol.c | 46 +++++++++++++------------
8 files changed, 84 insertions(+), 79 deletions(-)
--
tejun
[L] http://lkml.kernel.org/g/1399377044-29873-1-git-send-email-tj@kernel.org
[1] http://lkml.kernel.org/g/20140507160429.GE26540@htj.dyndns.org
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/7] cgroup: fix offlining child waiting in cgroup_subtree_control_write() 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-09 19:32 ` [PATCH 2/7] cgroup: cgroup_idr_lock should be bh Tejun Heo ` (7 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo cgroup_subtree_control_write() waits for offline to complete child-by-child before enabling a controller; however, it has a couple bugs. * It doesn't initialize the wait_queue_t. This can lead to infinite hang on the following schedule() among other things. * It forgets to pin the child before releasing cgroup_tree_mutex and performing schedule(). The child may already be gone by the time it wakes up and invokes finish_wait(). Pin the child being waited on. Signed-off-by: Tejun Heo <tj@kernel.org> --- kernel/cgroup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 9db1a96..95fc66b 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2594,16 +2594,18 @@ retry: * cases, wait till it's gone using offline_waitq. */ cgroup_for_each_live_child(child, cgrp) { - wait_queue_t wait; + DEFINE_WAIT(wait); if (!cgroup_css(child, ss)) continue; + cgroup_get(child); prepare_to_wait(&child->offline_waitq, &wait, TASK_UNINTERRUPTIBLE); mutex_unlock(&cgroup_tree_mutex); schedule(); finish_wait(&child->offline_waitq, &wait); + cgroup_put(child); goto retry; } -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/7] cgroup: cgroup_idr_lock should be bh 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo 2014-05-09 19:32 ` [PATCH 1/7] cgroup: fix offlining child waiting in cgroup_subtree_control_write() Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-09 19:32 ` [PATCH 3/7] cgroup: css_release() shouldn't clear cgroup->subsys[] Tejun Heo ` (6 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo cgroup_idr_remove() can be invoked from bh leading to lockdep detecting possible AA deadlock (IN_BH/ON_BH). Make the lock bh-safe. Signed-off-by: Tejun Heo <tj@kernel.org> --- kernel/cgroup.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 95fc66b..e2ff925 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -203,9 +203,9 @@ static int cgroup_idr_alloc(struct idr *idr, void *ptr, int start, int end, int ret; idr_preload(gfp_mask); - spin_lock(&cgroup_idr_lock); + spin_lock_bh(&cgroup_idr_lock); ret = idr_alloc(idr, ptr, start, end, gfp_mask); - spin_unlock(&cgroup_idr_lock); + spin_unlock_bh(&cgroup_idr_lock); idr_preload_end(); return ret; } @@ -214,17 +214,17 @@ static void *cgroup_idr_replace(struct idr *idr, void *ptr, int id) { void *ret; - spin_lock(&cgroup_idr_lock); + spin_lock_bh(&cgroup_idr_lock); ret = idr_replace(idr, ptr, id); - spin_unlock(&cgroup_idr_lock); + spin_unlock_bh(&cgroup_idr_lock); return ret; } static void cgroup_idr_remove(struct idr *idr, int id) { - spin_lock(&cgroup_idr_lock); + spin_lock_bh(&cgroup_idr_lock); idr_remove(idr, id); - spin_unlock(&cgroup_idr_lock); + spin_unlock_bh(&cgroup_idr_lock); } /** -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/7] cgroup: css_release() shouldn't clear cgroup->subsys[] 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo 2014-05-09 19:32 ` [PATCH 1/7] cgroup: fix offlining child waiting in cgroup_subtree_control_write() Tejun Heo 2014-05-09 19:32 ` [PATCH 2/7] cgroup: cgroup_idr_lock should be bh Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-09 19:32 ` [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" Tejun Heo ` (5 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo c1a71504e971 ("cgroup: don't recycle cgroup id until all csses' have been destroyed") made cgroup ID persist until a cgroup is released and add cgroup->subsys[] clearing to css_release() so that css_from_id() doesn't return a css which has already been released which happens before cgroup release; however, the right change here was updating offline_css() to clear cgroup->subsys[] which was done by e32978031016 ("cgroup: cgroup->subsys[] should be cleared after the css is offlined") instead of clearing it from css_release(). We're now clearing cgroup->subsys[] twice. This is okay for traditional hierarchies as a css's lifetime is the same as its cgroup's; however, this confuses unified hierarchy and turning on and off a controller repeatedly using "cgroup.subtree_control" can lead to an oops like the following which happens because cgroup->subsys[] is incorrectly cleared asynchronously by css_release(). BUG: unable to handle kernel NULL pointer dereference at 00000000000000 08 IP: [<ffffffff81130c11>] kill_css+0x21/0x1c0 PGD 1170d067 PUD f0ab067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 2 PID: 459 Comm: bash Not tainted 3.15.0-rc2-work+ #5 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff880009296710 ti: ffff88000e198000 task.ti: ffff88000e198000 RIP: 0010:[<ffffffff81130c11>] [<ffffffff81130c11>] kill_css+0x21/0x1c0 RSP: 0018:ffff88000e199dc8 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 0000000000000001 RSI: ffffffff8238a968 RDI: ffff880009296f98 RBP: ffff88000e199de0 R08: 0000000000000001 R09: 02b0000000000000 R10: 0000000000000000 R11: ffff880009296fc0 R12: 0000000000000001 R13: ffff88000db6fc58 R14: 0000000000000001 R15: ffff8800139dcc00 FS: 00007ff9160c5740(0000) GS:ffff88001fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 0000000013947000 CR4: 00000000000006e0 Stack: ffff88000e199de0 ffffffff82389160 0000000000000001 ffff88000e199e80 ffffffff8113537f 0000000000000007 ffff88000e74af00 ffff88000e199e48 ffff880009296710 ffff88000db6fc00 ffffffff8239c100 0000000000000002 Call Trace: [<ffffffff8113537f>] cgroup_subtree_control_write+0x85f/0xa00 [<ffffffff8112fd18>] cgroup_file_write+0x38/0x1d0 [<ffffffff8126fc97>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f2ae6>] vfs_write+0xb6/0x1c0 [<ffffffff811f35ad>] SyS_write+0x4d/0xc0 [<ffffffff81d0acd2>] system_call_fastpath+0x16/0x1b Code: 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 08 8b 05 37 ad 29 01 85 c0 0f 85 df 00 00 00 <48> 8b 43 08 48 8b 3b be 01 00 00 00 8b 48 5c d3 e6 e8 49 ff ff RIP [<ffffffff81130c11>] kill_css+0x21/0x1c0 RSP <ffff88000e199dc8> CR2: 0000000000000008 ---[ end trace e7aae1f877c4e1b4 ]--- Remove the unnecessary cgroup->subsys[] clearing from css_release(). Signed-off-by: Tejun Heo <tj@kernel.org> --- kernel/cgroup.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index e2ff925..35daf89 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4102,7 +4102,6 @@ static void css_release(struct percpu_ref *ref) container_of(ref, struct cgroup_subsys_state, refcnt); struct cgroup_subsys *ss = css->ss; - RCU_INIT_POINTER(css->cgroup->subsys[ss->id], NULL); cgroup_idr_remove(&ss->css_idr, css->id); call_rcu(&css->rcu_head, css_free_rcu_fn); -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo ` (2 preceding siblings ...) 2014-05-09 19:32 ` [PATCH 3/7] cgroup: css_release() shouldn't clear cgroup->subsys[] Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-13 3:51 ` Li Zefan 2014-05-09 19:32 ` [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() Tejun Heo ` (4 subsequent siblings) 8 siblings, 1 reply; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo I was confused that strsep() was equivalent to strtok_r() in skipping over consecutive delimiters. strsep() just splits at the first occurrence of one of the delimiters which makes the parsing very inflexible, which makes allowing multiple whitespace chars as delimters kinda moot. Let's just be consistently strict and require list of tokens separated by spaces. This is what Documentation/cgroups/unified-hierarchy.txt describes too. Also, parsing may access beyond the end of the string if the string ends with spaces or is zero-length. Make sure it skips zero-length tokens. Note that this also ensures that the parser doesn't puke on multiple consecutive spaces. v2: Add zero-length token skipping. Signed-off-by: Tejun Heo <tj@kernel.org> --- kernel/cgroup.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 35daf89..b81e7c0 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2542,11 +2542,13 @@ static int cgroup_subtree_control_write(struct cgroup_subsys_state *dummy_css, int ssid, ret; /* - * Parse input - white space separated list of subsystem names - * prefixed with either + or -. + * Parse input - space separated list of subsystem names prefixed + * with either + or -. */ p = buffer; - while ((tok = strsep(&p, " \t\n"))) { + while ((tok = strsep(&p, " "))) { + if (tok[0] =='\0') + continue; for_each_subsys(ss, ssid) { if (ss->disabled || strcmp(tok + 1, ss->name)) continue; -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" 2014-05-09 19:32 ` [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" Tejun Heo @ 2014-05-13 3:51 ` Li Zefan 2014-05-13 15:18 ` Tejun Heo 0 siblings, 1 reply; 15+ messages in thread From: Li Zefan @ 2014-05-13 3:51 UTC (permalink / raw) To: Tejun Heo; +Cc: cgroups, linux-kernel > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index 35daf89..b81e7c0 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -2542,11 +2542,13 @@ static int cgroup_subtree_control_write(struct cgroup_subsys_state *dummy_css, > int ssid, ret; > > /* > - * Parse input - white space separated list of subsystem names > - * prefixed with either + or -. > + * Parse input - space separated list of subsystem names prefixed > + * with either + or -. > */ > p = buffer; > - while ((tok = strsep(&p, " \t\n"))) { > + while ((tok = strsep(&p, " "))) { > + if (tok[0] =='\0') if (tok[0] == '\0') > + continue; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" 2014-05-13 3:51 ` Li Zefan @ 2014-05-13 15:18 ` Tejun Heo 0 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-13 15:18 UTC (permalink / raw) To: Li Zefan; +Cc: cgroups, linux-kernel On Tue, May 13, 2014 at 11:51:51AM +0800, Li Zefan wrote: > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > index 35daf89..b81e7c0 100644 > > --- a/kernel/cgroup.c > > +++ b/kernel/cgroup.c > > @@ -2542,11 +2542,13 @@ static int cgroup_subtree_control_write(struct cgroup_subsys_state *dummy_css, > > int ssid, ret; > > > > /* > > - * Parse input - white space separated list of subsystem names > > - * prefixed with either + or -. > > + * Parse input - space separated list of subsystem names prefixed > > + * with either + or -. > > */ > > p = buffer; > > - while ((tok = strsep(&p, " \t\n"))) { > > + while ((tok = strsep(&p, " "))) { > > + if (tok[0] =='\0') > > if (tok[0] == '\0') Oops, updated. Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo ` (3 preceding siblings ...) 2014-05-09 19:32 ` [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-13 5:00 ` Li Zefan 2014-05-09 19:32 ` [PATCH 6/7] cgroup: use release_agent_path_lock in cgroup_release_agent_show() Tejun Heo ` (3 subsequent siblings) 8 siblings, 1 reply; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo After waiting for a child to finish offline, cgroup_subtree_control_write() jumps up to retry from after the input parsing and active protection breaking. This retry makes the scheduled locking update more difficult. Let's simplify it by returning with restart_syscall() for retries. Signed-off-by: Tejun Heo <tj@kernel.org> --- kernel/cgroup.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index b81e7c0..ff92bac 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2535,7 +2535,7 @@ out_finish: static int cgroup_subtree_control_write(struct cgroup_subsys_state *dummy_css, struct cftype *cft, char *buffer) { - unsigned int enable_req = 0, disable_req = 0, enable, disable; + unsigned int enable = 0, disable = 0; struct cgroup *cgrp = dummy_css->cgroup, *child; struct cgroup_subsys *ss; char *tok, *p; @@ -2554,11 +2554,11 @@ static int cgroup_subtree_control_write(struct cgroup_subsys_state *dummy_css, continue; if (*tok == '+') { - enable_req |= 1 << ssid; - disable_req &= ~(1 << ssid); + enable |= 1 << ssid; + disable &= ~(1 << ssid); } else if (*tok == '-') { - disable_req |= 1 << ssid; - enable_req &= ~(1 << ssid); + disable |= 1 << ssid; + enable &= ~(1 << ssid); } else { return -EINVAL; } @@ -2576,9 +2576,6 @@ static int cgroup_subtree_control_write(struct cgroup_subsys_state *dummy_css, */ cgroup_get(cgrp); kernfs_break_active_protection(cgrp->control_kn); -retry: - enable = enable_req; - disable = disable_req; mutex_lock(&cgroup_tree_mutex); @@ -2608,7 +2605,9 @@ retry: schedule(); finish_wait(&child->offline_waitq, &wait); cgroup_put(child); - goto retry; + + ret = restart_syscall(); + goto out_unbreak; } /* unavailable or not enabled on the parent? */ @@ -2692,6 +2691,7 @@ out_unlock: mutex_unlock(&cgroup_mutex); out_unlock_tree: mutex_unlock(&cgroup_tree_mutex); +out_unbreak: kernfs_unbreak_active_protection(cgrp->control_kn); cgroup_put(cgrp); return ret; -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() 2014-05-09 19:32 ` [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() Tejun Heo @ 2014-05-13 5:00 ` Li Zefan 2014-05-13 15:19 ` Tejun Heo 0 siblings, 1 reply; 15+ messages in thread From: Li Zefan @ 2014-05-13 5:00 UTC (permalink / raw) To: Tejun Heo; +Cc: cgroups, linux-kernel Hi Tejun, On 2014/5/10 3:32, Tejun Heo wrote: > After waiting for a child to finish offline, > cgroup_subtree_control_write() jumps up to retry from after the input > parsing and active protection breaking. This retry makes the > scheduled locking update more difficult. Could you explain this sentence more specific? I don't understand what "scheduled locking update" means. > Let's simplify it by > returning with restart_syscall() for retries. > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > kernel/cgroup.c | 18 +++++++++--------- > 1 file changed, 9 insertions(+), 9 deletions(-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() 2014-05-13 5:00 ` Li Zefan @ 2014-05-13 15:19 ` Tejun Heo 0 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-13 15:19 UTC (permalink / raw) To: Li Zefan; +Cc: cgroups, linux-kernel On Tue, May 13, 2014 at 01:00:11PM +0800, Li Zefan wrote: > Hi Tejun, > > On 2014/5/10 3:32, Tejun Heo wrote: > > After waiting for a child to finish offline, > > cgroup_subtree_control_write() jumps up to retry from after the input > > parsing and active protection breaking. This retry makes the > > scheduled locking update more difficult. > > Could you explain this sentence more specific? I don't understand what > "scheduled locking update" means. Updating it to "planned cgroup_tree_mutex removal". Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 6/7] cgroup: use release_agent_path_lock in cgroup_release_agent_show() 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo ` (4 preceding siblings ...) 2014-05-09 19:32 ` [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-09 19:32 ` [PATCH 7/7] cgroup: rename css_tryget*() to css_tryget_online*() Tejun Heo ` (2 subsequent siblings) 8 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo release_path is now protected by release_agent_path_lock to allow accessing it without grabbing cgroup_mutex; however, cgroup_release_agent_show() was still grabbing cgroup_mutex. Let's convert it to release_agent_path_lock so that we don't have to worry about this one for the planned locking updates. Signed-off-by: Tejun Heo <tj@kernel.org> --- kernel/cgroup.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index ff92bac..7b55ca5 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2373,11 +2373,10 @@ static int cgroup_release_agent_show(struct seq_file *seq, void *v) { struct cgroup *cgrp = seq_css(seq)->cgroup; - if (!cgroup_lock_live_group(cgrp)) - return -ENODEV; + spin_lock(&release_agent_path_lock); seq_puts(seq, cgrp->root->release_agent_path); + spin_unlock(&release_agent_path_lock); seq_putc(seq, '\n'); - mutex_unlock(&cgroup_mutex); return 0; } -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 7/7] cgroup: rename css_tryget*() to css_tryget_online*() 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo ` (5 preceding siblings ...) 2014-05-09 19:32 ` [PATCH 6/7] cgroup: use release_agent_path_lock in cgroup_release_agent_show() Tejun Heo @ 2014-05-09 19:32 ` Tejun Heo 2014-05-09 19:47 ` [PATCH v2 " Tejun Heo 2014-05-13 6:35 ` [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Li Zefan 2014-05-13 16:12 ` Tejun Heo 8 siblings, 1 reply; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:32 UTC (permalink / raw) To: lizefan Cc: cgroups, linux-kernel, Tejun Heo, Vivek Goyal, Jens Axboe, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo Unlike the more usual refcnting, what css_tryget() provides is the distinction between online and offline csses instead of protection against upping a refcnt which already reached zero. cgroup is planning to provide actual tryget which fails if the refcnt already reached zero. Let's rename the existing trygets so that they clearly indicate that they're onliness. I thought about keeping the existing names as-are and introducing new names for the planned actual tryget; however, given that each controller participates in the synchronization of the online state, it seems worthwhile to make it explicit that these functions are about on/offline state. Rename css_tryget() to css_tryget_online() and css_tryget_from_dir() to css_tryget_online_from_dir(). This is pure rename. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> --- block/blk-cgroup.c | 2 +- fs/bio.c | 2 +- include/linux/cgroup.h | 14 +++++++------- kernel/cgroup.c | 40 ++++++++++++++++++++-------------------- kernel/cpuset.c | 6 +++--- kernel/events/core.c | 3 ++- mm/hugetlb_cgroup.c | 2 +- mm/memcontrol.c | 46 ++++++++++++++++++++++++---------------------- 8 files changed, 59 insertions(+), 56 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index e4a4145..4447556 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -185,7 +185,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, lockdep_assert_held(q->queue_lock); /* blkg holds a reference to blkcg */ - if (!css_tryget(&blkcg->css)) { + if (!css_tryget_online(&blkcg->css)) { ret = -EINVAL; goto err_free_blkg; } diff --git a/fs/bio.c b/fs/bio.c index 6f0362b..0608ef9 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -1970,7 +1970,7 @@ int bio_associate_current(struct bio *bio) /* associate blkcg if exists */ rcu_read_lock(); css = task_css(current, blkio_cgrp_id); - if (css && css_tryget(css)) + if (css && css_tryget_online(css)) bio->bi_css = css; rcu_read_unlock(); diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index bde4461..c5f3684 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -95,16 +95,16 @@ static inline void css_get(struct cgroup_subsys_state *css) } /** - * css_tryget - try to obtain a reference on the specified css + * css_tryget_online - try to obtain a reference on the specified css if online * @css: target css * - * Obtain a reference on @css if it's alive. The caller naturally needs to - * ensure that @css is accessible but doesn't have to be holding a + * Obtain a reference on @css if it's online. The caller naturally needs + * to ensure that @css is accessible but doesn't have to be holding a * reference on it - IOW, RCU protected access is good enough for this * function. Returns %true if a reference count was successfully obtained; * %false otherwise. */ -static inline bool css_tryget(struct cgroup_subsys_state *css) +static inline bool css_tryget_online(struct cgroup_subsys_state *css) { if (css->flags & CSS_ROOT) return true; @@ -115,7 +115,7 @@ static inline bool css_tryget(struct cgroup_subsys_state *css) * css_put - put a css reference * @css: target css * - * Put a reference obtained via css_get() and css_tryget(). + * Put a reference obtained via css_get() and css_tryget_online(). */ static inline void css_put(struct cgroup_subsys_state *css) { @@ -905,8 +905,8 @@ void css_task_iter_end(struct css_task_iter *it); int cgroup_attach_task_all(struct task_struct *from, struct task_struct *); int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from); -struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry, - struct cgroup_subsys *ss); +struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry, + struct cgroup_subsys *ss); #else /* !CONFIG_CGROUPS */ diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 7b55ca5..8603b3e 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -3771,7 +3771,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry) /* * We aren't being called from kernfs and there's no guarantee on - * @kn->priv's validity. For this and css_tryget_from_dir(), + * @kn->priv's validity. For this and css_tryget_online_from_dir(), * @kn->priv is RCU safe. Let's do the RCU dancing. */ rcu_read_lock(); @@ -4060,9 +4060,9 @@ err: * Implemented in kill_css(). * * 2. When the percpu_ref is confirmed to be visible as killed on all CPUs - * and thus css_tryget() is guaranteed to fail, the css can be offlined - * by invoking offline_css(). After offlining, the base ref is put. - * Implemented in css_killed_work_fn(). + * and thus css_tryget_online() is guaranteed to fail, the css can be + * offlined by invoking offline_css(). After offlining, the base ref is + * put. Implemented in css_killed_work_fn(). * * 3. When the percpu_ref reaches zero, the only possible remaining * accessors are inside RCU read sections. css_release() schedules the @@ -4386,7 +4386,7 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, /* * This is called when the refcnt of a css is confirmed to be killed. - * css_tryget() is now guaranteed to fail. + * css_tryget_online() is now guaranteed to fail. */ static void css_killed_work_fn(struct work_struct *work) { @@ -4398,8 +4398,8 @@ static void css_killed_work_fn(struct work_struct *work) mutex_lock(&cgroup_mutex); /* - * css_tryget() is guaranteed to fail now. Tell subsystems to - * initate destruction. + * css_tryget_online() is guaranteed to fail now. Tell subsystems + * to initate destruction. */ offline_css(css); @@ -4440,8 +4440,8 @@ static void css_killed_ref_fn(struct percpu_ref *ref) * * This function initiates destruction of @css by removing cgroup interface * files and putting its base reference. ->css_offline() will be invoked - * asynchronously once css_tryget() is guaranteed to fail and when the - * reference count reaches zero, @css will be released. + * asynchronously once css_tryget_online() is guaranteed to fail and when + * the reference count reaches zero, @css will be released. */ static void kill_css(struct cgroup_subsys_state *css) { @@ -4462,7 +4462,7 @@ static void kill_css(struct cgroup_subsys_state *css) /* * cgroup core guarantees that, by the time ->css_offline() is * invoked, no new css reference will be given out via - * css_tryget(). We can't simply call percpu_ref_kill() and + * css_tryget_online(). We can't simply call percpu_ref_kill() and * proceed to offlining css's because percpu_ref_kill() doesn't * guarantee that the ref is seen as killed on all CPUs on return. * @@ -4478,9 +4478,9 @@ static void kill_css(struct cgroup_subsys_state *css) * * css's make use of percpu refcnts whose killing latency shouldn't be * exposed to userland and are RCU protected. Also, cgroup core needs to - * guarantee that css_tryget() won't succeed by the time ->css_offline() is - * invoked. To satisfy all the requirements, destruction is implemented in - * the following two steps. + * guarantee that css_tryget_online() won't succeed by the time + * ->css_offline() is invoked. To satisfy all the requirements, + * destruction is implemented in the following two steps. * * s1. Verify @cgrp can be destroyed and mark it dying. Remove all * userland visible parts and start killing the percpu refcnts of @@ -4574,9 +4574,9 @@ static int cgroup_destroy_locked(struct cgroup *cgrp) /* * There are two control paths which try to determine cgroup from * dentry without going through kernfs - cgroupstats_build() and - * css_tryget_from_dir(). Those are supported by RCU protecting - * clearing of cgrp->kn->priv backpointer, which should happen - * after all files under it have been removed. + * css_tryget_online_from_dir(). Those are supported by RCU + * protecting clearing of cgrp->kn->priv backpointer, which should + * happen after all files under it have been removed. */ kernfs_remove(cgrp->kn); /* @cgrp has an extra ref on its kn */ RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL); @@ -5173,7 +5173,7 @@ static int __init cgroup_disable(char *str) __setup("cgroup_disable=", cgroup_disable); /** - * css_tryget_from_dir - get corresponding css from the dentry of a cgroup dir + * css_tryget_online_from_dir - get corresponding css from a cgroup dentry * @dentry: directory dentry of interest * @ss: subsystem of interest * @@ -5181,8 +5181,8 @@ __setup("cgroup_disable=", cgroup_disable); * to get the corresponding css and return it. If such css doesn't exist * or can't be pinned, an ERR_PTR value is returned. */ -struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry, - struct cgroup_subsys *ss) +struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry, + struct cgroup_subsys *ss) { struct kernfs_node *kn = kernfs_node_from_dentry(dentry); struct cgroup_subsys_state *css = NULL; @@ -5204,7 +5204,7 @@ struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry, if (cgrp) css = cgroup_css(cgrp, ss); - if (!css || !css_tryget(css)) + if (!css || !css_tryget_online(css)) css = ERR_PTR(-ENOENT); rcu_read_unlock(); diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 7c0e8da..37ca0a5 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -872,7 +872,7 @@ static void update_tasks_cpumask_hier(struct cpuset *root_cs, bool update_root) continue; } } - if (!css_tryget(&cp->css)) + if (!css_tryget_online(&cp->css)) continue; rcu_read_unlock(); @@ -1108,7 +1108,7 @@ static void update_tasks_nodemask_hier(struct cpuset *root_cs, bool update_root) continue; } } - if (!css_tryget(&cp->css)) + if (!css_tryget_online(&cp->css)) continue; rcu_read_unlock(); @@ -2153,7 +2153,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work) rcu_read_lock(); cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) { - if (cs == &top_cpuset || !css_tryget(&cs->css)) + if (cs == &top_cpuset || !css_tryget_online(&cs->css)) continue; rcu_read_unlock(); diff --git a/kernel/events/core.c b/kernel/events/core.c index f83a71a..077968d 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -607,7 +607,8 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event, if (!f.file) return -EBADF; - css = css_tryget_from_dir(f.file->f_dentry, &perf_event_cgrp_subsys); + css = css_tryget_online_from_dir(f.file->f_dentry, + &perf_event_cgrp_subsys); if (IS_ERR(css)) { ret = PTR_ERR(css); goto out; diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c index 595d7fd..372f1ad 100644 --- a/mm/hugetlb_cgroup.c +++ b/mm/hugetlb_cgroup.c @@ -181,7 +181,7 @@ int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages, again: rcu_read_lock(); h_cg = hugetlb_cgroup_from_task(current); - if (!css_tryget(&h_cg->css)) { + if (!css_tryget_online(&h_cg->css)) { rcu_read_unlock(); goto again; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c3f82f6..5cf3246 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -567,7 +567,8 @@ void sock_update_memcg(struct sock *sk) memcg = mem_cgroup_from_task(current); cg_proto = sk->sk_prot->proto_cgroup(memcg); if (!mem_cgroup_is_root(memcg) && - memcg_proto_active(cg_proto) && css_tryget(&memcg->css)) { + memcg_proto_active(cg_proto) && + css_tryget_online(&memcg->css)) { sk->sk_cgrp = cg_proto; } rcu_read_unlock(); @@ -834,7 +835,7 @@ retry: */ __mem_cgroup_remove_exceeded(mz->memcg, mz, mctz); if (!res_counter_soft_limit_excess(&mz->memcg->res) || - !css_tryget(&mz->memcg->css)) + !css_tryget_online(&mz->memcg->css)) goto retry; done: return mz; @@ -1076,7 +1077,7 @@ static struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); if (unlikely(!memcg)) memcg = root_mem_cgroup; - } while (!css_tryget(&memcg->css)); + } while (!css_tryget_online(&memcg->css)); rcu_read_unlock(); return memcg; } @@ -1113,7 +1114,8 @@ skip_node: */ if (next_css) { if ((next_css == &root->css) || - ((next_css->flags & CSS_ONLINE) && css_tryget(next_css))) + ((next_css->flags & CSS_ONLINE) && + css_tryget_online(next_css))) return mem_cgroup_from_css(next_css); prev_css = next_css; @@ -1159,7 +1161,7 @@ mem_cgroup_iter_load(struct mem_cgroup_reclaim_iter *iter, * would be returned all the time. */ if (position && position != root && - !css_tryget(&position->css)) + !css_tryget_online(&position->css)) position = NULL; } return position; @@ -2785,9 +2787,9 @@ static void __mem_cgroup_cancel_local_charge(struct mem_cgroup *memcg, /* * A helper function to get mem_cgroup from ID. must be called under - * rcu_read_lock(). The caller is responsible for calling css_tryget if - * the mem_cgroup is used for charging. (dropping refcnt from swap can be - * called against removed memcg.) + * rcu_read_lock(). The caller is responsible for calling + * css_tryget_online() if the mem_cgroup is used for charging. (dropping + * refcnt from swap can be called against removed memcg.) */ static struct mem_cgroup *mem_cgroup_lookup(unsigned short id) { @@ -2810,14 +2812,14 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page) lock_page_cgroup(pc); if (PageCgroupUsed(pc)) { memcg = pc->mem_cgroup; - if (memcg && !css_tryget(&memcg->css)) + if (memcg && !css_tryget_online(&memcg->css)) memcg = NULL; } else if (PageSwapCache(page)) { ent.val = page_private(page); id = lookup_swap_cgroup_id(ent); rcu_read_lock(); memcg = mem_cgroup_lookup(id); - if (memcg && !css_tryget(&memcg->css)) + if (memcg && !css_tryget_online(&memcg->css)) memcg = NULL; rcu_read_unlock(); } @@ -3473,7 +3475,7 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, } /* The corresponding put will be done in the workqueue. */ - if (!css_tryget(&memcg->css)) + if (!css_tryget_online(&memcg->css)) goto out; rcu_read_unlock(); @@ -4246,8 +4248,8 @@ void mem_cgroup_uncharge_swap(swp_entry_t ent) memcg = mem_cgroup_lookup(id); if (memcg) { /* - * We uncharge this because swap is freed. - * This memcg can be obsolete one. We avoid calling css_tryget + * We uncharge this because swap is freed. This memcg can + * be obsolete one. We avoid calling css_tryget_online(). */ if (!mem_cgroup_is_root(memcg)) res_counter_uncharge(&memcg->memsw, PAGE_SIZE); @@ -5840,10 +5842,10 @@ static void kmem_cgroup_css_offline(struct mem_cgroup *memcg) * which is then paired with css_put during uncharge resp. here. * * Although this might sound strange as this path is called from - * css_offline() when the referencemight have dropped down to 0 - * and shouldn't be incremented anymore (css_tryget would fail) - * we do not have other options because of the kmem allocations - * lifetime. + * css_offline() when the referencemight have dropped down to 0 and + * shouldn't be incremented anymore (css_tryget_online() would + * fail) we do not have other options because of the kmem + * allocations lifetime. */ css_get(&memcg->css); @@ -6051,8 +6053,8 @@ static int memcg_write_event_control(struct cgroup_subsys_state *css, * automatically removed on cgroup destruction but the removal is * asynchronous, so take an extra ref on @css. */ - cfile_css = css_tryget_from_dir(cfile.file->f_dentry->d_parent, - &memory_cgrp_subsys); + cfile_css = css_tryget_online_from_dir(cfile.file->f_dentry->d_parent, + &memory_cgrp_subsys); ret = -EINVAL; if (IS_ERR(cfile_css)) goto out_put_cfile; @@ -6496,7 +6498,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css) /* * XXX: css_offline() would be where we should reparent all * memory to prepare the cgroup for destruction. However, - * memcg does not do css_tryget() and res_counter charging + * memcg does not do css_tryget_online() and res_counter charging * under the same RCU lock region, which means that charging * could race with offlining. Offlining only happens to * cgroups with no tasks in them but charges can show up @@ -6510,9 +6512,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css) * lookup_swap_cgroup_id() * rcu_read_lock() * mem_cgroup_lookup() - * css_tryget() + * css_tryget_online() * rcu_read_unlock() - * disable css_tryget() + * disable css_tryget_online() * call_rcu() * offline_css() * reparent_charges() -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 7/7] cgroup: rename css_tryget*() to css_tryget_online*() 2014-05-09 19:32 ` [PATCH 7/7] cgroup: rename css_tryget*() to css_tryget_online*() Tejun Heo @ 2014-05-09 19:47 ` Tejun Heo 0 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-09 19:47 UTC (permalink / raw) To: lizefan Cc: cgroups, linux-kernel, Vivek Goyal, Jens Axboe, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo >From 23d99584b0e6fd209f6106866836e6cdadba9c03 Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Fri, 9 May 2014 15:46:38 -0400 Unlike the more usual refcnting, what css_tryget() provides is the distinction between online and offline csses instead of protection against upping a refcnt which already reached zero. cgroup is planning to provide actual tryget which fails if the refcnt already reached zero. Let's rename the existing trygets so that they clearly indicate that they're onliness. I thought about keeping the existing names as-are and introducing new names for the planned actual tryget; however, given that each controller participates in the synchronization of the online state, it seems worthwhile to make it explicit that these functions are about on/offline state. Rename css_tryget() to css_tryget_online() and css_tryget_from_dir() to css_tryget_online_from_dir(). This is pure rename. v2: cgroup_freezer grew new usages of css_tryget(). Update accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> --- block/blk-cgroup.c | 2 +- fs/bio.c | 2 +- include/linux/cgroup.h | 14 +++++++------- kernel/cgroup.c | 40 ++++++++++++++++++++-------------------- kernel/cgroup_freezer.c | 4 ++-- kernel/cpuset.c | 6 +++--- kernel/events/core.c | 3 ++- mm/hugetlb_cgroup.c | 2 +- mm/memcontrol.c | 46 ++++++++++++++++++++++++---------------------- 9 files changed, 61 insertions(+), 58 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index e4a4145..4447556 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -185,7 +185,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, lockdep_assert_held(q->queue_lock); /* blkg holds a reference to blkcg */ - if (!css_tryget(&blkcg->css)) { + if (!css_tryget_online(&blkcg->css)) { ret = -EINVAL; goto err_free_blkg; } diff --git a/fs/bio.c b/fs/bio.c index 6f0362b..0608ef9 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -1970,7 +1970,7 @@ int bio_associate_current(struct bio *bio) /* associate blkcg if exists */ rcu_read_lock(); css = task_css(current, blkio_cgrp_id); - if (css && css_tryget(css)) + if (css && css_tryget_online(css)) bio->bi_css = css; rcu_read_unlock(); diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index bde4461..c5f3684 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -95,16 +95,16 @@ static inline void css_get(struct cgroup_subsys_state *css) } /** - * css_tryget - try to obtain a reference on the specified css + * css_tryget_online - try to obtain a reference on the specified css if online * @css: target css * - * Obtain a reference on @css if it's alive. The caller naturally needs to - * ensure that @css is accessible but doesn't have to be holding a + * Obtain a reference on @css if it's online. The caller naturally needs + * to ensure that @css is accessible but doesn't have to be holding a * reference on it - IOW, RCU protected access is good enough for this * function. Returns %true if a reference count was successfully obtained; * %false otherwise. */ -static inline bool css_tryget(struct cgroup_subsys_state *css) +static inline bool css_tryget_online(struct cgroup_subsys_state *css) { if (css->flags & CSS_ROOT) return true; @@ -115,7 +115,7 @@ static inline bool css_tryget(struct cgroup_subsys_state *css) * css_put - put a css reference * @css: target css * - * Put a reference obtained via css_get() and css_tryget(). + * Put a reference obtained via css_get() and css_tryget_online(). */ static inline void css_put(struct cgroup_subsys_state *css) { @@ -905,8 +905,8 @@ void css_task_iter_end(struct css_task_iter *it); int cgroup_attach_task_all(struct task_struct *from, struct task_struct *); int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from); -struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry, - struct cgroup_subsys *ss); +struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry, + struct cgroup_subsys *ss); #else /* !CONFIG_CGROUPS */ diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 7b55ca5..8603b3e 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -3771,7 +3771,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry) /* * We aren't being called from kernfs and there's no guarantee on - * @kn->priv's validity. For this and css_tryget_from_dir(), + * @kn->priv's validity. For this and css_tryget_online_from_dir(), * @kn->priv is RCU safe. Let's do the RCU dancing. */ rcu_read_lock(); @@ -4060,9 +4060,9 @@ err: * Implemented in kill_css(). * * 2. When the percpu_ref is confirmed to be visible as killed on all CPUs - * and thus css_tryget() is guaranteed to fail, the css can be offlined - * by invoking offline_css(). After offlining, the base ref is put. - * Implemented in css_killed_work_fn(). + * and thus css_tryget_online() is guaranteed to fail, the css can be + * offlined by invoking offline_css(). After offlining, the base ref is + * put. Implemented in css_killed_work_fn(). * * 3. When the percpu_ref reaches zero, the only possible remaining * accessors are inside RCU read sections. css_release() schedules the @@ -4386,7 +4386,7 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, /* * This is called when the refcnt of a css is confirmed to be killed. - * css_tryget() is now guaranteed to fail. + * css_tryget_online() is now guaranteed to fail. */ static void css_killed_work_fn(struct work_struct *work) { @@ -4398,8 +4398,8 @@ static void css_killed_work_fn(struct work_struct *work) mutex_lock(&cgroup_mutex); /* - * css_tryget() is guaranteed to fail now. Tell subsystems to - * initate destruction. + * css_tryget_online() is guaranteed to fail now. Tell subsystems + * to initate destruction. */ offline_css(css); @@ -4440,8 +4440,8 @@ static void css_killed_ref_fn(struct percpu_ref *ref) * * This function initiates destruction of @css by removing cgroup interface * files and putting its base reference. ->css_offline() will be invoked - * asynchronously once css_tryget() is guaranteed to fail and when the - * reference count reaches zero, @css will be released. + * asynchronously once css_tryget_online() is guaranteed to fail and when + * the reference count reaches zero, @css will be released. */ static void kill_css(struct cgroup_subsys_state *css) { @@ -4462,7 +4462,7 @@ static void kill_css(struct cgroup_subsys_state *css) /* * cgroup core guarantees that, by the time ->css_offline() is * invoked, no new css reference will be given out via - * css_tryget(). We can't simply call percpu_ref_kill() and + * css_tryget_online(). We can't simply call percpu_ref_kill() and * proceed to offlining css's because percpu_ref_kill() doesn't * guarantee that the ref is seen as killed on all CPUs on return. * @@ -4478,9 +4478,9 @@ static void kill_css(struct cgroup_subsys_state *css) * * css's make use of percpu refcnts whose killing latency shouldn't be * exposed to userland and are RCU protected. Also, cgroup core needs to - * guarantee that css_tryget() won't succeed by the time ->css_offline() is - * invoked. To satisfy all the requirements, destruction is implemented in - * the following two steps. + * guarantee that css_tryget_online() won't succeed by the time + * ->css_offline() is invoked. To satisfy all the requirements, + * destruction is implemented in the following two steps. * * s1. Verify @cgrp can be destroyed and mark it dying. Remove all * userland visible parts and start killing the percpu refcnts of @@ -4574,9 +4574,9 @@ static int cgroup_destroy_locked(struct cgroup *cgrp) /* * There are two control paths which try to determine cgroup from * dentry without going through kernfs - cgroupstats_build() and - * css_tryget_from_dir(). Those are supported by RCU protecting - * clearing of cgrp->kn->priv backpointer, which should happen - * after all files under it have been removed. + * css_tryget_online_from_dir(). Those are supported by RCU + * protecting clearing of cgrp->kn->priv backpointer, which should + * happen after all files under it have been removed. */ kernfs_remove(cgrp->kn); /* @cgrp has an extra ref on its kn */ RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL); @@ -5173,7 +5173,7 @@ static int __init cgroup_disable(char *str) __setup("cgroup_disable=", cgroup_disable); /** - * css_tryget_from_dir - get corresponding css from the dentry of a cgroup dir + * css_tryget_online_from_dir - get corresponding css from a cgroup dentry * @dentry: directory dentry of interest * @ss: subsystem of interest * @@ -5181,8 +5181,8 @@ __setup("cgroup_disable=", cgroup_disable); * to get the corresponding css and return it. If such css doesn't exist * or can't be pinned, an ERR_PTR value is returned. */ -struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry, - struct cgroup_subsys *ss) +struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry, + struct cgroup_subsys *ss) { struct kernfs_node *kn = kernfs_node_from_dentry(dentry); struct cgroup_subsys_state *css = NULL; @@ -5204,7 +5204,7 @@ struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry, if (cgrp) css = cgroup_css(cgrp, ss); - if (!css || !css_tryget(css)) + if (!css || !css_tryget_online(css)) css = ERR_PTR(-ENOENT); rcu_read_unlock(); diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c index 12ead0b..e728f66 100644 --- a/kernel/cgroup_freezer.c +++ b/kernel/cgroup_freezer.c @@ -302,7 +302,7 @@ static int freezer_read(struct seq_file *m, void *v) /* update states bottom-up */ css_for_each_descendant_post(pos, css) { - if (!css_tryget(pos)) + if (!css_tryget_online(pos)) continue; rcu_read_unlock(); @@ -402,7 +402,7 @@ static void freezer_change_state(struct freezer *freezer, bool freeze) struct freezer *pos_f = css_freezer(pos); struct freezer *parent = parent_freezer(pos_f); - if (!css_tryget(pos)) + if (!css_tryget_online(pos)) continue; rcu_read_unlock(); diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 7c0e8da..37ca0a5 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -872,7 +872,7 @@ static void update_tasks_cpumask_hier(struct cpuset *root_cs, bool update_root) continue; } } - if (!css_tryget(&cp->css)) + if (!css_tryget_online(&cp->css)) continue; rcu_read_unlock(); @@ -1108,7 +1108,7 @@ static void update_tasks_nodemask_hier(struct cpuset *root_cs, bool update_root) continue; } } - if (!css_tryget(&cp->css)) + if (!css_tryget_online(&cp->css)) continue; rcu_read_unlock(); @@ -2153,7 +2153,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work) rcu_read_lock(); cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) { - if (cs == &top_cpuset || !css_tryget(&cs->css)) + if (cs == &top_cpuset || !css_tryget_online(&cs->css)) continue; rcu_read_unlock(); diff --git a/kernel/events/core.c b/kernel/events/core.c index f83a71a..077968d 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -607,7 +607,8 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event, if (!f.file) return -EBADF; - css = css_tryget_from_dir(f.file->f_dentry, &perf_event_cgrp_subsys); + css = css_tryget_online_from_dir(f.file->f_dentry, + &perf_event_cgrp_subsys); if (IS_ERR(css)) { ret = PTR_ERR(css); goto out; diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c index 595d7fd..372f1ad 100644 --- a/mm/hugetlb_cgroup.c +++ b/mm/hugetlb_cgroup.c @@ -181,7 +181,7 @@ int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages, again: rcu_read_lock(); h_cg = hugetlb_cgroup_from_task(current); - if (!css_tryget(&h_cg->css)) { + if (!css_tryget_online(&h_cg->css)) { rcu_read_unlock(); goto again; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c3f82f6..5cf3246 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -567,7 +567,8 @@ void sock_update_memcg(struct sock *sk) memcg = mem_cgroup_from_task(current); cg_proto = sk->sk_prot->proto_cgroup(memcg); if (!mem_cgroup_is_root(memcg) && - memcg_proto_active(cg_proto) && css_tryget(&memcg->css)) { + memcg_proto_active(cg_proto) && + css_tryget_online(&memcg->css)) { sk->sk_cgrp = cg_proto; } rcu_read_unlock(); @@ -834,7 +835,7 @@ retry: */ __mem_cgroup_remove_exceeded(mz->memcg, mz, mctz); if (!res_counter_soft_limit_excess(&mz->memcg->res) || - !css_tryget(&mz->memcg->css)) + !css_tryget_online(&mz->memcg->css)) goto retry; done: return mz; @@ -1076,7 +1077,7 @@ static struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); if (unlikely(!memcg)) memcg = root_mem_cgroup; - } while (!css_tryget(&memcg->css)); + } while (!css_tryget_online(&memcg->css)); rcu_read_unlock(); return memcg; } @@ -1113,7 +1114,8 @@ skip_node: */ if (next_css) { if ((next_css == &root->css) || - ((next_css->flags & CSS_ONLINE) && css_tryget(next_css))) + ((next_css->flags & CSS_ONLINE) && + css_tryget_online(next_css))) return mem_cgroup_from_css(next_css); prev_css = next_css; @@ -1159,7 +1161,7 @@ mem_cgroup_iter_load(struct mem_cgroup_reclaim_iter *iter, * would be returned all the time. */ if (position && position != root && - !css_tryget(&position->css)) + !css_tryget_online(&position->css)) position = NULL; } return position; @@ -2785,9 +2787,9 @@ static void __mem_cgroup_cancel_local_charge(struct mem_cgroup *memcg, /* * A helper function to get mem_cgroup from ID. must be called under - * rcu_read_lock(). The caller is responsible for calling css_tryget if - * the mem_cgroup is used for charging. (dropping refcnt from swap can be - * called against removed memcg.) + * rcu_read_lock(). The caller is responsible for calling + * css_tryget_online() if the mem_cgroup is used for charging. (dropping + * refcnt from swap can be called against removed memcg.) */ static struct mem_cgroup *mem_cgroup_lookup(unsigned short id) { @@ -2810,14 +2812,14 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page) lock_page_cgroup(pc); if (PageCgroupUsed(pc)) { memcg = pc->mem_cgroup; - if (memcg && !css_tryget(&memcg->css)) + if (memcg && !css_tryget_online(&memcg->css)) memcg = NULL; } else if (PageSwapCache(page)) { ent.val = page_private(page); id = lookup_swap_cgroup_id(ent); rcu_read_lock(); memcg = mem_cgroup_lookup(id); - if (memcg && !css_tryget(&memcg->css)) + if (memcg && !css_tryget_online(&memcg->css)) memcg = NULL; rcu_read_unlock(); } @@ -3473,7 +3475,7 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, } /* The corresponding put will be done in the workqueue. */ - if (!css_tryget(&memcg->css)) + if (!css_tryget_online(&memcg->css)) goto out; rcu_read_unlock(); @@ -4246,8 +4248,8 @@ void mem_cgroup_uncharge_swap(swp_entry_t ent) memcg = mem_cgroup_lookup(id); if (memcg) { /* - * We uncharge this because swap is freed. - * This memcg can be obsolete one. We avoid calling css_tryget + * We uncharge this because swap is freed. This memcg can + * be obsolete one. We avoid calling css_tryget_online(). */ if (!mem_cgroup_is_root(memcg)) res_counter_uncharge(&memcg->memsw, PAGE_SIZE); @@ -5840,10 +5842,10 @@ static void kmem_cgroup_css_offline(struct mem_cgroup *memcg) * which is then paired with css_put during uncharge resp. here. * * Although this might sound strange as this path is called from - * css_offline() when the referencemight have dropped down to 0 - * and shouldn't be incremented anymore (css_tryget would fail) - * we do not have other options because of the kmem allocations - * lifetime. + * css_offline() when the referencemight have dropped down to 0 and + * shouldn't be incremented anymore (css_tryget_online() would + * fail) we do not have other options because of the kmem + * allocations lifetime. */ css_get(&memcg->css); @@ -6051,8 +6053,8 @@ static int memcg_write_event_control(struct cgroup_subsys_state *css, * automatically removed on cgroup destruction but the removal is * asynchronous, so take an extra ref on @css. */ - cfile_css = css_tryget_from_dir(cfile.file->f_dentry->d_parent, - &memory_cgrp_subsys); + cfile_css = css_tryget_online_from_dir(cfile.file->f_dentry->d_parent, + &memory_cgrp_subsys); ret = -EINVAL; if (IS_ERR(cfile_css)) goto out_put_cfile; @@ -6496,7 +6498,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css) /* * XXX: css_offline() would be where we should reparent all * memory to prepare the cgroup for destruction. However, - * memcg does not do css_tryget() and res_counter charging + * memcg does not do css_tryget_online() and res_counter charging * under the same RCU lock region, which means that charging * could race with offlining. Offlining only happens to * cgroups with no tasks in them but charges can show up @@ -6510,9 +6512,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css) * lookup_swap_cgroup_id() * rcu_read_lock() * mem_cgroup_lookup() - * css_tryget() + * css_tryget_online() * rcu_read_unlock() - * disable css_tryget() + * disable css_tryget_online() * call_rcu() * offline_css() * reparent_charges() -- 1.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo ` (6 preceding siblings ...) 2014-05-09 19:32 ` [PATCH 7/7] cgroup: rename css_tryget*() to css_tryget_online*() Tejun Heo @ 2014-05-13 6:35 ` Li Zefan 2014-05-13 16:12 ` Tejun Heo 8 siblings, 0 replies; 15+ messages in thread From: Li Zefan @ 2014-05-13 6:35 UTC (permalink / raw) To: Tejun Heo; +Cc: cgroups, linux-kernel On 2014/5/10 3:32, Tejun Heo wrote: > Hello, > > Changes from the last take[L] are, > > * 0002, 0003 and 0007 added. > > * Other patches are refreshed without content change. > > This patchset contains the following seven patches. > > 0001-cgroup-fix-offlining-child-waiting-in-cgroup_subtree.patch > 0002-cgroup-cgroup_idr_lock-should-be-bh.patch > 0003-cgroup-css_release-shouldn-t-clear-cgroup-subsys.patch > 0004-cgroup-update-and-fix-parsing-of-cgroup.subtree_cont.patch > 0005-cgroup-use-restart_syscall-for-retries-after-offline.patch > 0006-cgroup-use-release_agent_path_lock-in-cgroup_release.patch > 0007-cgroup-rename-css_tryget-to-css_tryget_online.patch > > 0001 fixes two bugs in cgroup_subtree_control_write(). > > 0004 fixes and makes subtree_control parsing stricter. > > 0005 simplifies cgroup_substree_control_write() retry path by using > restart_syscall(). > > 0006 makes cgroup_release_agent_show() use release_path_lock. The > original conversion missed this one. > > 0007 renames css_tryget() to css_tryget_online(). This patch was > posted separately before - [1] - and acked by Michal and Johannes. > > This patchset is on top of cgroup/for-3.16 6e1a046e9458 ("Merge branch > 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu > into for-3.16") and available in the following git branch. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-post-unified-updates-v2 > > diffstat follows. > > block/blk-cgroup.c | 2 - > fs/bio.c | 2 - > include/linux/cgroup.h | 14 +++---- > kernel/cgroup.c | 88 +++++++++++++++++++++++++------------------------ > kernel/cpuset.c | 6 +-- > kernel/events/core.c | 3 + > mm/hugetlb_cgroup.c | 2 - > mm/memcontrol.c | 46 +++++++++++++------------ > 8 files changed, 84 insertions(+), 79 deletions(-) > Acked-by: Li Zefan <lizefan@huawei.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo ` (7 preceding siblings ...) 2014-05-13 6:35 ` [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Li Zefan @ 2014-05-13 16:12 ` Tejun Heo 8 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2014-05-13 16:12 UTC (permalink / raw) To: lizefan; +Cc: cgroups, linux-kernel On Fri, May 09, 2014 at 03:32:48PM -0400, Tejun Heo wrote: > Hello, > > Changes from the last take[L] are, > > * 0002, 0003 and 0007 added. > > * Other patches are refreshed without content change. > > This patchset contains the following seven patches. > > 0001-cgroup-fix-offlining-child-waiting-in-cgroup_subtree.patch > 0002-cgroup-cgroup_idr_lock-should-be-bh.patch > 0003-cgroup-css_release-shouldn-t-clear-cgroup-subsys.patch > 0004-cgroup-update-and-fix-parsing-of-cgroup.subtree_cont.patch > 0005-cgroup-use-restart_syscall-for-retries-after-offline.patch > 0006-cgroup-use-release_agent_path_lock-in-cgroup_release.patch > 0007-cgroup-rename-css_tryget-to-css_tryget_online.patch Applied to cgroup/for-3.16. Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2014-05-13 16:12 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-09 19:32 [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Tejun Heo 2014-05-09 19:32 ` [PATCH 1/7] cgroup: fix offlining child waiting in cgroup_subtree_control_write() Tejun Heo 2014-05-09 19:32 ` [PATCH 2/7] cgroup: cgroup_idr_lock should be bh Tejun Heo 2014-05-09 19:32 ` [PATCH 3/7] cgroup: css_release() shouldn't clear cgroup->subsys[] Tejun Heo 2014-05-09 19:32 ` [PATCH 4/7] cgroup: update and fix parsing of "cgroup.subtree_control" Tejun Heo 2014-05-13 3:51 ` Li Zefan 2014-05-13 15:18 ` Tejun Heo 2014-05-09 19:32 ` [PATCH 5/7] cgroup: use restart_syscall() for retries after offline waits in cgroup_subtree_control_write() Tejun Heo 2014-05-13 5:00 ` Li Zefan 2014-05-13 15:19 ` Tejun Heo 2014-05-09 19:32 ` [PATCH 6/7] cgroup: use release_agent_path_lock in cgroup_release_agent_show() Tejun Heo 2014-05-09 19:32 ` [PATCH 7/7] cgroup: rename css_tryget*() to css_tryget_online*() Tejun Heo 2014-05-09 19:47 ` [PATCH v2 " Tejun Heo 2014-05-13 6:35 ` [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates Li Zefan 2014-05-13 16:12 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).