* 2 bug fixes for the 3.4 cgroup code (v2)
@ 2013-01-09 15:03 Hans de Goede
[not found] ` <1357743822-21707-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Hans de Goede @ 2013-01-09 15:03 UTC (permalink / raw)
To: Tejun Heo, Li Zefan; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA
So I messed things up a bit with v1 of this patchset, the problem is that
to simplify debugging I had disabled all CGROUP Kconfig options, except
for CGROUPS itself. Which turned cgroup_clear_css_refs() into a nop.
Upon further testing after posting v1, with the CGROUP Kconfig options
enabled again, I hit a kernel-panic. The problem was I did the cgrp->count != 0
test after calling/testing cgroup_clear_css_refs(), so now
cgroup_clear_css_refs() would get called twice despite succeeding the first
time in case the cgrp->count != 0 test fails. Causing the BUG() in there to
avoid the refs being decremented twice to trigger.
The fix in v2 of the patch is easy, simply test cgrp->count != 0 before
calling/testing cgroup_clear_css_refs().
And for people just tuning in, here is the "cover letter" for v1 from
v1 of the patch-set:
As a spare-time project I'm working on Linux on Allwinner A10 socs (ARM
mach-sun4i). While the linux-sunxi project is slowly working on cleaning
up the code and getting support for these socs upstream (first bits have
landed in 3.8), to be able to truely use these systems we're stuck on using
3.4 for now, since that is the latest to which all the ugly Allwinner code
has been forward-ported. See: http://linux-sunxi.org
I'm using Fedora-18 as userland, and since that uses systemd it exercises
the cgroup subsystem quite a bit. Thanks to building the kernel with
CONFIG_DEBUG_LIST this has exposed a use after free bug in the 3.4 cgroup
code. While debugging this I've also noticed some missing locking (or so
I believe).
The current cgroup code seems to be unaffected by both issues. Since the
current code is significantly changed, I've written a simple fix for 3.4,
which I would like to propose to be added to the 3.4.xx bugfix releases.
Allthough I know everyone dislikes looking back at old code once it has
been rewritten in a better fashion, I hope you are still willing to make
some time to review these 2 simple patches, as a first step in getting
them into 3.4.xx.
Some details on the differences between the current cgroup code, which
does not have these issues and the 3.4 code, in the current code:
1) The code-segment missing the locking has been removed
2) Unlike the 3.4 code cgroup_rmdir() properly checks cgrp->count
Note that the cgrp->count check in the current cgroup_rmdir() code has the
check before waiting on the cgroup_rmdir_waitq, and simply returns EBUSY when
cgrp->count != 0, where as my patch adds the check inside the waiting block
and waits for the waitq to be woken up.
The reason for this difference is that the current cgroup code handles
the clearing of a css_set / the link->cgrp_link_list elements from
cgrp->css_sets directly (and does not call cgroup_wakeup_rmdir_waiter()
when doing so), where as the 3.4 code handles the clearing of this list
from a workqueue, and *does* call cgroup_wakeup_rmdir_waiter() when clearing
the list.
Regards,
Hans
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <1357743822-21707-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() [not found] ` <1357743822-21707-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-01-09 15:03 ` Hans de Goede [not found] ` <1357743822-21707-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-01-09 15:03 ` [PATCH 3.4 2/2] cgroup: Fix use after free of cgrp (cgrp->css_sets) (v2) Hans de Goede 1 sibling, 1 reply; 7+ messages in thread From: Hans de Goede @ 2013-01-09 15:03 UTC (permalink / raw) To: Tejun Heo, Li Zefan; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Hans de Goede As indicated in the comment above cgroup_css_sets_empty the css_set_lock must be hold when calling it. Signed-off-by: Hans de Goede <hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- kernel/cgroup.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index c908354..59e711a 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -3980,7 +3980,10 @@ static int cgroup_rmdir(struct inode *unused_dir, struct dentry *dentry) /* the vfs holds both inode->i_mutex already */ again: mutex_lock(&cgroup_mutex); - if (!cgroup_css_sets_empty(cgrp)) { + read_lock(&css_set_lock); + ret = cgroup_css_sets_empty(cgrp); + read_unlock(&css_set_lock); + if (!ret) { mutex_unlock(&cgroup_mutex); return -EBUSY; } -- 1.8.0.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
[parent not found: <1357743822-21707-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() [not found] ` <1357743822-21707-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-01-10 2:43 ` Li Zefan [not found] ` <50EE2ABD.5070404-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Li Zefan @ 2013-01-10 2:43 UTC (permalink / raw) To: Hans de Goede; +Cc: Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA On 2013/1/9 23:03, Hans de Goede wrote: > As indicated in the comment above cgroup_css_sets_empty the > css_set_lock must be hold when calling it. > > Signed-off-by: Hans de Goede <hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > --- > kernel/cgroup.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index c908354..59e711a 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -3980,7 +3980,10 @@ static int cgroup_rmdir(struct inode *unused_dir, struct dentry *dentry) > /* the vfs holds both inode->i_mutex already */ > again: > mutex_lock(&cgroup_mutex); > - if (!cgroup_css_sets_empty(cgrp)) { > + read_lock(&css_set_lock); > + ret = cgroup_css_sets_empty(cgrp); This function doesn't exist in 3.4.xx kernel! It was introduced by an out-of-tree patch, and that patch introduced these two bugs (maybe more). > + read_unlock(&css_set_lock); > + if (!ret) { > mutex_unlock(&cgroup_mutex); > return -EBUSY; > } > ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <50EE2ABD.5070404-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() [not found] ` <50EE2ABD.5070404-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2013-01-10 8:49 ` Hans de Goede 0 siblings, 0 replies; 7+ messages in thread From: Hans de Goede @ 2013-01-10 8:49 UTC (permalink / raw) To: Li Zefan; +Cc: Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA Hi, On 01/10/2013 03:43 AM, Li Zefan wrote: > On 2013/1/9 23:03, Hans de Goede wrote: >> As indicated in the comment above cgroup_css_sets_empty the >> css_set_lock must be hold when calling it. >> >> Signed-off-by: Hans de Goede <hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >> --- >> kernel/cgroup.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c >> index c908354..59e711a 100644 >> --- a/kernel/cgroup.c >> +++ b/kernel/cgroup.c >> @@ -3980,7 +3980,10 @@ static int cgroup_rmdir(struct inode *unused_dir, struct dentry *dentry) >> /* the vfs holds both inode->i_mutex already */ >> again: >> mutex_lock(&cgroup_mutex); >> - if (!cgroup_css_sets_empty(cgrp)) { >> + read_lock(&css_set_lock); >> + ret = cgroup_css_sets_empty(cgrp); > > This function doesn't exist in 3.4.xx kernel! It was introduced by an > out-of-tree patch, and that patch introduced these two bugs (maybe more). Ah, I didn't expect the sunxi support patches in linux-sunxi to touch the cgroup code, but since they also bring in a ton of android stuff they do, and you're right, this android specific patch is the culprit: https://github.com/linux-sunxi/linux-sunxi/commit/dbc38c633f4b7abe97c53036df55fbb2040188bc Sorry about the noise. Regards, Hans ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3.4 2/2] cgroup: Fix use after free of cgrp (cgrp->css_sets) (v2) [not found] ` <1357743822-21707-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-01-09 15:03 ` [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() Hans de Goede @ 2013-01-09 15:03 ` Hans de Goede 1 sibling, 0 replies; 7+ messages in thread From: Hans de Goede @ 2013-01-09 15:03 UTC (permalink / raw) To: Tejun Heo, Li Zefan; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Hans de Goede Running a 3.4 kernel + Fedora-18 (systemd) userland on my Allwinner A10 (arm cortex a8), I'm seeing repeated, reproducable list_del list corruption errors when build with CONFIG_DEBUG_LIST, and the backtrace always shows free_css_set_work as the function making the problematic list_del call. I've tracked this doen to a use after free of the cgrp struct, specifically of the cgrp->css_sets list_head, which gets cleared by free_css_set_work. Since free_css_set_work runs form a workqueue, it is possible for it to not be done with clearing the list when the cgrp gets free-ed. To avoid this the code adding the links increases cgrp->count, and the freeing code running from the workqueue decreases cgrp->count *after* doing list_del, and then if the count goes to 0 calls cgroup_wakeup_rmdir_waiter(). However cgroup_rmdir() is missing a check for cgrp->count != 0, causing it to still continue with the rmdir (which leads to the free-ing of the cgrp), before free_css_set_work is done. Sometimes the free-ed memory is re-used before free_css_set_work gets around to unlinking link->cgrp_link_list, triggering the list_del list corruption messages. This patch fixes this by properly checking for cgrp->count != 0 and waiting for the cgroup_rmdir_waitq in that case. Changes in v2: -The cgrp->count check must be done before the cgroup_clear_css_refs() call/check, since on success cgroup_clear_css_refs has decremented all the css refs, and that *must* be done only once Signed-off-by: Hans de Goede <hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- kernel/cgroup.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 59e711a..9b0e1b4 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4022,7 +4022,12 @@ again: return -EBUSY; } prepare_to_wait(&cgroup_rmdir_waitq, &wait, TASK_INTERRUPTIBLE); - if (!cgroup_clear_css_refs(cgrp)) { + /* + * Note the cgrp->count check *must* be done before the + * cgroup_clear_css_refs() call, because on success that call has + * decremented all the css refs, and that *must* be done only once! + */ + if (atomic_read(&cgrp->count) != 0 || !cgroup_clear_css_refs(cgrp)) { mutex_unlock(&cgroup_mutex); /* * Because someone may call cgroup_wakeup_rmdir_waiter() before -- 1.8.0.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3.4 0/2] 2 bug fixes for the 3.4 cgroup code
@ 2013-01-09 12:11 Hans de Goede
[not found] ` <1357733486-1742-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Hans de Goede @ 2013-01-09 12:11 UTC (permalink / raw)
To: Tejun Heo, Li Zefan; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA
As a spare-time project I'm working on Linux on Allwinner A10 socs (ARM
mach-sun4i). While the linux-sunxi project is slowly working on cleaning
up the code and getting support for these socs upstream (first bits have
landed in 3.8), to be able to truely use these systems we're stuck on using
3.4 for now, since that is the latest to which all the ugly Allwinner code
has been forward-ported. See: http://linux-sunxi.org
I'm using Fedora-18 as userland, and since that uses systemd it exercises
the cgroup subsystem quite a bit. Thanks to building the kernel with
CONFIG_DEBUG_LIST this has exposed a use after free bug in the 3.4 cgroup
code. While debugging this I've also noticed some missing locking (or so
I believe).
The current cgroup code seems to be unaffected by both issues. Since the
current code is significantly changed, I've written a simple fix for 3.4,
which I would like to propose to be added to the 3.4.xx bugfix releases.
Allthough I know everyone dislikes looking back at old code once it has
been rewritten in a better fashion, I hope you are still willing to make
some time to review these 2 simple patches, as a first step in getting
them into 3.4.xx.
Some details on the differences between the current cgroup code, which
does not have these issues and the 3.4 code, in the current code:
1) The code-segment missing the locking has been removed
2) Unlike the 3.4 code cgroup_rmdir() properly checks cgrp->count
Note that the cgrp->count check in the current cgroup_rmdir() code has the
check before waiting on the cgroup_rmdir_waitq, and simply returns EBUSY when
cgrp->count != 0, where as my patch adds the check inside the waiting block
and waits for the waitq to be woken up.
The reason for this difference is that the current cgroup code handles
the clearing of a css_set / the link->cgrp_link_list elements from
cgrp->css_sets directly (and does not call cgroup_wakeup_rmdir_waiter()
when doing so), where as the 3.4 code handles the clearing of this list
from a workqueue, and *does* call cgroup_wakeup_rmdir_waiter() when clearing
the list.
Regards,
Hans
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <1357733486-1742-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() [not found] ` <1357733486-1742-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-01-09 12:11 ` Hans de Goede [not found] ` <1357733486-1742-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Hans de Goede @ 2013-01-09 12:11 UTC (permalink / raw) To: Tejun Heo, Li Zefan; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Hans de Goede As indicated in the comment above cgroup_css_sets_empty the css_set_lock must be hold when calling it. Signed-off-by: Hans de Goede <hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- kernel/cgroup.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index c908354..59e711a 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -3980,7 +3980,10 @@ static int cgroup_rmdir(struct inode *unused_dir, struct dentry *dentry) /* the vfs holds both inode->i_mutex already */ again: mutex_lock(&cgroup_mutex); - if (!cgroup_css_sets_empty(cgrp)) { + read_lock(&css_set_lock); + ret = cgroup_css_sets_empty(cgrp); + read_unlock(&css_set_lock); + if (!ret) { mutex_unlock(&cgroup_mutex); return -EBUSY; } -- 1.8.0.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
[parent not found: <1357733486-1742-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() [not found] ` <1357733486-1742-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-01-09 12:37 ` Hans de Goede 0 siblings, 0 replies; 7+ messages in thread From: Hans de Goede @ 2013-01-09 12:37 UTC (permalink / raw) To: Hans de Goede; +Cc: Tejun Heo, Li Zefan, cgroups-u79uwXL29TY76Z2rM5mHXA Hi, Self-nack, looks like I posted this series too soon. With a config which actually has CGROUP controllers enabled this causes a panic, so looks like I need to do some more work on this. Feedback on this issue is still appreciated. Regards, Hans On 01/09/2013 01:11 PM, Hans de Goede wrote: > As indicated in the comment above cgroup_css_sets_empty the > css_set_lock must be hold when calling it. > > Signed-off-by: Hans de Goede <hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > --- > kernel/cgroup.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index c908354..59e711a 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -3980,7 +3980,10 @@ static int cgroup_rmdir(struct inode *unused_dir, struct dentry *dentry) > /* the vfs holds both inode->i_mutex already */ > again: > mutex_lock(&cgroup_mutex); > - if (!cgroup_css_sets_empty(cgrp)) { > + read_lock(&css_set_lock); > + ret = cgroup_css_sets_empty(cgrp); > + read_unlock(&css_set_lock); > + if (!ret) { > mutex_unlock(&cgroup_mutex); > return -EBUSY; > } > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-01-10 8:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-09 15:03 2 bug fixes for the 3.4 cgroup code (v2) Hans de Goede
[not found] ` <1357743822-21707-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-01-09 15:03 ` [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() Hans de Goede
[not found] ` <1357743822-21707-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-01-10 2:43 ` Li Zefan
[not found] ` <50EE2ABD.5070404-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-01-10 8:49 ` Hans de Goede
2013-01-09 15:03 ` [PATCH 3.4 2/2] cgroup: Fix use after free of cgrp (cgrp->css_sets) (v2) Hans de Goede
-- strict thread matches above, loose matches on Subject: below --
2013-01-09 12:11 [PATCH 3.4 0/2] 2 bug fixes for the 3.4 cgroup code Hans de Goede
[not found] ` <1357733486-1742-1-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-01-09 12:11 ` [PATCH 3.4 1/2] cgroup: Take css_set_lock when calling cgroup_css_sets_empty() Hans de Goede
[not found] ` <1357733486-1742-2-git-send-email-hdegoede-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-01-09 12:37 ` Hans de Goede
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).