* [PATCH] cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput()
@ 2009-12-23 18:48 Dave Anderson
2009-12-24 5:47 ` Li Zefan
2009-12-24 8:38 ` Ben Blum
0 siblings, 2 replies; 3+ messages in thread
From: Dave Anderson @ 2009-12-23 18:48 UTC (permalink / raw)
To: menage; +Cc: linux-kernel, bblum, lizf
[-- Attachment #1: Type: text/plain, Size: 1591 bytes --]
The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
here in cgroup_diput():
/*
* if we're getting rid of the cgroup, refcount should ensure
* that there are no pidlists left.
*/
BUG_ON(!list_empty(&cgrp->pidlists));
The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
when pidlist_array_load() calls cgroup_pidlist_find():
(1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
pre-existing cgroup_pidlist, and increments its use_count.
(2) if no matching cgroup_pidlist is found, then a new one is allocated, it
down_write's its mutex, and the use_count is set to 0.
(3) the matching, or new, cgroup_pidlist gets returned back to pidlist_array_load(),
which increments its use_count -- regardless whether new or pre-existing --
and up_write's the mutex.
So if a matching list is ever encountered by cgroup_pidlist_find() during
the life of a cgroup directory, it results in an inflated use_count value,
preventing it from ever getting released by cgroup_release_pid_array().
Then if the directory is subsequently removed, cgroup_diput() hits the
BUG_ON() when it finds that the directory's cgroup is still populated
with a pidlist.
The patch simply removes the use_count increment when a matching
pidlist is found by cgroup_pidlist_find(), because it gets bumped by
the calling pidlist_array_load() function while still protected by the
list's mutex.
Signed-off-by: Dave Anderson <anderson@redhat.com>
---
[-- Attachment #2: cgroup.patch --]
[-- Type: text/x-patch, Size: 301 bytes --]
--- linux-2.6-git/kernel/cgroup.c.orig
+++ linux-2.6-git/kernel/cgroup.c
@@ -2468,7 +2468,6 @@ static struct cgroup_pidlist *cgroup_pid
/* make sure l doesn't vanish out from under us */
down_write(&l->mutex);
mutex_unlock(&cgrp->pidlist_mutex);
- l->use_count++;
return l;
}
}
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput()
2009-12-23 18:48 [PATCH] cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput() Dave Anderson
@ 2009-12-24 5:47 ` Li Zefan
2009-12-24 8:38 ` Ben Blum
1 sibling, 0 replies; 3+ messages in thread
From: Li Zefan @ 2009-12-24 5:47 UTC (permalink / raw)
To: Dave Anderson
Cc: menage, linux-kernel, Ben Blum, Andrew Morton,
containers@lists.osdl.org
CC: Andrew
CC: Container list
Dave Anderson wrote:
>
> The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
> here in cgroup_diput():
>
> /*
> * if we're getting rid of the cgroup, refcount should
> ensure
> * that there are no pidlists left.
> */
> BUG_ON(!list_empty(&cgrp->pidlists));
>
Good catch. Thanks!
This BUG can be easily triggered if 2 threads are reading the same cgroup's
tasks file at the same time, and then the cgroup gets removed.
And this patch needs to be added to 2.6.32.x.
> The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
> when pidlist_array_load() calls cgroup_pidlist_find():
>
> (1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
> pre-existing cgroup_pidlist, and increments its use_count.
> (2) if no matching cgroup_pidlist is found, then a new one is allocated, it
> down_write's its mutex, and the use_count is set to 0.
> (3) the matching, or new, cgroup_pidlist gets returned back to
> pidlist_array_load(),
> which increments its use_count -- regardless whether new or
> pre-existing --
> and up_write's the mutex.
>
> So if a matching list is ever encountered by cgroup_pidlist_find() during
> the life of a cgroup directory, it results in an inflated use_count value,
> preventing it from ever getting released by cgroup_release_pid_array().
> Then if the directory is subsequently removed, cgroup_diput() hits the
> BUG_ON() when it finds that the directory's cgroup is still populated
> with a pidlist.
>
> The patch simply removes the use_count increment when a matching
> pidlist is found by cgroup_pidlist_find(), because it gets bumped by
> the calling pidlist_array_load() function while still protected by the
> list's mutex.
>
> Signed-off-by: Dave Anderson <anderson@redhat.com>
>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput()
2009-12-23 18:48 [PATCH] cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput() Dave Anderson
2009-12-24 5:47 ` Li Zefan
@ 2009-12-24 8:38 ` Ben Blum
1 sibling, 0 replies; 3+ messages in thread
From: Ben Blum @ 2009-12-24 8:38 UTC (permalink / raw)
To: Dave Anderson; +Cc: menage, linux-kernel, lizf
On Wed, Dec 23, 2009 at 01:48:42PM -0500, Dave Anderson wrote:
>
> The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
> here in cgroup_diput():
>
> /*
> * if we're getting rid of the cgroup, refcount should
> ensure
> * that there are no pidlists left.
> */
> BUG_ON(!list_empty(&cgrp->pidlists));
>
> The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
> when pidlist_array_load() calls cgroup_pidlist_find():
>
> (1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
> pre-existing cgroup_pidlist, and increments its use_count.
> (2) if no matching cgroup_pidlist is found, then a new one is allocated, it
> down_write's its mutex, and the use_count is set to 0.
> (3) the matching, or new, cgroup_pidlist gets returned back to
> pidlist_array_load(),
> which increments its use_count -- regardless whether new or
> pre-existing --
> and up_write's the mutex.
>
> So if a matching list is ever encountered by cgroup_pidlist_find() during
> the life of a cgroup directory, it results in an inflated use_count value,
> preventing it from ever getting released by cgroup_release_pid_array().
> Then if the directory is subsequently removed, cgroup_diput() hits the
> BUG_ON() when it finds that the directory's cgroup is still populated
> with a pidlist.
>
> The patch simply removes the use_count increment when a matching
> pidlist is found by cgroup_pidlist_find(), because it gets bumped by
> the calling pidlist_array_load() function while still protected by the
> list's mutex.
>
> Signed-off-by: Dave Anderson <anderson@redhat.com>
>
> ---
>
>
Ack! That was probably my fault. Good catch.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-12-24 8:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-23 18:48 [PATCH] cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput() Dave Anderson
2009-12-24 5:47 ` Li Zefan
2009-12-24 8:38 ` Ben Blum
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox