From: Li Zefan <lizf@cn.fujitsu.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux Containers <containers@lists.linux-foundation.org>,
Paul Menage <menage@google.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Paul Jackson <pj@sgi.com>
Subject: [PATCH] cgroup: fix a race condition in manipulating tsk->cg_list
Date: Thu, 17 Apr 2008 11:37:15 +0800 [thread overview]
Message-ID: <4806C5EB.3040102@cn.fujitsu.com> (raw)
When I ran a test program to fork mass processes and at the same time
'cat /cgroup/tasks', I got the following oops:
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:72!
invalid opcode: 0000 [#1] SMP
Pid: 4178, comm: a.out Not tainted (2.6.25-rc9 #72)
...
Call Trace:
[<c044a5f9>] ? cgroup_exit+0x55/0x94
[<c0427acf>] ? do_exit+0x217/0x5ba
[<c0427ed7>] ? do_group_exit+0.65/0x7c
[<c0427efd>] ? sys_exit_group+0xf/0x11
[<c0404842>] ? syscall_call+0x7/0xb
[<c05e0000>] ? init_cyrix+0x2fa/0x479
...
EIP: [<c04df671>] list_del+0x35/0x53 SS:ESP 0068:ebc7df4
---[ end trace caffb7332252612b ]---
Fixing recursive fault but reboot is needed!
After digging into the code and debugging, I finlly found out a race
situation:
do_exit()
->cgroup_exit()
->if (!list_empty(&tsk->cg_list))
list_del(&tsk->cg_list);
cgroup_iter_start()
->cgroup_enable_task_cg_list()
->list_add(&tsk->cg_list, ..);
In this case the list won't be deleted though the process has exited.
We got two bug reports in the past, which seem to be the same bug as
this one:
http://lkml.org/lkml/2008/3/5/332
http://lkml.org/lkml/2007/10/17/224
Actually sometimes I got oops on list_del, sometimes oops on list_add.
And I can change my test program a bit to trigger other oops.
The patch has been tested both on x86_32 and x86_64.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
kernel/cgroup.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2727f92..6d8de05 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1722,7 +1722,12 @@ void cgroup_enable_task_cg_lists(void)
use_task_css_set_links = 1;
do_each_thread(g, p) {
task_lock(p);
- if (list_empty(&p->cg_list))
+ /*
+ * We should check if the process is exiting, otherwise
+ * it will race with cgroup_exit() in that the list
+ * entry won't be deleted though the process has exited.
+ */
+ if (!(p->flags & PF_EXITING) && list_empty(&p->cg_list))
list_add(&p->cg_list, &p->cgroups->tasks);
task_unlock(p);
} while_each_thread(g, p);
-- 1.5.4.rc3
next reply other threads:[~2008-04-17 3:38 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-17 3:37 Li Zefan [this message]
[not found] ` <4806C5EB.3040102-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2008-04-17 4:11 ` [PATCH] cgroup: fix a race condition in manipulating tsk->cg_list Andrew Morton
2008-04-17 4:11 ` Andrew Morton
2008-04-17 4:17 ` Paul Menage
2008-04-17 4:59 ` Andrew Morton
2008-04-17 5:10 ` Li Zefan
[not found] ` <4806DBC9.3090607-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2008-04-17 5:16 ` Andrew Morton
2008-04-17 5:16 ` Andrew Morton
[not found] ` <20080416215907.63d71409.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-04-17 5:10 ` Li Zefan
2008-04-17 5:20 ` Paul Menage
2008-04-17 5:20 ` Paul Menage
[not found] ` <6599ad830804162117w14364b7cg20d3694ffdfeb867-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-17 4:59 ` Andrew Morton
[not found] ` <20080416211144.a38f6fc0.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-04-17 4:17 ` Paul Menage
2008-04-17 4:18 ` Paul Menage
2008-04-17 4:18 ` Paul Menage
2008-04-17 4:28 ` Paul Menage
[not found] ` <6599ad830804162118g6b24d8ebq26b0d72133b0e19e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-17 4:28 ` Paul Menage
2008-04-17 5:04 ` Li Zefan
2008-04-17 5:04 ` Li Zefan
2008-04-17 5:16 ` Andrew Morton
[not found] ` <20080416221655.c73d219f.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-04-17 5:24 ` Paul Menage
2008-04-17 5:24 ` Paul Menage
2008-04-17 5:27 ` Li Zefan
[not found] ` <6599ad830804162224s42ba221vea981fe34b30636a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-17 5:27 ` Li Zefan
[not found] ` <4806DA6F.3000405-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2008-04-17 5:16 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2008-04-17 3:37 Li Zefan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4806C5EB.3040102@cn.fujitsu.com \
--to=lizf@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=pj@sgi.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.