[PATCH] memcg: fix oops in oom handling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Li Zefan <lizf@cn.fujitsu.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Paul Menage <menage@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: [PATCH] memcg: fix oops in oom handling
Date: Mon, 14 Apr 2008 14:52:00 +0800	[thread overview]
Message-ID: <4802FF10.6030905@cn.fujitsu.com> (raw)

When I used a test program to fork mass processes and immediately
move them to a cgroup where the memory limit is low enough to
trigger oom kill, I got oops:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
IP: [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
PGD 4c95f067 PUD 4406c067 PMD 0
Oops: 0002 [1] SMP
CPU 2
Modules linked in:

Pid: 11973, comm: a.out Not tainted 2.6.25-rc7 #5
RIP: 0010:[<ffffffff8045c47f>]  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
RSP: 0018:ffff8100448c7c30  EFLAGS: 00010002
RAX: 0000000000000202 RBX: 0000000000000009 RCX: 000000000001c9f3
RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000808
RBP: ffff81007e444080 R08: 0000000000000000 R09: ffff8100448c7900
R10: ffff81000105f480 R11: 00000100ffffffff R12: ffff810067c84140
R13: 0000000000000001 R14: ffff8100441d0018 R15: ffff81007da56200
FS:  00007f70eb1856f0(0000) GS:ffff81007fbad3c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000808 CR3: 000000004498a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process a.out (pid: 11973, threadinfo ffff8100448c6000, task ffff81007da533e0)
Stack:  ffffffff8023ef5a 00000000000000d0 ffffffff80548dc0 00000000000000d0
 ffff810067c84140 ffff81007e444080 ffffffff8026cef9 00000000000000d0
 ffff8100441d0000 00000000000000d0 ffff8100441d0000 ffff8100505445c0
Call Trace:
 [<ffffffff8023ef5a>] ? force_sig_info+0x25/0xb9
 [<ffffffff8026cef9>] ? oom_kill_task+0x77/0xe2
 [<ffffffff8026d696>] ? mem_cgroup_out_of_memory+0x55/0x67
 [<ffffffff802910ad>] ? mem_cgroup_charge_common+0xec/0x202
 [<ffffffff8027997b>] ? handle_mm_fault+0x24e/0x77f
 [<ffffffff8022c4af>] ? default_wake_function+0x0/0xe
 [<ffffffff8027a17a>] ? get_user_pages+0x2ce/0x3af
 [<ffffffff80290fee>] ? mem_cgroup_charge_common+0x2d/0x202
 [<ffffffff8027a441>] ? make_pages_present+0x8e/0xa4
 [<ffffffff8027d1ab>] ? mmap_region+0x373/0x429
 [<ffffffff8027d7eb>] ? do_mmap_pgoff+0x2ff/0x364
 [<ffffffff80210471>] ? sys_mmap+0xe5/0x111
 [<ffffffff8020bfc9>] ? tracesys+0xdc/0xe1

Code: 00 00 01 48 8b 3c 24 e9 46 d4 dd ff f0 ff 07 48 8b 3c 24 e9 3a d4 dd ff fe 07 48 8b 3c 24 e9 2f d4 dd ff 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c3 fa b8 00 01 00
RIP  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
 RSP <ffff8100448c7c30>
CR2: 0000000000000808
---[ end trace c3702fa668021ea4 ]---

It's reproducable in a x86_64 box, but doesn't happen in x86_32.

This is because tsk->sighand is not guarded by RCU, so we have to
hold tasklist_lock, just as what out_of_memory() does.

Signed-off-by: Li Zefan <lizf@cn.fujitsu>
---
 mm/oom_kill.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f255eda..beb592f 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -423,7 +423,7 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask)
 	struct task_struct *p;
 
 	cgroup_lock();
-	rcu_read_lock();
+	read_lock(&tasklist_lock);
 retry:
 	p = select_bad_process(&points, mem);
 	if (PTR_ERR(p) == -1UL)
@@ -436,7 +436,7 @@ retry:
 				"Memory cgroup out of memory"))
 		goto retry;
 out:
-	rcu_read_unlock();
+	read_unlock(&tasklist_lock);
 	cgroup_unlock();
 }
 #endif
-- 1.5.4.rc3

WARNING: multiple messages have this Message-ID (diff)

From: Li Zefan <lizf@cn.fujitsu.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Paul Menage <menage@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: [PATCH] memcg: fix oops in oom handling
Date: Mon, 14 Apr 2008 14:52:00 +0800	[thread overview]
Message-ID: <4802FF10.6030905@cn.fujitsu.com> (raw)

When I used a test program to fork mass processes and immediately
move them to a cgroup where the memory limit is low enough to
trigger oom kill, I got oops:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
IP: [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
PGD 4c95f067 PUD 4406c067 PMD 0
Oops: 0002 [1] SMP
CPU 2
Modules linked in:

Pid: 11973, comm: a.out Not tainted 2.6.25-rc7 #5
RIP: 0010:[<ffffffff8045c47f>]  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
RSP: 0018:ffff8100448c7c30  EFLAGS: 00010002
RAX: 0000000000000202 RBX: 0000000000000009 RCX: 000000000001c9f3
RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000808
RBP: ffff81007e444080 R08: 0000000000000000 R09: ffff8100448c7900
R10: ffff81000105f480 R11: 00000100ffffffff R12: ffff810067c84140
R13: 0000000000000001 R14: ffff8100441d0018 R15: ffff81007da56200
FS:  00007f70eb1856f0(0000) GS:ffff81007fbad3c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000808 CR3: 000000004498a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process a.out (pid: 11973, threadinfo ffff8100448c6000, task ffff81007da533e0)
Stack:  ffffffff8023ef5a 00000000000000d0 ffffffff80548dc0 00000000000000d0
 ffff810067c84140 ffff81007e444080 ffffffff8026cef9 00000000000000d0
 ffff8100441d0000 00000000000000d0 ffff8100441d0000 ffff8100505445c0
Call Trace:
 [<ffffffff8023ef5a>] ? force_sig_info+0x25/0xb9
 [<ffffffff8026cef9>] ? oom_kill_task+0x77/0xe2
 [<ffffffff8026d696>] ? mem_cgroup_out_of_memory+0x55/0x67
 [<ffffffff802910ad>] ? mem_cgroup_charge_common+0xec/0x202
 [<ffffffff8027997b>] ? handle_mm_fault+0x24e/0x77f
 [<ffffffff8022c4af>] ? default_wake_function+0x0/0xe
 [<ffffffff8027a17a>] ? get_user_pages+0x2ce/0x3af
 [<ffffffff80290fee>] ? mem_cgroup_charge_common+0x2d/0x202
 [<ffffffff8027a441>] ? make_pages_present+0x8e/0xa4
 [<ffffffff8027d1ab>] ? mmap_region+0x373/0x429
 [<ffffffff8027d7eb>] ? do_mmap_pgoff+0x2ff/0x364
 [<ffffffff80210471>] ? sys_mmap+0xe5/0x111
 [<ffffffff8020bfc9>] ? tracesys+0xdc/0xe1

Code: 00 00 01 48 8b 3c 24 e9 46 d4 dd ff f0 ff 07 48 8b 3c 24 e9 3a d4 dd ff fe 07 48 8b 3c 24 e9 2f d4 dd ff 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c3 fa b8 00 01 00
RIP  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
 RSP <ffff8100448c7c30>
CR2: 0000000000000808
---[ end trace c3702fa668021ea4 ]---

It's reproducable in a x86_64 box, but doesn't happen in x86_32.

This is because tsk->sighand is not guarded by RCU, so we have to
hold tasklist_lock, just as what out_of_memory() does.

Signed-off-by: Li Zefan <lizf@cn.fujitsu>
---
 mm/oom_kill.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f255eda..beb592f 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -423,7 +423,7 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask)
 	struct task_struct *p;
 
 	cgroup_lock();
-	rcu_read_lock();
+	read_lock(&tasklist_lock);
 retry:
 	p = select_bad_process(&points, mem);
 	if (PTR_ERR(p) == -1UL)
@@ -436,7 +436,7 @@ retry:
 				"Memory cgroup out of memory"))
 		goto retry;
 out:
-	rcu_read_unlock();
+	read_unlock(&tasklist_lock);
 	cgroup_unlock();
 }
 #endif
-- 1.5.4.rc3 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next             reply	other threads:[~2008-04-14  6:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-14  6:52 Li Zefan [this message]
2008-04-14  6:52 ` [PATCH] memcg: fix oops in oom handling Li Zefan
2008-04-14  7:14 ` KAMEZAWA Hiroyuki
2008-04-14  7:14   ` KAMEZAWA Hiroyuki
2008-04-14  7:24   ` Li Zefan
2008-04-14  7:24     ` Li Zefan
2008-04-14  7:53   ` Paul Menage
2008-04-14  7:53     ` Paul Menage
2008-04-14  8:07     ` Li Zefan
2008-04-14  8:07       ` Li Zefan
2008-04-14  9:20       ` Li Zefan
2008-04-14  9:20         ` Li Zefan
2008-04-14  7:24 ` KAMEZAWA Hiroyuki
2008-04-14  7:24   ` KAMEZAWA Hiroyuki
2008-04-14  7:48   ` Andrew Morton
2008-04-14  7:48     ` Andrew Morton
2008-04-14  7:55 ` Balbir Singh
2008-04-14  7:55   ` Balbir Singh

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:f255eda dfblob:beb592f dfblob:f255eda dfblob:beb592f )
 OR (
bs:"[PATCH] memcg: fix oops in oom handling" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4802FF10.6030905@cn.fujitsu.com \
    --to=lizf@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=menage@google.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.