Re: [PATCH] memcg: fix oops in oom handling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Li Zefan <lizf@cn.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Paul Menage <menage@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH] memcg: fix oops in oom handling
Date: Mon, 14 Apr 2008 13:25:43 +0530	[thread overview]
Message-ID: <48030DFF.9070407@linux.vnet.ibm.com> (raw)
In-Reply-To: <4802FF10.6030905@cn.fujitsu.com>

Li Zefan wrote:
> When I used a test program to fork mass processes and immediately
> move them to a cgroup where the memory limit is low enough to
> trigger oom kill, I got oops:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
> IP: [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
> PGD 4c95f067 PUD 4406c067 PMD 0
> Oops: 0002 [1] SMP
> CPU 2
> Modules linked in:
> 
> Pid: 11973, comm: a.out Not tainted 2.6.25-rc7 #5
> RIP: 0010:[<ffffffff8045c47f>]  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
> RSP: 0018:ffff8100448c7c30  EFLAGS: 00010002
> RAX: 0000000000000202 RBX: 0000000000000009 RCX: 000000000001c9f3
> RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000808
> RBP: ffff81007e444080 R08: 0000000000000000 R09: ffff8100448c7900
> R10: ffff81000105f480 R11: 00000100ffffffff R12: ffff810067c84140
> R13: 0000000000000001 R14: ffff8100441d0018 R15: ffff81007da56200
> FS:  00007f70eb1856f0(0000) GS:ffff81007fbad3c0(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000808 CR3: 000000004498a000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process a.out (pid: 11973, threadinfo ffff8100448c6000, task ffff81007da533e0)
> Stack:  ffffffff8023ef5a 00000000000000d0 ffffffff80548dc0 00000000000000d0
>  ffff810067c84140 ffff81007e444080 ffffffff8026cef9 00000000000000d0
>  ffff8100441d0000 00000000000000d0 ffff8100441d0000 ffff8100505445c0
> Call Trace:
>  [<ffffffff8023ef5a>] ? force_sig_info+0x25/0xb9
>  [<ffffffff8026cef9>] ? oom_kill_task+0x77/0xe2
>  [<ffffffff8026d696>] ? mem_cgroup_out_of_memory+0x55/0x67
>  [<ffffffff802910ad>] ? mem_cgroup_charge_common+0xec/0x202
>  [<ffffffff8027997b>] ? handle_mm_fault+0x24e/0x77f
>  [<ffffffff8022c4af>] ? default_wake_function+0x0/0xe
>  [<ffffffff8027a17a>] ? get_user_pages+0x2ce/0x3af
>  [<ffffffff80290fee>] ? mem_cgroup_charge_common+0x2d/0x202
>  [<ffffffff8027a441>] ? make_pages_present+0x8e/0xa4
>  [<ffffffff8027d1ab>] ? mmap_region+0x373/0x429
>  [<ffffffff8027d7eb>] ? do_mmap_pgoff+0x2ff/0x364
>  [<ffffffff80210471>] ? sys_mmap+0xe5/0x111
>  [<ffffffff8020bfc9>] ? tracesys+0xdc/0xe1
> 
> Code: 00 00 01 48 8b 3c 24 e9 46 d4 dd ff f0 ff 07 48 8b 3c 24 e9 3a d4 dd ff fe 07 48 8b 3c 24 e9 2f d4 dd ff 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c3 fa b8 00 01 00
> RIP  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
>  RSP <ffff8100448c7c30>
> CR2: 0000000000000808
> ---[ end trace c3702fa668021ea4 ]---
> 
> It's reproducable in a x86_64 box, but doesn't happen in x86_32.
> 
> This is because tsk->sighand is not guarded by RCU, so we have to
> hold tasklist_lock, just as what out_of_memory() does.
> 
> Signed-off-by: Li Zefan <lizf@cn.fujitsu>
> ---
>  mm/oom_kill.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index f255eda..beb592f 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -423,7 +423,7 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask)
>  	struct task_struct *p;
> 
>  	cgroup_lock();
> -	rcu_read_lock();
> +	read_lock(&tasklist_lock);
>  retry:
>  	p = select_bad_process(&points, mem);
>  	if (PTR_ERR(p) == -1UL)
> @@ -436,7 +436,7 @@ retry:
>  				"Memory cgroup out of memory"))
>  		goto retry;
>  out:
> -	rcu_read_unlock();
> +	read_unlock(&tasklist_lock);
>  	cgroup_unlock();
>  }
>  #endif
> -- 1.5.4.rc3 

This looks sane to me

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

WARNING: multiple messages have this Message-ID (diff)

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Li Zefan <lizf@cn.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Paul Menage <menage@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH] memcg: fix oops in oom handling
Date: Mon, 14 Apr 2008 13:25:43 +0530	[thread overview]
Message-ID: <48030DFF.9070407@linux.vnet.ibm.com> (raw)
In-Reply-To: <4802FF10.6030905@cn.fujitsu.com>

Li Zefan wrote:
> When I used a test program to fork mass processes and immediately
> move them to a cgroup where the memory limit is low enough to
> trigger oom kill, I got oops:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
> IP: [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
> PGD 4c95f067 PUD 4406c067 PMD 0
> Oops: 0002 [1] SMP
> CPU 2
> Modules linked in:
> 
> Pid: 11973, comm: a.out Not tainted 2.6.25-rc7 #5
> RIP: 0010:[<ffffffff8045c47f>]  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
> RSP: 0018:ffff8100448c7c30  EFLAGS: 00010002
> RAX: 0000000000000202 RBX: 0000000000000009 RCX: 000000000001c9f3
> RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000808
> RBP: ffff81007e444080 R08: 0000000000000000 R09: ffff8100448c7900
> R10: ffff81000105f480 R11: 00000100ffffffff R12: ffff810067c84140
> R13: 0000000000000001 R14: ffff8100441d0018 R15: ffff81007da56200
> FS:  00007f70eb1856f0(0000) GS:ffff81007fbad3c0(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000808 CR3: 000000004498a000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process a.out (pid: 11973, threadinfo ffff8100448c6000, task ffff81007da533e0)
> Stack:  ffffffff8023ef5a 00000000000000d0 ffffffff80548dc0 00000000000000d0
>  ffff810067c84140 ffff81007e444080 ffffffff8026cef9 00000000000000d0
>  ffff8100441d0000 00000000000000d0 ffff8100441d0000 ffff8100505445c0
> Call Trace:
>  [<ffffffff8023ef5a>] ? force_sig_info+0x25/0xb9
>  [<ffffffff8026cef9>] ? oom_kill_task+0x77/0xe2
>  [<ffffffff8026d696>] ? mem_cgroup_out_of_memory+0x55/0x67
>  [<ffffffff802910ad>] ? mem_cgroup_charge_common+0xec/0x202
>  [<ffffffff8027997b>] ? handle_mm_fault+0x24e/0x77f
>  [<ffffffff8022c4af>] ? default_wake_function+0x0/0xe
>  [<ffffffff8027a17a>] ? get_user_pages+0x2ce/0x3af
>  [<ffffffff80290fee>] ? mem_cgroup_charge_common+0x2d/0x202
>  [<ffffffff8027a441>] ? make_pages_present+0x8e/0xa4
>  [<ffffffff8027d1ab>] ? mmap_region+0x373/0x429
>  [<ffffffff8027d7eb>] ? do_mmap_pgoff+0x2ff/0x364
>  [<ffffffff80210471>] ? sys_mmap+0xe5/0x111
>  [<ffffffff8020bfc9>] ? tracesys+0xdc/0xe1
> 
> Code: 00 00 01 48 8b 3c 24 e9 46 d4 dd ff f0 ff 07 48 8b 3c 24 e9 3a d4 dd ff fe 07 48 8b 3c 24 e9 2f d4 dd ff 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c3 fa b8 00 01 00
> RIP  [<ffffffff8045c47f>] _spin_lock_irqsave+0x8/0x18
>  RSP <ffff8100448c7c30>
> CR2: 0000000000000808
> ---[ end trace c3702fa668021ea4 ]---
> 
> It's reproducable in a x86_64 box, but doesn't happen in x86_32.
> 
> This is because tsk->sighand is not guarded by RCU, so we have to
> hold tasklist_lock, just as what out_of_memory() does.
> 
> Signed-off-by: Li Zefan <lizf@cn.fujitsu>
> ---
>  mm/oom_kill.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index f255eda..beb592f 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -423,7 +423,7 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask)
>  	struct task_struct *p;
> 
>  	cgroup_lock();
> -	rcu_read_lock();
> +	read_lock(&tasklist_lock);
>  retry:
>  	p = select_bad_process(&points, mem);
>  	if (PTR_ERR(p) == -1UL)
> @@ -436,7 +436,7 @@ retry:
>  				"Memory cgroup out of memory"))
>  		goto retry;
>  out:
> -	rcu_read_unlock();
> +	read_unlock(&tasklist_lock);
>  	cgroup_unlock();
>  }
>  #endif
> -- 1.5.4.rc3 

This looks sane to me

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-04-14  7:58 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-14  6:52 [PATCH] memcg: fix oops in oom handling Li Zefan
2008-04-14  6:52 ` Li Zefan
2008-04-14  7:14 ` KAMEZAWA Hiroyuki
2008-04-14  7:14   ` KAMEZAWA Hiroyuki
2008-04-14  7:24   ` Li Zefan
2008-04-14  7:24     ` Li Zefan
2008-04-14  7:53   ` Paul Menage
2008-04-14  7:53     ` Paul Menage
2008-04-14  8:07     ` Li Zefan
2008-04-14  8:07       ` Li Zefan
2008-04-14  9:20       ` Li Zefan
2008-04-14  9:20         ` Li Zefan
2008-04-14  7:24 ` KAMEZAWA Hiroyuki
2008-04-14  7:24   ` KAMEZAWA Hiroyuki
2008-04-14  7:48   ` Andrew Morton
2008-04-14  7:48     ` Andrew Morton
2008-04-14  7:55 ` Balbir Singh [this message]
2008-04-14  7:55   ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48030DFF.9070407@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.