Re: [PATCH] memcg: handle panic_on_oom=always case

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nick Piggin <npiggin@suse.de>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
	rientjes@google.com,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH] memcg: handle panic_on_oom=always case
Date: Wed, 17 Feb 2010 19:45:26 +1100	[thread overview]
Message-ID: <20100217084526.GP5723@laptop> (raw)
In-Reply-To: <20100217150445.1a40201d.kamezawa.hiroyu@jp.fujitsu.com>

On Wed, Feb 17, 2010 at 03:04:45PM +0900, KAMEZAWA Hiroyuki wrote:
> tested on mmotm-Feb11.
> 
> Balbir-san, Nishimura-san, I want review from both of you.
> 
> ==
> 
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Now, if panic_on_oom=2, the whole system panics even if the oom happend
> in some special situation (as cpuset, mempolicy....).
> Then, panic_on_oom=2 means painc_on_oom_always.
> 
> Now, memcg doesn't check panic_on_oom flag. This patch adds a check.
> 
> Maybe someone doubts how it's useful. kdump+panic_on_oom=2 is the
> last tool to investigate what happens in oom-ed system. If a task is killed,
> the sysytem recovers and used memory were freed, there will be few hint
> to know what happnes. In mission critical system, oom should never happen.
> Then, investigation after OOM is very important.
> Then, panic_on_oom=2+kdump is useful to avoid next OOM by knowing
> precise information via snapshot.

No I don't doubt it is useful, and I think this probably is the simplest
and most useful semantic. So thanks for doing this.

I hate to pick nits in a trivial patch but I will anyway:


> TODO:
>  - For memcg, it's for isolate system's memory usage, oom-notiifer and
>    freeze_at_oom (or rest_at_oom) should be implemented. Then, management
>    daemon can do similar jobs (as kdump) in safer way or taking snapshot
>    per cgroup.
> 
> CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> CC: David Rientjes <rientjes@google.com>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
>  Documentation/cgroups/memory.txt |    2 ++
>  Documentation/sysctl/vm.txt      |    5 ++++-
>  mm/oom_kill.c                    |    2 ++
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> Index: mmotm-2.6.33-Feb11/Documentation/cgroups/memory.txt
> ===================================================================
> --- mmotm-2.6.33-Feb11.orig/Documentation/cgroups/memory.txt
> +++ mmotm-2.6.33-Feb11/Documentation/cgroups/memory.txt
> @@ -182,6 +182,8 @@ list.
>  NOTE: Reclaim does not work for the root cgroup, since we cannot set any
>  limits on the root cgroup.
>  
> +Note2: When panic_on_oom is set to "2", the whole system will panic.
> +

Maybe:

NOTE2: When panic_on_oom is set to "2", the whole system will panic in
case of an oom event in any cgroup.

>  2. Locking
>  
>  The memory controller uses the following hierarchy
> Index: mmotm-2.6.33-Feb11/Documentation/sysctl/vm.txt
> ===================================================================
> --- mmotm-2.6.33-Feb11.orig/Documentation/sysctl/vm.txt
> +++ mmotm-2.6.33-Feb11/Documentation/sysctl/vm.txt
> @@ -573,11 +573,14 @@ Because other nodes' memory may be free.
>  may be not fatal yet.
>  
>  If this is set to 2, the kernel panics compulsorily even on the
> -above-mentioned.
> +above-mentioned. Even oom happens under memoyr cgroup, the whole
> +system panics.
                                           memory

>  
>  The default value is 0.
>  1 and 2 are for failover of clustering. Please select either
>  according to your policy of failover.
> +2 seems too strong but panic_on_oom=2+kdump gives you very strong
> +tool to investigate a system which should never cause OOM.

I don't think you need say 2 seems too strong because as you rightfully
say, it has real uses. The hint about using it to investigate OOM
conditions is good though.

>  
>  =============================================================
>  
> Index: mmotm-2.6.33-Feb11/mm/oom_kill.c
> ===================================================================
> --- mmotm-2.6.33-Feb11.orig/mm/oom_kill.c
> +++ mmotm-2.6.33-Feb11/mm/oom_kill.c
> @@ -471,6 +471,8 @@ void mem_cgroup_out_of_memory(struct mem
>  	unsigned long points = 0;
>  	struct task_struct *p;
>  
> +	if (sysctl_panic_on_oom == 2)
> +		panic("out of memory(memcg). panic_on_oom is selected.\n");
>  	read_lock(&tasklist_lock);
>  retry:
>  	p = select_bad_process(&points, mem);

WARNING: multiple messages have this Message-ID (diff)

From: Nick Piggin <npiggin@suse.de>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
	rientjes@google.com,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH] memcg: handle panic_on_oom=always case
Date: Wed, 17 Feb 2010 19:45:26 +1100	[thread overview]
Message-ID: <20100217084526.GP5723@laptop> (raw)
In-Reply-To: <20100217150445.1a40201d.kamezawa.hiroyu@jp.fujitsu.com>

On Wed, Feb 17, 2010 at 03:04:45PM +0900, KAMEZAWA Hiroyuki wrote:
> tested on mmotm-Feb11.
> 
> Balbir-san, Nishimura-san, I want review from both of you.
> 
> ==
> 
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Now, if panic_on_oom=2, the whole system panics even if the oom happend
> in some special situation (as cpuset, mempolicy....).
> Then, panic_on_oom=2 means painc_on_oom_always.
> 
> Now, memcg doesn't check panic_on_oom flag. This patch adds a check.
> 
> Maybe someone doubts how it's useful. kdump+panic_on_oom=2 is the
> last tool to investigate what happens in oom-ed system. If a task is killed,
> the sysytem recovers and used memory were freed, there will be few hint
> to know what happnes. In mission critical system, oom should never happen.
> Then, investigation after OOM is very important.
> Then, panic_on_oom=2+kdump is useful to avoid next OOM by knowing
> precise information via snapshot.

No I don't doubt it is useful, and I think this probably is the simplest
and most useful semantic. So thanks for doing this.

I hate to pick nits in a trivial patch but I will anyway:


> TODO:
>  - For memcg, it's for isolate system's memory usage, oom-notiifer and
>    freeze_at_oom (or rest_at_oom) should be implemented. Then, management
>    daemon can do similar jobs (as kdump) in safer way or taking snapshot
>    per cgroup.
> 
> CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> CC: David Rientjes <rientjes@google.com>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
>  Documentation/cgroups/memory.txt |    2 ++
>  Documentation/sysctl/vm.txt      |    5 ++++-
>  mm/oom_kill.c                    |    2 ++
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> Index: mmotm-2.6.33-Feb11/Documentation/cgroups/memory.txt
> ===================================================================
> --- mmotm-2.6.33-Feb11.orig/Documentation/cgroups/memory.txt
> +++ mmotm-2.6.33-Feb11/Documentation/cgroups/memory.txt
> @@ -182,6 +182,8 @@ list.
>  NOTE: Reclaim does not work for the root cgroup, since we cannot set any
>  limits on the root cgroup.
>  
> +Note2: When panic_on_oom is set to "2", the whole system will panic.
> +

Maybe:

NOTE2: When panic_on_oom is set to "2", the whole system will panic in
case of an oom event in any cgroup.

>  2. Locking
>  
>  The memory controller uses the following hierarchy
> Index: mmotm-2.6.33-Feb11/Documentation/sysctl/vm.txt
> ===================================================================
> --- mmotm-2.6.33-Feb11.orig/Documentation/sysctl/vm.txt
> +++ mmotm-2.6.33-Feb11/Documentation/sysctl/vm.txt
> @@ -573,11 +573,14 @@ Because other nodes' memory may be free.
>  may be not fatal yet.
>  
>  If this is set to 2, the kernel panics compulsorily even on the
> -above-mentioned.
> +above-mentioned. Even oom happens under memoyr cgroup, the whole
> +system panics.
                                           memory

>  
>  The default value is 0.
>  1 and 2 are for failover of clustering. Please select either
>  according to your policy of failover.
> +2 seems too strong but panic_on_oom=2+kdump gives you very strong
> +tool to investigate a system which should never cause OOM.

I don't think you need say 2 seems too strong because as you rightfully
say, it has real uses. The hint about using it to investigate OOM
conditions is good though.

>  
>  =============================================================
>  
> Index: mmotm-2.6.33-Feb11/mm/oom_kill.c
> ===================================================================
> --- mmotm-2.6.33-Feb11.orig/mm/oom_kill.c
> +++ mmotm-2.6.33-Feb11/mm/oom_kill.c
> @@ -471,6 +471,8 @@ void mem_cgroup_out_of_memory(struct mem
>  	unsigned long points = 0;
>  	struct task_struct *p;
>  
> +	if (sysctl_panic_on_oom == 2)
> +		panic("out of memory(memcg). panic_on_oom is selected.\n");
>  	read_lock(&tasklist_lock);
>  retry:
>  	p = select_bad_process(&points, mem);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-02-17  8:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-17  6:04 [PATCH] memcg: handle panic_on_oom=always case KAMEZAWA Hiroyuki
2010-02-17  6:04 ` KAMEZAWA Hiroyuki
2010-02-17  6:50 ` Daisuke Nishimura
2010-02-17  6:50   ` Daisuke Nishimura
2010-02-17  8:45 ` Nick Piggin [this message]
2010-02-17  8:45   ` Nick Piggin
2010-02-17  8:51   ` KAMEZAWA Hiroyuki
2010-02-17  8:51     ` KAMEZAWA Hiroyuki
2010-02-17  9:04 ` [PATCH] memcg: handle panic_on_oom=always case v2 KAMEZAWA Hiroyuki
2010-02-17  9:04   ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100217084526.GP5723@laptop \
    --to=npiggin@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.