Re: [patch] mm, coredump: fail allocations when coredumping instead of oom killing

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@linux-foundation.org>
To: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@suse.de>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-mm@kvack.org
Subject: Re: [patch] mm, coredump: fail allocations when coredumping instead of oom killing
Date: Thu, 22 Mar 2012 16:07:03 -0700	[thread overview]
Message-ID: <20120322160703.dbcf52a8.akpm@linux-foundation.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1203191723470.3609@chino.kir.corp.google.com>

On Mon, 19 Mar 2012 17:46:47 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> On Mon, 19 Mar 2012, Andrew Morton wrote:
> 
> > > Yup, this is the one.  We only currently see this when a memcg is at its 
> > > limit and there are other threads that are trying to exit that are blocked 
> > > on a coredumper that can no longer get memory.  dump_write() calling 
> > > ->write() (ext4 in this case) causes a livelock when 
> > > add_to_page_cache_locked() tries to charge the soon-to-be-added pagecache 
> > > to the coredumper's memcg that is oom and calls 
> > > mem_cgroup_charge_common().  That allows the oom, but the oom killer will 
> > > find the other threads that are exiting and choose to be a no-op to avoid 
> > > needlessly killing threads.  The coredumper only has PF_DUMPCORE and not 
> > > PF_EXITING so it doesn't get immediately killed.
> > 
> > I don't understand the description of the livelock.  Does
> > add_to_page_cache_locked() succeed, or fail?  What does "allows the
> > oom" mean?
> > 
> 
> Sorry if it wasn't clear.  The coredumper calling into 
> add_to_page_cache_locked() calls the oom killer because the memcg is oom 
> (and would call the global oom killer if the entire system were oom).  The 
> oom killer, both memcg and global, doesn't do anything because it sees 
> eligible threads with PF_EXITING set.  This logic has existed for several 
> years to avoid needlessly oom killing additional threads when others are 
> already in the process of exiting and freeing their memory.  Those 
> PF_EXITING threads, however, are blocked on the coredumper to exit in 
> exit_mm(), so they'll never actually exit.  Thus, the coredumper must make 
> forward progress for anything to actually exit and the oom killer is 
> useless.
> 
> In this condition, there are a few options:
> 
>  - give the coredumper access to memory reserves and allow it to allocate,
>    essentially oom killing it,
> 
>  - fail coredumper memory allocations because of the oom condition and 
>    allow the threads blocked on it to exit, or
> 
>  - implement an oom killer timeout that would kill additional threads if 
>    we repeatedly call into it without making forward progress over a small 
>    period of time.
> 
> The first and last, in my opinion, are non-starters because it allows a 
> complete depletion of memory reserves if the coredumper is chosen and then 
> nothing is guaranteed to be able to ever exit.

Why does option 1 lead to reserve exhaustion?  If we have a zillion
simultaneous core dumps?

>  This patch implements the 
> middle option where we do our best effort to allow the coredump to be 
> successful (we even try direct reclaim before failing) but choose to fail 
> before calling into the oom killer and causing a livelock.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2012-03-22 23:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-15  2:15 [patch] mm, coredump: fail allocations when coredumping instead of oom killing David Rientjes
2012-03-15 10:20 ` Mel Gorman
2012-03-15 21:47   ` David Rientjes
2012-03-19 21:52     ` Andrew Morton
2012-03-20  0:46       ` David Rientjes
2012-03-22 23:07         ` Andrew Morton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120322160703.dbcf52a8.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).