From: Mel Gorman <mel@csn.ul.ie>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Pekka Enberg <penberg@cs.helsinki.fi>,
arjan@infradead.org, linux-kernel@vger.kernel.org,
Christoph Lameter <cl@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>
Subject: Re: upcoming kerneloops.org item: get_page_from_freelist
Date: Tue, 30 Jun 2009 21:32:49 +0100 [thread overview]
Message-ID: <20090630203249.GC6689@csn.ul.ie> (raw)
In-Reply-To: <alpine.DEB.2.00.0906301216180.16312@chino.kir.corp.google.com>
On Tue, Jun 30, 2009 at 12:35:59PM -0700, David Rientjes wrote:
> On Tue, 30 Jun 2009, Mel Gorman wrote:
>
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 175a67a..5f4656e 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -230,14 +230,21 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
> > /*
> > * This task already has access to memory reserves and is
> > * being killed. Don't allow any other task access to the
> > - * memory reserve.
> > + * memory reserve unless the current process is the one
> > + * selected for OOM-killing. If the current process has
> > + * been OOM-killed and we are OOM again, another process
> > + * needs to be considered for OOM-kill
> > *
> > * Note: this may have a chance of deadlock if it gets
> > * blocked waiting for another task which itself is waiting
> > * for memory. Is there a better alternative?
> > */
> > - if (test_tsk_thread_flag(p, TIF_MEMDIE))
> > - return ERR_PTR(-1UL);
> > + if (test_tsk_thread_flag(p, TIF_MEMDIE)) {
> > + if (p == current)
> > + continue;
> > + else
> > + return ERR_PTR(-1UL);
> > + }
> >
> > /*
> > * This is in the process of releasing memory so wait for it
>
> This will panic the machine if current is the only user thread running or
> eligible for oom kill (all others could have mm's with OOM_DISABLE set).
> Currently, such tasks can exit or kthreads can free memory so that the oom
> is recoverable.
>
Good point, would the following be ok instead?
+ if (test_tsk_thread_flag(p, TIF_MEMDIE)) {
+ if (p == current) {
+ chosen = ERR_PTR(-1UL);
+ continue;
+ } else
+ return ERR_PTR(-1UL);
> The problem with this approach is that it increases the liklihood that
> memory reserves will be totally depleted when several threads are
> competing for them.
>
How so?
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 5d714f8..5896469 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1539,6 +1539,14 @@ should_alloc_retry(gfp_t gfp_mask, unsigned int order,
> > if (gfp_mask & __GFP_NORETRY)
> > return 0;
> >
> > + /* Do not loop if OOM-killed unless __GFP_NOFAIL is specified */
> > + if (test_thread_flag(TIF_MEMDIE)) {
> > + if (gfp_mask & __GFP_NOFAIL)
> > + WARN(1, "Potential infinite loop with __GFP_NOFAIL");
> > + else
> > + return 0;
> > + }
> > +
>
> There's a second bug in the refactored page allocator: when the oom killer
> is invoked and it happens to kill current, alloc_flags is never updated
> because it loops back to `restart', which is past gfp_to_alloc_flags().
>
> When that is fixed with the patch below, gfp_mask & __GFP_NOFAIL will
> never be true here in your scenario, where the oom killer kills current,
> because __alloc_pages_high_priority() will infinitely loop.
>
> > /*
> > * In this implementation, order <= PAGE_ALLOC_COSTLY_ORDER
> > * means __GFP_NOFAIL, but that may not be true in other
> >
>
> This is needed for 2.6.31-rc2.
>
>
> mm: update alloc_flags after oom killer has been called
>
> It is possible for the oom killer to select current as the task to kill.
> When this happens, alloc_flags needs to be updated accordingly to set
> ALLOC_NO_WATERMARKS so the subsequent allocation attempt may use memory
> reserves as the result of its thread having TIF_MEMDIE set if the
> allocation is not __GFP_NOMEMALLOC.
>
> Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
> ---
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1756,6 +1756,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>
> wake_all_kswapd(order, zonelist, high_zoneidx);
>
> +restart:
> /*
> * OK, we're below the kswapd watermark and have kicked background
> * reclaim. Now things get more complex, so set up alloc_flags according
> @@ -1763,7 +1764,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> */
> alloc_flags = gfp_to_alloc_flags(gfp_mask);
>
> -restart:
> /* This is the last chance, in general, before the goto nopage. */
> page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
> high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
next prev parent reply other threads:[~2009-06-30 20:32 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-24 15:07 upcoming kerneloops.org item: get_page_from_freelist Arjan van de Ven
2009-06-24 16:46 ` Andrew Morton
2009-06-24 16:52 ` Linus Torvalds
2009-06-24 16:55 ` Pekka Enberg
2009-06-24 16:56 ` Pekka Enberg
2009-06-24 17:00 ` Pekka Enberg
2009-06-24 17:55 ` Andrew Morton
2009-06-24 17:53 ` Pekka Enberg
2009-06-24 18:30 ` Andrew Morton
2009-06-24 18:42 ` Linus Torvalds
2009-06-24 18:44 ` Pekka Enberg
2009-06-24 18:50 ` Linus Torvalds
2009-06-24 19:12 ` Pekka J Enberg
2009-06-24 19:21 ` Linus Torvalds
2009-06-24 19:06 ` Andrew Morton
2009-06-24 19:16 ` Linus Torvalds
2009-06-24 19:36 ` Andrew Morton
2009-06-24 19:46 ` Linus Torvalds
2009-06-24 19:47 ` Linus Torvalds
2009-06-24 20:01 ` Andrew Morton
2009-06-24 20:13 ` Linus Torvalds
2009-06-24 20:40 ` Linus Torvalds
2009-06-24 22:07 ` Andrew Morton
2009-06-25 4:05 ` Nick Piggin
2009-06-25 13:25 ` Theodore Tso
2009-06-25 18:51 ` David Rientjes
2009-06-25 19:38 ` Theodore Tso
2009-06-25 19:44 ` Theodore Tso
2009-06-25 19:55 ` Andrew Morton
2009-06-25 20:11 ` Linus Torvalds
2009-06-25 20:22 ` Linus Torvalds
2009-06-25 20:36 ` David Rientjes
2009-06-25 20:51 ` Linus Torvalds
2009-06-25 22:25 ` David Rientjes
2009-06-26 8:51 ` Nick Piggin
2009-06-25 20:18 ` David Rientjes
2009-06-25 20:37 ` Theodore Tso
2009-06-25 21:05 ` Joel Becker
2009-06-25 21:26 ` Andreas Dilger
2009-06-25 22:05 ` Theodore Tso
2009-06-25 22:11 ` Eric Sandeen
2009-06-26 1:11 ` Theodore Tso
2009-06-26 5:16 ` Pekka J Enberg
2009-06-26 8:56 ` Nick Piggin
2009-06-26 8:58 ` Pekka Enberg
2009-06-26 9:07 ` Nick Piggin
2009-06-29 21:06 ` Christoph Lameter
2009-06-30 7:59 ` Nick Piggin
2009-06-26 14:41 ` Eric Sandeen
2009-06-29 21:15 ` Christoph Lameter
2009-06-29 21:20 ` Eric Sandeen
2009-06-29 22:35 ` Christoph Lameter
2009-06-25 19:55 ` Jens Axboe
2009-06-25 20:08 ` Jens Axboe
2009-06-24 21:56 ` Andrew Morton
2009-06-25 4:14 ` Nick Piggin
2009-06-25 8:21 ` David Rientjes
2009-06-29 15:30 ` Mel Gorman
2009-06-29 19:20 ` Andrew Morton
2009-06-30 11:00 ` Mel Gorman
2009-06-30 19:35 ` David Rientjes
2009-06-30 20:32 ` Mel Gorman [this message]
2009-06-30 20:51 ` David Rientjes
2009-07-01 10:22 ` Mel Gorman
2009-06-29 23:35 ` David Rientjes
2009-06-30 7:47 ` Nick Piggin
2009-06-30 8:13 ` David Rientjes
2009-06-30 8:24 ` Nick Piggin
2009-06-30 8:41 ` David Rientjes
2009-06-30 9:09 ` Nick Piggin
2009-06-30 19:47 ` David Rientjes
2009-06-30 6:27 ` Pavel Machek
2009-06-28 10:16 ` Pavel Machek
2009-06-28 18:01 ` Linus Torvalds
2009-06-28 18:27 ` Arjan van de Ven
2009-06-28 18:36 ` Linus Torvalds
2009-06-30 7:35 ` Pavel Machek
2009-06-24 18:43 ` Pekka Enberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090630203249.GC6689@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=cl@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox