All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mel@csn.ul.ie>, Minchan Kim <minchan.kim@gmail.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	linux-mm@kvack.org
Subject: Re: [patch] mm: clear pages_scanned only if draining a pcp adds pages to the buddy allocator
Date: Tue, 25 Jan 2011 09:42:35 +0100	[thread overview]
Message-ID: <20110125084235.GC2217@cmpxchg.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1101231457130.966@chino.kir.corp.google.com>

On Sun, Jan 23, 2011 at 02:58:39PM -0800, David Rientjes wrote:
> 0e093d99763e (writeback: do not sleep on the congestion queue if there
> are no congested BDIs or if significant congestion is not being
> encountered in the current zone) uncovered a livelock in the page
> allocator that resulted in tasks infinitely looping trying to find memory
> and kswapd running at 100% cpu.
> 
> The issue occurs because drain_all_pages() is called immediately
> following direct reclaim when no memory is freed and try_to_free_pages()
> returns non-zero because all zones in the zonelist do not have their
> all_unreclaimable flag set.
> 
> When draining the per-cpu pagesets back to the buddy allocator for each
> zone, the zone->pages_scanned counter is cleared to avoid erroneously
> setting zone->all_unreclaimable later.  The problem is that no pages may
> actually be drained and, thus, the unreclaimable logic never fails direct
> reclaim so the oom killer may be invoked.
> 
> This apparently only manifested after wait_iff_congested() was introduced
> and the zone was full of anonymous memory that would not congest the
> backing store.  The page allocator would infinitely loop if there were no
> other tasks waiting to be scheduled and clear zone->pages_scanned because
> of drain_all_pages() as the result of this change before kswapd could
> scan enough pages to trigger the reclaim logic.  Additionally, with every
> loop of the page allocator and in the reclaim path, kswapd would be
> kicked and would end up running at 100% cpu.  In this scenario, current
> and kswapd are all running continuously with kswapd incrementing
> zone->pages_scanned and current clearing it.
> 
> The problem is even more pronounced when current swaps some of its memory
> to swap cache and the reclaimable logic then considers all active
> anonymous memory in the all_unreclaimable logic, which requires a much
> higher zone->pages_scanned value for try_to_free_pages() to return zero
> that is never attainable in this scenario.
> 
> Before wait_iff_congested(), the page allocator would incur an
> unconditional timeout and allow kswapd to elevate zone->pages_scanned to
> a level that the oom killer would be called the next time it loops.
> 
> The fix is to only attempt to drain pcp pages if there is actually a
> quantity to be drained.  The unconditional clearing of
> zone->pages_scanned in free_pcppages_bulk() need not be changed since
> other callers already ensure that draining will occur.  This patch
> ensures that free_pcppages_bulk() will actually free memory before
> calling into it from drain_all_pages() so zone->pages_scanned is only
> cleared if appropriate.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-01-25  8:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-23 22:58 [patch] mm: clear pages_scanned only if draining a pcp adds pages to the buddy allocator David Rientjes
2011-01-24 17:09 ` Rik van Riel
2011-01-25  8:42 ` Johannes Weiner [this message]
2011-01-26  8:51 ` Mel Gorman
2011-01-30  2:10 ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110125084235.GC2217@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=fengguang.wu@intel.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.