linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Michal Hocko <mhocko@suse.cz>,
	David Rientjes <rientjes@google.com>,
	Nishanth Aravamudan <nacc@linux.vnet.ibm.com>,
	Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.14 16/37] mm: exclude memoryless nodes from zone_reclaim
Date: Tue,  7 Oct 2014 16:19:33 -0700	[thread overview]
Message-ID: <20141007231827.565961673@linuxfoundation.org> (raw)
In-Reply-To: <20141007231827.043235686@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michal Hocko <mhocko@suse.cz>

commit 70ef57e6c22c3323dce179b7d0d433c479266612 upstream.

We had a report about strange OOM killer strikes on a PPC machine
although there was a lot of swap free and a tons of anonymous memory
which could be swapped out.  In the end it turned out that the OOM was a
side effect of zone reclaim which wasn't unmapping and swapping out and
so the system was pushed to the OOM.  Although this sounds like a bug
somewhere in the kswapd vs.  zone reclaim vs.  direct reclaim
interaction numactl on the said hardware suggests that the zone reclaim
should not have been set in the first place:

  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 2 cpus:
  node 2 size: 7168 MB
  node 2 free: 6019 MB
  node distances:
  node   0   2
  0:  10  40
  2:  40  10

So all the CPUs are associated with Node0 which doesn't have any memory
while Node2 contains all the available memory.  Node distances cause an
automatic zone_reclaim_mode enabling.

Zone reclaim is intended to keep the allocations local but this doesn't
make any sense on the memoryless nodes.  So let's exclude such nodes for
init_zone_allows_reclaim which evaluates zone reclaim behavior and
suitable reclaim_nodes.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/page_alloc.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1869,7 +1869,7 @@ static void __paginginit init_zone_allow
 {
 	int i;
 
-	for_each_online_node(i)
+	for_each_node_state(i, N_MEMORY)
 		if (node_distance(nid, i) <= RECLAIM_DISTANCE)
 			node_set(i, NODE_DATA(nid)->reclaim_nodes);
 		else
@@ -4933,7 +4933,8 @@ void __paginginit free_area_init_node(in
 
 	pgdat->node_id = nid;
 	pgdat->node_start_pfn = node_start_pfn;
-	init_zone_allows_reclaim(nid);
+	if (node_state(nid, N_MEMORY))
+		init_zone_allows_reclaim(nid);
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 	get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
 #endif



  parent reply	other threads:[~2014-10-07 23:32 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-07 23:19 [PATCH 3.14 00/37] 3.14.21-stable review Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 01/37] udf: Avoid infinite loop when processing indirect ICBs Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 02/37] perf: fix perf bug in fork() Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 03/37] mm: migrate: Close race between migration completion and mprotect Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 04/37] cpufreq: integrator: fix integrator_cpufreq_remove return type Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 05/37] md/raid5: disable DISCARD by default due to safety concerns Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 06/37] drm/i915: Flush the PTEs after updating them before suspend Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 07/37] Fix problem recognizing symlinks Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 08/37] init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 09/37] ring-buffer: Fix infinite spin in reading buffer Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 10/37] CIFS: Fix SMB2 readdir error handling Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 11/37] hugetlb: ensure hugepage access is denied if hugepages are not supported Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 12/37] mm, thp: move invariant bug check out of loop in __split_huge_page_map Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 13/37] mm: numa: Do not mark PTEs pte_numa when splitting huge pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 14/37] media: vb2: fix VBI/poll regression Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 15/37] jiffies: Fix timeval conversion to jiffies Greg Kroah-Hartman
2014-10-07 23:19 ` Greg Kroah-Hartman [this message]
2014-10-07 23:19 ` [PATCH 3.14 17/37] swap: change swap_info singly-linked list to list_head Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 18/37] lib/plist: add helper functions Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 19/37] lib/plist: add plist_requeue Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 20/37] swap: change swap_list_head to plist, add swap_avail_head Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 21/37] mm, compaction: avoid isolating pinned pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 22/37] mm/compaction: disallow high-order page for migration target Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 23/37] mm/compaction: do not call suitable_migration_target() on every page Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 24/37] drbd: fix regression out of mem, failed to invoke fence-peer helper Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 25/37] mm/compaction: change the timing to check to drop the spinlock Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 26/37] mm/compaction: check pageblock suitability once per pageblock Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 27/37] mm/compaction: clean-up code on success of ballon isolation Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 28/37] mm, compaction: determine isolation mode only once Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 29/37] mm, compaction: ignore pageblock skip when manually invoking compaction Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 30/37] mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 31/37] mm: optimize put_mems_allowed() usage Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 32/37] mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 33/37] mm: vmscan: respect NUMA policy mask when shrinking slab on direct reclaim Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 34/37] mm: vmscan: shrink_slab: rename max_pass -> freeable Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 35/37] vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state() Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 36/37] mm: per-thread vma caching Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.14 37/37] mm: dont pointlessly use BUG_ON() for sanity check Greg Kroah-Hartman
2014-10-08  2:48 ` [PATCH 3.14 00/37] 3.14.21-stable review Guenter Roeck
2014-10-08 20:05 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141007231827.565961673@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=nacc@linux.vnet.ibm.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).