All of lore.kernel.org
 help / color / mirror / Atom feed
diff for duplicates of <545A419C.3090900@suse.cz>

diff --git a/a/1.txt b/N1/1.txt
index a9646bf..5f9f96a 100644
--- a/a/1.txt
+++ b/N1/1.txt
@@ -42,3 +42,73 @@ But I noticed a possible issue that could lead to your problem.
 Can you please try the following patch?
 
 --------8<-------
+>From fe9c963cc665cdab50abb41f3babb5b19d08fab1 Mon Sep 17 00:00:00 2001
+From: Vlastimil Babka <vbabka@suse.cz>
+Date: Wed, 5 Nov 2014 14:19:18 +0100
+Subject: [PATCH] mm, compaction: do not reset deferred compaction
+ optimistically
+
+In try_to_compact_pages() we reset deferred compaction for a zone where we
+think compaction has succeeded. Although this action does not reset the
+counters affecting deferred compaction period, just bumping the deferred order
+means that another compaction attempt will be able to pass the check in
+compaction_deferred() and proceed with compaction.
+
+This is a problem when try_to_compact_pages() thinks compaction was successful
+just because the watermark check is missing proper classzone_idx parameter,
+but then the allocation attempt itself will fail due to its watermark check
+having the proper value. Although __alloc_pages_direct_compact() will re-defer
+compaction in such case, this happens only in the case of sync compaction.
+Async compaction will leave the zone open for another compaction attempt which
+may reset the deferred order again. This could possibly explain what
+P. Christeas reported - a system where many processes include the following
+backtrace:
+
+        [<ffffffff813b1025>] preempt_schedule_irq+0x3c/0x59
+        [<ffffffff813b4810>] retint_kernel+0x20/0x30
+        [<ffffffff810d7481>] ? __zone_watermark_ok+0x77/0x85
+        [<ffffffff810d8256>] zone_watermark_ok+0x1a/0x1c
+        [<ffffffff810eee56>] compact_zone+0x215/0x4b2
+        [<ffffffff810ef13f>] compact_zone_order+0x4c/0x5f
+        [<ffffffff810ef2fe>] try_to_compact_pages+0xc4/0x1e8
+        [<ffffffff813ad7f8>] __alloc_pages_direct_compact+0x61/0x1bf
+        [<ffffffff810da299>] __alloc_pages_nodemask+0x409/0x799
+        [<ffffffff8110d3fd>] new_slab+0x5f/0x21c
+
+The issue has been made visible by commit 53853e2d2bfb ("mm, compaction: defer
+each zone individually instead of preferred zone"), since before the commit,
+deferred compaction for fallback zones (where classzone_idx matters) was not
+considered separately.
+
+Although work is underway to fix the underlying zone watermark check mismatch,
+this patch fixes the immediate problem by removing the optimistic defer reset
+completely. Its usefulness is questionable anyway, since if the allocation
+really succeeds, a full defer reset (including the period counters) follows.
+
+Fixes: 53853e2d2bfb748a8b5aa2fd1de15699266865e0
+Reported-by: P. Christeas <xrg@linux.gr>
+Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
+---
+ mm/compaction.c | 7 -------
+ 1 file changed, 7 deletions(-)
+
+diff --git a/mm/compaction.c b/mm/compaction.c
+index ec74cf0..f0335f9 100644
+--- a/mm/compaction.c
++++ b/mm/compaction.c
+@@ -1325,13 +1325,6 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
+ 				      alloc_flags)) {
+ 			*candidate_zone = zone;
+ 			/*
+-			 * We think the allocation will succeed in this zone,
+-			 * but it is not certain, hence the false. The caller
+-			 * will repeat this with true if allocation indeed
+-			 * succeeds in this zone.
+-			 */
+-			compaction_defer_reset(zone, order, false);
+-			/*
+ 			 * It is possible that async compaction aborted due to
+ 			 * need_resched() and the watermarks were ok thanks to
+ 			 * somebody else freeing memory. The allocation can
+-- 
+2.1.2
diff --git a/a/content_digest b/N1/content_digest
index 006db74..0232154 100644
--- a/a/content_digest
+++ b/N1/content_digest
@@ -53,6 +53,76 @@
  "But I noticed a possible issue that could lead to your problem.\n"
  "Can you please try the following patch?\n"
  "\n"
- --------8<-------
+ "--------8<-------\n"
+ ">From fe9c963cc665cdab50abb41f3babb5b19d08fab1 Mon Sep 17 00:00:00 2001\n"
+ "From: Vlastimil Babka <vbabka@suse.cz>\n"
+ "Date: Wed, 5 Nov 2014 14:19:18 +0100\n"
+ "Subject: [PATCH] mm, compaction: do not reset deferred compaction\n"
+ " optimistically\n"
+ "\n"
+ "In try_to_compact_pages() we reset deferred compaction for a zone where we\n"
+ "think compaction has succeeded. Although this action does not reset the\n"
+ "counters affecting deferred compaction period, just bumping the deferred order\n"
+ "means that another compaction attempt will be able to pass the check in\n"
+ "compaction_deferred() and proceed with compaction.\n"
+ "\n"
+ "This is a problem when try_to_compact_pages() thinks compaction was successful\n"
+ "just because the watermark check is missing proper classzone_idx parameter,\n"
+ "but then the allocation attempt itself will fail due to its watermark check\n"
+ "having the proper value. Although __alloc_pages_direct_compact() will re-defer\n"
+ "compaction in such case, this happens only in the case of sync compaction.\n"
+ "Async compaction will leave the zone open for another compaction attempt which\n"
+ "may reset the deferred order again. This could possibly explain what\n"
+ "P. Christeas reported - a system where many processes include the following\n"
+ "backtrace:\n"
+ "\n"
+ "        [<ffffffff813b1025>] preempt_schedule_irq+0x3c/0x59\n"
+ "        [<ffffffff813b4810>] retint_kernel+0x20/0x30\n"
+ "        [<ffffffff810d7481>] ? __zone_watermark_ok+0x77/0x85\n"
+ "        [<ffffffff810d8256>] zone_watermark_ok+0x1a/0x1c\n"
+ "        [<ffffffff810eee56>] compact_zone+0x215/0x4b2\n"
+ "        [<ffffffff810ef13f>] compact_zone_order+0x4c/0x5f\n"
+ "        [<ffffffff810ef2fe>] try_to_compact_pages+0xc4/0x1e8\n"
+ "        [<ffffffff813ad7f8>] __alloc_pages_direct_compact+0x61/0x1bf\n"
+ "        [<ffffffff810da299>] __alloc_pages_nodemask+0x409/0x799\n"
+ "        [<ffffffff8110d3fd>] new_slab+0x5f/0x21c\n"
+ "\n"
+ "The issue has been made visible by commit 53853e2d2bfb (\"mm, compaction: defer\n"
+ "each zone individually instead of preferred zone\"), since before the commit,\n"
+ "deferred compaction for fallback zones (where classzone_idx matters) was not\n"
+ "considered separately.\n"
+ "\n"
+ "Although work is underway to fix the underlying zone watermark check mismatch,\n"
+ "this patch fixes the immediate problem by removing the optimistic defer reset\n"
+ "completely. Its usefulness is questionable anyway, since if the allocation\n"
+ "really succeeds, a full defer reset (including the period counters) follows.\n"
+ "\n"
+ "Fixes: 53853e2d2bfb748a8b5aa2fd1de15699266865e0\n"
+ "Reported-by: P. Christeas <xrg@linux.gr>\n"
+ "Signed-off-by: Vlastimil Babka <vbabka@suse.cz>\n"
+ "---\n"
+ " mm/compaction.c | 7 -------\n"
+ " 1 file changed, 7 deletions(-)\n"
+ "\n"
+ "diff --git a/mm/compaction.c b/mm/compaction.c\n"
+ "index ec74cf0..f0335f9 100644\n"
+ "--- a/mm/compaction.c\n"
+ "+++ b/mm/compaction.c\n"
+ "@@ -1325,13 +1325,6 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,\n"
+ " \t\t\t\t      alloc_flags)) {\n"
+ " \t\t\t*candidate_zone = zone;\n"
+ " \t\t\t/*\n"
+ "-\t\t\t * We think the allocation will succeed in this zone,\n"
+ "-\t\t\t * but it is not certain, hence the false. The caller\n"
+ "-\t\t\t * will repeat this with true if allocation indeed\n"
+ "-\t\t\t * succeeds in this zone.\n"
+ "-\t\t\t */\n"
+ "-\t\t\tcompaction_defer_reset(zone, order, false);\n"
+ "-\t\t\t/*\n"
+ " \t\t\t * It is possible that async compaction aborted due to\n"
+ " \t\t\t * need_resched() and the watermarks were ok thanks to\n"
+ " \t\t\t * somebody else freeing memory. The allocation can\n"
+ "-- \n"
+ 2.1.2
 
-90518147a7aec8383d7e6adaecb4394d1566682525c81b567a84b5c87578fafb
+29d95f8f19a2df9cb74da5157dbc5ca2e6649a05ba939dde3ef48f498f2972f3

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.