* [PATCH 1/2] mm: provide zone vmstat percpu drift bounds
@ 2011-12-07 16:16 Konstantin Khlebnikov
  2011-12-07 16:16 ` [PATCH 2/2] mm: fix endless looping around false-positive too_many_isolated() Konstantin Khlebnikov
  0 siblings, 1 reply; 2+ messages in thread
From: Konstantin Khlebnikov @ 2011-12-07 16:16 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: linux-kernel, Mel Gorman
vmstat use per-cpu counters for accounting, so atomic part on struct zone has some drift.
Free-pages watermark logic has some protection against this innacuracy.
too-many-isolated checks has the same problem. This patch provides drift bounds for them.
Plus this patch reset zone->percpu_drift_mark if drift protection is no longer required,
this can happens after memory hotplug.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/mmzone.h |    3 +++
 mm/vmstat.c            |    6 +++++-
 2 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 188cb2f..401438d 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -307,6 +307,9 @@ struct zone {
 	 */
 	unsigned long percpu_drift_mark;
 
+	/* Maximum vm_stat per-cpu counters drift */
+	unsigned long percpu_drift;
+
 	/*
 	 * We don't know if the memory that we're going to allocate will be freeable
 	 * or/and it will be released eventually, so to avoid totally wasting several
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 8fd603b..94540e1 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -172,16 +172,20 @@ void refresh_zone_stat_thresholds(void)
 			per_cpu_ptr(zone->pageset, cpu)->stat_threshold
 							= threshold;
 
+		max_drift = num_online_cpus() * threshold;
+		zone->percpu_drift = max_drift;
+
 		/*
 		 * Only set percpu_drift_mark if there is a danger that
 		 * NR_FREE_PAGES reports the low watermark is ok when in fact
 		 * the min watermark could be breached by an allocation
 		 */
 		tolerate_drift = low_wmark_pages(zone) - min_wmark_pages(zone);
-		max_drift = num_online_cpus() * threshold;
 		if (max_drift > tolerate_drift)
 			zone->percpu_drift_mark = high_wmark_pages(zone) +
 					max_drift;
+		else
+			zone->percpu_drift_mark = 0;
 	}
 }
 
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related	[flat|nested] 2+ messages in thread
* [PATCH 2/2] mm: fix endless looping around false-positive too_many_isolated()
  2011-12-07 16:16 [PATCH 1/2] mm: provide zone vmstat percpu drift bounds Konstantin Khlebnikov
@ 2011-12-07 16:16 ` Konstantin Khlebnikov
  0 siblings, 0 replies; 2+ messages in thread
From: Konstantin Khlebnikov @ 2011-12-07 16:16 UTC (permalink / raw)
  To: linux-mm, Andrew Morton; +Cc: linux-kernel, Mel Gorman
Due to vmstat counters percpu drift result of too_many_isolated() check
can be false-positive. Unfortunately it can be stable false-positive:
for example if zone at the one moment hasn't active/inactive pages at all
(for small zones like "DMA" this is very likely) but its atomic part of
isolated-pages counter is non-zero. In this sutuation shrink_inactive_list()
and isolate_migratepages() will loop forever around too_many_isolated().
After this patch too_many_isolated() will sum percpu fractions of
isolated pages counter if atomic part above watermark, but not higher than
watermark plus possible percpu drift.
We can ignore drift for active/inactive pages counters, because sooner or later
isolate pages counter drops to zero.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/compaction.c |   11 +++++++++--
 mm/vmscan.c     |    5 +++++
 2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 899d956..2d6fced 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -231,7 +231,7 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
 /* Similar to reclaim, but different enough that they don't share logic */
 static bool too_many_isolated(struct zone *zone)
 {
-	unsigned long active, inactive, isolated;
+	unsigned long active, inactive, isolated, watermark;
 
 	inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
 					zone_page_state(zone, NR_INACTIVE_ANON);
@@ -240,7 +240,14 @@ static bool too_many_isolated(struct zone *zone)
 	isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
 					zone_page_state(zone, NR_ISOLATED_ANON);
 
-	return isolated > (inactive + active) / 2;
+	watermark = (inactive + active) / 2;
+
+	if (isolated > watermark &&
+	    isolated - watermark <= zone->percpu_drift * 2)
+		isolated = zone_page_state_snapshot(zone, NR_ISOLATED_FILE) +
+			   zone_page_state_snapshot(zone, NR_ISOLATED_ANON);
+
+	return isolated > watermark;
 }
 
 /* possible outcome of isolate_migratepages */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 393ebce..3918c5f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1320,6 +1320,11 @@ static int too_many_isolated(struct zone *zone, int file,
 		isolated = zone_page_state(zone, NR_ISOLATED_ANON);
 	}
 
+	if (isolated > inactive &&
+	    isolated - inactive <= zone->percpu_drift)
+		isolated = zone_page_state_snapshot(zone,
+				file ? NR_ISOLATED_FILE : NR_ISOLATED_ANON);
+
 	return isolated > inactive;
 }
 
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related	[flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-12-07 15:17 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-07 16:16 [PATCH 1/2] mm: provide zone vmstat percpu drift bounds Konstantin Khlebnikov
2011-12-07 16:16 ` [PATCH 2/2] mm: fix endless looping around false-positive too_many_isolated() Konstantin Khlebnikov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).