From: Rik van Riel <riel@redhat.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Mel Gorman <mel@csn.ul.ie>,
Andrew Morton <akpm@linux-foundation.org>,
jaschut@sandia.gov, minchan@kernel.org,
kamezawa.hiroyu@jp.fujitsu.com
Subject: [PATCH -mm v2] mm: have order > 0 compaction start off where it left
Date: Thu, 28 Jun 2012 13:55:20 -0400 [thread overview]
Message-ID: <20120628135520.0c48b066@annuminas.surriel.com> (raw)
Order > 0 compaction stops when enough free pages of the correct
page order have been coalesced. When doing subsequent higher order
allocations, it is possible for compaction to be invoked many times.
However, the compaction code always starts out looking for things to
compact at the start of the zone, and for free pages to compact things
to at the end of the zone.
This can cause quadratic behaviour, with isolate_freepages starting
at the end of the zone each time, even though previous invocations
of the compaction code already filled up all free memory on that end
of the zone.
This can cause isolate_freepages to take enormous amounts of CPU
with certain workloads on larger memory systems.
The obvious solution is to have isolate_freepages remember where
it left off last time, and continue at that point the next time
it gets invoked for an order > 0 compaction. This could cause
compaction to fail if cc->free_pfn and cc->migrate_pfn are close
together initially, in that case we restart from the end of the
zone and try once more.
Forced full (order == -1) compactions are left alone.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Reported-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
v2: implement Mel's suggestions, handling wrap-around etc
include/linux/mmzone.h | 4 ++++
mm/compaction.c | 48 ++++++++++++++++++++++++++++++++++++++++++++----
mm/internal.h | 2 ++
mm/page_alloc.c | 5 +++++
4 files changed, 55 insertions(+), 4 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 2427706..e629594 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -369,6 +369,10 @@ struct zone {
*/
spinlock_t lock;
int all_unreclaimable; /* All pages pinned */
+#if defined CONFIG_COMPACTION || defined CONFIG_CMA
+ /* pfn where the last order > 0 compaction isolated free pages */
+ unsigned long compact_cached_free_pfn;
+#endif
#ifdef CONFIG_MEMORY_HOTPLUG
/* see spanned/present_pages for more description */
seqlock_t span_seqlock;
diff --git a/mm/compaction.c b/mm/compaction.c
index 7ea259d..2668b77 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -422,6 +422,17 @@ static void isolate_freepages(struct zone *zone,
pfn -= pageblock_nr_pages) {
unsigned long isolated;
+ /*
+ * Skip ahead if another thread is compacting in the area
+ * simultaneously. If we wrapped around, we can only skip
+ * ahead if zone->compact_cached_free_pfn also wrapped to
+ * above our starting point.
+ */
+ if (cc->order > 0 && (!cc->wrapped ||
+ zone->compact_cached_free_pfn >
+ cc->start_free_pfn))
+ pfn = min(pfn, zone->compact_cached_free_pfn);
+
if (!pfn_valid(pfn))
continue;
@@ -463,6 +474,8 @@ static void isolate_freepages(struct zone *zone,
*/
if (isolated)
high_pfn = max(high_pfn, pfn);
+ if (cc->order > 0)
+ zone->compact_cached_free_pfn = high_pfn;
}
/* split_free_page does not map the pages */
@@ -565,8 +578,27 @@ static int compact_finished(struct zone *zone,
if (fatal_signal_pending(current))
return COMPACT_PARTIAL;
- /* Compaction run completes if the migrate and free scanner meet */
- if (cc->free_pfn <= cc->migrate_pfn)
+ /*
+ * A full (order == -1) compaction run starts at the beginning and
+ * end of a zone; it completes when the migrate and free scanner meet.
+ * A partial (order > 0) compaction can start with the free scanner
+ * at a random point in the zone, and may have to restart.
+ */
+ if (cc->free_pfn <= cc->migrate_pfn) {
+ if (cc->order > 0 && !cc->wrapped) {
+ /* We started partway through; restart at the end. */
+ unsigned long free_pfn;
+ free_pfn = zone->zone_start_pfn + zone->spanned_pages;
+ free_pfn &= ~(pageblock_nr_pages-1);
+ zone->compact_cached_free_pfn = free_pfn;
+ cc->wrapped = 1;
+ return COMPACT_CONTINUE;
+ }
+ return COMPACT_COMPLETE;
+ }
+
+ /* We wrapped around and ended up where we started. */
+ if (cc->wrapped && cc->free_pfn <= cc->start_free_pfn)
return COMPACT_COMPLETE;
/*
@@ -664,8 +696,16 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
/* Setup to move all movable pages to the end of the zone */
cc->migrate_pfn = zone->zone_start_pfn;
- cc->free_pfn = cc->migrate_pfn + zone->spanned_pages;
- cc->free_pfn &= ~(pageblock_nr_pages-1);
+
+ if (cc->order > 0) {
+ /* Incremental compaction. Start where the last one stopped. */
+ cc->free_pfn = zone->compact_cached_free_pfn;
+ cc->start_free_pfn = cc->free_pfn;
+ } else {
+ /* Order == -1 starts at the end of the zone. */
+ cc->free_pfn = cc->migrate_pfn + zone->spanned_pages;
+ cc->free_pfn &= ~(pageblock_nr_pages-1);
+ }
migrate_prep_local();
diff --git a/mm/internal.h b/mm/internal.h
index 2ba87fb..0b72461 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -118,8 +118,10 @@ struct compact_control {
unsigned long nr_freepages; /* Number of isolated free pages */
unsigned long nr_migratepages; /* Number of pages to migrate */
unsigned long free_pfn; /* isolate_freepages search base */
+ unsigned long start_free_pfn; /* where we started the search */
unsigned long migrate_pfn; /* isolate_migratepages search base */
bool sync; /* Synchronous migration */
+ bool wrapped; /* Last round for order>0 compaction */
int order; /* order a direct compactor needs */
int migratetype; /* MOVABLE, RECLAIMABLE etc */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4403009..c353a61 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4394,6 +4394,11 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
zone->spanned_pages = size;
zone->present_pages = realsize;
+#if defined CONFIG_COMPACTION || defined CONFIG_CMA
+ zone->compact_cached_free_pfn = zone->zone_start_pfn +
+ zone->spanned_pages;
+ zone->compact_cached_free_pfn &= ~(pageblock_nr_pages-1);
+#endif
#ifdef CONFIG_NUMA
zone->node = nid;
zone->min_unmapped_pages = (realsize*sysctl_min_unmapped_ratio)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2012-06-28 17:55 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-28 17:55 Rik van Riel [this message]
2012-06-28 20:19 ` [PATCH -mm v2] mm: have order > 0 compaction start off where it left Jim Schutt
2012-06-28 20:57 ` Rik van Riel
2012-06-28 20:59 ` Andrew Morton
2012-06-28 21:24 ` Rik van Riel
2012-06-28 21:35 ` Andrew Morton
2012-07-02 17:42 ` Sasha Levin
2012-07-03 0:57 ` Rik van Riel
2012-07-03 2:54 ` Minchan Kim
2012-07-03 10:10 ` Mel Gorman
2012-07-03 21:48 ` Andrew Morton
2012-07-04 2:34 ` Minchan Kim
2012-07-04 7:42 ` Andrew Morton
2012-07-04 8:01 ` Minchan Kim
2012-07-11 20:18 ` [PATCH -mm v3] " Rik van Riel
2012-07-12 2:26 ` Minchan Kim
2012-07-04 9:57 ` [PATCH -mm v2] " Mel Gorman
2012-06-28 23:27 ` Minchan Kim
2012-07-03 14:59 ` Rik van Riel
2012-07-04 2:28 ` Minchan Kim
2012-07-04 10:08 ` Mel Gorman
2012-07-03 20:13 ` [PATCH -mm] mm: minor fixes for compaction Rik van Riel
2012-07-04 2:36 ` Minchan Kim
2012-06-29 10:02 ` [PATCH -mm v2] mm: have order > 0 compaction start off where it left Mel Gorman
2012-06-30 3:51 ` Kamezawa Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120628135520.0c48b066@annuminas.surriel.com \
--to=riel@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=jaschut@sandia.gov \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=minchan@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).