From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
Michal Nazarewicz <mina86@mina86.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
David Rientjes <rientjes@google.com>
Subject: Re: [PATCH 5/5] mm, compaction: more focused lru and pcplists draining
Date: Wed, 19 Nov 2014 23:53:37 +0100 [thread overview]
Message-ID: <546D1F71.6030304@suse.cz> (raw)
In-Reply-To: <20141114070501.GA24817@js1304-P5Q-DELUXE>
On 11/14/2014 08:05 AM, Joonsoo Kim wrote:
>> What about this scenario, with pageblock order:
>>
>> - record cc->migrate_pfn pointing to pageblock X
>> - isolate_migratepages() skips the pageblock due to e.g. skip bit,
>> or the pageblock being a THP already...
>> - loop to pageblock X+1, last_migrated_pfn is still set to pfn of
>> pageblock X (more precisely the pfn is (X << pageblock_order) - 1
>> per your code, but doesn't matter)
>> - isolate_migratepages isolates something, but ends up somewhere in
>> the middle of pageblock due to COMPACT_CLUSTER_MAX
>> - cc->migrate_pfn points to pageblock X+1 (plus some pages it scanned)
>> - so it will decide that it has fully migrated pageblock X and it's
>> time to drain. But the drain is most likely useless - we didn't
>> migrate anything in pageblock X, we skipped it. And in X+1 we didn't
>> migrate everything yet, so we should drain only after finishing the
>> other part of the pageblock.
>
> Yes, but, it can be easily fixed.
>
> while (compact_finished()) {
> unsigned long prev_migrate_pfn = cc->migrate_pfn;
>
> isolate_migratepages()
> switch case {
> NONE:
> goto check_drain;
> SUCCESS:
> if (!last_migrated_pfn)
> last_migrated_pfn = prev_migrate_pfn;
> }
>
> ...
>
> check_drain: (at the end of loop)
> ...
> }
Good suggestion, also gets rid of the awkward subtraction of 1 in the
current patch. Thanks.
>> In short, "last_migrated_pfn" is not "last position of migrate
>> scanner" but "last block where we *actually* migrated".
>
> Okay. Now I get it.
> Nevertheless, I'd like to change logic like above.
>
> One problem of your approach is that it can't detect some cases.
>
> Let's think about following case.
> '|' denotes aligned block boundary.
> '^' denotes migrate_pfn at certain time.
>
> Assume that last_migrated_pfn = 0;
>
> |--------------|-------------|--------------|
> ^ ^
> before isolate after isolate
>
> In this case, your code just records position of second '^' to
> last_migrated_pfn and skip to flush. But, flush is needed if we
> migrate some pages because we move away from previous aligned block.
>
> Thanks.
>
Right, so the patch below implements your suggestion, and the last_migrated_pfn
initialization fix. I named the variable "isolate_start_pfn" instead of
prev_migrate_pfn, as it's where the migrate scanner isolation starts, and having
both prev_migrate_pfn and last_migrated_pfn would be more confusing I think.
------8<------
From: Vlastimil Babka <vbabka@suse.cz>
Date: Mon, 3 Nov 2014 15:28:01 +0100
Subject: [PATCH] mm-compaction-more-focused-lru-and-pcplists-draining-fix
As Joonsoo Kim pointed out, last_migrate_pfn was reset to 0 by mistake at each
iteration in compact_zone(). This mistake could result in fail to recognize
immediately draining points for orders smaller than pageblock.
Joonsoo has also suggested an improvement to detecting cc->order aligned
block where migration might have occured - before this fix, some of the drain
opportunities might have been missed.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
---
mm/compaction.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index fe43e60..100e6e8 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1144,6 +1144,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
unsigned long end_pfn = zone_end_pfn(zone);
const int migratetype = gfpflags_to_migratetype(cc->gfp_mask);
const bool sync = cc->mode != MIGRATE_ASYNC;
+ unsigned long last_migrated_pfn = 0;
ret = compaction_suitable(zone, cc->order, cc->alloc_flags,
cc->classzone_idx);
@@ -1189,7 +1190,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
while ((ret = compact_finished(zone, cc, migratetype)) ==
COMPACT_CONTINUE) {
int err;
- unsigned long last_migrated_pfn = 0;
+ unsigned long isolate_start_pfn = cc->migrate_pfn;
switch (isolate_migratepages(zone, cc)) {
case ISOLATE_ABORT:
@@ -1230,21 +1231,22 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
}
/*
- * Record where we have freed pages by migration and not yet
- * flushed them to buddy allocator. Subtract 1, because often
- * we finish a pageblock and migrate_pfn points to the first
- * page* of the next one. In that case we want the drain below
- * to happen immediately.
+ * Record where we could have freed pages by migration and not
+ * yet flushed them to buddy allocator. We use the pfn that
+ * isolate_migratepages() started from in this loop iteration
+ * - this is the lowest page that could have been isolated and
+ * then freed by migration.
*/
if (!last_migrated_pfn)
- last_migrated_pfn = cc->migrate_pfn - 1;
+ last_migrated_pfn = isolate_start_pfn;
check_drain:
/*
- * Have we moved away from the previous cc->order aligned block
- * where we migrated from? If yes, flush the pages that were
- * freed, so that they can merge and compact_finished() can
- * detect immediately if allocation should succeed.
+ * Has the migration scanner moved away from the previous
+ * cc->order aligned block where we migrated from? If yes,
+ * flush the pages that were freed, so that they can merge and
+ * compact_finished() can detect immediately if allocation
+ * would succeed.
*/
if (cc->order > 0 && last_migrated_pfn) {
int cpu;
--
2.1.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
Michal Nazarewicz <mina86@mina86.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
David Rientjes <rientjes@google.com>
Subject: Re: [PATCH 5/5] mm, compaction: more focused lru and pcplists draining
Date: Wed, 19 Nov 2014 23:53:37 +0100 [thread overview]
Message-ID: <546D1F71.6030304@suse.cz> (raw)
In-Reply-To: <20141114070501.GA24817@js1304-P5Q-DELUXE>
On 11/14/2014 08:05 AM, Joonsoo Kim wrote:
>> What about this scenario, with pageblock order:
>>
>> - record cc->migrate_pfn pointing to pageblock X
>> - isolate_migratepages() skips the pageblock due to e.g. skip bit,
>> or the pageblock being a THP already...
>> - loop to pageblock X+1, last_migrated_pfn is still set to pfn of
>> pageblock X (more precisely the pfn is (X << pageblock_order) - 1
>> per your code, but doesn't matter)
>> - isolate_migratepages isolates something, but ends up somewhere in
>> the middle of pageblock due to COMPACT_CLUSTER_MAX
>> - cc->migrate_pfn points to pageblock X+1 (plus some pages it scanned)
>> - so it will decide that it has fully migrated pageblock X and it's
>> time to drain. But the drain is most likely useless - we didn't
>> migrate anything in pageblock X, we skipped it. And in X+1 we didn't
>> migrate everything yet, so we should drain only after finishing the
>> other part of the pageblock.
>
> Yes, but, it can be easily fixed.
>
> while (compact_finished()) {
> unsigned long prev_migrate_pfn = cc->migrate_pfn;
>
> isolate_migratepages()
> switch case {
> NONE:
> goto check_drain;
> SUCCESS:
> if (!last_migrated_pfn)
> last_migrated_pfn = prev_migrate_pfn;
> }
>
> ...
>
> check_drain: (at the end of loop)
> ...
> }
Good suggestion, also gets rid of the awkward subtraction of 1 in the
current patch. Thanks.
>> In short, "last_migrated_pfn" is not "last position of migrate
>> scanner" but "last block where we *actually* migrated".
>
> Okay. Now I get it.
> Nevertheless, I'd like to change logic like above.
>
> One problem of your approach is that it can't detect some cases.
>
> Let's think about following case.
> '|' denotes aligned block boundary.
> '^' denotes migrate_pfn at certain time.
>
> Assume that last_migrated_pfn = 0;
>
> |--------------|-------------|--------------|
> ^ ^
> before isolate after isolate
>
> In this case, your code just records position of second '^' to
> last_migrated_pfn and skip to flush. But, flush is needed if we
> migrate some pages because we move away from previous aligned block.
>
> Thanks.
>
Right, so the patch below implements your suggestion, and the last_migrated_pfn
initialization fix. I named the variable "isolate_start_pfn" instead of
prev_migrate_pfn, as it's where the migrate scanner isolation starts, and having
both prev_migrate_pfn and last_migrated_pfn would be more confusing I think.
------8<------
From: Vlastimil Babka <vbabka@suse.cz>
Date: Mon, 3 Nov 2014 15:28:01 +0100
Subject: [PATCH] mm-compaction-more-focused-lru-and-pcplists-draining-fix
As Joonsoo Kim pointed out, last_migrate_pfn was reset to 0 by mistake at each
iteration in compact_zone(). This mistake could result in fail to recognize
immediately draining points for orders smaller than pageblock.
Joonsoo has also suggested an improvement to detecting cc->order aligned
block where migration might have occured - before this fix, some of the drain
opportunities might have been missed.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
---
mm/compaction.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index fe43e60..100e6e8 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1144,6 +1144,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
unsigned long end_pfn = zone_end_pfn(zone);
const int migratetype = gfpflags_to_migratetype(cc->gfp_mask);
const bool sync = cc->mode != MIGRATE_ASYNC;
+ unsigned long last_migrated_pfn = 0;
ret = compaction_suitable(zone, cc->order, cc->alloc_flags,
cc->classzone_idx);
@@ -1189,7 +1190,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
while ((ret = compact_finished(zone, cc, migratetype)) ==
COMPACT_CONTINUE) {
int err;
- unsigned long last_migrated_pfn = 0;
+ unsigned long isolate_start_pfn = cc->migrate_pfn;
switch (isolate_migratepages(zone, cc)) {
case ISOLATE_ABORT:
@@ -1230,21 +1231,22 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
}
/*
- * Record where we have freed pages by migration and not yet
- * flushed them to buddy allocator. Subtract 1, because often
- * we finish a pageblock and migrate_pfn points to the first
- * page* of the next one. In that case we want the drain below
- * to happen immediately.
+ * Record where we could have freed pages by migration and not
+ * yet flushed them to buddy allocator. We use the pfn that
+ * isolate_migratepages() started from in this loop iteration
+ * - this is the lowest page that could have been isolated and
+ * then freed by migration.
*/
if (!last_migrated_pfn)
- last_migrated_pfn = cc->migrate_pfn - 1;
+ last_migrated_pfn = isolate_start_pfn;
check_drain:
/*
- * Have we moved away from the previous cc->order aligned block
- * where we migrated from? If yes, flush the pages that were
- * freed, so that they can merge and compact_finished() can
- * detect immediately if allocation should succeed.
+ * Has the migration scanner moved away from the previous
+ * cc->order aligned block where we migrated from? If yes,
+ * flush the pages that were freed, so that they can merge and
+ * compact_finished() can detect immediately if allocation
+ * would succeed.
*/
if (cc->order > 0 && last_migrated_pfn) {
int cpu;
--
2.1.2
next prev parent reply other threads:[~2014-11-19 22:53 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-07 15:33 [PATCH 0/5] Further compaction tuning Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 1/5] mm, compaction: pass classzone_idx and alloc_flags to watermark checking Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-20 15:45 ` Rik van Riel
2014-10-20 15:45 ` Rik van Riel
2014-10-27 6:46 ` Joonsoo Kim
2014-10-27 6:46 ` Joonsoo Kim
2014-10-27 9:11 ` Vlastimil Babka
2014-10-27 9:11 ` Vlastimil Babka
2014-10-28 7:16 ` Joonsoo Kim
2014-10-28 7:16 ` Joonsoo Kim
2014-10-29 13:51 ` Vlastimil Babka
2014-10-29 13:51 ` Vlastimil Babka
2014-10-31 7:49 ` Joonsoo Kim
2014-10-31 7:49 ` Joonsoo Kim
2014-11-14 8:52 ` Vlastimil Babka
2014-11-14 8:52 ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 2/5] mm, compaction: simplify deferred compaction Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-15 22:32 ` Andrew Morton
2014-10-15 22:32 ` Andrew Morton
2014-10-16 15:11 ` Vlastimil Babka
2014-10-16 15:11 ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 3/5] mm, compaction: defer only on COMPACT_COMPLETE Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-20 15:18 ` Rik van Riel
2014-10-20 15:18 ` Rik van Riel
2014-10-07 15:33 ` [PATCH 4/5] mm, compaction: always update cached scanner positions Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-20 15:26 ` Rik van Riel
2014-10-20 15:26 ` Rik van Riel
2014-10-27 7:35 ` Joonsoo Kim
2014-10-27 7:35 ` Joonsoo Kim
2014-10-27 9:39 ` Vlastimil Babka
2014-10-27 9:39 ` Vlastimil Babka
2014-10-28 7:08 ` Joonsoo Kim
2014-10-28 7:08 ` Joonsoo Kim
2014-10-31 15:53 ` Vlastimil Babka
2014-10-31 15:53 ` Vlastimil Babka
2014-11-04 0:28 ` Joonsoo Kim
2014-11-04 0:28 ` Joonsoo Kim
2014-11-14 8:57 ` Vlastimil Babka
2014-11-14 8:57 ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 5/5] mm, compaction: more focused lru and pcplists draining Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-20 15:44 ` Rik van Riel
2014-10-20 15:44 ` Rik van Riel
2014-10-27 7:41 ` Joonsoo Kim
2014-10-27 7:41 ` Joonsoo Kim
2014-11-03 8:12 ` Vlastimil Babka
2014-11-03 8:12 ` Vlastimil Babka
2014-11-04 0:37 ` Joonsoo Kim
2014-11-04 0:37 ` Joonsoo Kim
2014-11-13 12:47 ` Vlastimil Babka
2014-11-13 12:47 ` Vlastimil Babka
2014-11-14 7:05 ` Joonsoo Kim
2014-11-14 7:05 ` Joonsoo Kim
2014-11-19 22:53 ` Vlastimil Babka [this message]
2014-11-19 22:53 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=546D1F71.6030304@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.