All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
	Michal Nazarewicz <mina86@mina86.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>
Subject: Re: [PATCH 5/5] mm, compaction: more focused lru and pcplists draining
Date: Mon, 03 Nov 2014 09:12:33 +0100	[thread overview]
Message-ID: <545738F1.4010307@suse.cz> (raw)
In-Reply-To: <20141027074112.GC23379@js1304-P5Q-DELUXE>

On 10/27/2014 08:41 AM, Joonsoo Kim wrote:
> On Tue, Oct 07, 2014 at 05:33:39PM +0200, Vlastimil Babka wrote:
>> The goal of memory compaction is to create high-order freepages through page
>> migration. Page migration however puts pages on the per-cpu lru_add cache,
>> which is later flushed to per-cpu pcplists, and only after pcplists are
>> drained the pages can actually merge. This can happen due to the per-cpu
>> caches becoming full through further freeing, or explicitly.
>>
>> During direct compaction, it is useful to do the draining explicitly so that
>> pages merge as soon as possible and compaction can detect success immediately
>> and keep the latency impact at minimum. However the current implementation is
>> far from ideal. Draining is done only in  __alloc_pages_direct_compact(),
>> after all zones were already compacted, and the decisions to continue or stop
>> compaction in individual zones was done without the last batch of migrations
>> being merged. It is also missing the draining of lru_add cache before the
>> pcplists.
>>
>> This patch moves the draining for direct compaction into compact_zone(). It
>> adds the missing lru_cache draining and uses the newly introduced single zone
>> pcplists draining to reduce overhead and avoid impact on unrelated zones.
>> Draining is only performed when it can actually lead to merging of a page of
>> desired order (passed by cc->order). This means it is only done when migration
>> occurred in the previously scanned cc->order aligned block(s) and the
>> migration scanner is now pointing to the next cc->order aligned block.
>>
>> The patch has been tested with stress-highalloc benchmark from mmtests.
>> Although overal allocation success rates of the benchmark were not affected,
>> the number of detected compaction successes has doubled. This suggests that
>> allocations were previously successful due to implicit merging caused by
>> background activity, making a later allocation attempt succeed immediately,
>> but not attributing the success to compaction. Since stress-highalloc always
>> tries to allocate almost the whole memory, it cannot show the improvement in
>> its reported success rate metric. However after this patch, compaction should
>> detect success and terminate earlier, reducing the direct compaction latencies
>> in a real scenario.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mel Gorman <mgorman@suse.de>
>> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> Cc: Michal Nazarewicz <mina86@mina86.com>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: David Rientjes <rientjes@google.com>
>> ---
>>   mm/compaction.c | 41 ++++++++++++++++++++++++++++++++++++++++-
>>   mm/page_alloc.c |  4 ----
>>   2 files changed, 40 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 8fa888d..41b49d7 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1179,6 +1179,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>>   	while ((ret = compact_finished(zone, cc, migratetype)) ==
>>   						COMPACT_CONTINUE) {
>>   		int err;
>> +		unsigned long last_migrated_pfn = 0;
>
> I think that this definition looks odd.
> In every iteration, last_migrated_pfn is re-defined as 0.
> Maybe, it is on outside of the loop.

Oops you're right, that's a mistake and it makes the code miss some of 
the drain points (a minority I think but anyway).

>>
>>   		switch (isolate_migratepages(zone, cc)) {
>>   		case ISOLATE_ABORT:
>> @@ -1187,7 +1188,12 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>>   			cc->nr_migratepages = 0;
>>   			goto out;
>>   		case ISOLATE_NONE:
>> -			continue;
>> +			/*
>> +			 * We haven't isolated and migrated anything, but
>> +			 * there might still be unflushed migrations from
>> +			 * previous cc->order aligned block.
>> +			 */
>> +			goto check_drain;
>>   		case ISOLATE_SUCCESS:
>>   			;
>>   		}
>> @@ -1212,6 +1218,39 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>>   				goto out;
>>   			}
>>   		}
>> +
>> +		/*
>> +		 * Record where we have freed pages by migration and not yet
>> +		 * flushed them to buddy allocator. Subtract 1, because often
>> +		 * we finish a pageblock and migrate_pfn points to the first
>> +		 * page* of the next one. In that case we want the drain below
>> +		 * to happen immediately.
>> +		 */
>> +		if (!last_migrated_pfn)
>> +			last_migrated_pfn = cc->migrate_pfn - 1;
>
> And, I wonder why last_migrated_pfn is set after isolate_migratepages().

Not sure I understand your question. With the mistake above, it cannot 
currently be set at the point isolate_migratepages() is called, so you 
might question the goto check_drain in the ISOLATE_NONE case, if that's 
what you are wondering about.

When I correct that, it might be set when COMPACT_CLUSTER_MAX pages are 
isolated and migrated the middle of a pageblock, and then the rest of 
the pageblock contains no pages that could be isolated, so the last 
isolate_migratepages() attempt in the pageblock returns with 
ISOLATE_NONE. Still there were some migrations that produced free pages 
that should be drained at that point.

>
> Thanks.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
	Michal Nazarewicz <mina86@mina86.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>
Subject: Re: [PATCH 5/5] mm, compaction: more focused lru and pcplists draining
Date: Mon, 03 Nov 2014 09:12:33 +0100	[thread overview]
Message-ID: <545738F1.4010307@suse.cz> (raw)
In-Reply-To: <20141027074112.GC23379@js1304-P5Q-DELUXE>

On 10/27/2014 08:41 AM, Joonsoo Kim wrote:
> On Tue, Oct 07, 2014 at 05:33:39PM +0200, Vlastimil Babka wrote:
>> The goal of memory compaction is to create high-order freepages through page
>> migration. Page migration however puts pages on the per-cpu lru_add cache,
>> which is later flushed to per-cpu pcplists, and only after pcplists are
>> drained the pages can actually merge. This can happen due to the per-cpu
>> caches becoming full through further freeing, or explicitly.
>>
>> During direct compaction, it is useful to do the draining explicitly so that
>> pages merge as soon as possible and compaction can detect success immediately
>> and keep the latency impact at minimum. However the current implementation is
>> far from ideal. Draining is done only in  __alloc_pages_direct_compact(),
>> after all zones were already compacted, and the decisions to continue or stop
>> compaction in individual zones was done without the last batch of migrations
>> being merged. It is also missing the draining of lru_add cache before the
>> pcplists.
>>
>> This patch moves the draining for direct compaction into compact_zone(). It
>> adds the missing lru_cache draining and uses the newly introduced single zone
>> pcplists draining to reduce overhead and avoid impact on unrelated zones.
>> Draining is only performed when it can actually lead to merging of a page of
>> desired order (passed by cc->order). This means it is only done when migration
>> occurred in the previously scanned cc->order aligned block(s) and the
>> migration scanner is now pointing to the next cc->order aligned block.
>>
>> The patch has been tested with stress-highalloc benchmark from mmtests.
>> Although overal allocation success rates of the benchmark were not affected,
>> the number of detected compaction successes has doubled. This suggests that
>> allocations were previously successful due to implicit merging caused by
>> background activity, making a later allocation attempt succeed immediately,
>> but not attributing the success to compaction. Since stress-highalloc always
>> tries to allocate almost the whole memory, it cannot show the improvement in
>> its reported success rate metric. However after this patch, compaction should
>> detect success and terminate earlier, reducing the direct compaction latencies
>> in a real scenario.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mel Gorman <mgorman@suse.de>
>> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> Cc: Michal Nazarewicz <mina86@mina86.com>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: David Rientjes <rientjes@google.com>
>> ---
>>   mm/compaction.c | 41 ++++++++++++++++++++++++++++++++++++++++-
>>   mm/page_alloc.c |  4 ----
>>   2 files changed, 40 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 8fa888d..41b49d7 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1179,6 +1179,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>>   	while ((ret = compact_finished(zone, cc, migratetype)) ==
>>   						COMPACT_CONTINUE) {
>>   		int err;
>> +		unsigned long last_migrated_pfn = 0;
>
> I think that this definition looks odd.
> In every iteration, last_migrated_pfn is re-defined as 0.
> Maybe, it is on outside of the loop.

Oops you're right, that's a mistake and it makes the code miss some of 
the drain points (a minority I think but anyway).

>>
>>   		switch (isolate_migratepages(zone, cc)) {
>>   		case ISOLATE_ABORT:
>> @@ -1187,7 +1188,12 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>>   			cc->nr_migratepages = 0;
>>   			goto out;
>>   		case ISOLATE_NONE:
>> -			continue;
>> +			/*
>> +			 * We haven't isolated and migrated anything, but
>> +			 * there might still be unflushed migrations from
>> +			 * previous cc->order aligned block.
>> +			 */
>> +			goto check_drain;
>>   		case ISOLATE_SUCCESS:
>>   			;
>>   		}
>> @@ -1212,6 +1218,39 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>>   				goto out;
>>   			}
>>   		}
>> +
>> +		/*
>> +		 * Record where we have freed pages by migration and not yet
>> +		 * flushed them to buddy allocator. Subtract 1, because often
>> +		 * we finish a pageblock and migrate_pfn points to the first
>> +		 * page* of the next one. In that case we want the drain below
>> +		 * to happen immediately.
>> +		 */
>> +		if (!last_migrated_pfn)
>> +			last_migrated_pfn = cc->migrate_pfn - 1;
>
> And, I wonder why last_migrated_pfn is set after isolate_migratepages().

Not sure I understand your question. With the mistake above, it cannot 
currently be set at the point isolate_migratepages() is called, so you 
might question the goto check_drain in the ISOLATE_NONE case, if that's 
what you are wondering about.

When I correct that, it might be set when COMPACT_CLUSTER_MAX pages are 
isolated and migrated the middle of a pageblock, and then the rest of 
the pageblock contains no pages that could be isolated, so the last 
isolate_migratepages() attempt in the pageblock returns with 
ISOLATE_NONE. Still there were some migrations that produced free pages 
that should be drained at that point.

>
> Thanks.
>


  reply	other threads:[~2014-11-03  8:12 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-07 15:33 [PATCH 0/5] Further compaction tuning Vlastimil Babka
2014-10-07 15:33 ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 1/5] mm, compaction: pass classzone_idx and alloc_flags to watermark checking Vlastimil Babka
2014-10-07 15:33   ` Vlastimil Babka
2014-10-20 15:45   ` Rik van Riel
2014-10-20 15:45     ` Rik van Riel
2014-10-27  6:46   ` Joonsoo Kim
2014-10-27  6:46     ` Joonsoo Kim
2014-10-27  9:11     ` Vlastimil Babka
2014-10-27  9:11       ` Vlastimil Babka
2014-10-28  7:16       ` Joonsoo Kim
2014-10-28  7:16         ` Joonsoo Kim
2014-10-29 13:51         ` Vlastimil Babka
2014-10-29 13:51           ` Vlastimil Babka
2014-10-31  7:49           ` Joonsoo Kim
2014-10-31  7:49             ` Joonsoo Kim
2014-11-14  8:52             ` Vlastimil Babka
2014-11-14  8:52               ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 2/5] mm, compaction: simplify deferred compaction Vlastimil Babka
2014-10-07 15:33   ` Vlastimil Babka
2014-10-15 22:32   ` Andrew Morton
2014-10-15 22:32     ` Andrew Morton
2014-10-16 15:11     ` Vlastimil Babka
2014-10-16 15:11       ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 3/5] mm, compaction: defer only on COMPACT_COMPLETE Vlastimil Babka
2014-10-07 15:33   ` Vlastimil Babka
2014-10-20 15:18   ` Rik van Riel
2014-10-20 15:18     ` Rik van Riel
2014-10-07 15:33 ` [PATCH 4/5] mm, compaction: always update cached scanner positions Vlastimil Babka
2014-10-07 15:33   ` Vlastimil Babka
2014-10-20 15:26   ` Rik van Riel
2014-10-20 15:26     ` Rik van Riel
2014-10-27  7:35   ` Joonsoo Kim
2014-10-27  7:35     ` Joonsoo Kim
2014-10-27  9:39     ` Vlastimil Babka
2014-10-27  9:39       ` Vlastimil Babka
2014-10-28  7:08       ` Joonsoo Kim
2014-10-28  7:08         ` Joonsoo Kim
2014-10-31 15:53         ` Vlastimil Babka
2014-10-31 15:53           ` Vlastimil Babka
2014-11-04  0:28           ` Joonsoo Kim
2014-11-04  0:28             ` Joonsoo Kim
2014-11-14  8:57             ` Vlastimil Babka
2014-11-14  8:57               ` Vlastimil Babka
2014-10-07 15:33 ` [PATCH 5/5] mm, compaction: more focused lru and pcplists draining Vlastimil Babka
2014-10-07 15:33   ` Vlastimil Babka
2014-10-20 15:44   ` Rik van Riel
2014-10-20 15:44     ` Rik van Riel
2014-10-27  7:41   ` Joonsoo Kim
2014-10-27  7:41     ` Joonsoo Kim
2014-11-03  8:12     ` Vlastimil Babka [this message]
2014-11-03  8:12       ` Vlastimil Babka
2014-11-04  0:37       ` Joonsoo Kim
2014-11-04  0:37         ` Joonsoo Kim
2014-11-13 12:47         ` Vlastimil Babka
2014-11-13 12:47           ` Vlastimil Babka
2014-11-14  7:05           ` Joonsoo Kim
2014-11-14  7:05             ` Joonsoo Kim
2014-11-19 22:53             ` Vlastimil Babka
2014-11-19 22:53               ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545738F1.4010307@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.