From: Vlastimil Babka <vbabka@suse.cz>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>,
Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Michal Nazarewicz <mina86@mina86.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 12/13] mm, compaction: try to capture the just-created high-order freepage
Date: Wed, 25 Jun 2014 10:57:36 +0200 [thread overview]
Message-ID: <53AA8F00.4000902@suse.cz> (raw)
In-Reply-To: <20140625015733.GC12855@nhori.redhat.com>
On 06/25/2014 03:57 AM, Naoya Horiguchi wrote:
> On Fri, Jun 20, 2014 at 05:49:42PM +0200, Vlastimil Babka wrote:
>> Compaction uses watermark checking to determine if it succeeded in creating
>> a high-order free page. My testing has shown that this is quite racy and it
>> can happen that watermark checking in compaction succeeds, and moments later
>> the watermark checking in page allocation fails, even though the number of
>> free pages has increased meanwhile.
>>
>> It should be more reliable if direct compaction captured the high-order free
>> page as soon as it detects it, and pass it back to allocation. This would
>> also reduce the window for somebody else to allocate the free page.
>>
>> Capture has been implemented before by 1fb3f8ca0e92 ("mm: compaction: capture
>> a suitable high-order page immediately when it is made available"), but later
>> reverted by 8fb74b9f ("mm: compaction: partially revert capture of suitable
>> high-order page") due to a bug.
>>
>> This patch differs from the previous attempt in two aspects:
>>
>> 1) The previous patch scanned free lists to capture the page. In this patch,
>> only the cc->order aligned block that the migration scanner just finished
>> is considered, but only if pages were actually isolated for migration in
>> that block. Tracking cc->order aligned blocks also has benefits for the
>> following patch that skips blocks where non-migratable pages were found.
>>
>> 2) The operations done in buffered_rmqueue() and get_page_from_freelist() are
>> closely followed so that page capture mimics normal page allocation as much
>> as possible. This includes operations such as prep_new_page() and
>> page->pfmemalloc setting (that was missing in the previous attempt), zone
>> statistics are updated etc. Due to subtleties with IRQ disabling and
>> enabling this cannot be simply factored out from the normal allocation
>> functions without affecting the fastpath.
>>
>> This patch has tripled compaction success rates (as recorded in vmstat) in
>> stress-highalloc mmtests benchmark, although allocation success rates increased
>> only by a few percent. Closer inspection shows that due to the racy watermark
>> checking and lack of lru_add_drain(), the allocations that resulted in direct
>> compactions were often failing, but later allocations succeeeded in the fast
>> path. So the benefit of the patch to allocation success rates may be limited,
>> but it improves the fairness in the sense that whoever spent the time
>> compacting has a higher change of benefitting from it, and also can stop
>> compacting sooner, as page availability is detected immediately. With better
>> success detection, the contribution of compaction to high-order allocation
>> success success rates is also no longer understated by the vmstats.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mel Gorman <mgorman@suse.de>
>> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> Cc: Michal Nazarewicz <mina86@mina86.com>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: David Rientjes <rientjes@google.com>
>> ---
> ...
>> @@ -669,6 +708,7 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>> continue;
>> if (PageTransHuge(page)) {
>> low_pfn += (1 << compound_order(page)) - 1;
>> + next_capture_pfn = low_pfn + 1;
>
> Don't we need if (next_capture_pfn) here?
Good catch, thanks! It should also use ALIGN properly as the non-locked
test above.
> Thanks,
> Naoya Horiguchi
>
>> continue;
>> }
>> }
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>,
Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Michal Nazarewicz <mina86@mina86.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 12/13] mm, compaction: try to capture the just-created high-order freepage
Date: Wed, 25 Jun 2014 10:57:36 +0200 [thread overview]
Message-ID: <53AA8F00.4000902@suse.cz> (raw)
In-Reply-To: <20140625015733.GC12855@nhori.redhat.com>
On 06/25/2014 03:57 AM, Naoya Horiguchi wrote:
> On Fri, Jun 20, 2014 at 05:49:42PM +0200, Vlastimil Babka wrote:
>> Compaction uses watermark checking to determine if it succeeded in creating
>> a high-order free page. My testing has shown that this is quite racy and it
>> can happen that watermark checking in compaction succeeds, and moments later
>> the watermark checking in page allocation fails, even though the number of
>> free pages has increased meanwhile.
>>
>> It should be more reliable if direct compaction captured the high-order free
>> page as soon as it detects it, and pass it back to allocation. This would
>> also reduce the window for somebody else to allocate the free page.
>>
>> Capture has been implemented before by 1fb3f8ca0e92 ("mm: compaction: capture
>> a suitable high-order page immediately when it is made available"), but later
>> reverted by 8fb74b9f ("mm: compaction: partially revert capture of suitable
>> high-order page") due to a bug.
>>
>> This patch differs from the previous attempt in two aspects:
>>
>> 1) The previous patch scanned free lists to capture the page. In this patch,
>> only the cc->order aligned block that the migration scanner just finished
>> is considered, but only if pages were actually isolated for migration in
>> that block. Tracking cc->order aligned blocks also has benefits for the
>> following patch that skips blocks where non-migratable pages were found.
>>
>> 2) The operations done in buffered_rmqueue() and get_page_from_freelist() are
>> closely followed so that page capture mimics normal page allocation as much
>> as possible. This includes operations such as prep_new_page() and
>> page->pfmemalloc setting (that was missing in the previous attempt), zone
>> statistics are updated etc. Due to subtleties with IRQ disabling and
>> enabling this cannot be simply factored out from the normal allocation
>> functions without affecting the fastpath.
>>
>> This patch has tripled compaction success rates (as recorded in vmstat) in
>> stress-highalloc mmtests benchmark, although allocation success rates increased
>> only by a few percent. Closer inspection shows that due to the racy watermark
>> checking and lack of lru_add_drain(), the allocations that resulted in direct
>> compactions were often failing, but later allocations succeeeded in the fast
>> path. So the benefit of the patch to allocation success rates may be limited,
>> but it improves the fairness in the sense that whoever spent the time
>> compacting has a higher change of benefitting from it, and also can stop
>> compacting sooner, as page availability is detected immediately. With better
>> success detection, the contribution of compaction to high-order allocation
>> success success rates is also no longer understated by the vmstats.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mel Gorman <mgorman@suse.de>
>> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> Cc: Michal Nazarewicz <mina86@mina86.com>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: David Rientjes <rientjes@google.com>
>> ---
> ...
>> @@ -669,6 +708,7 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>> continue;
>> if (PageTransHuge(page)) {
>> low_pfn += (1 << compound_order(page)) - 1;
>> + next_capture_pfn = low_pfn + 1;
>
> Don't we need if (next_capture_pfn) here?
Good catch, thanks! It should also use ALIGN properly as the non-locked
test above.
> Thanks,
> Naoya Horiguchi
>
>> continue;
>> }
>> }
next prev parent reply other threads:[~2014-06-25 8:57 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-20 15:49 [PATCH v3 00/13] compaction: balancing overhead and success rates Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-20 15:49 ` [PATCH v3 01/13] mm, THP: don't hold mmap_sem in khugepaged when allocating THP Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-20 17:45 ` Kirill A. Shutemov
2014-06-20 17:45 ` Kirill A. Shutemov
2014-06-23 5:39 ` Zhang Yanfei
2014-06-23 5:39 ` Zhang Yanfei
2014-06-23 9:52 ` Vlastimil Babka
2014-06-23 9:52 ` Vlastimil Babka
2014-06-23 10:40 ` Zhang Yanfei
2014-06-23 10:40 ` Zhang Yanfei
2014-06-20 15:49 ` [PATCH v3 02/13] mm, compaction: defer each zone individually instead of preferred zone Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 2:24 ` Minchan Kim
2014-06-23 2:24 ` Minchan Kim
2014-06-23 6:26 ` Zhang Yanfei
2014-06-23 6:26 ` Zhang Yanfei
2014-06-24 8:23 ` Joonsoo Kim
2014-06-24 8:23 ` Joonsoo Kim
2014-06-24 15:29 ` Vlastimil Babka
2014-06-24 15:29 ` Vlastimil Babka
2014-06-25 1:02 ` Joonsoo Kim
2014-06-25 1:02 ` Joonsoo Kim
2014-06-20 15:49 ` [PATCH v3 03/13] mm, compaction: do not recheck suitable_migration_target under lock Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-20 15:49 ` [PATCH v3 04/13] mm, compaction: move pageblock checks up from isolate_migratepages_range() Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 6:57 ` Zhang Yanfei
2014-06-23 6:57 ` Zhang Yanfei
2014-06-24 4:52 ` Naoya Horiguchi
2014-06-24 4:52 ` Naoya Horiguchi
2014-06-24 15:34 ` Vlastimil Babka
2014-06-24 15:34 ` Vlastimil Babka
2014-06-24 16:58 ` Naoya Horiguchi
2014-06-24 16:58 ` Naoya Horiguchi
2014-06-25 8:50 ` Vlastimil Babka
2014-06-25 8:50 ` Vlastimil Babka
2014-06-25 15:46 ` Naoya Horiguchi
2014-06-25 15:46 ` Naoya Horiguchi
2014-06-24 8:33 ` Joonsoo Kim
2014-06-24 8:33 ` Joonsoo Kim
2014-06-24 15:42 ` Vlastimil Babka
2014-06-24 15:42 ` Vlastimil Babka
2014-06-25 0:53 ` Joonsoo Kim
2014-06-25 0:53 ` Joonsoo Kim
2014-06-25 8:59 ` Vlastimil Babka
2014-06-25 8:59 ` Vlastimil Babka
2014-06-27 5:57 ` Joonsoo Kim
2014-06-27 5:57 ` Joonsoo Kim
2014-06-20 15:49 ` [PATCH v3 05/13] mm, compaction: report compaction as contended only due to lock contention Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 1:39 ` Minchan Kim
2014-06-23 1:39 ` Minchan Kim
2014-06-23 8:55 ` Zhang Yanfei
2014-06-23 8:55 ` Zhang Yanfei
2014-06-23 23:35 ` Minchan Kim
2014-06-23 23:35 ` Minchan Kim
2014-06-24 1:07 ` Zhang Yanfei
2014-06-24 1:07 ` Zhang Yanfei
2014-07-11 8:28 ` Vlastimil Babka
2014-07-11 8:28 ` Vlastimil Babka
2014-07-11 9:38 ` Vlastimil Babka
2014-07-11 9:38 ` Vlastimil Babka
2014-06-20 15:49 ` [PATCH v3 06/13] mm, compaction: periodically drop lock and restore IRQs in scanners Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 2:53 ` Minchan Kim
2014-06-23 2:53 ` Minchan Kim
2014-07-11 12:03 ` Vlastimil Babka
2014-07-11 12:03 ` Vlastimil Babka
2014-06-23 9:13 ` Zhang Yanfei
2014-06-23 9:13 ` Zhang Yanfei
2014-06-24 15:39 ` Naoya Horiguchi
2014-06-24 15:39 ` Naoya Horiguchi
2014-06-24 15:44 ` Vlastimil Babka
2014-06-24 15:44 ` Vlastimil Babka
2014-06-20 15:49 ` [PATCH v3 07/13] mm, compaction: skip rechecks when lock was already held Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 9:16 ` Zhang Yanfei
2014-06-23 9:16 ` Zhang Yanfei
2014-06-24 18:55 ` Naoya Horiguchi
2014-06-24 18:55 ` Naoya Horiguchi
2014-06-20 15:49 ` [PATCH v3 08/13] mm, compaction: remember position within pageblock in free pages scanner Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 3:04 ` Minchan Kim
2014-06-23 3:04 ` Minchan Kim
2014-06-23 9:17 ` Zhang Yanfei
2014-06-23 9:17 ` Zhang Yanfei
2014-06-24 19:09 ` Naoya Horiguchi
2014-06-24 19:09 ` Naoya Horiguchi
2014-06-20 15:49 ` [PATCH v3 09/13] mm, compaction: skip buddy pages by their order in the migrate scanner Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 3:05 ` Minchan Kim
2014-06-23 3:05 ` Minchan Kim
2014-06-23 9:29 ` Zhang Yanfei
2014-06-23 9:29 ` Zhang Yanfei
2014-06-20 15:49 ` [PATCH v3 10/13] mm: rename allocflags_to_migratetype for clarity Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-24 20:34 ` Naoya Horiguchi
2014-06-24 20:34 ` Naoya Horiguchi
2014-06-20 15:49 ` [PATCH v3 11/13] mm, compaction: pass gfp mask to compact_control Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-23 3:06 ` Minchan Kim
2014-06-23 3:06 ` Minchan Kim
2014-06-23 9:31 ` Zhang Yanfei
2014-06-23 9:31 ` Zhang Yanfei
2014-06-20 15:49 ` [PATCH v3 12/13] mm, compaction: try to capture the just-created high-order freepage Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
2014-06-25 1:57 ` Naoya Horiguchi
2014-06-25 1:57 ` Naoya Horiguchi
2014-06-25 8:57 ` Vlastimil Babka [this message]
2014-06-25 8:57 ` Vlastimil Babka
2014-06-20 15:49 ` [RFC PATCH v3 13/13] mm, compaction: do not migrate pages when that cannot satisfy page fault allocation Vlastimil Babka
2014-06-20 15:49 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53AA8F00.4000902@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.