From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f44.google.com (mail-pa0-f44.google.com [209.85.220.44]) by kanga.kvack.org (Postfix) with ESMTP id 4F7C26B0038 for ; Fri, 30 Jan 2015 07:34:31 -0500 (EST) Received: by mail-pa0-f44.google.com with SMTP id rd3so51917702pab.3 for ; Fri, 30 Jan 2015 04:34:31 -0800 (PST) Received: from mail-pa0-x233.google.com (mail-pa0-x233.google.com. [2607:f8b0:400e:c03::233]) by mx.google.com with ESMTPS id b1si13624482pat.116.2015.01.30.04.34.30 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 04:34:30 -0800 (PST) Received: by mail-pa0-f51.google.com with SMTP id fb1so52073109pad.10 for ; Fri, 30 Jan 2015 04:34:30 -0800 (PST) From: Joonsoo Kim Subject: [PATCH v2 0/4] enhance compaction success rate Date: Fri, 30 Jan 2015 21:34:08 +0900 Message-Id: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim This patchset aims at increase of compaction success rate. Changes are related to compaction finish condition and freepage isolation condition. >>From these changes, I did stress highalloc test in mmtests with nonmovable order 7 allocation configuration, and compaction success rate (%) are Base Patch-1 Patch-2 Patch-3 Patch-4 18.47 27.13 31.82 -- 42.20 Note: Base version is tested in v1 and the others are tested freshly. Test is perform based on next-20150103 and Vlastimil's stealing logic patches due to current next's unstablility. Patch-3 isn't tested since there is no functional change. Joonsoo (3): mm/compaction: stop the isolation when we isolate enough freepage mm/page_alloc: separate steal decision from steal behaviour part mm/compaction: enhance compaction finish condition Joonsoo Kim (1): mm/compaction: fix wrong order check in compact_finished() include/linux/mmzone.h | 3 +++ mm/compaction.c | 47 ++++++++++++++++++++++++++++++++++++++--------- mm/internal.h | 1 + mm/page_alloc.c | 50 ++++++++++++++++++++++++++++++++------------------ 4 files changed, 74 insertions(+), 27 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f48.google.com (mail-pa0-f48.google.com [209.85.220.48]) by kanga.kvack.org (Postfix) with ESMTP id DECAA6B006C for ; Fri, 30 Jan 2015 07:34:34 -0500 (EST) Received: by mail-pa0-f48.google.com with SMTP id ey11so52020889pad.7 for ; Fri, 30 Jan 2015 04:34:34 -0800 (PST) Received: from mail-pa0-x231.google.com (mail-pa0-x231.google.com. [2607:f8b0:400e:c03::231]) by mx.google.com with ESMTPS id hq3si1177630pac.11.2015.01.30.04.34.34 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 04:34:34 -0800 (PST) Received: by mail-pa0-f49.google.com with SMTP id fa1so52017471pad.8 for ; Fri, 30 Jan 2015 04:34:34 -0800 (PST) From: Joonsoo Kim Subject: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Date: Fri, 30 Jan 2015 21:34:09 +0900 Message-Id: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , stable@vger.kernel.org What we want to check here is whether there is highorder freepage in buddy list of other migratetype in order to steal it without fragmentation. But, current code just checks cc->order which means allocation request order. So, this is wrong. Without this fix, non-movable synchronous compaction below pageblock order would not stopped until compaction is complete, because migratetype of most pageblocks are movable and high order freepage made by compaction is usually on movable type buddy list. There is some report related to this bug. See below link. http://www.spinics.net/lists/linux-mm/msg81666.html Although the issued system still has load spike comes from compaction, this makes that system completely stable and responsive according to his report. stress-highalloc test in mmtests with non movable order 7 allocation doesn't show any notable difference in allocation success rate, but, it shows more compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 18.47 : 28.94 Cc: Acked-by: Vlastimil Babka Signed-off-by: Joonsoo Kim --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index b68736c..4954e19 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, return COMPACT_PARTIAL; /* Job done if allocation would set block type */ - if (cc->order >= pageblock_order && area->nr_free) + if (order >= pageblock_order && area->nr_free) return COMPACT_PARTIAL; } -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id 3F1856B006E for ; Fri, 30 Jan 2015 07:34:38 -0500 (EST) Received: by mail-pa0-f51.google.com with SMTP id fb1so52074155pad.10 for ; Fri, 30 Jan 2015 04:34:38 -0800 (PST) Received: from mail-pa0-x22e.google.com (mail-pa0-x22e.google.com. [2607:f8b0:400e:c03::22e]) by mx.google.com with ESMTPS id fv6si13635104pdb.108.2015.01.30.04.34.37 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 04:34:37 -0800 (PST) Received: by mail-pa0-f46.google.com with SMTP id lj1so51905333pab.5 for ; Fri, 30 Jan 2015 04:34:37 -0800 (PST) From: Joonsoo Kim Subject: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Date: Fri, 30 Jan 2015 21:34:10 +0900 Message-Id: <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim From: Joonsoo Currently, freepage isolation in one pageblock doesn't consider how many freepages we isolate. When I traced flow of compaction, compaction sometimes isolates more than 256 freepages to migrate just 32 pages. In this patch, freepage isolation is stopped at the point that we have more isolated freepage than isolated page for migration. This results in slowing down free page scanner and make compaction success rate higher. stress-highalloc test in mmtests with non movable order 7 allocation shows increase of compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 27.13 : 31.82 pfn where both scanners meets on compaction complete (separate test due to enormous tracepoint buffer) (zone_start=4096, zone_end=1048576) 586034 : 654378 In fact, I didn't fully understand why this patch results in such good result. There was a guess that not used freepages are released to pcp list and on next compaction trial we won't isolate them again so compaction success rate would decrease. To prevent this effect, I tested with adding pcp drain code on release_freepages(), but, it has no good effect. Anyway, this patch reduces waste time to isolate unneeded freepages so seems reasonable. Signed-off-by: Joonsoo Kim --- mm/compaction.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 4954e19..782772d 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, /* If a page was split, advance to the end of it */ if (isolated) { + cc->nr_freepages += isolated; + if (!strict && + cc->nr_migratepages <= cc->nr_freepages) { + blockpfn += isolated; + break; + } + blockpfn += isolated - 1; cursor += isolated - 1; continue; @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc) unsigned long isolate_start_pfn; /* exact pfn we start at */ unsigned long block_end_pfn; /* end of current pageblock */ unsigned long low_pfn; /* lowest pfn scanner is able to scan */ - int nr_freepages = cc->nr_freepages; struct list_head *freelist = &cc->freepages; /* @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc) * pages on cc->migratepages. We stop searching if the migrate * and free page scanners meet or enough free pages are isolated. */ - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages; + for (; block_start_pfn >= low_pfn && + cc->nr_migratepages > cc->nr_freepages; block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { - unsigned long isolated; /* * This can iterate a massively long zone without finding any @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolated = isolate_freepages_block(cc, &isolate_start_pfn, + isolate_freepages_block(cc, &isolate_start_pfn, block_end_pfn, freelist, false); - nr_freepages += isolated; /* * Remember where the free scanner should restart next time, @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc) */ if (block_start_pfn < low_pfn) cc->free_pfn = cc->migrate_pfn; - - cc->nr_freepages = nr_freepages; } /* -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) by kanga.kvack.org (Postfix) with ESMTP id 319456B0070 for ; Fri, 30 Jan 2015 07:34:42 -0500 (EST) Received: by mail-pa0-f41.google.com with SMTP id kq14so52029018pab.0 for ; Fri, 30 Jan 2015 04:34:41 -0800 (PST) Received: from mail-pa0-x230.google.com (mail-pa0-x230.google.com. [2607:f8b0:400e:c03::230]) by mx.google.com with ESMTPS id qp2si13616495pdb.123.2015.01.30.04.34.41 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 04:34:41 -0800 (PST) Received: by mail-pa0-f48.google.com with SMTP id ey11so52021775pad.7 for ; Fri, 30 Jan 2015 04:34:41 -0800 (PST) From: Joonsoo Kim Subject: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Date: Fri, 30 Jan 2015 21:34:11 +0900 Message-Id: <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim From: Joonsoo This is preparation step to use page allocator's anti fragmentation logic in compaction. This patch just separates steal decision part from actual steal behaviour part so there is no functional change. Signed-off-by: Joonsoo Kim --- mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 17 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8d52ab1..ef74750 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, } } +static bool can_steal_freepages(unsigned int order, + int start_mt, int fallback_mt) +{ + if (is_migrate_cma(fallback_mt)) + return false; + + if (order >= pageblock_order) + return true; + + if (order >= pageblock_order / 2 || + start_mt == MIGRATE_RECLAIMABLE || + start_mt == MIGRATE_UNMOVABLE || + page_group_by_mobility_disabled) + return true; + + return false; +} + /* * When we are falling back to another migratetype during allocation, try to * steal extra free pages from the same pageblocks to satisfy further @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, * as well. */ static void try_to_steal_freepages(struct zone *zone, struct page *page, - int start_type, int fallback_type) + int start_type) { int current_order = page_order(page); + int pages; /* Take ownership for orders >= pageblock_order */ if (current_order >= pageblock_order) { @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, return; } - if (current_order >= pageblock_order / 2 || - start_type == MIGRATE_RECLAIMABLE || - start_type == MIGRATE_UNMOVABLE || - page_group_by_mobility_disabled) { - int pages; + pages = move_freepages_block(zone, page, start_type); - pages = move_freepages_block(zone, page, start_type); - - /* Claim the whole block if over half of it is free */ - if (pages >= (1 << (pageblock_order-1)) || - page_group_by_mobility_disabled) - set_pageblock_migratetype(page, start_type); - } + /* Claim the whole block if over half of it is free */ + if (pages >= (1 << (pageblock_order-1)) || + page_group_by_mobility_disabled) + set_pageblock_migratetype(page, start_type); } /* Remove an element from the buddy allocator from the fallback list */ @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) struct free_area *area; unsigned int current_order; struct page *page; + bool can_steal; /* Find the largest possible block of pages in the other list */ for (current_order = MAX_ORDER-1; @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) struct page, lru); area->nr_free--; - if (!is_migrate_cma(migratetype)) { + can_steal = can_steal_freepages(current_order, + start_migratetype, migratetype); + if (can_steal) { try_to_steal_freepages(zone, page, - start_migratetype, - migratetype); + start_migratetype); } else { /* * When borrowing from MIGRATE_CMA, we need to @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) * itself, and we do not try to steal extra * free pages. */ - buddy_type = migratetype; + if (is_migrate_cma(migratetype)) + buddy_type = migratetype; } /* Remove the page from the freelists */ -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f44.google.com (mail-pa0-f44.google.com [209.85.220.44]) by kanga.kvack.org (Postfix) with ESMTP id 71F596B0071 for ; Fri, 30 Jan 2015 07:34:45 -0500 (EST) Received: by mail-pa0-f44.google.com with SMTP id rd3so51919521pab.3 for ; Fri, 30 Jan 2015 04:34:45 -0800 (PST) Received: from mail-pa0-x22b.google.com (mail-pa0-x22b.google.com. [2607:f8b0:400e:c03::22b]) by mx.google.com with ESMTPS id we6si13608811pac.129.2015.01.30.04.34.44 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 04:34:44 -0800 (PST) Received: by mail-pa0-f43.google.com with SMTP id eu11so51996272pac.2 for ; Fri, 30 Jan 2015 04:34:44 -0800 (PST) From: Joonsoo Kim Subject: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Date: Fri, 30 Jan 2015 21:34:12 +0900 Message-Id: <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim From: Joonsoo Compaction has anti fragmentation algorithm. It is that freepage should be more than pageblock order to finish the compaction if we don't find any freepage in requested migratetype buddy list. This is for mitigating fragmentation, but, there is a lack of migratetype consideration and it is too excessive compared to page allocator's anti fragmentation algorithm. Not considering migratetype would cause premature finish of compaction. For example, if allocation request is for unmovable migratetype, freepage with CMA migratetype doesn't help that allocation and compaction should not be stopped. But, current logic regards this situation as compaction is no longer needed, so finish the compaction. Secondly, condition is too excessive compared to page allocator's logic. We can steal freepage from other migratetype and change pageblock migratetype on more relaxed conditions in page allocator. This is designed to prevent fragmentation and we can use it here. Imposing hard constraint only to the compaction doesn't help much in this case since page allocator would cause fragmentation again. To solve these problems, this patch borrows anti fragmentation logic from page allocator. It will reduce premature compaction finish in some cases and reduce excessive compaction work. stress-highalloc test in mmtests with non movable order 7 allocation shows considerable increase of compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 31.82 : 42.20 Signed-off-by: Joonsoo Kim --- include/linux/mmzone.h | 3 +++ mm/compaction.c | 30 ++++++++++++++++++++++++++++-- mm/internal.h | 1 + mm/page_alloc.c | 5 ++--- 4 files changed, 34 insertions(+), 5 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index f279d9c..a2906bc 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -63,6 +63,9 @@ enum { MIGRATE_TYPES }; +#define FALLBACK_MIGRATETYPES (4) +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; + #ifdef CONFIG_CMA # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) #else diff --git a/mm/compaction.c b/mm/compaction.c index 782772d..0460e4b 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; } +static bool can_steal_fallbacks(struct free_area *area, + unsigned int order, int migratetype) +{ + int i; + int fallback_mt; + + if (area->nr_free == 0) + return false; + + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { + fallback_mt = fallbacks[migratetype][i]; + if (fallback_mt == MIGRATE_RESERVE) + break; + + if (list_empty(&area->free_list[fallback_mt])) + continue; + + if (can_steal_freepages(order, migratetype, fallback_mt)) + return true; + } + return false; +} + static int __compact_finished(struct zone *zone, struct compact_control *cc, const int migratetype) { @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, if (!list_empty(&area->free_list[migratetype])) return COMPACT_PARTIAL; - /* Job done if allocation would set block type */ - if (order >= pageblock_order && area->nr_free) + /* + * Job done if allocation would steal freepages from + * other migratetype buddy lists. + */ + if (can_steal_fallbacks(area, order, migratetype)) return COMPACT_PARTIAL; } diff --git a/mm/internal.h b/mm/internal.h index c4d6c9b..0a89a14 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -201,6 +201,7 @@ unsigned long isolate_migratepages_range(struct compact_control *cc, unsigned long low_pfn, unsigned long end_pfn); +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt); #endif /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ef74750..4c3538b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, * This array describes the order lists are fallen back to when * the free lists for the desirable migrate type are depleted */ -static int fallbacks[MIGRATE_TYPES][4] = { +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = { [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, #ifdef CONFIG_CMA @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page, } } -static bool can_steal_freepages(unsigned int order, - int start_mt, int fallback_mt) +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt) { if (is_migrate_cma(fallback_mt)) return false; -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) by kanga.kvack.org (Postfix) with ESMTP id A8AE46B0038 for ; Fri, 30 Jan 2015 08:27:21 -0500 (EST) Received: by mail-wi0-f180.google.com with SMTP id h11so2663742wiw.1 for ; Fri, 30 Jan 2015 05:27:21 -0800 (PST) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id v3si3089300wjf.128.2015.01.30.05.27.17 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 05:27:18 -0800 (PST) Message-ID: <54CB86B3.1060500@suse.cz> Date: Fri, 30 Jan 2015 14:27:15 +0100 From: Vlastimil Babka MIME-Version: 1.0 Subject: Re: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , stable@vger.kernel.org On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > What we want to check here is whether there is highorder freepage > in buddy list of other migratetype in order to steal it without > fragmentation. But, current code just checks cc->order which means > allocation request order. So, this is wrong. The bug has been introduced by 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page immediately when it is made available") and survived the later partial revert 8fb74b9fb2b1. > Without this fix, non-movable synchronous compaction below pageblock order > would not stopped until compaction is complete, because migratetype of most > pageblocks are movable and high order freepage made by compaction is usually > on movable type buddy list. > > There is some report related to this bug. See below link. > > http://www.spinics.net/lists/linux-mm/msg81666.html > > Although the issued system still has load spike comes from compaction, > this makes that system completely stable and responsive according to > his report. > > stress-highalloc test in mmtests with non movable order 7 allocation doesn't > show any notable difference in allocation success rate, but, it shows more > compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 18.47 : 28.94 > > Cc: # v3.7+ Fixes: 1fb3f8ca0e9222535a39b884cb67a34628411b9f > Acked-by: Vlastimil Babka > Signed-off-by: Joonsoo Kim Thanks. > --- > mm/compaction.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index b68736c..4954e19 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > return COMPACT_PARTIAL; > > /* Job done if allocation would set block type */ > - if (cc->order >= pageblock_order && area->nr_free) > + if (order >= pageblock_order && area->nr_free) > return COMPACT_PARTIAL; > } > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f52.google.com (mail-wg0-f52.google.com [74.125.82.52]) by kanga.kvack.org (Postfix) with ESMTP id 3D7B26B0038 for ; Fri, 30 Jan 2015 08:47:31 -0500 (EST) Received: by mail-wg0-f52.google.com with SMTP id y19so26940040wgg.11 for ; Fri, 30 Jan 2015 05:47:30 -0800 (PST) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id as2si20712655wjc.91.2015.01.30.05.47.28 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 05:47:28 -0800 (PST) Message-ID: <54CB8B6B.2080503@suse.cz> Date: Fri, 30 Jan 2015 14:47:23 +0100 From: Vlastimil Babka MIME-Version: 1.0 Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > From: Joonsoo > > Currently, freepage isolation in one pageblock doesn't consider how many > freepages we isolate. When I traced flow of compaction, compaction > sometimes isolates more than 256 freepages to migrate just 32 pages. > > In this patch, freepage isolation is stopped at the point that we > have more isolated freepage than isolated page for migration. This > results in slowing down free page scanner and make compaction success > rate higher. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 27.13 : 31.82 > > pfn where both scanners meets on compaction complete > (separate test due to enormous tracepoint buffer) > (zone_start=4096, zone_end=1048576) > 586034 : 654378 Now I that I know that scanners meeting further in zone is better for success rate, the better success rate makes sense. Still not sure why they meet further :) > In fact, I didn't fully understand why this patch results in such good > result. There was a guess that not used freepages are released to pcp list > and on next compaction trial we won't isolate them again so compaction > success rate would decrease. To prevent this effect, I tested with adding > pcp drain code on release_freepages(), but, it has no good effect. > > Anyway, this patch reduces waste time to isolate unneeded freepages so > seems reasonable. I briefly tried it on top of the pivot-changing series and with order-9 allocations it reduced free page scanned counter by almost 10%. No effect on success rates (maybe because pivot changing already took care of the scanners meeting problem) but the scanning reduction is good on its own. It also explains why e14c720efdd7 ("mm, compaction: remember position within pageblock in free pages scanner") had less than expected improvements. It would only actually stop within pageblock in case of async compaction detecting contention. I guess that's also why the infinite loop problem fixed by 1d5bfe1ffb5b affected so relatively few people. > Signed-off-by: Joonsoo Kim Acked-by: Vlastimil Babka Thanks! > --- > mm/compaction.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 4954e19..782772d 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > > /* If a page was split, advance to the end of it */ > if (isolated) { > + cc->nr_freepages += isolated; > + if (!strict && > + cc->nr_migratepages <= cc->nr_freepages) { > + blockpfn += isolated; > + break; > + } > + > blockpfn += isolated - 1; > cursor += isolated - 1; > continue; > @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc) > unsigned long isolate_start_pfn; /* exact pfn we start at */ > unsigned long block_end_pfn; /* end of current pageblock */ > unsigned long low_pfn; /* lowest pfn scanner is able to scan */ > - int nr_freepages = cc->nr_freepages; > struct list_head *freelist = &cc->freepages; > > /* > @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc) > * pages on cc->migratepages. We stop searching if the migrate > * and free page scanners meet or enough free pages are isolated. > */ > - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages; > + for (; block_start_pfn >= low_pfn && > + cc->nr_migratepages > cc->nr_freepages; > block_end_pfn = block_start_pfn, > block_start_pfn -= pageblock_nr_pages, > isolate_start_pfn = block_start_pfn) { > - unsigned long isolated; > > /* > * This can iterate a massively long zone without finding any > @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc) > continue; > > /* Found a block suitable for isolating free pages from. */ > - isolated = isolate_freepages_block(cc, &isolate_start_pfn, > + isolate_freepages_block(cc, &isolate_start_pfn, > block_end_pfn, freelist, false); > - nr_freepages += isolated; > > /* > * Remember where the free scanner should restart next time, > @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc) > */ > if (block_start_pfn < low_pfn) > cc->free_pfn = cc->migrate_pfn; > - > - cc->nr_freepages = nr_freepages; > } > > /* > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f45.google.com (mail-wg0-f45.google.com [74.125.82.45]) by kanga.kvack.org (Postfix) with ESMTP id 358816B0038 for ; Fri, 30 Jan 2015 09:27:58 -0500 (EST) Received: by mail-wg0-f45.google.com with SMTP id x12so27176738wgg.4 for ; Fri, 30 Jan 2015 06:27:57 -0800 (PST) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id hj2si20903791wjb.93.2015.01.30.06.27.55 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 06:27:56 -0800 (PST) Message-ID: <54CB94E6.7010805@suse.cz> Date: Fri, 30 Jan 2015 15:27:50 +0100 From: Vlastimil Babka MIME-Version: 1.0 Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > From: Joonsoo > > This is preparation step to use page allocator's anti fragmentation logic > in compaction. This patch just separates steal decision part from actual > steal behaviour part so there is no functional change. > > Signed-off-by: Joonsoo Kim > --- > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 32 insertions(+), 17 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 8d52ab1..ef74750 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > +static bool can_steal_freepages(unsigned int order, > + int start_mt, int fallback_mt) > +{ > + if (is_migrate_cma(fallback_mt)) > + return false; > + > + if (order >= pageblock_order) > + return true; > + > + if (order >= pageblock_order / 2 || > + start_mt == MIGRATE_RECLAIMABLE || > + start_mt == MIGRATE_UNMOVABLE || > + page_group_by_mobility_disabled) > + return true; > + > + return false; > +} > + > /* > * When we are falling back to another migratetype during allocation, try to > * steal extra free pages from the same pageblocks to satisfy further > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, > * as well. > */ > static void try_to_steal_freepages(struct zone *zone, struct page *page, > - int start_type, int fallback_type) > + int start_type) It's actually not 'try_to_' anymore, is it? But could be, see below. > { > int current_order = page_order(page); > + int pages; > > /* Take ownership for orders >= pageblock_order */ > if (current_order >= pageblock_order) { > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, > return; > } > > - if (current_order >= pageblock_order / 2 || > - start_type == MIGRATE_RECLAIMABLE || > - start_type == MIGRATE_UNMOVABLE || > - page_group_by_mobility_disabled) { > - int pages; > + pages = move_freepages_block(zone, page, start_type); > > - pages = move_freepages_block(zone, page, start_type); > - > - /* Claim the whole block if over half of it is free */ > - if (pages >= (1 << (pageblock_order-1)) || > - page_group_by_mobility_disabled) > - set_pageblock_migratetype(page, start_type); > - } > + /* Claim the whole block if over half of it is free */ > + if (pages >= (1 << (pageblock_order-1)) || > + page_group_by_mobility_disabled) > + set_pageblock_migratetype(page, start_type); > } > > /* Remove an element from the buddy allocator from the fallback list */ > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct free_area *area; > unsigned int current_order; > struct page *page; > + bool can_steal; > > /* Find the largest possible block of pages in the other list */ > for (current_order = MAX_ORDER-1; > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct page, lru); > area->nr_free--; > > - if (!is_migrate_cma(migratetype)) { > + can_steal = can_steal_freepages(current_order, > + start_migratetype, migratetype); > + if (can_steal) { can_steal is only used once, why not do if (can_steal_freepages()) directly? Or, call can_steal_freepages() from try_to_steal_freepages() and make try_to_steal_freepages() return its result. Then here it simplifies to: if (!try_to_steal_freepages(...) && is_migrate_cma(...)) buddy_type = migratetype; > try_to_steal_freepages(zone, page, > - start_migratetype, > - migratetype); > + start_migratetype); > } else { > /* > * When borrowing from MIGRATE_CMA, we need to > @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > * itself, and we do not try to steal extra > * free pages. > */ > - buddy_type = migratetype; > + if (is_migrate_cma(migratetype)) > + buddy_type = migratetype; > } > > /* Remove the page from the freelists */ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-f174.google.com (mail-lb0-f174.google.com [209.85.217.174]) by kanga.kvack.org (Postfix) with ESMTP id 5E4F06B006C for ; Fri, 30 Jan 2015 09:43:31 -0500 (EST) Received: by mail-lb0-f174.google.com with SMTP id f15so36670796lbj.5 for ; Fri, 30 Jan 2015 06:43:30 -0800 (PST) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id e8si544248wiz.55.2015.01.30.06.43.28 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 30 Jan 2015 06:43:29 -0800 (PST) Message-ID: <54CB988F.4080109@suse.cz> Date: Fri, 30 Jan 2015 15:43:27 +0100 From: Vlastimil Babka MIME-Version: 1.0 Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > From: Joonsoo > > Compaction has anti fragmentation algorithm. It is that freepage > should be more than pageblock order to finish the compaction if we don't > find any freepage in requested migratetype buddy list. This is for > mitigating fragmentation, but, there is a lack of migratetype > consideration and it is too excessive compared to page allocator's anti > fragmentation algorithm. > > Not considering migratetype would cause premature finish of compaction. > For example, if allocation request is for unmovable migratetype, > freepage with CMA migratetype doesn't help that allocation and > compaction should not be stopped. But, current logic regards this > situation as compaction is no longer needed, so finish the compaction. > > Secondly, condition is too excessive compared to page allocator's logic. > We can steal freepage from other migratetype and change pageblock > migratetype on more relaxed conditions in page allocator. This is designed > to prevent fragmentation and we can use it here. Imposing hard constraint > only to the compaction doesn't help much in this case since page allocator > would cause fragmentation again. > > To solve these problems, this patch borrows anti fragmentation logic from > page allocator. It will reduce premature compaction finish in some cases > and reduce excessive compaction work. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > considerable increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 31.82 : 42.20 > > Signed-off-by: Joonsoo Kim I have some worries about longer-term fragmentation, some testing of stress-highalloc several times without restarts could be helpful. > --- > include/linux/mmzone.h | 3 +++ > mm/compaction.c | 30 ++++++++++++++++++++++++++++-- > mm/internal.h | 1 + > mm/page_alloc.c | 5 ++--- > 4 files changed, 34 insertions(+), 5 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index f279d9c..a2906bc 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -63,6 +63,9 @@ enum { > MIGRATE_TYPES > }; > > +#define FALLBACK_MIGRATETYPES (4) > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; > + > #ifdef CONFIG_CMA > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > #else > diff --git a/mm/compaction.c b/mm/compaction.c > index 782772d..0460e4b 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; > } > > +static bool can_steal_fallbacks(struct free_area *area, > + unsigned int order, int migratetype) Could you move this to page_alloc.c and then you don't have to export the fallbacks arrays? > +{ > + int i; > + int fallback_mt; > + > + if (area->nr_free == 0) > + return false; > + > + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { > + fallback_mt = fallbacks[migratetype][i]; > + if (fallback_mt == MIGRATE_RESERVE) > + break; > + > + if (list_empty(&area->free_list[fallback_mt])) > + continue; > + > + if (can_steal_freepages(order, migratetype, fallback_mt)) > + return true; > + } > + return false; > +} > + > static int __compact_finished(struct zone *zone, struct compact_control *cc, > const int migratetype) > { > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > if (!list_empty(&area->free_list[migratetype])) > return COMPACT_PARTIAL; > > - /* Job done if allocation would set block type */ > - if (order >= pageblock_order && area->nr_free) > + /* > + * Job done if allocation would steal freepages from > + * other migratetype buddy lists. > + */ > + if (can_steal_fallbacks(area, order, migratetype)) > return COMPACT_PARTIAL; Seems somewhat wasteful in scenario where we want to satisfy a movable allocation and it's an async compaction. Then we don't compact in unmovable/reclaimable pageblock, and yet we will keep checking them for fallbacks. A price to pay for having generic code? Maybe can_steal_fallbacks could know the sync/async mode and use migrate_async_suitable appropriately. But then I wouldn't move it to page_alloc.c. More efficient could be a separate fallbacks array for async compaction with the unsuitable types removed. But maybe I'm just overengineering now. > } > > diff --git a/mm/internal.h b/mm/internal.h > index c4d6c9b..0a89a14 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -201,6 +201,7 @@ unsigned long > isolate_migratepages_range(struct compact_control *cc, > unsigned long low_pfn, unsigned long end_pfn); > > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt); > #endif > > /* > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ef74750..4c3538b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, > * This array describes the order lists are fallen back to when > * the free lists for the desirable migrate type are depleted > */ > -static int fallbacks[MIGRATE_TYPES][4] = { > +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = { > [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > #ifdef CONFIG_CMA > @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > -static bool can_steal_freepages(unsigned int order, > - int start_mt, int fallback_mt) > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt) > { > if (is_migrate_cma(fallback_mt)) > return false; > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f174.google.com (mail-qc0-f174.google.com [209.85.216.174]) by kanga.kvack.org (Postfix) with ESMTP id 5C25B6B0032 for ; Sat, 31 Jan 2015 02:39:07 -0500 (EST) Received: by mail-qc0-f174.google.com with SMTP id s11so23988952qcv.5 for ; Fri, 30 Jan 2015 23:39:07 -0800 (PST) Received: from BLU004-OMC2S4.hotmail.com (blu004-omc2s4.hotmail.com. [65.55.111.79]) by mx.google.com with ESMTPS id b50si17168360qga.56.2015.01.30.23.39.05 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 30 Jan 2015 23:39:06 -0800 (PST) Message-ID: Date: Sat, 31 Jan 2015 15:38:07 +0800 From: Zhang Yanfei MIME-Version: 1.0 Subject: Re: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , stable@vger.kernel.org Hello, At 2015/1/30 20:34, Joonsoo Kim wrote: > What we want to check here is whether there is highorder freepage > in buddy list of other migratetype in order to steal it without > fragmentation. But, current code just checks cc->order which means > allocation request order. So, this is wrong. > > Without this fix, non-movable synchronous compaction below pageblock order > would not stopped until compaction is complete, because migratetype of most > pageblocks are movable and high order freepage made by compaction is usually > on movable type buddy list. > > There is some report related to this bug. See below link. > > http://www.spinics.net/lists/linux-mm/msg81666.html > > Although the issued system still has load spike comes from compaction, > this makes that system completely stable and responsive according to > his report. > > stress-highalloc test in mmtests with non movable order 7 allocation doesn't > show any notable difference in allocation success rate, but, it shows more > compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 18.47 : 28.94 > > Cc: > Acked-by: Vlastimil Babka > Signed-off-by: Joonsoo Kim Reviewed-by: Zhang Yanfei > --- > mm/compaction.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index b68736c..4954e19 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > return COMPACT_PARTIAL; > > /* Job done if allocation would set block type */ > - if (cc->order >= pageblock_order && area->nr_free) > + if (order >= pageblock_order && area->nr_free) > return COMPACT_PARTIAL; > } > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f180.google.com (mail-qc0-f180.google.com [209.85.216.180]) by kanga.kvack.org (Postfix) with ESMTP id C473F6B0032 for ; Sat, 31 Jan 2015 02:50:00 -0500 (EST) Received: by mail-qc0-f180.google.com with SMTP id r5so23841506qcx.11 for ; Fri, 30 Jan 2015 23:50:00 -0800 (PST) Received: from BLU004-OMC2S20.hotmail.com (blu004-omc2s20.hotmail.com. [65.55.111.95]) by mx.google.com with ESMTPS id v109si17154232qge.78.2015.01.30.23.49.59 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 30 Jan 2015 23:49:59 -0800 (PST) Message-ID: Date: Sat, 31 Jan 2015 15:49:38 +0800 From: Zhang Yanfei MIME-Version: 1.0 Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Hello, At 2015/1/30 20:34, Joonsoo Kim wrote: > From: Joonsoo > > Currently, freepage isolation in one pageblock doesn't consider how many > freepages we isolate. When I traced flow of compaction, compaction > sometimes isolates more than 256 freepages to migrate just 32 pages. > > In this patch, freepage isolation is stopped at the point that we > have more isolated freepage than isolated page for migration. This > results in slowing down free page scanner and make compaction success > rate higher. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 27.13 : 31.82 > > pfn where both scanners meets on compaction complete > (separate test due to enormous tracepoint buffer) > (zone_start=4096, zone_end=1048576) > 586034 : 654378 > > In fact, I didn't fully understand why this patch results in such good > result. There was a guess that not used freepages are released to pcp list > and on next compaction trial we won't isolate them again so compaction > success rate would decrease. To prevent this effect, I tested with adding > pcp drain code on release_freepages(), but, it has no good effect. > > Anyway, this patch reduces waste time to isolate unneeded freepages so > seems reasonable. Reviewed-by: Zhang Yanfei IMHO, the patch making the free scanner move slower makes both scanners meet further. Before this patch, if we isolate too many free pages and even after we release the unneeded free pages later the free scanner still already be there and will be moved forward again next time -- the free scanner just cannot be moved back to grab the free pages we released before no matter where the free pages in, pcp or buddy. > > Signed-off-by: Joonsoo Kim > --- > mm/compaction.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 4954e19..782772d 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > > /* If a page was split, advance to the end of it */ > if (isolated) { > + cc->nr_freepages += isolated; > + if (!strict && > + cc->nr_migratepages <= cc->nr_freepages) { > + blockpfn += isolated; > + break; > + } > + > blockpfn += isolated - 1; > cursor += isolated - 1; > continue; > @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc) > unsigned long isolate_start_pfn; /* exact pfn we start at */ > unsigned long block_end_pfn; /* end of current pageblock */ > unsigned long low_pfn; /* lowest pfn scanner is able to scan */ > - int nr_freepages = cc->nr_freepages; > struct list_head *freelist = &cc->freepages; > > /* > @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc) > * pages on cc->migratepages. We stop searching if the migrate > * and free page scanners meet or enough free pages are isolated. > */ > - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages; > + for (; block_start_pfn >= low_pfn && > + cc->nr_migratepages > cc->nr_freepages; > block_end_pfn = block_start_pfn, > block_start_pfn -= pageblock_nr_pages, > isolate_start_pfn = block_start_pfn) { > - unsigned long isolated; > > /* > * This can iterate a massively long zone without finding any > @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc) > continue; > > /* Found a block suitable for isolating free pages from. */ > - isolated = isolate_freepages_block(cc, &isolate_start_pfn, > + isolate_freepages_block(cc, &isolate_start_pfn, > block_end_pfn, freelist, false); > - nr_freepages += isolated; > > /* > * Remember where the free scanner should restart next time, > @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc) > */ > if (block_start_pfn < low_pfn) > cc->free_pfn = cc->migrate_pfn; > - > - cc->nr_freepages = nr_freepages; > } > > /* > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-f171.google.com (mail-we0-f171.google.com [74.125.82.171]) by kanga.kvack.org (Postfix) with ESMTP id B063C6B0032 for ; Sat, 31 Jan 2015 03:32:02 -0500 (EST) Received: by mail-we0-f171.google.com with SMTP id k11so28471905wes.2 for ; Sat, 31 Jan 2015 00:32:02 -0800 (PST) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id j4si11484047wix.40.2015.01.31.00.32.00 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 31 Jan 2015 00:32:01 -0800 (PST) Message-ID: <54CC92FD.5000601@suse.cz> Date: Sat, 31 Jan 2015 09:31:57 +0100 From: Vlastimil Babka MIME-Version: 1.0 Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Zhang Yanfei , Joonsoo Kim , Andrew Morton Cc: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim On 01/31/2015 08:49 AM, Zhang Yanfei wrote: > Hello, > > At 2015/1/30 20:34, Joonsoo Kim wrote: > > Reviewed-by: Zhang Yanfei > > IMHO, the patch making the free scanner move slower makes both scanners > meet further. Before this patch, if we isolate too many free pages and even > after we release the unneeded free pages later the free scanner still already > be there and will be moved forward again next time -- the free scanner just > cannot be moved back to grab the free pages we released before no matter where > the free pages in, pcp or buddy. It can be actually moved back. If we are releasing free pages, it means the current compaction is terminating, and it will set zone->compact_cached_free_pfn back to the position of the released free page that was furthest back. The next compaction will start from the cached free pfn. It is however possible that another compaction runs in parallel and has progressed further and overwrites the cached free pfn. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f181.google.com (mail-qc0-f181.google.com [209.85.216.181]) by kanga.kvack.org (Postfix) with ESMTP id 9FF6D6B0032 for ; Sat, 31 Jan 2015 05:18:11 -0500 (EST) Received: by mail-qc0-f181.google.com with SMTP id l6so24113053qcy.12 for ; Sat, 31 Jan 2015 02:18:11 -0800 (PST) Received: from BLU004-OMC2S33.hotmail.com (blu004-omc2s33.hotmail.com. [65.55.111.108]) by mx.google.com with ESMTPS id r2si16735940qcq.9.2015.01.31.02.18.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 31 Jan 2015 02:18:10 -0800 (PST) Message-ID: Date: Sat, 31 Jan 2015 18:17:55 +0800 From: Zhang Yanfei MIME-Version: 1.0 Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> <54CC92FD.5000601@suse.cz> In-Reply-To: <54CC92FD.5000601@suse.cz> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka , Joonsoo Kim , Andrew Morton Cc: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim At 2015/1/31 16:31, Vlastimil Babka wrote: > On 01/31/2015 08:49 AM, Zhang Yanfei wrote: >> Hello, >> >> At 2015/1/30 20:34, Joonsoo Kim wrote: >> >> Reviewed-by: Zhang Yanfei >> >> IMHO, the patch making the free scanner move slower makes both scanners >> meet further. Before this patch, if we isolate too many free pages and even >> after we release the unneeded free pages later the free scanner still already >> be there and will be moved forward again next time -- the free scanner just >> cannot be moved back to grab the free pages we released before no matter where >> the free pages in, pcp or buddy. > > It can be actually moved back. If we are releasing free pages, it means the > current compaction is terminating, and it will set zone->compact_cached_free_pfn > back to the position of the released free page that was furthest back. The next > compaction will start from the cached free pfn. Yeah, you are right. I missed the release_freepages(). Thanks! > > It is however possible that another compaction runs in parallel and has > progressed further and overwrites the cached free pfn. > Hmm, maybe. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id 271C06B0032 for ; Sat, 31 Jan 2015 07:38:43 -0500 (EST) Received: by mail-pa0-f51.google.com with SMTP id fb1so63267671pad.10 for ; Sat, 31 Jan 2015 04:38:42 -0800 (PST) Received: from BLU004-OMC2S24.hotmail.com (blu004-omc2s24.hotmail.com. [65.55.111.99]) by mx.google.com with ESMTPS id in4si17171244pbc.32.2015.01.31.04.38.41 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 31 Jan 2015 04:38:42 -0800 (PST) Message-ID: Date: Sat, 31 Jan 2015 20:38:10 +0800 From: Zhang Yanfei MIME-Version: 1.0 Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim At 2015/1/30 20:34, Joonsoo Kim wrote: > From: Joonsoo > > This is preparation step to use page allocator's anti fragmentation logic > in compaction. This patch just separates steal decision part from actual > steal behaviour part so there is no functional change. > > Signed-off-by: Joonsoo Kim > --- > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 32 insertions(+), 17 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 8d52ab1..ef74750 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > +static bool can_steal_freepages(unsigned int order, > + int start_mt, int fallback_mt) > +{ > + if (is_migrate_cma(fallback_mt)) > + return false; > + > + if (order >= pageblock_order) > + return true; > + > + if (order >= pageblock_order / 2 || > + start_mt == MIGRATE_RECLAIMABLE || > + start_mt == MIGRATE_UNMOVABLE || > + page_group_by_mobility_disabled) > + return true; > + > + return false; > +} So some comments which can tell the cases can or cannot steal freepages from other migratetype is necessary IMHO. Actually we can just move some comments in try_to_steal_pages to here. Thanks. > + > /* > * When we are falling back to another migratetype during allocation, try to > * steal extra free pages from the same pageblocks to satisfy further > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, > * as well. > */ > static void try_to_steal_freepages(struct zone *zone, struct page *page, > - int start_type, int fallback_type) > + int start_type) > { > int current_order = page_order(page); > + int pages; > > /* Take ownership for orders >= pageblock_order */ > if (current_order >= pageblock_order) { > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, > return; > } > > - if (current_order >= pageblock_order / 2 || > - start_type == MIGRATE_RECLAIMABLE || > - start_type == MIGRATE_UNMOVABLE || > - page_group_by_mobility_disabled) { > - int pages; > + pages = move_freepages_block(zone, page, start_type); > > - pages = move_freepages_block(zone, page, start_type); > - > - /* Claim the whole block if over half of it is free */ > - if (pages >= (1 << (pageblock_order-1)) || > - page_group_by_mobility_disabled) > - set_pageblock_migratetype(page, start_type); > - } > + /* Claim the whole block if over half of it is free */ > + if (pages >= (1 << (pageblock_order-1)) || > + page_group_by_mobility_disabled) > + set_pageblock_migratetype(page, start_type); > } > > /* Remove an element from the buddy allocator from the fallback list */ > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct free_area *area; > unsigned int current_order; > struct page *page; > + bool can_steal; > > /* Find the largest possible block of pages in the other list */ > for (current_order = MAX_ORDER-1; > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct page, lru); > area->nr_free--; > > - if (!is_migrate_cma(migratetype)) { > + can_steal = can_steal_freepages(current_order, > + start_migratetype, migratetype); > + if (can_steal) { > try_to_steal_freepages(zone, page, > - start_migratetype, > - migratetype); > + start_migratetype); > } else { > /* > * When borrowing from MIGRATE_CMA, we need to > @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > * itself, and we do not try to steal extra > * free pages. > */ > - buddy_type = migratetype; > + if (is_migrate_cma(migratetype)) > + buddy_type = migratetype; > } > > /* Remove the page from the freelists */ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qa0-f53.google.com (mail-qa0-f53.google.com [209.85.216.53]) by kanga.kvack.org (Postfix) with ESMTP id 86A3D6B0032 for ; Sat, 31 Jan 2015 10:58:18 -0500 (EST) Received: by mail-qa0-f53.google.com with SMTP id n4so23986367qaq.12 for ; Sat, 31 Jan 2015 07:58:18 -0800 (PST) Received: from BLU004-OMC2S18.hotmail.com (blu004-omc2s18.hotmail.com. [65.55.111.93]) by mx.google.com with ESMTPS id w5si18331095qad.21.2015.01.31.07.58.17 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 31 Jan 2015 07:58:17 -0800 (PST) Message-ID: Date: Sat, 31 Jan 2015 23:58:03 +0800 From: Zhang Yanfei MIME-Version: 1.0 Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim , Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim At 2015/1/30 20:34, Joonsoo Kim wrote: > From: Joonsoo > > Compaction has anti fragmentation algorithm. It is that freepage > should be more than pageblock order to finish the compaction if we don't > find any freepage in requested migratetype buddy list. This is for > mitigating fragmentation, but, there is a lack of migratetype > consideration and it is too excessive compared to page allocator's anti > fragmentation algorithm. > > Not considering migratetype would cause premature finish of compaction. > For example, if allocation request is for unmovable migratetype, > freepage with CMA migratetype doesn't help that allocation and > compaction should not be stopped. But, current logic regards this > situation as compaction is no longer needed, so finish the compaction. > > Secondly, condition is too excessive compared to page allocator's logic. > We can steal freepage from other migratetype and change pageblock > migratetype on more relaxed conditions in page allocator. This is designed > to prevent fragmentation and we can use it here. Imposing hard constraint > only to the compaction doesn't help much in this case since page allocator > would cause fragmentation again. Changing both two behaviours in compaction may change the high order allocation behaviours in the buddy allocator slowpath, so just as Vlastimil suggested, some data from allocator should be necessary and helpful, IMHO. Thanks. > > To solve these problems, this patch borrows anti fragmentation logic from > page allocator. It will reduce premature compaction finish in some cases > and reduce excessive compaction work. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > considerable increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 31.82 : 42.20 > > Signed-off-by: Joonsoo Kim > --- > include/linux/mmzone.h | 3 +++ > mm/compaction.c | 30 ++++++++++++++++++++++++++++-- > mm/internal.h | 1 + > mm/page_alloc.c | 5 ++--- > 4 files changed, 34 insertions(+), 5 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index f279d9c..a2906bc 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -63,6 +63,9 @@ enum { > MIGRATE_TYPES > }; > > +#define FALLBACK_MIGRATETYPES (4) > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; > + > #ifdef CONFIG_CMA > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > #else > diff --git a/mm/compaction.c b/mm/compaction.c > index 782772d..0460e4b 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; > } > > +static bool can_steal_fallbacks(struct free_area *area, > + unsigned int order, int migratetype) > +{ > + int i; > + int fallback_mt; > + > + if (area->nr_free == 0) > + return false; > + > + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { > + fallback_mt = fallbacks[migratetype][i]; > + if (fallback_mt == MIGRATE_RESERVE) > + break; > + > + if (list_empty(&area->free_list[fallback_mt])) > + continue; > + > + if (can_steal_freepages(order, migratetype, fallback_mt)) > + return true; > + } > + return false; > +} > + > static int __compact_finished(struct zone *zone, struct compact_control *cc, > const int migratetype) > { > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > if (!list_empty(&area->free_list[migratetype])) > return COMPACT_PARTIAL; > > - /* Job done if allocation would set block type */ > - if (order >= pageblock_order && area->nr_free) > + /* > + * Job done if allocation would steal freepages from > + * other migratetype buddy lists. > + */ > + if (can_steal_fallbacks(area, order, migratetype)) > return COMPACT_PARTIAL; > } > > diff --git a/mm/internal.h b/mm/internal.h > index c4d6c9b..0a89a14 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -201,6 +201,7 @@ unsigned long > isolate_migratepages_range(struct compact_control *cc, > unsigned long low_pfn, unsigned long end_pfn); > > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt); > #endif > > /* > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ef74750..4c3538b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, > * This array describes the order lists are fallen back to when > * the free lists for the desirable migrate type are depleted > */ > -static int fallbacks[MIGRATE_TYPES][4] = { > +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = { > [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > #ifdef CONFIG_CMA > @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > -static bool can_steal_freepages(unsigned int order, > - int start_mt, int fallback_mt) > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt) > { > if (is_migrate_cma(fallback_mt)) > return false; > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by kanga.kvack.org (Postfix) with ESMTP id 221E26B0038 for ; Mon, 2 Feb 2015 02:01:14 -0500 (EST) Received: by mail-pa0-f50.google.com with SMTP id rd3so78624480pab.9 for ; Sun, 01 Feb 2015 23:01:13 -0800 (PST) Received: from lgeamrelo01.lge.com (lgeamrelo01.lge.com. [156.147.1.125]) by mx.google.com with ESMTP id bk8si15527171pdb.44.2015.02.01.23.01.12 for ; Sun, 01 Feb 2015 23:01:13 -0800 (PST) Date: Mon, 2 Feb 2015 16:02:49 +0900 From: Joonsoo Kim Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Message-ID: <20150202070248.GA6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> <54CB94E6.7010805@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54CB94E6.7010805@suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: Andrew Morton , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Fri, Jan 30, 2015 at 03:27:50PM +0100, Vlastimil Babka wrote: > On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > > From: Joonsoo > > > > This is preparation step to use page allocator's anti fragmentation logic > > in compaction. This patch just separates steal decision part from actual > > steal behaviour part so there is no functional change. > > > > Signed-off-by: Joonsoo Kim > > --- > > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > > 1 file changed, 32 insertions(+), 17 deletions(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 8d52ab1..ef74750 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > > } > > } > > > > +static bool can_steal_freepages(unsigned int order, > > + int start_mt, int fallback_mt) > > +{ > > + if (is_migrate_cma(fallback_mt)) > > + return false; > > + > > + if (order >= pageblock_order) > > + return true; > > + > > + if (order >= pageblock_order / 2 || > > + start_mt == MIGRATE_RECLAIMABLE || > > + start_mt == MIGRATE_UNMOVABLE || > > + page_group_by_mobility_disabled) > > + return true; > > + > > + return false; > > +} > > + > > /* > > * When we are falling back to another migratetype during allocation, try to > > * steal extra free pages from the same pageblocks to satisfy further > > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, > > * as well. > > */ > > static void try_to_steal_freepages(struct zone *zone, struct page *page, > > - int start_type, int fallback_type) > > + int start_type) > > It's actually not 'try_to_' anymore, is it? But could be, see below. > > > { > > int current_order = page_order(page); > > + int pages; > > > > /* Take ownership for orders >= pageblock_order */ > > if (current_order >= pageblock_order) { > > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, > > return; > > } > > > > - if (current_order >= pageblock_order / 2 || > > - start_type == MIGRATE_RECLAIMABLE || > > - start_type == MIGRATE_UNMOVABLE || > > - page_group_by_mobility_disabled) { > > - int pages; > > + pages = move_freepages_block(zone, page, start_type); > > > > - pages = move_freepages_block(zone, page, start_type); > > - > > - /* Claim the whole block if over half of it is free */ > > - if (pages >= (1 << (pageblock_order-1)) || > > - page_group_by_mobility_disabled) > > - set_pageblock_migratetype(page, start_type); > > - } > > + /* Claim the whole block if over half of it is free */ > > + if (pages >= (1 << (pageblock_order-1)) || > > + page_group_by_mobility_disabled) > > + set_pageblock_migratetype(page, start_type); > > } > > > > /* Remove an element from the buddy allocator from the fallback list */ > > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > > struct free_area *area; > > unsigned int current_order; > > struct page *page; > > + bool can_steal; > > > > /* Find the largest possible block of pages in the other list */ > > for (current_order = MAX_ORDER-1; > > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > > struct page, lru); > > area->nr_free--; > > > > - if (!is_migrate_cma(migratetype)) { > > + can_steal = can_steal_freepages(current_order, > > + start_migratetype, migratetype); > > + if (can_steal) { > > can_steal is only used once, why not do if (can_steal_freepages()) directly? > > Or, call can_steal_freepages() from try_to_steal_freepages() and make > try_to_steal_freepages() return its result. Then here it simplifies to: > > if (!try_to_steal_freepages(...) && is_migrate_cma(...)) > buddy_type = migratetype; You're right. Your commented code loosk better. Your comment on 3/4 and 4/4 makes me reconsider this code factorization and I found better solution. I will send it soon. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by kanga.kvack.org (Postfix) with ESMTP id 05B3B6B006C for ; Mon, 2 Feb 2015 02:01:45 -0500 (EST) Received: by mail-pa0-f49.google.com with SMTP id fa1so78740988pad.8 for ; Sun, 01 Feb 2015 23:01:44 -0800 (PST) Received: from lgeamrelo02.lge.com (lgeamrelo02.lge.com. [156.147.1.126]) by mx.google.com with ESMTP id r2si22690783pde.8.2015.02.01.23.01.42 for ; Sun, 01 Feb 2015 23:01:44 -0800 (PST) Date: Mon, 2 Feb 2015 16:03:21 +0900 From: Joonsoo Kim Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Message-ID: <20150202070321.GB6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Zhang Yanfei Cc: Andrew Morton , Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Sat, Jan 31, 2015 at 08:38:10PM +0800, Zhang Yanfei wrote: > At 2015/1/30 20:34, Joonsoo Kim wrote: > > From: Joonsoo > > > > This is preparation step to use page allocator's anti fragmentation logic > > in compaction. This patch just separates steal decision part from actual > > steal behaviour part so there is no functional change. > > > > Signed-off-by: Joonsoo Kim > > --- > > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > > 1 file changed, 32 insertions(+), 17 deletions(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 8d52ab1..ef74750 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > > } > > } > > > > +static bool can_steal_freepages(unsigned int order, > > + int start_mt, int fallback_mt) > > +{ > > + if (is_migrate_cma(fallback_mt)) > > + return false; > > + > > + if (order >= pageblock_order) > > + return true; > > + > > + if (order >= pageblock_order / 2 || > > + start_mt == MIGRATE_RECLAIMABLE || > > + start_mt == MIGRATE_UNMOVABLE || > > + page_group_by_mobility_disabled) > > + return true; > > + > > + return false; > > +} > > So some comments which can tell the cases can or cannot steal freepages > from other migratetype is necessary IMHO. Actually we can just > move some comments in try_to_steal_pages to here. Yes, move some comments looks sufficient to me. I will fix it. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by kanga.kvack.org (Postfix) with ESMTP id D3CCA6B0038 for ; Mon, 2 Feb 2015 02:09:32 -0500 (EST) Received: by mail-pa0-f49.google.com with SMTP id fa1so78822420pad.8 for ; Sun, 01 Feb 2015 23:09:32 -0800 (PST) Received: from lgeamrelo02.lge.com (lgeamrelo02.lge.com. [156.147.1.126]) by mx.google.com with ESMTP id uv4si22512174pbc.110.2015.02.01.23.09.30 for ; Sun, 01 Feb 2015 23:09:31 -0800 (PST) Date: Mon, 2 Feb 2015 16:11:09 +0900 From: Joonsoo Kim Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Message-ID: <20150202071109.GC6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> <54CB988F.4080109@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54CB988F.4080109@suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: Andrew Morton , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Fri, Jan 30, 2015 at 03:43:27PM +0100, Vlastimil Babka wrote: > On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > > From: Joonsoo > > > > Compaction has anti fragmentation algorithm. It is that freepage > > should be more than pageblock order to finish the compaction if we don't > > find any freepage in requested migratetype buddy list. This is for > > mitigating fragmentation, but, there is a lack of migratetype > > consideration and it is too excessive compared to page allocator's anti > > fragmentation algorithm. > > > > Not considering migratetype would cause premature finish of compaction. > > For example, if allocation request is for unmovable migratetype, > > freepage with CMA migratetype doesn't help that allocation and > > compaction should not be stopped. But, current logic regards this > > situation as compaction is no longer needed, so finish the compaction. > > > > Secondly, condition is too excessive compared to page allocator's logic. > > We can steal freepage from other migratetype and change pageblock > > migratetype on more relaxed conditions in page allocator. This is designed > > to prevent fragmentation and we can use it here. Imposing hard constraint > > only to the compaction doesn't help much in this case since page allocator > > would cause fragmentation again. > > > > To solve these problems, this patch borrows anti fragmentation logic from > > page allocator. It will reduce premature compaction finish in some cases > > and reduce excessive compaction work. > > > > stress-highalloc test in mmtests with non movable order 7 allocation shows > > considerable increase of compaction success rate. > > > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > > 31.82 : 42.20 > > > > Signed-off-by: Joonsoo Kim > > I have some worries about longer-term fragmentation, some testing of > stress-highalloc several times without restarts could be helpful. Okay. I will do it. > > > --- > > include/linux/mmzone.h | 3 +++ > > mm/compaction.c | 30 ++++++++++++++++++++++++++++-- > > mm/internal.h | 1 + > > mm/page_alloc.c | 5 ++--- > > 4 files changed, 34 insertions(+), 5 deletions(-) > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index f279d9c..a2906bc 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -63,6 +63,9 @@ enum { > > MIGRATE_TYPES > > }; > > > > +#define FALLBACK_MIGRATETYPES (4) > > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; > > + > > #ifdef CONFIG_CMA > > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > > #else > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 782772d..0460e4b 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; > > } > > > > +static bool can_steal_fallbacks(struct free_area *area, > > + unsigned int order, int migratetype) > > Could you move this to page_alloc.c and then you don't have to export the > fallbacks arrays? Okay. > > > +{ > > + int i; > > + int fallback_mt; > > + > > + if (area->nr_free == 0) > > + return false; > > + > > + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { > > + fallback_mt = fallbacks[migratetype][i]; > > + if (fallback_mt == MIGRATE_RESERVE) > > + break; > > + > > + if (list_empty(&area->free_list[fallback_mt])) > > + continue; > > + > > + if (can_steal_freepages(order, migratetype, fallback_mt)) > > + return true; > > + } > > + return false; > > +} > > + > > static int __compact_finished(struct zone *zone, struct compact_control *cc, > > const int migratetype) > > { > > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > > if (!list_empty(&area->free_list[migratetype])) > > return COMPACT_PARTIAL; > > > > - /* Job done if allocation would set block type */ > > - if (order >= pageblock_order && area->nr_free) > > + /* > > + * Job done if allocation would steal freepages from > > + * other migratetype buddy lists. > > + */ > > + if (can_steal_fallbacks(area, order, migratetype)) > > return COMPACT_PARTIAL; > > Seems somewhat wasteful in scenario where we want to satisfy a movable > allocation and it's an async compaction. Then we don't compact in > unmovable/reclaimable pageblock, and yet we will keep checking them for > fallbacks. A price to pay for having generic code? I think that there would be lucky case that high order freepage on unmovable/reclaimable pageblock is made by concurrent freeing. In this case, finishing compaction would be good thing. And, this logic would cause marginal overhead so generic code seems justificable to me. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id 443D86B0038 for ; Mon, 2 Feb 2015 02:10:47 -0500 (EST) Received: by mail-pa0-f51.google.com with SMTP id fb1so78948486pad.10 for ; Sun, 01 Feb 2015 23:10:47 -0800 (PST) Received: from lgeamrelo04.lge.com (lgeamrelo04.lge.com. [156.147.1.127]) by mx.google.com with ESMTP id ck3si1765683pad.80.2015.02.01.23.10.45 for ; Sun, 01 Feb 2015 23:10:46 -0800 (PST) Date: Mon, 2 Feb 2015 16:12:23 +0900 From: Joonsoo Kim Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Message-ID: <20150202071223.GD6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Zhang Yanfei Cc: Andrew Morton , Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org On Sat, Jan 31, 2015 at 11:58:03PM +0800, Zhang Yanfei wrote: > At 2015/1/30 20:34, Joonsoo Kim wrote: > > From: Joonsoo > > > > Compaction has anti fragmentation algorithm. It is that freepage > > should be more than pageblock order to finish the compaction if we don't > > find any freepage in requested migratetype buddy list. This is for > > mitigating fragmentation, but, there is a lack of migratetype > > consideration and it is too excessive compared to page allocator's anti > > fragmentation algorithm. > > > > Not considering migratetype would cause premature finish of compaction. > > For example, if allocation request is for unmovable migratetype, > > freepage with CMA migratetype doesn't help that allocation and > > compaction should not be stopped. But, current logic regards this > > situation as compaction is no longer needed, so finish the compaction. > > > > Secondly, condition is too excessive compared to page allocator's logic. > > We can steal freepage from other migratetype and change pageblock > > migratetype on more relaxed conditions in page allocator. This is designed > > to prevent fragmentation and we can use it here. Imposing hard constraint > > only to the compaction doesn't help much in this case since page allocator > > would cause fragmentation again. > > Changing both two behaviours in compaction may change the high order allocation > behaviours in the buddy allocator slowpath, so just as Vlastimil suggested, > some data from allocator should be necessary and helpful, IMHO. As Vlastimil said, fragmentation effect should be checked. I will do it and report the result on next version. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761876AbbA3Mec (ORCPT ); Fri, 30 Jan 2015 07:34:32 -0500 Received: from mail-pa0-f41.google.com ([209.85.220.41]:64421 "EHLO mail-pa0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752129AbbA3Mea (ORCPT ); Fri, 30 Jan 2015 07:34:30 -0500 From: Joonsoo Kim X-Google-Original-From: Joonsoo Kim To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: [PATCH v2 0/4] enhance compaction success rate Date: Fri, 30 Jan 2015 21:34:08 +0900 Message-Id: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patchset aims at increase of compaction success rate. Changes are related to compaction finish condition and freepage isolation condition. >>From these changes, I did stress highalloc test in mmtests with nonmovable order 7 allocation configuration, and compaction success rate (%) are Base Patch-1 Patch-2 Patch-3 Patch-4 18.47 27.13 31.82 -- 42.20 Note: Base version is tested in v1 and the others are tested freshly. Test is perform based on next-20150103 and Vlastimil's stealing logic patches due to current next's unstablility. Patch-3 isn't tested since there is no functional change. Joonsoo (3): mm/compaction: stop the isolation when we isolate enough freepage mm/page_alloc: separate steal decision from steal behaviour part mm/compaction: enhance compaction finish condition Joonsoo Kim (1): mm/compaction: fix wrong order check in compact_finished() include/linux/mmzone.h | 3 +++ mm/compaction.c | 47 ++++++++++++++++++++++++++++++++++++++--------- mm/internal.h | 1 + mm/page_alloc.c | 50 ++++++++++++++++++++++++++++++++------------------ 4 files changed, 74 insertions(+), 27 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762052AbbA3Meh (ORCPT ); Fri, 30 Jan 2015 07:34:37 -0500 Received: from mail-pa0-f43.google.com ([209.85.220.43]:45552 "EHLO mail-pa0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761899AbbA3Mee (ORCPT ); Fri, 30 Jan 2015 07:34:34 -0500 From: Joonsoo Kim X-Google-Original-From: Joonsoo Kim To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , Subject: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Date: Fri, 30 Jan 2015 21:34:09 +0900 Message-Id: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org What we want to check here is whether there is highorder freepage in buddy list of other migratetype in order to steal it without fragmentation. But, current code just checks cc->order which means allocation request order. So, this is wrong. Without this fix, non-movable synchronous compaction below pageblock order would not stopped until compaction is complete, because migratetype of most pageblocks are movable and high order freepage made by compaction is usually on movable type buddy list. There is some report related to this bug. See below link. http://www.spinics.net/lists/linux-mm/msg81666.html Although the issued system still has load spike comes from compaction, this makes that system completely stable and responsive according to his report. stress-highalloc test in mmtests with non movable order 7 allocation doesn't show any notable difference in allocation success rate, but, it shows more compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 18.47 : 28.94 Cc: Acked-by: Vlastimil Babka Signed-off-by: Joonsoo Kim --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index b68736c..4954e19 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, return COMPACT_PARTIAL; /* Job done if allocation would set block type */ - if (cc->order >= pageblock_order && area->nr_free) + if (order >= pageblock_order && area->nr_free) return COMPACT_PARTIAL; } -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762164AbbA3Mek (ORCPT ); Fri, 30 Jan 2015 07:34:40 -0500 Received: from mail-pa0-f48.google.com ([209.85.220.48]:47834 "EHLO mail-pa0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762135AbbA3Meh (ORCPT ); Fri, 30 Jan 2015 07:34:37 -0500 From: Joonsoo Kim X-Google-Original-From: Joonsoo Kim To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Date: Fri, 30 Jan 2015 21:34:10 +0900 Message-Id: <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Joonsoo Currently, freepage isolation in one pageblock doesn't consider how many freepages we isolate. When I traced flow of compaction, compaction sometimes isolates more than 256 freepages to migrate just 32 pages. In this patch, freepage isolation is stopped at the point that we have more isolated freepage than isolated page for migration. This results in slowing down free page scanner and make compaction success rate higher. stress-highalloc test in mmtests with non movable order 7 allocation shows increase of compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 27.13 : 31.82 pfn where both scanners meets on compaction complete (separate test due to enormous tracepoint buffer) (zone_start=4096, zone_end=1048576) 586034 : 654378 In fact, I didn't fully understand why this patch results in such good result. There was a guess that not used freepages are released to pcp list and on next compaction trial we won't isolate them again so compaction success rate would decrease. To prevent this effect, I tested with adding pcp drain code on release_freepages(), but, it has no good effect. Anyway, this patch reduces waste time to isolate unneeded freepages so seems reasonable. Signed-off-by: Joonsoo Kim --- mm/compaction.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 4954e19..782772d 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, /* If a page was split, advance to the end of it */ if (isolated) { + cc->nr_freepages += isolated; + if (!strict && + cc->nr_migratepages <= cc->nr_freepages) { + blockpfn += isolated; + break; + } + blockpfn += isolated - 1; cursor += isolated - 1; continue; @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc) unsigned long isolate_start_pfn; /* exact pfn we start at */ unsigned long block_end_pfn; /* end of current pageblock */ unsigned long low_pfn; /* lowest pfn scanner is able to scan */ - int nr_freepages = cc->nr_freepages; struct list_head *freelist = &cc->freepages; /* @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc) * pages on cc->migratepages. We stop searching if the migrate * and free page scanners meet or enough free pages are isolated. */ - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages; + for (; block_start_pfn >= low_pfn && + cc->nr_migratepages > cc->nr_freepages; block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { - unsigned long isolated; /* * This can iterate a massively long zone without finding any @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolated = isolate_freepages_block(cc, &isolate_start_pfn, + isolate_freepages_block(cc, &isolate_start_pfn, block_end_pfn, freelist, false); - nr_freepages += isolated; /* * Remember where the free scanner should restart next time, @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc) */ if (block_start_pfn < low_pfn) cc->free_pfn = cc->migrate_pfn; - - cc->nr_freepages = nr_freepages; } /* -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762186AbbA3Meo (ORCPT ); Fri, 30 Jan 2015 07:34:44 -0500 Received: from mail-pa0-f50.google.com ([209.85.220.50]:33681 "EHLO mail-pa0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762135AbbA3Mel (ORCPT ); Fri, 30 Jan 2015 07:34:41 -0500 From: Joonsoo Kim X-Google-Original-From: Joonsoo Kim To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Date: Fri, 30 Jan 2015 21:34:11 +0900 Message-Id: <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Joonsoo This is preparation step to use page allocator's anti fragmentation logic in compaction. This patch just separates steal decision part from actual steal behaviour part so there is no functional change. Signed-off-by: Joonsoo Kim --- mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 17 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8d52ab1..ef74750 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, } } +static bool can_steal_freepages(unsigned int order, + int start_mt, int fallback_mt) +{ + if (is_migrate_cma(fallback_mt)) + return false; + + if (order >= pageblock_order) + return true; + + if (order >= pageblock_order / 2 || + start_mt == MIGRATE_RECLAIMABLE || + start_mt == MIGRATE_UNMOVABLE || + page_group_by_mobility_disabled) + return true; + + return false; +} + /* * When we are falling back to another migratetype during allocation, try to * steal extra free pages from the same pageblocks to satisfy further @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, * as well. */ static void try_to_steal_freepages(struct zone *zone, struct page *page, - int start_type, int fallback_type) + int start_type) { int current_order = page_order(page); + int pages; /* Take ownership for orders >= pageblock_order */ if (current_order >= pageblock_order) { @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, return; } - if (current_order >= pageblock_order / 2 || - start_type == MIGRATE_RECLAIMABLE || - start_type == MIGRATE_UNMOVABLE || - page_group_by_mobility_disabled) { - int pages; + pages = move_freepages_block(zone, page, start_type); - pages = move_freepages_block(zone, page, start_type); - - /* Claim the whole block if over half of it is free */ - if (pages >= (1 << (pageblock_order-1)) || - page_group_by_mobility_disabled) - set_pageblock_migratetype(page, start_type); - } + /* Claim the whole block if over half of it is free */ + if (pages >= (1 << (pageblock_order-1)) || + page_group_by_mobility_disabled) + set_pageblock_migratetype(page, start_type); } /* Remove an element from the buddy allocator from the fallback list */ @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) struct free_area *area; unsigned int current_order; struct page *page; + bool can_steal; /* Find the largest possible block of pages in the other list */ for (current_order = MAX_ORDER-1; @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) struct page, lru); area->nr_free--; - if (!is_migrate_cma(migratetype)) { + can_steal = can_steal_freepages(current_order, + start_migratetype, migratetype); + if (can_steal) { try_to_steal_freepages(zone, page, - start_migratetype, - migratetype); + start_migratetype); } else { /* * When borrowing from MIGRATE_CMA, we need to @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) * itself, and we do not try to steal extra * free pages. */ - buddy_type = migratetype; + if (is_migrate_cma(migratetype)) + buddy_type = migratetype; } /* Remove the page from the freelists */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762211AbbA3Mes (ORCPT ); Fri, 30 Jan 2015 07:34:48 -0500 Received: from mail-pa0-f44.google.com ([209.85.220.44]:58692 "EHLO mail-pa0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762182AbbA3Mep (ORCPT ); Fri, 30 Jan 2015 07:34:45 -0500 From: Joonsoo Kim X-Google-Original-From: Joonsoo Kim To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Date: Fri, 30 Jan 2015 21:34:12 +0900 Message-Id: <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Joonsoo Compaction has anti fragmentation algorithm. It is that freepage should be more than pageblock order to finish the compaction if we don't find any freepage in requested migratetype buddy list. This is for mitigating fragmentation, but, there is a lack of migratetype consideration and it is too excessive compared to page allocator's anti fragmentation algorithm. Not considering migratetype would cause premature finish of compaction. For example, if allocation request is for unmovable migratetype, freepage with CMA migratetype doesn't help that allocation and compaction should not be stopped. But, current logic regards this situation as compaction is no longer needed, so finish the compaction. Secondly, condition is too excessive compared to page allocator's logic. We can steal freepage from other migratetype and change pageblock migratetype on more relaxed conditions in page allocator. This is designed to prevent fragmentation and we can use it here. Imposing hard constraint only to the compaction doesn't help much in this case since page allocator would cause fragmentation again. To solve these problems, this patch borrows anti fragmentation logic from page allocator. It will reduce premature compaction finish in some cases and reduce excessive compaction work. stress-highalloc test in mmtests with non movable order 7 allocation shows considerable increase of compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 31.82 : 42.20 Signed-off-by: Joonsoo Kim --- include/linux/mmzone.h | 3 +++ mm/compaction.c | 30 ++++++++++++++++++++++++++++-- mm/internal.h | 1 + mm/page_alloc.c | 5 ++--- 4 files changed, 34 insertions(+), 5 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index f279d9c..a2906bc 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -63,6 +63,9 @@ enum { MIGRATE_TYPES }; +#define FALLBACK_MIGRATETYPES (4) +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; + #ifdef CONFIG_CMA # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) #else diff --git a/mm/compaction.c b/mm/compaction.c index 782772d..0460e4b 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; } +static bool can_steal_fallbacks(struct free_area *area, + unsigned int order, int migratetype) +{ + int i; + int fallback_mt; + + if (area->nr_free == 0) + return false; + + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { + fallback_mt = fallbacks[migratetype][i]; + if (fallback_mt == MIGRATE_RESERVE) + break; + + if (list_empty(&area->free_list[fallback_mt])) + continue; + + if (can_steal_freepages(order, migratetype, fallback_mt)) + return true; + } + return false; +} + static int __compact_finished(struct zone *zone, struct compact_control *cc, const int migratetype) { @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, if (!list_empty(&area->free_list[migratetype])) return COMPACT_PARTIAL; - /* Job done if allocation would set block type */ - if (order >= pageblock_order && area->nr_free) + /* + * Job done if allocation would steal freepages from + * other migratetype buddy lists. + */ + if (can_steal_fallbacks(area, order, migratetype)) return COMPACT_PARTIAL; } diff --git a/mm/internal.h b/mm/internal.h index c4d6c9b..0a89a14 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -201,6 +201,7 @@ unsigned long isolate_migratepages_range(struct compact_control *cc, unsigned long low_pfn, unsigned long end_pfn); +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt); #endif /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ef74750..4c3538b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, * This array describes the order lists are fallen back to when * the free lists for the desirable migrate type are depleted */ -static int fallbacks[MIGRATE_TYPES][4] = { +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = { [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, #ifdef CONFIG_CMA @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page, } } -static bool can_steal_freepages(unsigned int order, - int start_mt, int fallback_mt) +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt) { if (is_migrate_cma(fallback_mt)) return false; -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762026AbbA3N1V (ORCPT ); Fri, 30 Jan 2015 08:27:21 -0500 Received: from cantor2.suse.de ([195.135.220.15]:36041 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757649AbbA3N1T (ORCPT ); Fri, 30 Jan 2015 08:27:19 -0500 Message-ID: <54CB86B3.1060500@suse.cz> Date: Fri, 30 Jan 2015 14:27:15 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , stable@vger.kernel.org Subject: Re: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > What we want to check here is whether there is highorder freepage > in buddy list of other migratetype in order to steal it without > fragmentation. But, current code just checks cc->order which means > allocation request order. So, this is wrong. The bug has been introduced by 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page immediately when it is made available") and survived the later partial revert 8fb74b9fb2b1. > Without this fix, non-movable synchronous compaction below pageblock order > would not stopped until compaction is complete, because migratetype of most > pageblocks are movable and high order freepage made by compaction is usually > on movable type buddy list. > > There is some report related to this bug. See below link. > > http://www.spinics.net/lists/linux-mm/msg81666.html > > Although the issued system still has load spike comes from compaction, > this makes that system completely stable and responsive according to > his report. > > stress-highalloc test in mmtests with non movable order 7 allocation doesn't > show any notable difference in allocation success rate, but, it shows more > compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 18.47 : 28.94 > > Cc: # v3.7+ Fixes: 1fb3f8ca0e9222535a39b884cb67a34628411b9f > Acked-by: Vlastimil Babka > Signed-off-by: Joonsoo Kim Thanks. > --- > mm/compaction.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index b68736c..4954e19 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > return COMPACT_PARTIAL; > > /* Job done if allocation would set block type */ > - if (cc->order >= pageblock_order && area->nr_free) > + if (order >= pageblock_order && area->nr_free) > return COMPACT_PARTIAL; > } > > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761614AbbA3Nxf (ORCPT ); Fri, 30 Jan 2015 08:53:35 -0500 Received: from cantor2.suse.de ([195.135.220.15]:37294 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762254AbbA3Nr3 (ORCPT ); Fri, 30 Jan 2015 08:47:29 -0500 Message-ID: <54CB8B6B.2080503@suse.cz> Date: Fri, 30 Jan 2015 14:47:23 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > From: Joonsoo > > Currently, freepage isolation in one pageblock doesn't consider how many > freepages we isolate. When I traced flow of compaction, compaction > sometimes isolates more than 256 freepages to migrate just 32 pages. > > In this patch, freepage isolation is stopped at the point that we > have more isolated freepage than isolated page for migration. This > results in slowing down free page scanner and make compaction success > rate higher. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 27.13 : 31.82 > > pfn where both scanners meets on compaction complete > (separate test due to enormous tracepoint buffer) > (zone_start=4096, zone_end=1048576) > 586034 : 654378 Now I that I know that scanners meeting further in zone is better for success rate, the better success rate makes sense. Still not sure why they meet further :) > In fact, I didn't fully understand why this patch results in such good > result. There was a guess that not used freepages are released to pcp list > and on next compaction trial we won't isolate them again so compaction > success rate would decrease. To prevent this effect, I tested with adding > pcp drain code on release_freepages(), but, it has no good effect. > > Anyway, this patch reduces waste time to isolate unneeded freepages so > seems reasonable. I briefly tried it on top of the pivot-changing series and with order-9 allocations it reduced free page scanned counter by almost 10%. No effect on success rates (maybe because pivot changing already took care of the scanners meeting problem) but the scanning reduction is good on its own. It also explains why e14c720efdd7 ("mm, compaction: remember position within pageblock in free pages scanner") had less than expected improvements. It would only actually stop within pageblock in case of async compaction detecting contention. I guess that's also why the infinite loop problem fixed by 1d5bfe1ffb5b affected so relatively few people. > Signed-off-by: Joonsoo Kim Acked-by: Vlastimil Babka Thanks! > --- > mm/compaction.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 4954e19..782772d 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > > /* If a page was split, advance to the end of it */ > if (isolated) { > + cc->nr_freepages += isolated; > + if (!strict && > + cc->nr_migratepages <= cc->nr_freepages) { > + blockpfn += isolated; > + break; > + } > + > blockpfn += isolated - 1; > cursor += isolated - 1; > continue; > @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc) > unsigned long isolate_start_pfn; /* exact pfn we start at */ > unsigned long block_end_pfn; /* end of current pageblock */ > unsigned long low_pfn; /* lowest pfn scanner is able to scan */ > - int nr_freepages = cc->nr_freepages; > struct list_head *freelist = &cc->freepages; > > /* > @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc) > * pages on cc->migratepages. We stop searching if the migrate > * and free page scanners meet or enough free pages are isolated. > */ > - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages; > + for (; block_start_pfn >= low_pfn && > + cc->nr_migratepages > cc->nr_freepages; > block_end_pfn = block_start_pfn, > block_start_pfn -= pageblock_nr_pages, > isolate_start_pfn = block_start_pfn) { > - unsigned long isolated; > > /* > * This can iterate a massively long zone without finding any > @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc) > continue; > > /* Found a block suitable for isolating free pages from. */ > - isolated = isolate_freepages_block(cc, &isolate_start_pfn, > + isolate_freepages_block(cc, &isolate_start_pfn, > block_end_pfn, freelist, false); > - nr_freepages += isolated; > > /* > * Remember where the free scanner should restart next time, > @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc) > */ > if (block_start_pfn < low_pfn) > cc->free_pfn = cc->migrate_pfn; > - > - cc->nr_freepages = nr_freepages; > } > > /* > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755232AbbA3O15 (ORCPT ); Fri, 30 Jan 2015 09:27:57 -0500 Received: from cantor2.suse.de ([195.135.220.15]:41260 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750852AbbA3O14 (ORCPT ); Fri, 30 Jan 2015 09:27:56 -0500 Message-ID: <54CB94E6.7010805@suse.cz> Date: Fri, 30 Jan 2015 15:27:50 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > From: Joonsoo > > This is preparation step to use page allocator's anti fragmentation logic > in compaction. This patch just separates steal decision part from actual > steal behaviour part so there is no functional change. > > Signed-off-by: Joonsoo Kim > --- > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 32 insertions(+), 17 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 8d52ab1..ef74750 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > +static bool can_steal_freepages(unsigned int order, > + int start_mt, int fallback_mt) > +{ > + if (is_migrate_cma(fallback_mt)) > + return false; > + > + if (order >= pageblock_order) > + return true; > + > + if (order >= pageblock_order / 2 || > + start_mt == MIGRATE_RECLAIMABLE || > + start_mt == MIGRATE_UNMOVABLE || > + page_group_by_mobility_disabled) > + return true; > + > + return false; > +} > + > /* > * When we are falling back to another migratetype during allocation, try to > * steal extra free pages from the same pageblocks to satisfy further > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, > * as well. > */ > static void try_to_steal_freepages(struct zone *zone, struct page *page, > - int start_type, int fallback_type) > + int start_type) It's actually not 'try_to_' anymore, is it? But could be, see below. > { > int current_order = page_order(page); > + int pages; > > /* Take ownership for orders >= pageblock_order */ > if (current_order >= pageblock_order) { > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, > return; > } > > - if (current_order >= pageblock_order / 2 || > - start_type == MIGRATE_RECLAIMABLE || > - start_type == MIGRATE_UNMOVABLE || > - page_group_by_mobility_disabled) { > - int pages; > + pages = move_freepages_block(zone, page, start_type); > > - pages = move_freepages_block(zone, page, start_type); > - > - /* Claim the whole block if over half of it is free */ > - if (pages >= (1 << (pageblock_order-1)) || > - page_group_by_mobility_disabled) > - set_pageblock_migratetype(page, start_type); > - } > + /* Claim the whole block if over half of it is free */ > + if (pages >= (1 << (pageblock_order-1)) || > + page_group_by_mobility_disabled) > + set_pageblock_migratetype(page, start_type); > } > > /* Remove an element from the buddy allocator from the fallback list */ > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct free_area *area; > unsigned int current_order; > struct page *page; > + bool can_steal; > > /* Find the largest possible block of pages in the other list */ > for (current_order = MAX_ORDER-1; > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct page, lru); > area->nr_free--; > > - if (!is_migrate_cma(migratetype)) { > + can_steal = can_steal_freepages(current_order, > + start_migratetype, migratetype); > + if (can_steal) { can_steal is only used once, why not do if (can_steal_freepages()) directly? Or, call can_steal_freepages() from try_to_steal_freepages() and make try_to_steal_freepages() return its result. Then here it simplifies to: if (!try_to_steal_freepages(...) && is_migrate_cma(...)) buddy_type = migratetype; > try_to_steal_freepages(zone, page, > - start_migratetype, > - migratetype); > + start_migratetype); > } else { > /* > * When borrowing from MIGRATE_CMA, we need to > @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > * itself, and we do not try to steal extra > * free pages. > */ > - buddy_type = migratetype; > + if (is_migrate_cma(migratetype)) > + buddy_type = migratetype; > } > > /* Remove the page from the freelists */ > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762420AbbA3Onb (ORCPT ); Fri, 30 Jan 2015 09:43:31 -0500 Received: from cantor2.suse.de ([195.135.220.15]:42610 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752222AbbA3Ona (ORCPT ); Fri, 30 Jan 2015 09:43:30 -0500 Message-ID: <54CB988F.4080109@suse.cz> Date: Fri, 30 Jan 2015 15:43:27 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > From: Joonsoo > > Compaction has anti fragmentation algorithm. It is that freepage > should be more than pageblock order to finish the compaction if we don't > find any freepage in requested migratetype buddy list. This is for > mitigating fragmentation, but, there is a lack of migratetype > consideration and it is too excessive compared to page allocator's anti > fragmentation algorithm. > > Not considering migratetype would cause premature finish of compaction. > For example, if allocation request is for unmovable migratetype, > freepage with CMA migratetype doesn't help that allocation and > compaction should not be stopped. But, current logic regards this > situation as compaction is no longer needed, so finish the compaction. > > Secondly, condition is too excessive compared to page allocator's logic. > We can steal freepage from other migratetype and change pageblock > migratetype on more relaxed conditions in page allocator. This is designed > to prevent fragmentation and we can use it here. Imposing hard constraint > only to the compaction doesn't help much in this case since page allocator > would cause fragmentation again. > > To solve these problems, this patch borrows anti fragmentation logic from > page allocator. It will reduce premature compaction finish in some cases > and reduce excessive compaction work. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > considerable increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 31.82 : 42.20 > > Signed-off-by: Joonsoo Kim I have some worries about longer-term fragmentation, some testing of stress-highalloc several times without restarts could be helpful. > --- > include/linux/mmzone.h | 3 +++ > mm/compaction.c | 30 ++++++++++++++++++++++++++++-- > mm/internal.h | 1 + > mm/page_alloc.c | 5 ++--- > 4 files changed, 34 insertions(+), 5 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index f279d9c..a2906bc 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -63,6 +63,9 @@ enum { > MIGRATE_TYPES > }; > > +#define FALLBACK_MIGRATETYPES (4) > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; > + > #ifdef CONFIG_CMA > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > #else > diff --git a/mm/compaction.c b/mm/compaction.c > index 782772d..0460e4b 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; > } > > +static bool can_steal_fallbacks(struct free_area *area, > + unsigned int order, int migratetype) Could you move this to page_alloc.c and then you don't have to export the fallbacks arrays? > +{ > + int i; > + int fallback_mt; > + > + if (area->nr_free == 0) > + return false; > + > + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { > + fallback_mt = fallbacks[migratetype][i]; > + if (fallback_mt == MIGRATE_RESERVE) > + break; > + > + if (list_empty(&area->free_list[fallback_mt])) > + continue; > + > + if (can_steal_freepages(order, migratetype, fallback_mt)) > + return true; > + } > + return false; > +} > + > static int __compact_finished(struct zone *zone, struct compact_control *cc, > const int migratetype) > { > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > if (!list_empty(&area->free_list[migratetype])) > return COMPACT_PARTIAL; > > - /* Job done if allocation would set block type */ > - if (order >= pageblock_order && area->nr_free) > + /* > + * Job done if allocation would steal freepages from > + * other migratetype buddy lists. > + */ > + if (can_steal_fallbacks(area, order, migratetype)) > return COMPACT_PARTIAL; Seems somewhat wasteful in scenario where we want to satisfy a movable allocation and it's an async compaction. Then we don't compact in unmovable/reclaimable pageblock, and yet we will keep checking them for fallbacks. A price to pay for having generic code? Maybe can_steal_fallbacks could know the sync/async mode and use migrate_async_suitable appropriately. But then I wouldn't move it to page_alloc.c. More efficient could be a separate fallbacks array for async compaction with the unsuitable types removed. But maybe I'm just overengineering now. > } > > diff --git a/mm/internal.h b/mm/internal.h > index c4d6c9b..0a89a14 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -201,6 +201,7 @@ unsigned long > isolate_migratepages_range(struct compact_control *cc, > unsigned long low_pfn, unsigned long end_pfn); > > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt); > #endif > > /* > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ef74750..4c3538b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, > * This array describes the order lists are fallen back to when > * the free lists for the desirable migrate type are depleted > */ > -static int fallbacks[MIGRATE_TYPES][4] = { > +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = { > [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > #ifdef CONFIG_CMA > @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > -static bool can_steal_freepages(unsigned int order, > - int start_mt, int fallback_mt) > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt) > { > if (is_migrate_cma(fallback_mt)) > return false; > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751323AbbAaHjL (ORCPT ); Sat, 31 Jan 2015 02:39:11 -0500 Received: from blu004-omc2s4.hotmail.com ([65.55.111.79]:49472 "EHLO BLU004-OMC2S4.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751055AbbAaHjH (ORCPT ); Sat, 31 Jan 2015 02:39:07 -0500 X-TMN: [+RMb2vS8JwS58PTfkyRfLzGTMViW992C] X-Originating-Email: [zhangyanfei.ok@hotmail.com] Message-ID: Date: Sat, 31 Jan 2015 15:38:07 +0800 From: Zhang Yanfei User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , stable@vger.kernel.org Subject: Re: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 31 Jan 2015 07:39:04.0851 (UTC) FILETIME=[FD20B630:01D03D28] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, At 2015/1/30 20:34, Joonsoo Kim wrote: > What we want to check here is whether there is highorder freepage > in buddy list of other migratetype in order to steal it without > fragmentation. But, current code just checks cc->order which means > allocation request order. So, this is wrong. > > Without this fix, non-movable synchronous compaction below pageblock order > would not stopped until compaction is complete, because migratetype of most > pageblocks are movable and high order freepage made by compaction is usually > on movable type buddy list. > > There is some report related to this bug. See below link. > > http://www.spinics.net/lists/linux-mm/msg81666.html > > Although the issued system still has load spike comes from compaction, > this makes that system completely stable and responsive according to > his report. > > stress-highalloc test in mmtests with non movable order 7 allocation doesn't > show any notable difference in allocation success rate, but, it shows more > compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 18.47 : 28.94 > > Cc: > Acked-by: Vlastimil Babka > Signed-off-by: Joonsoo Kim Reviewed-by: Zhang Yanfei > --- > mm/compaction.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index b68736c..4954e19 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > return COMPACT_PARTIAL; > > /* Job done if allocation would set block type */ > - if (cc->order >= pageblock_order && area->nr_free) > + if (order >= pageblock_order && area->nr_free) > return COMPACT_PARTIAL; > } > > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751391AbbAaHuB (ORCPT ); Sat, 31 Jan 2015 02:50:01 -0500 Received: from blu004-omc2s20.hotmail.com ([65.55.111.95]:53250 "EHLO BLU004-OMC2S20.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751055AbbAaHt7 (ORCPT ); Sat, 31 Jan 2015 02:49:59 -0500 X-TMN: [f7YxflQZat6M0iu6LKN/vdmLGaW2/2B5] X-Originating-Email: [zhangyanfei.ok@hotmail.com] Message-ID: Date: Sat, 31 Jan 2015 15:49:38 +0800 From: Zhang Yanfei User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 31 Jan 2015 07:49:58.0163 (UTC) FILETIME=[82883630:01D03D2A] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, At 2015/1/30 20:34, Joonsoo Kim wrote: > From: Joonsoo > > Currently, freepage isolation in one pageblock doesn't consider how many > freepages we isolate. When I traced flow of compaction, compaction > sometimes isolates more than 256 freepages to migrate just 32 pages. > > In this patch, freepage isolation is stopped at the point that we > have more isolated freepage than isolated page for migration. This > results in slowing down free page scanner and make compaction success > rate higher. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 27.13 : 31.82 > > pfn where both scanners meets on compaction complete > (separate test due to enormous tracepoint buffer) > (zone_start=4096, zone_end=1048576) > 586034 : 654378 > > In fact, I didn't fully understand why this patch results in such good > result. There was a guess that not used freepages are released to pcp list > and on next compaction trial we won't isolate them again so compaction > success rate would decrease. To prevent this effect, I tested with adding > pcp drain code on release_freepages(), but, it has no good effect. > > Anyway, this patch reduces waste time to isolate unneeded freepages so > seems reasonable. Reviewed-by: Zhang Yanfei IMHO, the patch making the free scanner move slower makes both scanners meet further. Before this patch, if we isolate too many free pages and even after we release the unneeded free pages later the free scanner still already be there and will be moved forward again next time -- the free scanner just cannot be moved back to grab the free pages we released before no matter where the free pages in, pcp or buddy. > > Signed-off-by: Joonsoo Kim > --- > mm/compaction.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 4954e19..782772d 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > > /* If a page was split, advance to the end of it */ > if (isolated) { > + cc->nr_freepages += isolated; > + if (!strict && > + cc->nr_migratepages <= cc->nr_freepages) { > + blockpfn += isolated; > + break; > + } > + > blockpfn += isolated - 1; > cursor += isolated - 1; > continue; > @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc) > unsigned long isolate_start_pfn; /* exact pfn we start at */ > unsigned long block_end_pfn; /* end of current pageblock */ > unsigned long low_pfn; /* lowest pfn scanner is able to scan */ > - int nr_freepages = cc->nr_freepages; > struct list_head *freelist = &cc->freepages; > > /* > @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc) > * pages on cc->migratepages. We stop searching if the migrate > * and free page scanners meet or enough free pages are isolated. > */ > - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages; > + for (; block_start_pfn >= low_pfn && > + cc->nr_migratepages > cc->nr_freepages; > block_end_pfn = block_start_pfn, > block_start_pfn -= pageblock_nr_pages, > isolate_start_pfn = block_start_pfn) { > - unsigned long isolated; > > /* > * This can iterate a massively long zone without finding any > @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc) > continue; > > /* Found a block suitable for isolating free pages from. */ > - isolated = isolate_freepages_block(cc, &isolate_start_pfn, > + isolate_freepages_block(cc, &isolate_start_pfn, > block_end_pfn, freelist, false); > - nr_freepages += isolated; > > /* > * Remember where the free scanner should restart next time, > @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc) > */ > if (block_start_pfn < low_pfn) > cc->free_pfn = cc->migrate_pfn; > - > - cc->nr_freepages = nr_freepages; > } > > /* > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752549AbbAaIcF (ORCPT ); Sat, 31 Jan 2015 03:32:05 -0500 Received: from cantor2.suse.de ([195.135.220.15]:51363 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751213AbbAaIcB (ORCPT ); Sat, 31 Jan 2015 03:32:01 -0500 Message-ID: <54CC92FD.5000601@suse.cz> Date: Sat, 31 Jan 2015 09:31:57 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Zhang Yanfei , Joonsoo Kim , Andrew Morton CC: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/31/2015 08:49 AM, Zhang Yanfei wrote: > Hello, > > At 2015/1/30 20:34, Joonsoo Kim wrote: > > Reviewed-by: Zhang Yanfei > > IMHO, the patch making the free scanner move slower makes both scanners > meet further. Before this patch, if we isolate too many free pages and even > after we release the unneeded free pages later the free scanner still already > be there and will be moved forward again next time -- the free scanner just > cannot be moved back to grab the free pages we released before no matter where > the free pages in, pcp or buddy. It can be actually moved back. If we are releasing free pages, it means the current compaction is terminating, and it will set zone->compact_cached_free_pfn back to the position of the released free page that was furthest back. The next compaction will start from the cached free pfn. It is however possible that another compaction runs in parallel and has progressed further and overwrites the cached free pfn. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752951AbbAaKSM (ORCPT ); Sat, 31 Jan 2015 05:18:12 -0500 Received: from blu004-omc2s33.hotmail.com ([65.55.111.108]:57838 "EHLO BLU004-OMC2S33.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751715AbbAaKSK (ORCPT ); Sat, 31 Jan 2015 05:18:10 -0500 X-TMN: [+xKfS1/HI6tdUqDN4NZoZHwQavkJLz2W] X-Originating-Email: [zhangyanfei.ok@hotmail.com] Message-ID: Date: Sat, 31 Jan 2015 18:17:55 +0800 From: Zhang Yanfei User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Vlastimil Babka , Joonsoo Kim , Andrew Morton CC: Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-3-git-send-email-iamjoonsoo.kim@lge.com> <54CC92FD.5000601@suse.cz> In-Reply-To: <54CC92FD.5000601@suse.cz> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 31 Jan 2015 10:18:09.0081 (UTC) FILETIME=[35EEA290:01D03D3F] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org At 2015/1/31 16:31, Vlastimil Babka wrote: > On 01/31/2015 08:49 AM, Zhang Yanfei wrote: >> Hello, >> >> At 2015/1/30 20:34, Joonsoo Kim wrote: >> >> Reviewed-by: Zhang Yanfei >> >> IMHO, the patch making the free scanner move slower makes both scanners >> meet further. Before this patch, if we isolate too many free pages and even >> after we release the unneeded free pages later the free scanner still already >> be there and will be moved forward again next time -- the free scanner just >> cannot be moved back to grab the free pages we released before no matter where >> the free pages in, pcp or buddy. > > It can be actually moved back. If we are releasing free pages, it means the > current compaction is terminating, and it will set zone->compact_cached_free_pfn > back to the position of the released free page that was furthest back. The next > compaction will start from the cached free pfn. Yeah, you are right. I missed the release_freepages(). Thanks! > > It is however possible that another compaction runs in parallel and has > progressed further and overwrites the cached free pfn. > Hmm, maybe. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752708AbbAaMip (ORCPT ); Sat, 31 Jan 2015 07:38:45 -0500 Received: from blu004-omc2s24.hotmail.com ([65.55.111.99]:52140 "EHLO BLU004-OMC2S24.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751002AbbAaMil (ORCPT ); Sat, 31 Jan 2015 07:38:41 -0500 X-TMN: [6VjtxLS5wJ8gqfNTenMkPmNmPbioW+Fl] X-Originating-Email: [zhangyanfei.ok@hotmail.com] Message-ID: Date: Sat, 31 Jan 2015 20:38:10 +0800 From: Zhang Yanfei User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 31 Jan 2015 12:38:39.0953 (UTC) FILETIME=[D71F9C10:01D03D52] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org At 2015/1/30 20:34, Joonsoo Kim wrote: > From: Joonsoo > > This is preparation step to use page allocator's anti fragmentation logic > in compaction. This patch just separates steal decision part from actual > steal behaviour part so there is no functional change. > > Signed-off-by: Joonsoo Kim > --- > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 32 insertions(+), 17 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 8d52ab1..ef74750 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > +static bool can_steal_freepages(unsigned int order, > + int start_mt, int fallback_mt) > +{ > + if (is_migrate_cma(fallback_mt)) > + return false; > + > + if (order >= pageblock_order) > + return true; > + > + if (order >= pageblock_order / 2 || > + start_mt == MIGRATE_RECLAIMABLE || > + start_mt == MIGRATE_UNMOVABLE || > + page_group_by_mobility_disabled) > + return true; > + > + return false; > +} So some comments which can tell the cases can or cannot steal freepages from other migratetype is necessary IMHO. Actually we can just move some comments in try_to_steal_pages to here. Thanks. > + > /* > * When we are falling back to another migratetype during allocation, try to > * steal extra free pages from the same pageblocks to satisfy further > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, > * as well. > */ > static void try_to_steal_freepages(struct zone *zone, struct page *page, > - int start_type, int fallback_type) > + int start_type) > { > int current_order = page_order(page); > + int pages; > > /* Take ownership for orders >= pageblock_order */ > if (current_order >= pageblock_order) { > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, > return; > } > > - if (current_order >= pageblock_order / 2 || > - start_type == MIGRATE_RECLAIMABLE || > - start_type == MIGRATE_UNMOVABLE || > - page_group_by_mobility_disabled) { > - int pages; > + pages = move_freepages_block(zone, page, start_type); > > - pages = move_freepages_block(zone, page, start_type); > - > - /* Claim the whole block if over half of it is free */ > - if (pages >= (1 << (pageblock_order-1)) || > - page_group_by_mobility_disabled) > - set_pageblock_migratetype(page, start_type); > - } > + /* Claim the whole block if over half of it is free */ > + if (pages >= (1 << (pageblock_order-1)) || > + page_group_by_mobility_disabled) > + set_pageblock_migratetype(page, start_type); > } > > /* Remove an element from the buddy allocator from the fallback list */ > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct free_area *area; > unsigned int current_order; > struct page *page; > + bool can_steal; > > /* Find the largest possible block of pages in the other list */ > for (current_order = MAX_ORDER-1; > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > struct page, lru); > area->nr_free--; > > - if (!is_migrate_cma(migratetype)) { > + can_steal = can_steal_freepages(current_order, > + start_migratetype, migratetype); > + if (can_steal) { > try_to_steal_freepages(zone, page, > - start_migratetype, > - migratetype); > + start_migratetype); > } else { > /* > * When borrowing from MIGRATE_CMA, we need to > @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > * itself, and we do not try to steal extra > * free pages. > */ > - buddy_type = migratetype; > + if (is_migrate_cma(migratetype)) > + buddy_type = migratetype; > } > > /* Remove the page from the freelists */ > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753225AbbAaP6U (ORCPT ); Sat, 31 Jan 2015 10:58:20 -0500 Received: from blu004-omc2s18.hotmail.com ([65.55.111.93]:57512 "EHLO BLU004-OMC2S18.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751244AbbAaP6T (ORCPT ); Sat, 31 Jan 2015 10:58:19 -0500 X-TMN: [nhyl1gI7iQoMEOYCxGkWRHTcwc6Kiex2] X-Originating-Email: [zhangyanfei.ok@hotmail.com] Message-ID: Date: Sat, 31 Jan 2015 23:58:03 +0800 From: Zhang Yanfei User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Joonsoo Kim , Andrew Morton CC: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 31 Jan 2015 15:58:15.0919 (UTC) FILETIME=[B95AFFF0:01D03D6E] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org At 2015/1/30 20:34, Joonsoo Kim wrote: > From: Joonsoo > > Compaction has anti fragmentation algorithm. It is that freepage > should be more than pageblock order to finish the compaction if we don't > find any freepage in requested migratetype buddy list. This is for > mitigating fragmentation, but, there is a lack of migratetype > consideration and it is too excessive compared to page allocator's anti > fragmentation algorithm. > > Not considering migratetype would cause premature finish of compaction. > For example, if allocation request is for unmovable migratetype, > freepage with CMA migratetype doesn't help that allocation and > compaction should not be stopped. But, current logic regards this > situation as compaction is no longer needed, so finish the compaction. > > Secondly, condition is too excessive compared to page allocator's logic. > We can steal freepage from other migratetype and change pageblock > migratetype on more relaxed conditions in page allocator. This is designed > to prevent fragmentation and we can use it here. Imposing hard constraint > only to the compaction doesn't help much in this case since page allocator > would cause fragmentation again. Changing both two behaviours in compaction may change the high order allocation behaviours in the buddy allocator slowpath, so just as Vlastimil suggested, some data from allocator should be necessary and helpful, IMHO. Thanks. > > To solve these problems, this patch borrows anti fragmentation logic from > page allocator. It will reduce premature compaction finish in some cases > and reduce excessive compaction work. > > stress-highalloc test in mmtests with non movable order 7 allocation shows > considerable increase of compaction success rate. > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > 31.82 : 42.20 > > Signed-off-by: Joonsoo Kim > --- > include/linux/mmzone.h | 3 +++ > mm/compaction.c | 30 ++++++++++++++++++++++++++++-- > mm/internal.h | 1 + > mm/page_alloc.c | 5 ++--- > 4 files changed, 34 insertions(+), 5 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index f279d9c..a2906bc 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -63,6 +63,9 @@ enum { > MIGRATE_TYPES > }; > > +#define FALLBACK_MIGRATETYPES (4) > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; > + > #ifdef CONFIG_CMA > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > #else > diff --git a/mm/compaction.c b/mm/compaction.c > index 782772d..0460e4b 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; > } > > +static bool can_steal_fallbacks(struct free_area *area, > + unsigned int order, int migratetype) > +{ > + int i; > + int fallback_mt; > + > + if (area->nr_free == 0) > + return false; > + > + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { > + fallback_mt = fallbacks[migratetype][i]; > + if (fallback_mt == MIGRATE_RESERVE) > + break; > + > + if (list_empty(&area->free_list[fallback_mt])) > + continue; > + > + if (can_steal_freepages(order, migratetype, fallback_mt)) > + return true; > + } > + return false; > +} > + > static int __compact_finished(struct zone *zone, struct compact_control *cc, > const int migratetype) > { > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > if (!list_empty(&area->free_list[migratetype])) > return COMPACT_PARTIAL; > > - /* Job done if allocation would set block type */ > - if (order >= pageblock_order && area->nr_free) > + /* > + * Job done if allocation would steal freepages from > + * other migratetype buddy lists. > + */ > + if (can_steal_fallbacks(area, order, migratetype)) > return COMPACT_PARTIAL; > } > > diff --git a/mm/internal.h b/mm/internal.h > index c4d6c9b..0a89a14 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -201,6 +201,7 @@ unsigned long > isolate_migratepages_range(struct compact_control *cc, > unsigned long low_pfn, unsigned long end_pfn); > > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt); > #endif > > /* > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ef74750..4c3538b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, > * This array describes the order lists are fallen back to when > * the free lists for the desirable migrate type are depleted > */ > -static int fallbacks[MIGRATE_TYPES][4] = { > +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = { > [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, > #ifdef CONFIG_CMA > @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page, > } > } > > -static bool can_steal_freepages(unsigned int order, > - int start_mt, int fallback_mt) > +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt) > { > if (is_migrate_cma(fallback_mt)) > return false; > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932584AbbBBHBT (ORCPT ); Mon, 2 Feb 2015 02:01:19 -0500 Received: from lgeamrelo01.lge.com ([156.147.1.125]:40560 "EHLO lgeamrelo01.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752846AbbBBHBN (ORCPT ); Mon, 2 Feb 2015 02:01:13 -0500 X-Original-SENDERIP: 10.177.222.153 X-Original-MAILFROM: iamjoonsoo.kim@lge.com Date: Mon, 2 Feb 2015 16:02:49 +0900 From: Joonsoo Kim To: Vlastimil Babka Cc: Andrew Morton , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Message-ID: <20150202070248.GA6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> <54CB94E6.7010805@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54CB94E6.7010805@suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 30, 2015 at 03:27:50PM +0100, Vlastimil Babka wrote: > On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > > From: Joonsoo > > > > This is preparation step to use page allocator's anti fragmentation logic > > in compaction. This patch just separates steal decision part from actual > > steal behaviour part so there is no functional change. > > > > Signed-off-by: Joonsoo Kim > > --- > > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > > 1 file changed, 32 insertions(+), 17 deletions(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 8d52ab1..ef74750 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > > } > > } > > > > +static bool can_steal_freepages(unsigned int order, > > + int start_mt, int fallback_mt) > > +{ > > + if (is_migrate_cma(fallback_mt)) > > + return false; > > + > > + if (order >= pageblock_order) > > + return true; > > + > > + if (order >= pageblock_order / 2 || > > + start_mt == MIGRATE_RECLAIMABLE || > > + start_mt == MIGRATE_UNMOVABLE || > > + page_group_by_mobility_disabled) > > + return true; > > + > > + return false; > > +} > > + > > /* > > * When we are falling back to another migratetype during allocation, try to > > * steal extra free pages from the same pageblocks to satisfy further > > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page, > > * as well. > > */ > > static void try_to_steal_freepages(struct zone *zone, struct page *page, > > - int start_type, int fallback_type) > > + int start_type) > > It's actually not 'try_to_' anymore, is it? But could be, see below. > > > { > > int current_order = page_order(page); > > + int pages; > > > > /* Take ownership for orders >= pageblock_order */ > > if (current_order >= pageblock_order) { > > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page, > > return; > > } > > > > - if (current_order >= pageblock_order / 2 || > > - start_type == MIGRATE_RECLAIMABLE || > > - start_type == MIGRATE_UNMOVABLE || > > - page_group_by_mobility_disabled) { > > - int pages; > > + pages = move_freepages_block(zone, page, start_type); > > > > - pages = move_freepages_block(zone, page, start_type); > > - > > - /* Claim the whole block if over half of it is free */ > > - if (pages >= (1 << (pageblock_order-1)) || > > - page_group_by_mobility_disabled) > > - set_pageblock_migratetype(page, start_type); > > - } > > + /* Claim the whole block if over half of it is free */ > > + if (pages >= (1 << (pageblock_order-1)) || > > + page_group_by_mobility_disabled) > > + set_pageblock_migratetype(page, start_type); > > } > > > > /* Remove an element from the buddy allocator from the fallback list */ > > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > > struct free_area *area; > > unsigned int current_order; > > struct page *page; > > + bool can_steal; > > > > /* Find the largest possible block of pages in the other list */ > > for (current_order = MAX_ORDER-1; > > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) > > struct page, lru); > > area->nr_free--; > > > > - if (!is_migrate_cma(migratetype)) { > > + can_steal = can_steal_freepages(current_order, > > + start_migratetype, migratetype); > > + if (can_steal) { > > can_steal is only used once, why not do if (can_steal_freepages()) directly? > > Or, call can_steal_freepages() from try_to_steal_freepages() and make > try_to_steal_freepages() return its result. Then here it simplifies to: > > if (!try_to_steal_freepages(...) && is_migrate_cma(...)) > buddy_type = migratetype; You're right. Your commented code loosk better. Your comment on 3/4 and 4/4 makes me reconsider this code factorization and I found better solution. I will send it soon. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932730AbbBBHBq (ORCPT ); Mon, 2 Feb 2015 02:01:46 -0500 Received: from lgeamrelo02.lge.com ([156.147.1.126]:53883 "EHLO lgeamrelo02.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932199AbbBBHBn (ORCPT ); Mon, 2 Feb 2015 02:01:43 -0500 X-Original-SENDERIP: 10.177.222.153 X-Original-MAILFROM: iamjoonsoo.kim@lge.com Date: Mon, 2 Feb 2015 16:03:21 +0900 From: Joonsoo Kim To: Zhang Yanfei Cc: Andrew Morton , Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Message-ID: <20150202070321.GB6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-4-git-send-email-iamjoonsoo.kim@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jan 31, 2015 at 08:38:10PM +0800, Zhang Yanfei wrote: > At 2015/1/30 20:34, Joonsoo Kim wrote: > > From: Joonsoo > > > > This is preparation step to use page allocator's anti fragmentation logic > > in compaction. This patch just separates steal decision part from actual > > steal behaviour part so there is no functional change. > > > > Signed-off-by: Joonsoo Kim > > --- > > mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++----------------- > > 1 file changed, 32 insertions(+), 17 deletions(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 8d52ab1..ef74750 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page, > > } > > } > > > > +static bool can_steal_freepages(unsigned int order, > > + int start_mt, int fallback_mt) > > +{ > > + if (is_migrate_cma(fallback_mt)) > > + return false; > > + > > + if (order >= pageblock_order) > > + return true; > > + > > + if (order >= pageblock_order / 2 || > > + start_mt == MIGRATE_RECLAIMABLE || > > + start_mt == MIGRATE_UNMOVABLE || > > + page_group_by_mobility_disabled) > > + return true; > > + > > + return false; > > +} > > So some comments which can tell the cases can or cannot steal freepages > from other migratetype is necessary IMHO. Actually we can just > move some comments in try_to_steal_pages to here. Yes, move some comments looks sufficient to me. I will fix it. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932787AbbBBHJe (ORCPT ); Mon, 2 Feb 2015 02:09:34 -0500 Received: from lgeamrelo02.lge.com ([156.147.1.126]:59274 "EHLO lgeamrelo02.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932612AbbBBHJb (ORCPT ); Mon, 2 Feb 2015 02:09:31 -0500 X-Original-SENDERIP: 10.177.222.153 X-Original-MAILFROM: iamjoonsoo.kim@lge.com Date: Mon, 2 Feb 2015 16:11:09 +0900 From: Joonsoo Kim To: Vlastimil Babka Cc: Andrew Morton , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Message-ID: <20150202071109.GC6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> <54CB988F.4080109@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54CB988F.4080109@suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 30, 2015 at 03:43:27PM +0100, Vlastimil Babka wrote: > On 01/30/2015 01:34 PM, Joonsoo Kim wrote: > > From: Joonsoo > > > > Compaction has anti fragmentation algorithm. It is that freepage > > should be more than pageblock order to finish the compaction if we don't > > find any freepage in requested migratetype buddy list. This is for > > mitigating fragmentation, but, there is a lack of migratetype > > consideration and it is too excessive compared to page allocator's anti > > fragmentation algorithm. > > > > Not considering migratetype would cause premature finish of compaction. > > For example, if allocation request is for unmovable migratetype, > > freepage with CMA migratetype doesn't help that allocation and > > compaction should not be stopped. But, current logic regards this > > situation as compaction is no longer needed, so finish the compaction. > > > > Secondly, condition is too excessive compared to page allocator's logic. > > We can steal freepage from other migratetype and change pageblock > > migratetype on more relaxed conditions in page allocator. This is designed > > to prevent fragmentation and we can use it here. Imposing hard constraint > > only to the compaction doesn't help much in this case since page allocator > > would cause fragmentation again. > > > > To solve these problems, this patch borrows anti fragmentation logic from > > page allocator. It will reduce premature compaction finish in some cases > > and reduce excessive compaction work. > > > > stress-highalloc test in mmtests with non movable order 7 allocation shows > > considerable increase of compaction success rate. > > > > Compaction success rate (Compaction success * 100 / Compaction stalls, %) > > 31.82 : 42.20 > > > > Signed-off-by: Joonsoo Kim > > I have some worries about longer-term fragmentation, some testing of > stress-highalloc several times without restarts could be helpful. Okay. I will do it. > > > --- > > include/linux/mmzone.h | 3 +++ > > mm/compaction.c | 30 ++++++++++++++++++++++++++++-- > > mm/internal.h | 1 + > > mm/page_alloc.c | 5 ++--- > > 4 files changed, 34 insertions(+), 5 deletions(-) > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index f279d9c..a2906bc 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -63,6 +63,9 @@ enum { > > MIGRATE_TYPES > > }; > > > > +#define FALLBACK_MIGRATETYPES (4) > > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES]; > > + > > #ifdef CONFIG_CMA > > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > > #else > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 782772d..0460e4b 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; > > } > > > > +static bool can_steal_fallbacks(struct free_area *area, > > + unsigned int order, int migratetype) > > Could you move this to page_alloc.c and then you don't have to export the > fallbacks arrays? Okay. > > > +{ > > + int i; > > + int fallback_mt; > > + > > + if (area->nr_free == 0) > > + return false; > > + > > + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) { > > + fallback_mt = fallbacks[migratetype][i]; > > + if (fallback_mt == MIGRATE_RESERVE) > > + break; > > + > > + if (list_empty(&area->free_list[fallback_mt])) > > + continue; > > + > > + if (can_steal_freepages(order, migratetype, fallback_mt)) > > + return true; > > + } > > + return false; > > +} > > + > > static int __compact_finished(struct zone *zone, struct compact_control *cc, > > const int migratetype) > > { > > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, > > if (!list_empty(&area->free_list[migratetype])) > > return COMPACT_PARTIAL; > > > > - /* Job done if allocation would set block type */ > > - if (order >= pageblock_order && area->nr_free) > > + /* > > + * Job done if allocation would steal freepages from > > + * other migratetype buddy lists. > > + */ > > + if (can_steal_fallbacks(area, order, migratetype)) > > return COMPACT_PARTIAL; > > Seems somewhat wasteful in scenario where we want to satisfy a movable > allocation and it's an async compaction. Then we don't compact in > unmovable/reclaimable pageblock, and yet we will keep checking them for > fallbacks. A price to pay for having generic code? I think that there would be lucky case that high order freepage on unmovable/reclaimable pageblock is made by concurrent freeing. In this case, finishing compaction would be good thing. And, this logic would cause marginal overhead so generic code seems justificable to me. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932797AbbBBHKq (ORCPT ); Mon, 2 Feb 2015 02:10:46 -0500 Received: from lgeamrelo04.lge.com ([156.147.1.127]:53470 "EHLO lgeamrelo04.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932458AbbBBHKp (ORCPT ); Mon, 2 Feb 2015 02:10:45 -0500 X-Original-SENDERIP: 10.177.222.153 X-Original-MAILFROM: iamjoonsoo.kim@lge.com Date: Mon, 2 Feb 2015 16:12:23 +0900 From: Joonsoo Kim To: Zhang Yanfei Cc: Andrew Morton , Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Message-ID: <20150202071223.GD6488@js1304-P5Q-DELUXE> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> <1422621252-29859-5-git-send-email-iamjoonsoo.kim@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jan 31, 2015 at 11:58:03PM +0800, Zhang Yanfei wrote: > At 2015/1/30 20:34, Joonsoo Kim wrote: > > From: Joonsoo > > > > Compaction has anti fragmentation algorithm. It is that freepage > > should be more than pageblock order to finish the compaction if we don't > > find any freepage in requested migratetype buddy list. This is for > > mitigating fragmentation, but, there is a lack of migratetype > > consideration and it is too excessive compared to page allocator's anti > > fragmentation algorithm. > > > > Not considering migratetype would cause premature finish of compaction. > > For example, if allocation request is for unmovable migratetype, > > freepage with CMA migratetype doesn't help that allocation and > > compaction should not be stopped. But, current logic regards this > > situation as compaction is no longer needed, so finish the compaction. > > > > Secondly, condition is too excessive compared to page allocator's logic. > > We can steal freepage from other migratetype and change pageblock > > migratetype on more relaxed conditions in page allocator. This is designed > > to prevent fragmentation and we can use it here. Imposing hard constraint > > only to the compaction doesn't help much in this case since page allocator > > would cause fragmentation again. > > Changing both two behaviours in compaction may change the high order allocation > behaviours in the buddy allocator slowpath, so just as Vlastimil suggested, > some data from allocator should be necessary and helpful, IMHO. As Vlastimil said, fragmentation effect should be checked. I will do it and report the result on next version. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Joonsoo Kim To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , David Rientjes , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim , Subject: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Date: Fri, 30 Jan 2015 21:34:09 +0900 Message-Id: <1422621252-29859-2-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> References: <1422621252-29859-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: What we want to check here is whether there is highorder freepage in buddy list of other migratetype in order to steal it without fragmentation. But, current code just checks cc->order which means allocation request order. So, this is wrong. Without this fix, non-movable synchronous compaction below pageblock order would not stopped until compaction is complete, because migratetype of most pageblocks are movable and high order freepage made by compaction is usually on movable type buddy list. There is some report related to this bug. See below link. http://www.spinics.net/lists/linux-mm/msg81666.html Although the issued system still has load spike comes from compaction, this makes that system completely stable and responsive according to his report. stress-highalloc test in mmtests with non movable order 7 allocation doesn't show any notable difference in allocation success rate, but, it shows more compaction success rate. Compaction success rate (Compaction success * 100 / Compaction stalls, %) 18.47 : 28.94 Cc: Acked-by: Vlastimil Babka Signed-off-by: Joonsoo Kim --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index b68736c..4954e19 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc, return COMPACT_PARTIAL; /* Job done if allocation would set block type */ - if (cc->order >= pageblock_order && area->nr_free) + if (order >= pageblock_order && area->nr_free) return COMPACT_PARTIAL; } -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org