[PATCH v2 0/7] Fixes and cleanups to compaction

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/7] Fixes and cleanups to compaction
@ 2023-08-26 15:36 Kemeng Shi
  2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

Hi all, this is another series to do fix and clean up to compaction.
Patch 1-2 fix and clean up freepage list operation.
Patch 3-4 fix and clean up isolation of freepages
Patch 7 factor code to check if compaction is needed for allocation
order.
More details can be found in respective patches. Thanks!

v1->v2:
-Collect RVB from Baolin.
-Keep pfn inside of pageblock in patch 3.
-Only improve comment of is_via_compact_memory in patch 6.
-Squash patch 8 and patch 9 into patch 7 and use ALLOC_WMARK_MIN
instead of magic number 0.

Kemeng Shi (7):
  mm/compaction: use correct list in move_freelist_{head}/{tail}
  mm/compaction: call list_is_{first}/{last} more intuitively in
    move_freelist_{head}/{tail}
  mm/compaction: correctly return failure with bogus compound_order in
    strict mode
  mm/compaction: simplify pfn iteration in isolate_freepages_range
  mm/compaction: remove repeat compact_blockskip_flush check in
    reset_isolation_suitable
  mm/compaction: improve comment of is_via_compact_memory
  mm/compaction: factor out code to test if we should run compaction for
    target order

 mm/compaction.c | 106 +++++++++++++++++++++++++-----------------------
 1 file changed, 56 insertions(+), 50 deletions(-)

-- 
2.30.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail}
  2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
@ 2023-08-26 15:36 ` Kemeng Shi
  2023-08-29  9:03   ` Mel Gorman
  2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

The freepage is chained with buddy_list in freelist head. Use buddy_list
instead of lru to correct the list operation.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/compaction.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 38c8d216c6a3..e3ee1bc1c0ad 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1395,8 +1395,8 @@ move_freelist_head(struct list_head *freelist, struct page *freepage)
 {
 	LIST_HEAD(sublist);
 
-	if (!list_is_last(freelist, &freepage->lru)) {
-		list_cut_before(&sublist, freelist, &freepage->lru);
+	if (!list_is_last(freelist, &freepage->buddy_list)) {
+		list_cut_before(&sublist, freelist, &freepage->buddy_list);
 		list_splice_tail(&sublist, freelist);
 	}
 }
@@ -1412,8 +1412,8 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage)
 {
 	LIST_HEAD(sublist);
 
-	if (!list_is_first(freelist, &freepage->lru)) {
-		list_cut_position(&sublist, freelist, &freepage->lru);
+	if (!list_is_first(freelist, &freepage->buddy_list)) {
+		list_cut_position(&sublist, freelist, &freepage->buddy_list);
 		list_splice_tail(&sublist, freelist);
 	}
 }
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
  2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
  2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
@ 2023-08-26 15:36 ` Kemeng Shi
  2023-08-29  9:18   ` Mel Gorman
  2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

We use move_freelist_head after list_for_each_entry_reverse to skip
recent pages. And there is no need to do actual move if all freepages
are searched in list_for_each_entry_reverse, e.g. freepage point to
first page in freelist. It's more intuitively to call list_is_first
with list entry as the first argument and list head as the second
argument to check if list entry is the first list entry instead of
call list_is_last with list entry and list head passed in reverse.

Similarly, call list_is_last in move_freelist_tail is more intuitively.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/compaction.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index e3ee1bc1c0ad..a40550a33aee 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1395,7 +1395,7 @@ move_freelist_head(struct list_head *freelist, struct page *freepage)
 {
 	LIST_HEAD(sublist);
 
-	if (!list_is_last(freelist, &freepage->buddy_list)) {
+	if (!list_is_first(&freepage->buddy_list, freelist)) {
 		list_cut_before(&sublist, freelist, &freepage->buddy_list);
 		list_splice_tail(&sublist, freelist);
 	}
@@ -1412,7 +1412,7 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage)
 {
 	LIST_HEAD(sublist);
 
-	if (!list_is_first(freelist, &freepage->buddy_list)) {
+	if (!list_is_last(&freepage->buddy_list, freelist)) {
 		list_cut_position(&sublist, freelist, &freepage->buddy_list);
 		list_splice_tail(&sublist, freelist);
 	}
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
  2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
  2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
  2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
@ 2023-08-26 15:36 ` Kemeng Shi
  2023-08-29  3:48   ` Baolin Wang
  2023-08-29  3:54   ` Matthew Wilcox
       [not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

We always do zone_watermark_ok check and compaction_suitable check
together to test if compaction for target order should be runned.
Factor these code out to remove repeat code.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/compaction.c | 63 ++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 00b7bba6c72e..6f2b87b026b8 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2374,6 +2374,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
 	return false;
 }
 
+/*
+ * Should we do compaction for target allocation order.
+ * Return COMPACT_SUCCESS if allocation for target order can be already
+ * satisfied
+ * Return COMPACT_SKIPPED if compaction for target order is likely to fail
+ * Return COMPACT_CONTINUE if compaction for target order should be runned
+ */
+static inline enum compact_result
+compaction_suit_allocation_order(struct zone *zone, unsigned int order,
+				 int highest_zoneidx, unsigned int alloc_flags)
+{
+	unsigned long watermark;
+
+	watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
+	if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
+			      alloc_flags))
+		return COMPACT_SUCCESS;
+
+	if (!compaction_suitable(zone, order, highest_zoneidx))
+		return COMPACT_SKIPPED;
+
+	return COMPACT_CONTINUE;
+}
+
 static enum compact_result
 compact_zone(struct compact_control *cc, struct capture_control *capc)
 {
@@ -2399,19 +2423,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 	cc->migratetype = gfp_migratetype(cc->gfp_mask);
 
 	if (!is_via_compact_memory(cc->order)) {
-		unsigned long watermark;
-
-		/* Allocation can already succeed, nothing to do */
-		watermark = wmark_pages(cc->zone,
-					cc->alloc_flags & ALLOC_WMARK_MASK);
-		if (zone_watermark_ok(cc->zone, cc->order, watermark,
-				      cc->highest_zoneidx, cc->alloc_flags))
-			return COMPACT_SUCCESS;
-
-		/* Compaction is likely to fail */
-		if (!compaction_suitable(cc->zone, cc->order,
-					 cc->highest_zoneidx))
-			return COMPACT_SKIPPED;
+		ret = compaction_suit_allocation_order(cc->zone, cc->order,
+						       cc->highest_zoneidx,
+						       cc->alloc_flags);
+		if (ret != COMPACT_CONTINUE)
+			return ret;
 	}
 
 	/*
@@ -2917,14 +2933,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
 		if (!populated_zone(zone))
 			continue;
 
-		/* Allocation can already succeed, check other zones */
-		if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
-				      min_wmark_pages(zone),
-				      highest_zoneidx, 0))
-			continue;
-
-		if (compaction_suitable(zone, pgdat->kcompactd_max_order,
-					highest_zoneidx))
+		if (compaction_suit_allocation_order(zone,
+				pgdat->kcompactd_max_order,
+				highest_zoneidx, ALLOC_WMARK_MIN) ==
+				COMPACT_CONTINUE)
 			return true;
 	}
 
@@ -2961,12 +2973,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
 		if (compaction_deferred(zone, cc.order))
 			continue;
 
-		/* Allocation can already succeed, nothing to do */
-		if (zone_watermark_ok(zone, cc.order,
-				      min_wmark_pages(zone), zoneid, 0))
-			continue;
-
-		if (!compaction_suitable(zone, cc.order, zoneid))
+		if (compaction_suit_allocation_order(zone,
+				cc.order, zoneid, ALLOC_WMARK_MIN) !=
+				COMPACT_CONTINUE)
 			continue;
 
 		if (kthread_should_stop())
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode
       [not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
@ 2023-08-29  3:27   ` Baolin Wang
  2023-08-29  9:39   ` Mel Gorman
  1 sibling, 0 replies; 19+ messages in thread
From: Baolin Wang @ 2023-08-29  3:27 UTC (permalink / raw)
  To: Kemeng Shi, linux-mm, linux-kernel, akpm, mgorman, david, willy



On 8/26/2023 11:36 PM, Kemeng Shi wrote:
> In strict mode, we should return 0 if there is any hole in pageblock. If
> we successfully isolated pages at beginning at pageblock and then have a
> bogus compound_order outside pageblock in next page. We will abort search
> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> we will treat it as a successful isolation in strict mode as blockpfn is
> not < end_pfn and return partial isolated pages. Then
> isolate_freepages_range may success unexpectly with hole in isolated
> range.
> 
> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

> ---
>   mm/compaction.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a40550a33aee..b4d03c9ffe7c 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -631,6 +631,14 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>   				page += (1UL << order) - 1;
>   				nr_scanned += (1UL << order) - 1;
>   			}
> +			/*
> +			 * There is a tiny chance that we have read bogus
> +			 * compound_order(), so be careful to not go outside
> +			 * of the pageblock.
> +			 */
> +			if (unlikely(blockpfn >= end_pfn))
> +				blockpfn = end_pfn - 1;
> +
>   			goto isolate_fail;
>   		}
>   
> @@ -678,8 +686,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>   		spin_unlock_irqrestore(&cc->zone->lock, flags);
>   
>   	/*
> -	 * There is a tiny chance that we have read bogus compound_order(),
> -	 * so be careful to not go outside of the pageblock.
> +	 * Be careful to not go outside of the pageblock.
>   	 */
>   	if (unlikely(blockpfn > end_pfn))
>   		blockpfn = end_pfn;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory
       [not found] ` <20230826153617.4019189-7-shikemeng@huaweicloud.com>
@ 2023-08-29  3:28   ` Baolin Wang
  2023-08-29 15:06   ` Mel Gorman
  1 sibling, 0 replies; 19+ messages in thread
From: Baolin Wang @ 2023-08-29  3:28 UTC (permalink / raw)
  To: Kemeng Shi, linux-mm, linux-kernel, akpm, mgorman, david, willy



On 8/26/2023 11:36 PM, Kemeng Shi wrote:
> We do proactive compaction with order == -1 via
> 1. /proc/sys/vm/compact_memory
> 2. /sys/devices/system/node/nodex/compact
> 3. /proc/sys/vm/compaction_proactiveness
> Add missed situation in which order == -1.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

> ---
>   mm/compaction.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 89a1b627bc89..00b7bba6c72e 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2061,8 +2061,10 @@ static isolate_migrate_t isolate_migratepages(struct compact_control *cc)
>   }
>   
>   /*
> - * order == -1 is expected when compacting via
> - * /proc/sys/vm/compact_memory
> + * order == -1 is expected when compacting proactively via
> + * 1. /proc/sys/vm/compact_memory
> + * 2. /sys/devices/system/node/nodex/compact
> + * 3. /proc/sys/vm/compaction_proactiveness
>    */
>   static inline bool is_via_compact_memory(int order)
>   {


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
  2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
@ 2023-08-29  3:48   ` Baolin Wang
  2023-08-30  6:28     ` Kemeng Shi
  2023-08-29  3:54   ` Matthew Wilcox
  1 sibling, 1 reply; 19+ messages in thread
From: Baolin Wang @ 2023-08-29  3:48 UTC (permalink / raw)
  To: Kemeng Shi, linux-mm, linux-kernel, akpm, mgorman, david, willy



On 8/26/2023 11:36 PM, Kemeng Shi wrote:
> We always do zone_watermark_ok check and compaction_suitable check
> together to test if compaction for target order should be runned.
> Factor these code out to remove repeat code.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>   mm/compaction.c | 63 ++++++++++++++++++++++++++++---------------------
>   1 file changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 00b7bba6c72e..6f2b87b026b8 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2374,6 +2374,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
>   	return false;
>   }
>   
> +/*
> + * Should we do compaction for target allocation order.
> + * Return COMPACT_SUCCESS if allocation for target order can be already
> + * satisfied
> + * Return COMPACT_SKIPPED if compaction for target order is likely to fail
> + * Return COMPACT_CONTINUE if compaction for target order should be runned
> + */
> +static inline enum compact_result

I think you should drop the 'inline' to let the compiler make the decision.

> +compaction_suit_allocation_order(struct zone *zone, unsigned int order,
> +				 int highest_zoneidx, unsigned int alloc_flags)

The changes look good to me. So please feel free to add:
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

> +{
> +	unsigned long watermark;
> +
> +	watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
> +	if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
> +			      alloc_flags))
> +		return COMPACT_SUCCESS;
> +
> +	if (!compaction_suitable(zone, order, highest_zoneidx))
> +		return COMPACT_SKIPPED;
> +
> +	return COMPACT_CONTINUE;
> +}
> +
>   static enum compact_result
>   compact_zone(struct compact_control *cc, struct capture_control *capc)
>   {
> @@ -2399,19 +2423,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
>   	cc->migratetype = gfp_migratetype(cc->gfp_mask);
>   
>   	if (!is_via_compact_memory(cc->order)) {
> -		unsigned long watermark;
> -
> -		/* Allocation can already succeed, nothing to do */
> -		watermark = wmark_pages(cc->zone,
> -					cc->alloc_flags & ALLOC_WMARK_MASK);
> -		if (zone_watermark_ok(cc->zone, cc->order, watermark,
> -				      cc->highest_zoneidx, cc->alloc_flags))
> -			return COMPACT_SUCCESS;
> -
> -		/* Compaction is likely to fail */
> -		if (!compaction_suitable(cc->zone, cc->order,
> -					 cc->highest_zoneidx))
> -			return COMPACT_SKIPPED;
> +		ret = compaction_suit_allocation_order(cc->zone, cc->order,
> +						       cc->highest_zoneidx,
> +						       cc->alloc_flags);
> +		if (ret != COMPACT_CONTINUE)
> +			return ret;
>   	}
>   
>   	/*
> @@ -2917,14 +2933,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
>   		if (!populated_zone(zone))
>   			continue;
>   
> -		/* Allocation can already succeed, check other zones */
> -		if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
> -				      min_wmark_pages(zone),
> -				      highest_zoneidx, 0))
> -			continue;
> -
> -		if (compaction_suitable(zone, pgdat->kcompactd_max_order,
> -					highest_zoneidx))
> +		if (compaction_suit_allocation_order(zone,
> +				pgdat->kcompactd_max_order,
> +				highest_zoneidx, ALLOC_WMARK_MIN) ==
> +				COMPACT_CONTINUE)
>   			return true;
>   	}
>   
> @@ -2961,12 +2973,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
>   		if (compaction_deferred(zone, cc.order))
>   			continue;
>   
> -		/* Allocation can already succeed, nothing to do */
> -		if (zone_watermark_ok(zone, cc.order,
> -				      min_wmark_pages(zone), zoneid, 0))
> -			continue;
> -
> -		if (!compaction_suitable(zone, cc.order, zoneid))
> +		if (compaction_suit_allocation_order(zone,
> +				cc.order, zoneid, ALLOC_WMARK_MIN) !=
> +				COMPACT_CONTINUE)
>   			continue;
>   
>   		if (kthread_should_stop())


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
  2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
  2023-08-29  3:48   ` Baolin Wang
@ 2023-08-29  3:54   ` Matthew Wilcox
  2023-08-30  6:45     ` Kemeng Shi
  1 sibling, 1 reply; 19+ messages in thread
From: Matthew Wilcox @ 2023-08-29  3:54 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david

On Sat, Aug 26, 2023 at 11:36:17PM +0800, Kemeng Shi wrote:
> +		if (compaction_suit_allocation_order(zone,
> +				pgdat->kcompactd_max_order,
> +				highest_zoneidx, ALLOC_WMARK_MIN) ==
> +				COMPACT_CONTINUE)

The indentation is confusing here.  It looks like COMPACT_CONTINUE is
an argument of compaction_suit_allocation_order().  How about:

		ret = compaction_suit_allocation_order(zone,
				pgdat->kcompactd_max_order,
				highest_zoneidx, ALLOC_WMARK_MIN);
		if (ret == COMPACT_CONTINUE)

(assuming there's a handy variable called ret)

You could also distinguih it by indenting COMPACT_CONTINUE by an
extra tab, but I think it's worth an extra variable just because this is
such a long line.

> +		if (compaction_suit_allocation_order(zone,
> +				cc.order, zoneid, ALLOC_WMARK_MIN) !=
> +				COMPACT_CONTINUE)
>  			continue;

Same here.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail}
  2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
@ 2023-08-29  9:03   ` Mel Gorman
  0 siblings, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2023-08-29  9:03 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy

On Sat, Aug 26, 2023 at 11:36:11PM +0800, Kemeng Shi wrote:
> The freepage is chained with buddy_list in freelist head. Use buddy_list
> instead of lru to correct the list operation.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
  2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
@ 2023-08-29  9:18   ` Mel Gorman
  0 siblings, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2023-08-29  9:18 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy

On Sat, Aug 26, 2023 at 11:36:12PM +0800, Kemeng Shi wrote:
> We use move_freelist_head after list_for_each_entry_reverse to skip
> recent pages. And there is no need to do actual move if all freepages
> are searched in list_for_each_entry_reverse, e.g. freepage point to
> first page in freelist. It's more intuitively to call list_is_first
> with list entry as the first argument and list head as the second
> argument to check if list entry is the first list entry instead of
> call list_is_last with list entry and list head passed in reverse.
> 
> Similarly, call list_is_last in move_freelist_tail is more intuitively.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode
       [not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
  2023-08-29  3:27   ` [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode Baolin Wang
@ 2023-08-29  9:39   ` Mel Gorman
  2023-08-30  6:50     ` Kemeng Shi
  1 sibling, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2023-08-29  9:39 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy

On Sat, Aug 26, 2023 at 11:36:13PM +0800, Kemeng Shi wrote:
> In strict mode, we should return 0 if there is any hole in pageblock. If
> we successfully isolated pages at beginning at pageblock and then have a
> bogus compound_order outside pageblock in next page. We will abort search
> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> we will treat it as a successful isolation in strict mode as blockpfn is
> not < end_pfn and return partial isolated pages. Then
> isolate_freepages_range may success unexpectly with hole in isolated
> range.
> 
> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

Modify the (likely(order <= MAX_ORDER)) block to avoid ever updating
blockpfn past the end of the pageblock. Then remove the second redundant
check.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range
       [not found] ` <20230826153617.4019189-5-shikemeng@huaweicloud.com>
@ 2023-08-29 15:01   ` Mel Gorman
  2023-08-30  7:02     ` Kemeng Shi
  0 siblings, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 15:01 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy

On Sat, Aug 26, 2023 at 11:36:14PM +0800, Kemeng Shi wrote:
> We call isolate_freepages_block in strict mode, continuous pages in
> pageblock will be isolated if isolate_freepages_block successed.
> Then pfn + isolated will point to start of next pageblock to scan
> no matter how many pageblocks are isolated in isolate_freepages_block.
> Use pfn + isolated as start of next pageblock to scan to simplify the
> iteration.
> 
> The pfn + isolated always points to start of next pageblock as:
> In case isolated buddy page has order higher than pageblock:
> 1. page in buddy page is aligned with it's order
> 2. order of page is higher than pageblock order
> Then page is aligned with pageblock order. So pfn of page and isolated
> pages count are both aligned pageblock order. So pfn + isolated is
> pageblock order aligned.
> 
> In case isolated buddy page has order lower than pageblock:
> Buddy page with order N contains two order N - 1 pages as following:
> |        order N        |
> |order N - 1|order N - 1|
> So buddy pages with order N - 1 will never cross boudary of order N.
> Similar, buddy pages with order N - 2 will never cross boudary of order
> N - 1 and so on. Then any pages with order less than pageblock order
> will never crosa boudary of pageblock.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

While I don't think the patch is wrong, I also don't think it
meaningfully simplifies the code or optimises enough to be justified.
Even though a branch is eliminated, the whole path is not cheap.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
       [not found] ` <20230826153617.4019189-6-shikemeng@huaweicloud.com>
@ 2023-08-29 15:05   ` Mel Gorman
  2023-08-30  7:07     ` Kemeng Shi
  0 siblings, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 15:05 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy

On Sat, Aug 26, 2023 at 11:36:15PM +0800, Kemeng Shi wrote:
> We have compact_blockskip_flush check in __reset_isolation_suitable, just
> remove repeat check before __reset_isolation_suitable in
> compact_blockskip_flush.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

The comment should move to __reset_isolation_suitable but otherwise

Acked-by: Mel Gorman <mgorman@techsingularity.net>

As a complete aside, the reset_isolation_suitable and
__reset_isolation_suitable were badly named because locking is not
involved but it's meaningless churn to fix it.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory
       [not found] ` <20230826153617.4019189-7-shikemeng@huaweicloud.com>
  2023-08-29  3:28   ` [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory Baolin Wang
@ 2023-08-29 15:06   ` Mel Gorman
  1 sibling, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 15:06 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy

On Sat, Aug 26, 2023 at 11:36:16PM +0800, Kemeng Shi wrote:
> We do proactive compaction with order == -1 via
> 1. /proc/sys/vm/compact_memory
> 2. /sys/devices/system/node/nodex/compact
> 3. /proc/sys/vm/compaction_proactiveness
> Add missed situation in which order == -1.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
  2023-08-29  3:48   ` Baolin Wang
@ 2023-08-30  6:28     ` Kemeng Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30  6:28 UTC (permalink / raw)
  To: Baolin Wang, linux-mm, linux-kernel, akpm, mgorman, david, willy



on 8/29/2023 11:48 AM, Baolin Wang wrote:
> 
> 
> On 8/26/2023 11:36 PM, Kemeng Shi wrote:
>> We always do zone_watermark_ok check and compaction_suitable check
>> together to test if compaction for target order should be runned.
>> Factor these code out to remove repeat code.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> ---
>>   mm/compaction.c | 63 ++++++++++++++++++++++++++++---------------------
>>   1 file changed, 36 insertions(+), 27 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 00b7bba6c72e..6f2b87b026b8 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -2374,6 +2374,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
>>       return false;
>>   }
>>   +/*
>> + * Should we do compaction for target allocation order.
>> + * Return COMPACT_SUCCESS if allocation for target order can be already
>> + * satisfied
>> + * Return COMPACT_SKIPPED if compaction for target order is likely to fail
>> + * Return COMPACT_CONTINUE if compaction for target order should be runned
>> + */
>> +static inline enum compact_result
> 
> I think you should drop the 'inline' to let the compiler make the decision.
> 
Sure, I will drop this in next version. Thanks for feedback.
>> +compaction_suit_allocation_order(struct zone *zone, unsigned int order,
>> +                 int highest_zoneidx, unsigned int alloc_flags)
> 
> The changes look good to me. So please feel free to add:
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> 
>> +{
>> +    unsigned long watermark;
>> +
>> +    watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>> +    if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
>> +                  alloc_flags))
>> +        return COMPACT_SUCCESS;
>> +
>> +    if (!compaction_suitable(zone, order, highest_zoneidx))
>> +        return COMPACT_SKIPPED;
>> +
>> +    return COMPACT_CONTINUE;
>> +}
>> +
>>   static enum compact_result
>>   compact_zone(struct compact_control *cc, struct capture_control *capc)
>>   {
>> @@ -2399,19 +2423,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
>>       cc->migratetype = gfp_migratetype(cc->gfp_mask);
>>         if (!is_via_compact_memory(cc->order)) {
>> -        unsigned long watermark;
>> -
>> -        /* Allocation can already succeed, nothing to do */
>> -        watermark = wmark_pages(cc->zone,
>> -                    cc->alloc_flags & ALLOC_WMARK_MASK);
>> -        if (zone_watermark_ok(cc->zone, cc->order, watermark,
>> -                      cc->highest_zoneidx, cc->alloc_flags))
>> -            return COMPACT_SUCCESS;
>> -
>> -        /* Compaction is likely to fail */
>> -        if (!compaction_suitable(cc->zone, cc->order,
>> -                     cc->highest_zoneidx))
>> -            return COMPACT_SKIPPED;
>> +        ret = compaction_suit_allocation_order(cc->zone, cc->order,
>> +                               cc->highest_zoneidx,
>> +                               cc->alloc_flags);
>> +        if (ret != COMPACT_CONTINUE)
>> +            return ret;
>>       }
>>         /*
>> @@ -2917,14 +2933,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
>>           if (!populated_zone(zone))
>>               continue;
>>   -        /* Allocation can already succeed, check other zones */
>> -        if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
>> -                      min_wmark_pages(zone),
>> -                      highest_zoneidx, 0))
>> -            continue;
>> -
>> -        if (compaction_suitable(zone, pgdat->kcompactd_max_order,
>> -                    highest_zoneidx))
>> +        if (compaction_suit_allocation_order(zone,
>> +                pgdat->kcompactd_max_order,
>> +                highest_zoneidx, ALLOC_WMARK_MIN) ==
>> +                COMPACT_CONTINUE)
>>               return true;
>>       }
>>   @@ -2961,12 +2973,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
>>           if (compaction_deferred(zone, cc.order))
>>               continue;
>>   -        /* Allocation can already succeed, nothing to do */
>> -        if (zone_watermark_ok(zone, cc.order,
>> -                      min_wmark_pages(zone), zoneid, 0))
>> -            continue;
>> -
>> -        if (!compaction_suitable(zone, cc.order, zoneid))
>> +        if (compaction_suit_allocation_order(zone,
>> +                cc.order, zoneid, ALLOC_WMARK_MIN) !=
>> +                COMPACT_CONTINUE)
>>               continue;
>>             if (kthread_should_stop())
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
  2023-08-29  3:54   ` Matthew Wilcox
@ 2023-08-30  6:45     ` Kemeng Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30  6:45 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david



on 8/29/2023 11:54 AM, Matthew Wilcox wrote:
> On Sat, Aug 26, 2023 at 11:36:17PM +0800, Kemeng Shi wrote:
>> +		if (compaction_suit_allocation_order(zone,
>> +				pgdat->kcompactd_max_order,
>> +				highest_zoneidx, ALLOC_WMARK_MIN) ==
>> +				COMPACT_CONTINUE)
> 
> The indentation is confusing here.  It looks like COMPACT_CONTINUE is
> an argument of compaction_suit_allocation_order().  How about:
> 
> 		ret = compaction_suit_allocation_order(zone,
> 				pgdat->kcompactd_max_order,
> 				highest_zoneidx, ALLOC_WMARK_MIN);
> 		if (ret == COMPACT_CONTINUE)
> 
Thanks for information, I will fix it this way in next version.
> (assuming there's a handy variable called ret)
> 
> You could also distinguih it by indenting COMPACT_CONTINUE by an
> extra tab, but I think it's worth an extra variable just because this is
> such a long line>
>> +		if (compaction_suit_allocation_order(zone,
>> +				cc.order, zoneid, ALLOC_WMARK_MIN) !=
>> +				COMPACT_CONTINUE)
>>  			continue;
> 
> Same here.
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode
  2023-08-29  9:39   ` Mel Gorman
@ 2023-08-30  6:50     ` Kemeng Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30  6:50 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy



on 8/29/2023 5:39 PM, Mel Gorman wrote:
> On Sat, Aug 26, 2023 at 11:36:13PM +0800, Kemeng Shi wrote:
>> In strict mode, we should return 0 if there is any hole in pageblock. If
>> we successfully isolated pages at beginning at pageblock and then have a
>> bogus compound_order outside pageblock in next page. We will abort search
>> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
>> we will treat it as a successful isolation in strict mode as blockpfn is
>> not < end_pfn and return partial isolated pages. Then
>> isolate_freepages_range may success unexpectly with hole in isolated
>> range.
>>
>> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> 
> Modify the (likely(order <= MAX_ORDER)) block to avoid ever updating
> blockpfn past the end of the pageblock. Then remove the second redundant
> check.
> 
Sure, I will improve this in next version.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range
  2023-08-29 15:01   ` [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range Mel Gorman
@ 2023-08-30  7:02     ` Kemeng Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30  7:02 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy



on 8/29/2023 11:01 PM, Mel Gorman wrote:
> On Sat, Aug 26, 2023 at 11:36:14PM +0800, Kemeng Shi wrote:
>> We call isolate_freepages_block in strict mode, continuous pages in
>> pageblock will be isolated if isolate_freepages_block successed.
>> Then pfn + isolated will point to start of next pageblock to scan
>> no matter how many pageblocks are isolated in isolate_freepages_block.
>> Use pfn + isolated as start of next pageblock to scan to simplify the
>> iteration.
>>
>> The pfn + isolated always points to start of next pageblock as:
>> In case isolated buddy page has order higher than pageblock:
>> 1. page in buddy page is aligned with it's order
>> 2. order of page is higher than pageblock order
>> Then page is aligned with pageblock order. So pfn of page and isolated
>> pages count are both aligned pageblock order. So pfn + isolated is
>> pageblock order aligned.
>>
>> In case isolated buddy page has order lower than pageblock:
>> Buddy page with order N contains two order N - 1 pages as following:
>> |        order N        |
>> |order N - 1|order N - 1|
>> So buddy pages with order N - 1 will never cross boudary of order N.
>> Similar, buddy pages with order N - 2 will never cross boudary of order
>> N - 1 and so on. Then any pages with order less than pageblock order
>> will never crosa boudary of pageblock.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> 
> While I don't think the patch is wrong, I also don't think it
> meaningfully simplifies the code or optimises enough to be justified.
> Even though a branch is eliminated, the whole path is not cheap.
> 
OK, I will drop this in next version if you insistant.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
  2023-08-29 15:05   ` [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Mel Gorman
@ 2023-08-30  7:07     ` Kemeng Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30  7:07 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy



on 8/29/2023 11:05 PM, Mel Gorman wrote:
> On Sat, Aug 26, 2023 at 11:36:15PM +0800, Kemeng Shi wrote:
>> We have compact_blockskip_flush check in __reset_isolation_suitable, just
>> remove repeat check before __reset_isolation_suitable in
>> compact_blockskip_flush.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> 
> The comment should move to __reset_isolation_suitable but otherwise
> 
Thanks for the review, I will move comment in next version.
> Acked-by: Mel Gorman <mgorman@techsingularity.net>
> 
> As a complete aside, the reset_isolation_suitable and
> __reset_isolation_suitable were badly named because locking is not
> involved but it's meaningless churn to fix it.
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-08-30  7:07 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
2023-08-29  9:03   ` Mel Gorman
2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
2023-08-29  9:18   ` Mel Gorman
2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
2023-08-29  3:48   ` Baolin Wang
2023-08-30  6:28     ` Kemeng Shi
2023-08-29  3:54   ` Matthew Wilcox
2023-08-30  6:45     ` Kemeng Shi
     [not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
2023-08-29  3:27   ` [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode Baolin Wang
2023-08-29  9:39   ` Mel Gorman
2023-08-30  6:50     ` Kemeng Shi
     [not found] ` <20230826153617.4019189-5-shikemeng@huaweicloud.com>
2023-08-29 15:01   ` [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range Mel Gorman
2023-08-30  7:02     ` Kemeng Shi
     [not found] ` <20230826153617.4019189-6-shikemeng@huaweicloud.com>
2023-08-29 15:05   ` [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Mel Gorman
2023-08-30  7:07     ` Kemeng Shi
     [not found] ` <20230826153617.4019189-7-shikemeng@huaweicloud.com>
2023-08-29  3:28   ` [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory Baolin Wang
2023-08-29 15:06   ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).