* [PATCH v2 0/7] Fixes and cleanups to compaction
@ 2023-08-26 15:36 Kemeng Shi
2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
` (6 more replies)
0 siblings, 7 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
Cc: shikemeng
Hi all, this is another series to do fix and clean up to compaction.
Patch 1-2 fix and clean up freepage list operation.
Patch 3-4 fix and clean up isolation of freepages
Patch 7 factor code to check if compaction is needed for allocation
order.
More details can be found in respective patches. Thanks!
v1->v2:
-Collect RVB from Baolin.
-Keep pfn inside of pageblock in patch 3.
-Only improve comment of is_via_compact_memory in patch 6.
-Squash patch 8 and patch 9 into patch 7 and use ALLOC_WMARK_MIN
instead of magic number 0.
Kemeng Shi (7):
mm/compaction: use correct list in move_freelist_{head}/{tail}
mm/compaction: call list_is_{first}/{last} more intuitively in
move_freelist_{head}/{tail}
mm/compaction: correctly return failure with bogus compound_order in
strict mode
mm/compaction: simplify pfn iteration in isolate_freepages_range
mm/compaction: remove repeat compact_blockskip_flush check in
reset_isolation_suitable
mm/compaction: improve comment of is_via_compact_memory
mm/compaction: factor out code to test if we should run compaction for
target order
mm/compaction.c | 106 +++++++++++++++++++++++++-----------------------
1 file changed, 56 insertions(+), 50 deletions(-)
--
2.30.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail}
2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
@ 2023-08-26 15:36 ` Kemeng Shi
2023-08-29 9:03 ` Mel Gorman
2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
` (5 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
Cc: shikemeng
The freepage is chained with buddy_list in freelist head. Use buddy_list
instead of lru to correct the list operation.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/compaction.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 38c8d216c6a3..e3ee1bc1c0ad 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1395,8 +1395,8 @@ move_freelist_head(struct list_head *freelist, struct page *freepage)
{
LIST_HEAD(sublist);
- if (!list_is_last(freelist, &freepage->lru)) {
- list_cut_before(&sublist, freelist, &freepage->lru);
+ if (!list_is_last(freelist, &freepage->buddy_list)) {
+ list_cut_before(&sublist, freelist, &freepage->buddy_list);
list_splice_tail(&sublist, freelist);
}
}
@@ -1412,8 +1412,8 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage)
{
LIST_HEAD(sublist);
- if (!list_is_first(freelist, &freepage->lru)) {
- list_cut_position(&sublist, freelist, &freepage->lru);
+ if (!list_is_first(freelist, &freepage->buddy_list)) {
+ list_cut_position(&sublist, freelist, &freepage->buddy_list);
list_splice_tail(&sublist, freelist);
}
}
--
2.30.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
@ 2023-08-26 15:36 ` Kemeng Shi
2023-08-29 9:18 ` Mel Gorman
2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
` (4 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
Cc: shikemeng
We use move_freelist_head after list_for_each_entry_reverse to skip
recent pages. And there is no need to do actual move if all freepages
are searched in list_for_each_entry_reverse, e.g. freepage point to
first page in freelist. It's more intuitively to call list_is_first
with list entry as the first argument and list head as the second
argument to check if list entry is the first list entry instead of
call list_is_last with list entry and list head passed in reverse.
Similarly, call list_is_last in move_freelist_tail is more intuitively.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/compaction.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index e3ee1bc1c0ad..a40550a33aee 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1395,7 +1395,7 @@ move_freelist_head(struct list_head *freelist, struct page *freepage)
{
LIST_HEAD(sublist);
- if (!list_is_last(freelist, &freepage->buddy_list)) {
+ if (!list_is_first(&freepage->buddy_list, freelist)) {
list_cut_before(&sublist, freelist, &freepage->buddy_list);
list_splice_tail(&sublist, freelist);
}
@@ -1412,7 +1412,7 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage)
{
LIST_HEAD(sublist);
- if (!list_is_first(freelist, &freepage->buddy_list)) {
+ if (!list_is_last(&freepage->buddy_list, freelist)) {
list_cut_position(&sublist, freelist, &freepage->buddy_list);
list_splice_tail(&sublist, freelist);
}
--
2.30.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
@ 2023-08-26 15:36 ` Kemeng Shi
2023-08-29 3:48 ` Baolin Wang
2023-08-29 3:54 ` Matthew Wilcox
[not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
` (3 subsequent siblings)
6 siblings, 2 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-26 15:36 UTC (permalink / raw)
To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
Cc: shikemeng
We always do zone_watermark_ok check and compaction_suitable check
together to test if compaction for target order should be runned.
Factor these code out to remove repeat code.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/compaction.c | 63 ++++++++++++++++++++++++++++---------------------
1 file changed, 36 insertions(+), 27 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 00b7bba6c72e..6f2b87b026b8 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2374,6 +2374,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
return false;
}
+/*
+ * Should we do compaction for target allocation order.
+ * Return COMPACT_SUCCESS if allocation for target order can be already
+ * satisfied
+ * Return COMPACT_SKIPPED if compaction for target order is likely to fail
+ * Return COMPACT_CONTINUE if compaction for target order should be runned
+ */
+static inline enum compact_result
+compaction_suit_allocation_order(struct zone *zone, unsigned int order,
+ int highest_zoneidx, unsigned int alloc_flags)
+{
+ unsigned long watermark;
+
+ watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
+ if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
+ alloc_flags))
+ return COMPACT_SUCCESS;
+
+ if (!compaction_suitable(zone, order, highest_zoneidx))
+ return COMPACT_SKIPPED;
+
+ return COMPACT_CONTINUE;
+}
+
static enum compact_result
compact_zone(struct compact_control *cc, struct capture_control *capc)
{
@@ -2399,19 +2423,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
cc->migratetype = gfp_migratetype(cc->gfp_mask);
if (!is_via_compact_memory(cc->order)) {
- unsigned long watermark;
-
- /* Allocation can already succeed, nothing to do */
- watermark = wmark_pages(cc->zone,
- cc->alloc_flags & ALLOC_WMARK_MASK);
- if (zone_watermark_ok(cc->zone, cc->order, watermark,
- cc->highest_zoneidx, cc->alloc_flags))
- return COMPACT_SUCCESS;
-
- /* Compaction is likely to fail */
- if (!compaction_suitable(cc->zone, cc->order,
- cc->highest_zoneidx))
- return COMPACT_SKIPPED;
+ ret = compaction_suit_allocation_order(cc->zone, cc->order,
+ cc->highest_zoneidx,
+ cc->alloc_flags);
+ if (ret != COMPACT_CONTINUE)
+ return ret;
}
/*
@@ -2917,14 +2933,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
if (!populated_zone(zone))
continue;
- /* Allocation can already succeed, check other zones */
- if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
- min_wmark_pages(zone),
- highest_zoneidx, 0))
- continue;
-
- if (compaction_suitable(zone, pgdat->kcompactd_max_order,
- highest_zoneidx))
+ if (compaction_suit_allocation_order(zone,
+ pgdat->kcompactd_max_order,
+ highest_zoneidx, ALLOC_WMARK_MIN) ==
+ COMPACT_CONTINUE)
return true;
}
@@ -2961,12 +2973,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
if (compaction_deferred(zone, cc.order))
continue;
- /* Allocation can already succeed, nothing to do */
- if (zone_watermark_ok(zone, cc.order,
- min_wmark_pages(zone), zoneid, 0))
- continue;
-
- if (!compaction_suitable(zone, cc.order, zoneid))
+ if (compaction_suit_allocation_order(zone,
+ cc.order, zoneid, ALLOC_WMARK_MIN) !=
+ COMPACT_CONTINUE)
continue;
if (kthread_should_stop())
--
2.30.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode
[not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
@ 2023-08-29 3:27 ` Baolin Wang
2023-08-29 9:39 ` Mel Gorman
1 sibling, 0 replies; 19+ messages in thread
From: Baolin Wang @ 2023-08-29 3:27 UTC (permalink / raw)
To: Kemeng Shi, linux-mm, linux-kernel, akpm, mgorman, david, willy
On 8/26/2023 11:36 PM, Kemeng Shi wrote:
> In strict mode, we should return 0 if there is any hole in pageblock. If
> we successfully isolated pages at beginning at pageblock and then have a
> bogus compound_order outside pageblock in next page. We will abort search
> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> we will treat it as a successful isolation in strict mode as blockpfn is
> not < end_pfn and return partial isolated pages. Then
> isolate_freepages_range may success unexpectly with hole in isolated
> range.
>
> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> mm/compaction.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a40550a33aee..b4d03c9ffe7c 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -631,6 +631,14 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> page += (1UL << order) - 1;
> nr_scanned += (1UL << order) - 1;
> }
> + /*
> + * There is a tiny chance that we have read bogus
> + * compound_order(), so be careful to not go outside
> + * of the pageblock.
> + */
> + if (unlikely(blockpfn >= end_pfn))
> + blockpfn = end_pfn - 1;
> +
> goto isolate_fail;
> }
>
> @@ -678,8 +686,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> spin_unlock_irqrestore(&cc->zone->lock, flags);
>
> /*
> - * There is a tiny chance that we have read bogus compound_order(),
> - * so be careful to not go outside of the pageblock.
> + * Be careful to not go outside of the pageblock.
> */
> if (unlikely(blockpfn > end_pfn))
> blockpfn = end_pfn;
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory
[not found] ` <20230826153617.4019189-7-shikemeng@huaweicloud.com>
@ 2023-08-29 3:28 ` Baolin Wang
2023-08-29 15:06 ` Mel Gorman
1 sibling, 0 replies; 19+ messages in thread
From: Baolin Wang @ 2023-08-29 3:28 UTC (permalink / raw)
To: Kemeng Shi, linux-mm, linux-kernel, akpm, mgorman, david, willy
On 8/26/2023 11:36 PM, Kemeng Shi wrote:
> We do proactive compaction with order == -1 via
> 1. /proc/sys/vm/compact_memory
> 2. /sys/devices/system/node/nodex/compact
> 3. /proc/sys/vm/compaction_proactiveness
> Add missed situation in which order == -1.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> mm/compaction.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 89a1b627bc89..00b7bba6c72e 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2061,8 +2061,10 @@ static isolate_migrate_t isolate_migratepages(struct compact_control *cc)
> }
>
> /*
> - * order == -1 is expected when compacting via
> - * /proc/sys/vm/compact_memory
> + * order == -1 is expected when compacting proactively via
> + * 1. /proc/sys/vm/compact_memory
> + * 2. /sys/devices/system/node/nodex/compact
> + * 3. /proc/sys/vm/compaction_proactiveness
> */
> static inline bool is_via_compact_memory(int order)
> {
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
@ 2023-08-29 3:48 ` Baolin Wang
2023-08-30 6:28 ` Kemeng Shi
2023-08-29 3:54 ` Matthew Wilcox
1 sibling, 1 reply; 19+ messages in thread
From: Baolin Wang @ 2023-08-29 3:48 UTC (permalink / raw)
To: Kemeng Shi, linux-mm, linux-kernel, akpm, mgorman, david, willy
On 8/26/2023 11:36 PM, Kemeng Shi wrote:
> We always do zone_watermark_ok check and compaction_suitable check
> together to test if compaction for target order should be runned.
> Factor these code out to remove repeat code.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
> mm/compaction.c | 63 ++++++++++++++++++++++++++++---------------------
> 1 file changed, 36 insertions(+), 27 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 00b7bba6c72e..6f2b87b026b8 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2374,6 +2374,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
> return false;
> }
>
> +/*
> + * Should we do compaction for target allocation order.
> + * Return COMPACT_SUCCESS if allocation for target order can be already
> + * satisfied
> + * Return COMPACT_SKIPPED if compaction for target order is likely to fail
> + * Return COMPACT_CONTINUE if compaction for target order should be runned
> + */
> +static inline enum compact_result
I think you should drop the 'inline' to let the compiler make the decision.
> +compaction_suit_allocation_order(struct zone *zone, unsigned int order,
> + int highest_zoneidx, unsigned int alloc_flags)
The changes look good to me. So please feel free to add:
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> +{
> + unsigned long watermark;
> +
> + watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
> + if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
> + alloc_flags))
> + return COMPACT_SUCCESS;
> +
> + if (!compaction_suitable(zone, order, highest_zoneidx))
> + return COMPACT_SKIPPED;
> +
> + return COMPACT_CONTINUE;
> +}
> +
> static enum compact_result
> compact_zone(struct compact_control *cc, struct capture_control *capc)
> {
> @@ -2399,19 +2423,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
> cc->migratetype = gfp_migratetype(cc->gfp_mask);
>
> if (!is_via_compact_memory(cc->order)) {
> - unsigned long watermark;
> -
> - /* Allocation can already succeed, nothing to do */
> - watermark = wmark_pages(cc->zone,
> - cc->alloc_flags & ALLOC_WMARK_MASK);
> - if (zone_watermark_ok(cc->zone, cc->order, watermark,
> - cc->highest_zoneidx, cc->alloc_flags))
> - return COMPACT_SUCCESS;
> -
> - /* Compaction is likely to fail */
> - if (!compaction_suitable(cc->zone, cc->order,
> - cc->highest_zoneidx))
> - return COMPACT_SKIPPED;
> + ret = compaction_suit_allocation_order(cc->zone, cc->order,
> + cc->highest_zoneidx,
> + cc->alloc_flags);
> + if (ret != COMPACT_CONTINUE)
> + return ret;
> }
>
> /*
> @@ -2917,14 +2933,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
> if (!populated_zone(zone))
> continue;
>
> - /* Allocation can already succeed, check other zones */
> - if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
> - min_wmark_pages(zone),
> - highest_zoneidx, 0))
> - continue;
> -
> - if (compaction_suitable(zone, pgdat->kcompactd_max_order,
> - highest_zoneidx))
> + if (compaction_suit_allocation_order(zone,
> + pgdat->kcompactd_max_order,
> + highest_zoneidx, ALLOC_WMARK_MIN) ==
> + COMPACT_CONTINUE)
> return true;
> }
>
> @@ -2961,12 +2973,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
> if (compaction_deferred(zone, cc.order))
> continue;
>
> - /* Allocation can already succeed, nothing to do */
> - if (zone_watermark_ok(zone, cc.order,
> - min_wmark_pages(zone), zoneid, 0))
> - continue;
> -
> - if (!compaction_suitable(zone, cc.order, zoneid))
> + if (compaction_suit_allocation_order(zone,
> + cc.order, zoneid, ALLOC_WMARK_MIN) !=
> + COMPACT_CONTINUE)
> continue;
>
> if (kthread_should_stop())
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
2023-08-29 3:48 ` Baolin Wang
@ 2023-08-29 3:54 ` Matthew Wilcox
2023-08-30 6:45 ` Kemeng Shi
1 sibling, 1 reply; 19+ messages in thread
From: Matthew Wilcox @ 2023-08-29 3:54 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david
On Sat, Aug 26, 2023 at 11:36:17PM +0800, Kemeng Shi wrote:
> + if (compaction_suit_allocation_order(zone,
> + pgdat->kcompactd_max_order,
> + highest_zoneidx, ALLOC_WMARK_MIN) ==
> + COMPACT_CONTINUE)
The indentation is confusing here. It looks like COMPACT_CONTINUE is
an argument of compaction_suit_allocation_order(). How about:
ret = compaction_suit_allocation_order(zone,
pgdat->kcompactd_max_order,
highest_zoneidx, ALLOC_WMARK_MIN);
if (ret == COMPACT_CONTINUE)
(assuming there's a handy variable called ret)
You could also distinguih it by indenting COMPACT_CONTINUE by an
extra tab, but I think it's worth an extra variable just because this is
such a long line.
> + if (compaction_suit_allocation_order(zone,
> + cc.order, zoneid, ALLOC_WMARK_MIN) !=
> + COMPACT_CONTINUE)
> continue;
Same here.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail}
2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
@ 2023-08-29 9:03 ` Mel Gorman
0 siblings, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 9:03 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
On Sat, Aug 26, 2023 at 11:36:11PM +0800, Kemeng Shi wrote:
> The freepage is chained with buddy_list in freelist head. Use buddy_list
> instead of lru to correct the list operation.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
@ 2023-08-29 9:18 ` Mel Gorman
0 siblings, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 9:18 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
On Sat, Aug 26, 2023 at 11:36:12PM +0800, Kemeng Shi wrote:
> We use move_freelist_head after list_for_each_entry_reverse to skip
> recent pages. And there is no need to do actual move if all freepages
> are searched in list_for_each_entry_reverse, e.g. freepage point to
> first page in freelist. It's more intuitively to call list_is_first
> with list entry as the first argument and list head as the second
> argument to check if list entry is the first list entry instead of
> call list_is_last with list entry and list head passed in reverse.
>
> Similarly, call list_is_last in move_freelist_tail is more intuitively.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode
[not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
2023-08-29 3:27 ` [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode Baolin Wang
@ 2023-08-29 9:39 ` Mel Gorman
2023-08-30 6:50 ` Kemeng Shi
1 sibling, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 9:39 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
On Sat, Aug 26, 2023 at 11:36:13PM +0800, Kemeng Shi wrote:
> In strict mode, we should return 0 if there is any hole in pageblock. If
> we successfully isolated pages at beginning at pageblock and then have a
> bogus compound_order outside pageblock in next page. We will abort search
> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> we will treat it as a successful isolation in strict mode as blockpfn is
> not < end_pfn and return partial isolated pages. Then
> isolate_freepages_range may success unexpectly with hole in isolated
> range.
>
> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Modify the (likely(order <= MAX_ORDER)) block to avoid ever updating
blockpfn past the end of the pageblock. Then remove the second redundant
check.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range
[not found] ` <20230826153617.4019189-5-shikemeng@huaweicloud.com>
@ 2023-08-29 15:01 ` Mel Gorman
2023-08-30 7:02 ` Kemeng Shi
0 siblings, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 15:01 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
On Sat, Aug 26, 2023 at 11:36:14PM +0800, Kemeng Shi wrote:
> We call isolate_freepages_block in strict mode, continuous pages in
> pageblock will be isolated if isolate_freepages_block successed.
> Then pfn + isolated will point to start of next pageblock to scan
> no matter how many pageblocks are isolated in isolate_freepages_block.
> Use pfn + isolated as start of next pageblock to scan to simplify the
> iteration.
>
> The pfn + isolated always points to start of next pageblock as:
> In case isolated buddy page has order higher than pageblock:
> 1. page in buddy page is aligned with it's order
> 2. order of page is higher than pageblock order
> Then page is aligned with pageblock order. So pfn of page and isolated
> pages count are both aligned pageblock order. So pfn + isolated is
> pageblock order aligned.
>
> In case isolated buddy page has order lower than pageblock:
> Buddy page with order N contains two order N - 1 pages as following:
> | order N |
> |order N - 1|order N - 1|
> So buddy pages with order N - 1 will never cross boudary of order N.
> Similar, buddy pages with order N - 2 will never cross boudary of order
> N - 1 and so on. Then any pages with order less than pageblock order
> will never crosa boudary of pageblock.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
While I don't think the patch is wrong, I also don't think it
meaningfully simplifies the code or optimises enough to be justified.
Even though a branch is eliminated, the whole path is not cheap.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
[not found] ` <20230826153617.4019189-6-shikemeng@huaweicloud.com>
@ 2023-08-29 15:05 ` Mel Gorman
2023-08-30 7:07 ` Kemeng Shi
0 siblings, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 15:05 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
On Sat, Aug 26, 2023 at 11:36:15PM +0800, Kemeng Shi wrote:
> We have compact_blockskip_flush check in __reset_isolation_suitable, just
> remove repeat check before __reset_isolation_suitable in
> compact_blockskip_flush.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
The comment should move to __reset_isolation_suitable but otherwise
Acked-by: Mel Gorman <mgorman@techsingularity.net>
As a complete aside, the reset_isolation_suitable and
__reset_isolation_suitable were badly named because locking is not
involved but it's meaningless churn to fix it.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory
[not found] ` <20230826153617.4019189-7-shikemeng@huaweicloud.com>
2023-08-29 3:28 ` [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory Baolin Wang
@ 2023-08-29 15:06 ` Mel Gorman
1 sibling, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2023-08-29 15:06 UTC (permalink / raw)
To: Kemeng Shi; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
On Sat, Aug 26, 2023 at 11:36:16PM +0800, Kemeng Shi wrote:
> We do proactive compaction with order == -1 via
> 1. /proc/sys/vm/compact_memory
> 2. /sys/devices/system/node/nodex/compact
> 3. /proc/sys/vm/compaction_proactiveness
> Add missed situation in which order == -1.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
2023-08-29 3:48 ` Baolin Wang
@ 2023-08-30 6:28 ` Kemeng Shi
0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30 6:28 UTC (permalink / raw)
To: Baolin Wang, linux-mm, linux-kernel, akpm, mgorman, david, willy
on 8/29/2023 11:48 AM, Baolin Wang wrote:
>
>
> On 8/26/2023 11:36 PM, Kemeng Shi wrote:
>> We always do zone_watermark_ok check and compaction_suitable check
>> together to test if compaction for target order should be runned.
>> Factor these code out to remove repeat code.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> ---
>> mm/compaction.c | 63 ++++++++++++++++++++++++++++---------------------
>> 1 file changed, 36 insertions(+), 27 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 00b7bba6c72e..6f2b87b026b8 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -2374,6 +2374,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
>> return false;
>> }
>> +/*
>> + * Should we do compaction for target allocation order.
>> + * Return COMPACT_SUCCESS if allocation for target order can be already
>> + * satisfied
>> + * Return COMPACT_SKIPPED if compaction for target order is likely to fail
>> + * Return COMPACT_CONTINUE if compaction for target order should be runned
>> + */
>> +static inline enum compact_result
>
> I think you should drop the 'inline' to let the compiler make the decision.
>
Sure, I will drop this in next version. Thanks for feedback.
>> +compaction_suit_allocation_order(struct zone *zone, unsigned int order,
>> + int highest_zoneidx, unsigned int alloc_flags)
>
> The changes look good to me. So please feel free to add:
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>
>> +{
>> + unsigned long watermark;
>> +
>> + watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>> + if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
>> + alloc_flags))
>> + return COMPACT_SUCCESS;
>> +
>> + if (!compaction_suitable(zone, order, highest_zoneidx))
>> + return COMPACT_SKIPPED;
>> +
>> + return COMPACT_CONTINUE;
>> +}
>> +
>> static enum compact_result
>> compact_zone(struct compact_control *cc, struct capture_control *capc)
>> {
>> @@ -2399,19 +2423,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
>> cc->migratetype = gfp_migratetype(cc->gfp_mask);
>> if (!is_via_compact_memory(cc->order)) {
>> - unsigned long watermark;
>> -
>> - /* Allocation can already succeed, nothing to do */
>> - watermark = wmark_pages(cc->zone,
>> - cc->alloc_flags & ALLOC_WMARK_MASK);
>> - if (zone_watermark_ok(cc->zone, cc->order, watermark,
>> - cc->highest_zoneidx, cc->alloc_flags))
>> - return COMPACT_SUCCESS;
>> -
>> - /* Compaction is likely to fail */
>> - if (!compaction_suitable(cc->zone, cc->order,
>> - cc->highest_zoneidx))
>> - return COMPACT_SKIPPED;
>> + ret = compaction_suit_allocation_order(cc->zone, cc->order,
>> + cc->highest_zoneidx,
>> + cc->alloc_flags);
>> + if (ret != COMPACT_CONTINUE)
>> + return ret;
>> }
>> /*
>> @@ -2917,14 +2933,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
>> if (!populated_zone(zone))
>> continue;
>> - /* Allocation can already succeed, check other zones */
>> - if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
>> - min_wmark_pages(zone),
>> - highest_zoneidx, 0))
>> - continue;
>> -
>> - if (compaction_suitable(zone, pgdat->kcompactd_max_order,
>> - highest_zoneidx))
>> + if (compaction_suit_allocation_order(zone,
>> + pgdat->kcompactd_max_order,
>> + highest_zoneidx, ALLOC_WMARK_MIN) ==
>> + COMPACT_CONTINUE)
>> return true;
>> }
>> @@ -2961,12 +2973,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
>> if (compaction_deferred(zone, cc.order))
>> continue;
>> - /* Allocation can already succeed, nothing to do */
>> - if (zone_watermark_ok(zone, cc.order,
>> - min_wmark_pages(zone), zoneid, 0))
>> - continue;
>> -
>> - if (!compaction_suitable(zone, cc.order, zoneid))
>> + if (compaction_suit_allocation_order(zone,
>> + cc.order, zoneid, ALLOC_WMARK_MIN) !=
>> + COMPACT_CONTINUE)
>> continue;
>> if (kthread_should_stop())
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order
2023-08-29 3:54 ` Matthew Wilcox
@ 2023-08-30 6:45 ` Kemeng Shi
0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30 6:45 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david
on 8/29/2023 11:54 AM, Matthew Wilcox wrote:
> On Sat, Aug 26, 2023 at 11:36:17PM +0800, Kemeng Shi wrote:
>> + if (compaction_suit_allocation_order(zone,
>> + pgdat->kcompactd_max_order,
>> + highest_zoneidx, ALLOC_WMARK_MIN) ==
>> + COMPACT_CONTINUE)
>
> The indentation is confusing here. It looks like COMPACT_CONTINUE is
> an argument of compaction_suit_allocation_order(). How about:
>
> ret = compaction_suit_allocation_order(zone,
> pgdat->kcompactd_max_order,
> highest_zoneidx, ALLOC_WMARK_MIN);
> if (ret == COMPACT_CONTINUE)
>
Thanks for information, I will fix it this way in next version.
> (assuming there's a handy variable called ret)
>
> You could also distinguih it by indenting COMPACT_CONTINUE by an
> extra tab, but I think it's worth an extra variable just because this is
> such a long line>
>> + if (compaction_suit_allocation_order(zone,
>> + cc.order, zoneid, ALLOC_WMARK_MIN) !=
>> + COMPACT_CONTINUE)
>> continue;
>
> Same here.
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode
2023-08-29 9:39 ` Mel Gorman
@ 2023-08-30 6:50 ` Kemeng Shi
0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30 6:50 UTC (permalink / raw)
To: Mel Gorman; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
on 8/29/2023 5:39 PM, Mel Gorman wrote:
> On Sat, Aug 26, 2023 at 11:36:13PM +0800, Kemeng Shi wrote:
>> In strict mode, we should return 0 if there is any hole in pageblock. If
>> we successfully isolated pages at beginning at pageblock and then have a
>> bogus compound_order outside pageblock in next page. We will abort search
>> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
>> we will treat it as a successful isolation in strict mode as blockpfn is
>> not < end_pfn and return partial isolated pages. Then
>> isolate_freepages_range may success unexpectly with hole in isolated
>> range.
>>
>> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>
> Modify the (likely(order <= MAX_ORDER)) block to avoid ever updating
> blockpfn past the end of the pageblock. Then remove the second redundant
> check.
>
Sure, I will improve this in next version.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range
2023-08-29 15:01 ` [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range Mel Gorman
@ 2023-08-30 7:02 ` Kemeng Shi
0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30 7:02 UTC (permalink / raw)
To: Mel Gorman; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
on 8/29/2023 11:01 PM, Mel Gorman wrote:
> On Sat, Aug 26, 2023 at 11:36:14PM +0800, Kemeng Shi wrote:
>> We call isolate_freepages_block in strict mode, continuous pages in
>> pageblock will be isolated if isolate_freepages_block successed.
>> Then pfn + isolated will point to start of next pageblock to scan
>> no matter how many pageblocks are isolated in isolate_freepages_block.
>> Use pfn + isolated as start of next pageblock to scan to simplify the
>> iteration.
>>
>> The pfn + isolated always points to start of next pageblock as:
>> In case isolated buddy page has order higher than pageblock:
>> 1. page in buddy page is aligned with it's order
>> 2. order of page is higher than pageblock order
>> Then page is aligned with pageblock order. So pfn of page and isolated
>> pages count are both aligned pageblock order. So pfn + isolated is
>> pageblock order aligned.
>>
>> In case isolated buddy page has order lower than pageblock:
>> Buddy page with order N contains two order N - 1 pages as following:
>> | order N |
>> |order N - 1|order N - 1|
>> So buddy pages with order N - 1 will never cross boudary of order N.
>> Similar, buddy pages with order N - 2 will never cross boudary of order
>> N - 1 and so on. Then any pages with order less than pageblock order
>> will never crosa boudary of pageblock.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>
> While I don't think the patch is wrong, I also don't think it
> meaningfully simplifies the code or optimises enough to be justified.
> Even though a branch is eliminated, the whole path is not cheap.
>
OK, I will drop this in next version if you insistant.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
2023-08-29 15:05 ` [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Mel Gorman
@ 2023-08-30 7:07 ` Kemeng Shi
0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2023-08-30 7:07 UTC (permalink / raw)
To: Mel Gorman; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, david, willy
on 8/29/2023 11:05 PM, Mel Gorman wrote:
> On Sat, Aug 26, 2023 at 11:36:15PM +0800, Kemeng Shi wrote:
>> We have compact_blockskip_flush check in __reset_isolation_suitable, just
>> remove repeat check before __reset_isolation_suitable in
>> compact_blockskip_flush.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>
> The comment should move to __reset_isolation_suitable but otherwise
>
Thanks for the review, I will move comment in next version.
> Acked-by: Mel Gorman <mgorman@techsingularity.net>
>
> As a complete aside, the reset_isolation_suitable and
> __reset_isolation_suitable were badly named because locking is not
> involved but it's meaningless churn to fix it.
>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2023-08-30 7:07 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-26 15:36 [PATCH v2 0/7] Fixes and cleanups to compaction Kemeng Shi
2023-08-26 15:36 ` [PATCH v2 1/7] mm/compaction: use correct list in move_freelist_{head}/{tail} Kemeng Shi
2023-08-29 9:03 ` Mel Gorman
2023-08-26 15:36 ` [PATCH v2 2/7] mm/compaction: call list_is_{first}/{last} more intuitively " Kemeng Shi
2023-08-29 9:18 ` Mel Gorman
2023-08-26 15:36 ` [PATCH v2 7/7] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
2023-08-29 3:48 ` Baolin Wang
2023-08-30 6:28 ` Kemeng Shi
2023-08-29 3:54 ` Matthew Wilcox
2023-08-30 6:45 ` Kemeng Shi
[not found] ` <20230826153617.4019189-4-shikemeng@huaweicloud.com>
2023-08-29 3:27 ` [PATCH v2 3/7] mm/compaction: correctly return failure with bogus compound_order in strict mode Baolin Wang
2023-08-29 9:39 ` Mel Gorman
2023-08-30 6:50 ` Kemeng Shi
[not found] ` <20230826153617.4019189-5-shikemeng@huaweicloud.com>
2023-08-29 15:01 ` [PATCH v2 4/7] mm/compaction: simplify pfn iteration in isolate_freepages_range Mel Gorman
2023-08-30 7:02 ` Kemeng Shi
[not found] ` <20230826153617.4019189-6-shikemeng@huaweicloud.com>
2023-08-29 15:05 ` [PATCH v2 5/7] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Mel Gorman
2023-08-30 7:07 ` Kemeng Shi
[not found] ` <20230826153617.4019189-7-shikemeng@huaweicloud.com>
2023-08-29 3:28 ` [PATCH v2 6/7] mm/compaction: improve comment of is_via_compact_memory Baolin Wang
2023-08-29 15:06 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).