* [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
@ 2025-09-10 9:22 Wei Yang
2025-09-11 1:25 ` Wei Yang
0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-10 9:22 UTC (permalink / raw)
To: akpm, vbabka, surenb, mhocko, jackmanb, hannes, ziy,
wangkefeng.wang
Cc: linux-mm, Wei Yang, Oscar Salvador
Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
isolate_migratepages_block()") converts api from page to folio. But the
low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
to head page.
Originally, if page is a hugetlb tail page, compound_nr() return 1,
which means low_pfn only advance one in next iteration. After the
change, low_pfn would advance more than the hugetlb range, since
folio_nr_pages() always return total number of the large page. This
results in skipping some range to isolate and then to migrate.
The worst case for alloc_contig is it does all the isolation and
migration, but finally find some range is still not isolated. And then
undo all the work and try a new range.
Advance low_pfn to the end of hugetlb.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Oscar Salvador <osalvador@suse.de>
---
mm/compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index bf021b31c7ec..1e8f8eca318c 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
* Hugepage was successfully isolated and placed
* on the cc->migratepages list.
*/
- low_pfn += folio_nr_pages(folio) - 1;
+ low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
goto isolate_success_no_list;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-10 9:22 [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb Wei Yang
@ 2025-09-11 1:25 ` Wei Yang
2025-09-11 1:35 ` Zi Yan
0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-11 1:25 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, ziy,
wangkefeng.wang, linux-mm, Oscar Salvador
On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>isolate_migratepages_block()") converts api from page to folio. But the
>low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>to head page.
>
>Originally, if page is a hugetlb tail page, compound_nr() return 1,
>which means low_pfn only advance one in next iteration. After the
>change, low_pfn would advance more than the hugetlb range, since
>folio_nr_pages() always return total number of the large page. This
>results in skipping some range to isolate and then to migrate.
>
>The worst case for alloc_contig is it does all the isolation and
>migration, but finally find some range is still not isolated. And then
>undo all the work and try a new range.
>
>Advance low_pfn to the end of hugetlb.
>
>Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>Cc: Oscar Salvador <osalvador@suse.de>
Forgot to cc stable.
Cc: <stable@vger.kernel.org>
>---
> mm/compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/mm/compaction.c b/mm/compaction.c
>index bf021b31c7ec..1e8f8eca318c 100644
>--- a/mm/compaction.c
>+++ b/mm/compaction.c
>@@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> * Hugepage was successfully isolated and placed
> * on the cc->migratepages list.
> */
>- low_pfn += folio_nr_pages(folio) - 1;
>+ low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
One question is why we advance compound_nr() in original version.
Yes, there are several places advancing compound_nr(), but it seems to iterate
on the same large page and do the same thing and advance 1 again.
Not sure which part story I missed.
> goto isolate_success_no_list;
> }
>
>--
>2.34.1
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 1:25 ` Wei Yang
@ 2025-09-11 1:35 ` Zi Yan
2025-09-11 1:38 ` Zi Yan
2025-09-11 3:27 ` Wei Yang
0 siblings, 2 replies; 22+ messages in thread
From: Zi Yan @ 2025-09-11 1:35 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 10 Sep 2025, at 21:25, Wei Yang wrote:
> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>> isolate_migratepages_block()") converts api from page to folio. But the
>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>> to head page.
>>
>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>> which means low_pfn only advance one in next iteration. After the
>> change, low_pfn would advance more than the hugetlb range, since
>> folio_nr_pages() always return total number of the large page. This
>> results in skipping some range to isolate and then to migrate.
>>
>> The worst case for alloc_contig is it does all the isolation and
>> migration, but finally find some range is still not isolated. And then
>> undo all the work and try a new range.
>>
>> Advance low_pfn to the end of hugetlb.
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>
> Forgot to cc stable.
>
> Cc: <stable@vger.kernel.org>
Is there any bug report to justify the backport? Since it is more likely
to be a performance issue instead of a correctness issue.
>
>> ---
>> mm/compaction.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index bf021b31c7ec..1e8f8eca318c 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>> * Hugepage was successfully isolated and placed
>> * on the cc->migratepages list.
>> */
>> - low_pfn += folio_nr_pages(folio) - 1;
>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>
> One question is why we advance compound_nr() in original version.
>
> Yes, there are several places advancing compound_nr(), but it seems to iterate
> on the same large page and do the same thing and advance 1 again.
>
> Not sure which part story I missed.
isolate_migratepages_block() starts from the beginning of a pageblock.
How likely the code hit in the middle of a hugetlb?
>
>> goto isolate_success_no_list;
>> }
>>
>> --
>> 2.34.1
>
> --
> Wei Yang
> Help you, Help me
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 1:35 ` Zi Yan
@ 2025-09-11 1:38 ` Zi Yan
2025-09-11 1:50 ` Zi Yan
2025-09-11 3:27 ` Wei Yang
1 sibling, 1 reply; 22+ messages in thread
From: Zi Yan @ 2025-09-11 1:38 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 10 Sep 2025, at 21:35, Zi Yan wrote:
> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>
>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>> isolate_migratepages_block()") converts api from page to folio. But the
>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>> to head page.
>>>
>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>> which means low_pfn only advance one in next iteration. After the
>>> change, low_pfn would advance more than the hugetlb range, since
>>> folio_nr_pages() always return total number of the large page. This
>>> results in skipping some range to isolate and then to migrate.
>>>
>>> The worst case for alloc_contig is it does all the isolation and
>>> migration, but finally find some range is still not isolated. And then
>>> undo all the work and try a new range.
>>>
>>> Advance low_pfn to the end of hugetlb.
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>
>> Forgot to cc stable.
>>
>> Cc: <stable@vger.kernel.org>
>
> Is there any bug report to justify the backport? Since it is more likely
> to be a performance issue instead of a correctness issue.
>
>>
>>> ---
>>> mm/compaction.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>> index bf021b31c7ec..1e8f8eca318c 100644
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>> * Hugepage was successfully isolated and placed
>>> * on the cc->migratepages list.
>>> */
>>> - low_pfn += folio_nr_pages(folio) - 1;
>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>
>> One question is why we advance compound_nr() in original version.
>>
>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>> on the same large page and do the same thing and advance 1 again.
>>
>> Not sure which part story I missed.
>
> isolate_migratepages_block() starts from the beginning of a pageblock.
> How likely the code hit in the middle of a hugetlb?
>
In addition, there are two other “low_pfn += (1UL << order) - 1”
in the if (PageHuge(page)), why not change them too if you think
page can point to the middle of a hugetlb?
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 1:38 ` Zi Yan
@ 2025-09-11 1:50 ` Zi Yan
2025-09-11 3:34 ` Wei Yang
0 siblings, 1 reply; 22+ messages in thread
From: Zi Yan @ 2025-09-11 1:50 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 10 Sep 2025, at 21:38, Zi Yan wrote:
> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>
>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>
>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>> to head page.
>>>>
>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>> which means low_pfn only advance one in next iteration. After the
>>>> change, low_pfn would advance more than the hugetlb range, since
>>>> folio_nr_pages() always return total number of the large page. This
>>>> results in skipping some range to isolate and then to migrate.
>>>>
>>>> The worst case for alloc_contig is it does all the isolation and
>>>> migration, but finally find some range is still not isolated. And then
>>>> undo all the work and try a new range.
>>>>
>>>> Advance low_pfn to the end of hugetlb.
>>>>
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
This behavior seems to be introduced by commit 369fa227c219 ("mm: make
alloc_contig_range handle free hugetlb pages”). The related change is:
+ if (PageHuge(page) && cc->alloc_contig) {
+ ret = isolate_or_dissolve_huge_page(page);
+
+ /*
+ * Fail isolation in case isolate_or_dissolve_huge_page()
+ * reports an error. In case of -ENOMEM, abort right away.
+ */
+ if (ret < 0) {
+ /* Do not report -EBUSY down the chain */
+ if (ret == -EBUSY)
+ ret = 0;
+ low_pfn += (1UL << compound_order(page)) - 1;
+ goto isolate_fail;
+ }
+
+ /*
+ * Ok, the hugepage was dissolved. Now these pages are
+ * Buddy and cannot be re-allocated because they are
+ * isolated. Fall-through as the check below handles
+ * Buddy pages.
+ */
+ }
+
>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>
>>> Forgot to cc stable.
>>>
>>> Cc: <stable@vger.kernel.org>
>>
>> Is there any bug report to justify the backport? Since it is more likely
>> to be a performance issue instead of a correctness issue.
>>
>>>
>>>> ---
>>>> mm/compaction.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>> --- a/mm/compaction.c
>>>> +++ b/mm/compaction.c
>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>> * Hugepage was successfully isolated and placed
>>>> * on the cc->migratepages list.
>>>> */
>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>
>>> One question is why we advance compound_nr() in original version.
>>>
>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>> on the same large page and do the same thing and advance 1 again.
>>>
>>> Not sure which part story I missed.
>>
>> isolate_migratepages_block() starts from the beginning of a pageblock.
>> How likely the code hit in the middle of a hugetlb?
>>
>
> In addition, there are two other “low_pfn += (1UL << order) - 1”
> in the if (PageHuge(page)), why not change them too if you think
> page can point to the middle of a hugetlb?
>
> Best Regards,
> Yan, Zi
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 1:35 ` Zi Yan
2025-09-11 1:38 ` Zi Yan
@ 2025-09-11 3:27 ` Wei Yang
2025-09-11 16:19 ` Zi Yan
1 sibling, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-11 3:27 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
>On 10 Sep 2025, at 21:25, Wei Yang wrote:
>
>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>> isolate_migratepages_block()") converts api from page to folio. But the
>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>> to head page.
>>>
>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>> which means low_pfn only advance one in next iteration. After the
>>> change, low_pfn would advance more than the hugetlb range, since
>>> folio_nr_pages() always return total number of the large page. This
>>> results in skipping some range to isolate and then to migrate.
>>>
>>> The worst case for alloc_contig is it does all the isolation and
>>> migration, but finally find some range is still not isolated. And then
>>> undo all the work and try a new range.
>>>
>>> Advance low_pfn to the end of hugetlb.
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>
>> Forgot to cc stable.
>>
>> Cc: <stable@vger.kernel.org>
>
>Is there any bug report to justify the backport? Since it is more likely
>to be a performance issue instead of a correctness issue.
>
OK, I thought cc-stable is paired with fixes tag.
If not, please drop it.
>>
>>> ---
>>> mm/compaction.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>> index bf021b31c7ec..1e8f8eca318c 100644
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>> * Hugepage was successfully isolated and placed
>>> * on the cc->migratepages list.
>>> */
>>> - low_pfn += folio_nr_pages(folio) - 1;
>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>
>> One question is why we advance compound_nr() in original version.
>>
>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>> on the same large page and do the same thing and advance 1 again.
>>
>> Not sure which part story I missed.
>
>isolate_migratepages_block() starts from the beginning of a pageblock.
>How likely the code hit in the middle of a hugetlb?
>
OK, this is a kind of optimization based on the knowledge it is not likely to
be a tail page?
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 1:50 ` Zi Yan
@ 2025-09-11 3:34 ` Wei Yang
2025-09-11 6:30 ` Wei Yang
0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-11 3:34 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>On 10 Sep 2025, at 21:38, Zi Yan wrote:
>
>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>
>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>
>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>> to head page.
>>>>>
>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>> which means low_pfn only advance one in next iteration. After the
>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>> folio_nr_pages() always return total number of the large page. This
>>>>> results in skipping some range to isolate and then to migrate.
>>>>>
>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>> migration, but finally find some range is still not isolated. And then
>>>>> undo all the work and try a new range.
>>>>>
>>>>> Advance low_pfn to the end of hugetlb.
>>>>>
>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>
>This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>alloc_contig_range handle free hugetlb pages”). The related change is:
>
>+ if (PageHuge(page) && cc->alloc_contig) {
>+ ret = isolate_or_dissolve_huge_page(page);
>+
>+ /*
>+ * Fail isolation in case isolate_or_dissolve_huge_page()
>+ * reports an error. In case of -ENOMEM, abort right away.
>+ */
>+ if (ret < 0) {
>+ /* Do not report -EBUSY down the chain */
>+ if (ret == -EBUSY)
>+ ret = 0;
>+ low_pfn += (1UL << compound_order(page)) - 1;
The compound_order(page) return 1 for a tail page.
See below.
>+ goto isolate_fail;
>+ }
>+
>+ /*
>+ * Ok, the hugepage was dissolved. Now these pages are
>+ * Buddy and cannot be re-allocated because they are
>+ * isolated. Fall-through as the check below handles
>+ * Buddy pages.
>+ */
>+ }
>+
>
>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>
>>>> Forgot to cc stable.
>>>>
>>>> Cc: <stable@vger.kernel.org>
>>>
>>> Is there any bug report to justify the backport? Since it is more likely
>>> to be a performance issue instead of a correctness issue.
>>>
>>>>
>>>>> ---
>>>>> mm/compaction.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>> --- a/mm/compaction.c
>>>>> +++ b/mm/compaction.c
>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>> * Hugepage was successfully isolated and placed
>>>>> * on the cc->migratepages list.
>>>>> */
>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>
>>>> One question is why we advance compound_nr() in original version.
>>>>
>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>> on the same large page and do the same thing and advance 1 again.
>>>>
>>>> Not sure which part story I missed.
>>>
>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>> How likely the code hit in the middle of a hugetlb?
>>>
>>
>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>> in the if (PageHuge(page)), why not change them too if you think
>> page can point to the middle of a hugetlb?
>>
The order here is get from compound_order(page), which is 1 for a tail page.
So it looks right. Maybe I misunderstand it?
>> Best Regards,
>> Yan, Zi
>
>
>Best Regards,
>Yan, Zi
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 3:34 ` Wei Yang
@ 2025-09-11 6:30 ` Wei Yang
2025-09-11 16:28 ` Zi Yan
0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-11 6:30 UTC (permalink / raw)
To: Wei Yang
Cc: Zi Yan, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Thu, Sep 11, 2025 at 03:34:13AM +0000, Wei Yang wrote:
>On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>>On 10 Sep 2025, at 21:38, Zi Yan wrote:
>>
>>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>>
>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>
>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>> to head page.
>>>>>>
>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>
>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>> undo all the work and try a new range.
>>>>>>
>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>
>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>
>>This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>>alloc_contig_range handle free hugetlb pages”). The related change is:
>>
>>+ if (PageHuge(page) && cc->alloc_contig) {
>>+ ret = isolate_or_dissolve_huge_page(page);
>>+
>>+ /*
>>+ * Fail isolation in case isolate_or_dissolve_huge_page()
>>+ * reports an error. In case of -ENOMEM, abort right away.
>>+ */
>>+ if (ret < 0) {
>>+ /* Do not report -EBUSY down the chain */
>>+ if (ret == -EBUSY)
>>+ ret = 0;
>>+ low_pfn += (1UL << compound_order(page)) - 1;
>
>The compound_order(page) return 1 for a tail page.
>
>See below.
>
>>+ goto isolate_fail;
>>+ }
>>+
>>+ /*
>>+ * Ok, the hugepage was dissolved. Now these pages are
>>+ * Buddy and cannot be re-allocated because they are
>>+ * isolated. Fall-through as the check below handles
>>+ * Buddy pages.
>>+ */
>>+ }
>>+
>>
>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>
>>>>> Forgot to cc stable.
>>>>>
>>>>> Cc: <stable@vger.kernel.org>
>>>>
>>>> Is there any bug report to justify the backport? Since it is more likely
>>>> to be a performance issue instead of a correctness issue.
>>>>
>>>>>
>>>>>> ---
>>>>>> mm/compaction.c | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>> --- a/mm/compaction.c
>>>>>> +++ b/mm/compaction.c
>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>> * Hugepage was successfully isolated and placed
>>>>>> * on the cc->migratepages list.
>>>>>> */
>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>
>>>>> One question is why we advance compound_nr() in original version.
>>>>>
>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>
>>>>> Not sure which part story I missed.
>>>>
>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>> How likely the code hit in the middle of a hugetlb?
>>>>
>>>
>>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>>> in the if (PageHuge(page)), why not change them too if you think
>>> page can point to the middle of a hugetlb?
>>>
>
>The order here is get from compound_order(page), which is 1 for a tail page.
>
>So it looks right. Maybe I misunderstand it?
Oops, compound_order(page) returns 0 for tail page.
What I want to say is low_pfn advance by 1 for tail page. Sorry for the
misleading.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 3:27 ` Wei Yang
@ 2025-09-11 16:19 ` Zi Yan
2025-09-11 17:27 ` Vishal Moola (Oracle)
0 siblings, 1 reply; 22+ messages in thread
From: Zi Yan @ 2025-09-11 16:19 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 10 Sep 2025, at 23:27, Wei Yang wrote:
> On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>
>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>> to head page.
>>>>
>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>> which means low_pfn only advance one in next iteration. After the
>>>> change, low_pfn would advance more than the hugetlb range, since
>>>> folio_nr_pages() always return total number of the large page. This
>>>> results in skipping some range to isolate and then to migrate.
>>>>
>>>> The worst case for alloc_contig is it does all the isolation and
>>>> migration, but finally find some range is still not isolated. And then
>>>> undo all the work and try a new range.
>>>>
>>>> Advance low_pfn to the end of hugetlb.
>>>>
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>
>>> Forgot to cc stable.
>>>
>>> Cc: <stable@vger.kernel.org>
>>
>> Is there any bug report to justify the backport? Since it is more likely
>> to be a performance issue instead of a correctness issue.
>>
>
> OK, I thought cc-stable is paired with fixes tag.
>
> If not, please drop it.
>
>>>
>>>> ---
>>>> mm/compaction.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>> --- a/mm/compaction.c
>>>> +++ b/mm/compaction.c
>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>> * Hugepage was successfully isolated and placed
>>>> * on the cc->migratepages list.
>>>> */
>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>
>>> One question is why we advance compound_nr() in original version.
>>>
>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>> on the same large page and do the same thing and advance 1 again.
>>>
>>> Not sure which part story I missed.
>>
>> isolate_migratepages_block() starts from the beginning of a pageblock.
>> How likely the code hit in the middle of a hugetlb?
>>
>
> OK, this is a kind of optimization based on the knowledge it is not likely to
> be a tail page?
No, it might be that most of the time page is the head, or people assume so.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 6:30 ` Wei Yang
@ 2025-09-11 16:28 ` Zi Yan
2025-09-12 0:28 ` Wei Yang
0 siblings, 1 reply; 22+ messages in thread
From: Zi Yan @ 2025-09-11 16:28 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 11 Sep 2025, at 2:30, Wei Yang wrote:
> On Thu, Sep 11, 2025 at 03:34:13AM +0000, Wei Yang wrote:
>> On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>>> On 10 Sep 2025, at 21:38, Zi Yan wrote:
>>>
>>>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>>>
>>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>>
>>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>>> to head page.
>>>>>>>
>>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>>
>>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>>> undo all the work and try a new range.
>>>>>>>
>>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>>
>>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>
>>> This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>>> alloc_contig_range handle free hugetlb pages”). The related change is:
>>>
>>> + if (PageHuge(page) && cc->alloc_contig) {
>>> + ret = isolate_or_dissolve_huge_page(page);
>>> +
>>> + /*
>>> + * Fail isolation in case isolate_or_dissolve_huge_page()
>>> + * reports an error. In case of -ENOMEM, abort right away.
>>> + */
>>> + if (ret < 0) {
>>> + /* Do not report -EBUSY down the chain */
>>> + if (ret == -EBUSY)
>>> + ret = 0;
>>> + low_pfn += (1UL << compound_order(page)) - 1;
>>
>> The compound_order(page) return 1 for a tail page.
>>
>> See below.
>>
>>> + goto isolate_fail;
>>> + }
>>> +
>>> + /*
>>> + * Ok, the hugepage was dissolved. Now these pages are
>>> + * Buddy and cannot be re-allocated because they are
>>> + * isolated. Fall-through as the check below handles
>>> + * Buddy pages.
>>> + */
>>> + }
>>> +
>>>
>>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>>
>>>>>> Forgot to cc stable.
>>>>>>
>>>>>> Cc: <stable@vger.kernel.org>
>>>>>
>>>>> Is there any bug report to justify the backport? Since it is more likely
>>>>> to be a performance issue instead of a correctness issue.
>>>>>
>>>>>>
>>>>>>> ---
>>>>>>> mm/compaction.c | 2 +-
>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>>> --- a/mm/compaction.c
>>>>>>> +++ b/mm/compaction.c
>>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>> * Hugepage was successfully isolated and placed
>>>>>>> * on the cc->migratepages list.
>>>>>>> */
>>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>>
>>>>>> One question is why we advance compound_nr() in original version.
>>>>>>
>>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>>
>>>>>> Not sure which part story I missed.
>>>>>
>>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>>> How likely the code hit in the middle of a hugetlb?
>>>>>
>>>>
>>>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>>>> in the if (PageHuge(page)), why not change them too if you think
>>>> page can point to the middle of a hugetlb?
>>>>
>>
>> The order here is get from compound_order(page), which is 1 for a tail page.
>>
>> So it looks right. Maybe I misunderstand it?
>
> Oops, compound_order(page) returns 0 for tail page.
>
> What I want to say is low_pfn advance by 1 for tail page. Sorry for the
> misleading.
OK, that sounds inefficient and inconsistent with your fix.
While at it, can you also change two “low_pfn += (1UL << order) - 1” to skip
the rest of hugetlb?
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 16:19 ` Zi Yan
@ 2025-09-11 17:27 ` Vishal Moola (Oracle)
2025-09-12 1:07 ` Wei Yang
0 siblings, 1 reply; 22+ messages in thread
From: Vishal Moola (Oracle) @ 2025-09-11 17:27 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Thu, Sep 11, 2025 at 12:19:34PM -0400, Zi Yan wrote:
> On 10 Sep 2025, at 23:27, Wei Yang wrote:
>
> > On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
> >> On 10 Sep 2025, at 21:25, Wei Yang wrote:
> >>
> >>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
> >>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
> >>>> isolate_migratepages_block()") converts api from page to folio. But the
> >>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
> >>>> to head page.
> >>>>
> >>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
> >>>> which means low_pfn only advance one in next iteration. After the
> >>>> change, low_pfn would advance more than the hugetlb range, since
> >>>> folio_nr_pages() always return total number of the large page. This
> >>>> results in skipping some range to isolate and then to migrate.
> >>>>
> >>>> The worst case for alloc_contig is it does all the isolation and
> >>>> migration, but finally find some range is still not isolated. And then
> >>>> undo all the work and try a new range.
> >>>>
> >>>> Advance low_pfn to the end of hugetlb.
> >>>>
> >>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
> >>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
> >>>> Cc: Oscar Salvador <osalvador@suse.de>
> >>>
> >>> Forgot to cc stable.
> >>>
> >>> Cc: <stable@vger.kernel.org>
> >>
> >> Is there any bug report to justify the backport? Since it is more likely
> >> to be a performance issue instead of a correctness issue.
> >>
> >
> > OK, I thought cc-stable is paired with fixes tag.
> >
> > If not, please drop it.
> >
> >>>
> >>>> ---
> >>>> mm/compaction.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/mm/compaction.c b/mm/compaction.c
> >>>> index bf021b31c7ec..1e8f8eca318c 100644
> >>>> --- a/mm/compaction.c
> >>>> +++ b/mm/compaction.c
> >>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >>>> * Hugepage was successfully isolated and placed
> >>>> * on the cc->migratepages list.
> >>>> */
> >>>> - low_pfn += folio_nr_pages(folio) - 1;
> >>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
> >>>
> >>> One question is why we advance compound_nr() in original version.
> >>>
> >>> Yes, there are several places advancing compound_nr(), but it seems to iterate
> >>> on the same large page and do the same thing and advance 1 again.
> >>>
> >>> Not sure which part story I missed.
> >>
> >> isolate_migratepages_block() starts from the beginning of a pageblock.
> >> How likely the code hit in the middle of a hugetlb?
> >>
> >
> > OK, this is a kind of optimization based on the knowledge it is not likely to
> > be a tail page?
>
> No, it might be that most of the time page is the head, or people assume so.
For compound pages, we will always have tail pfn < head pfn, so we should
always find the head page first.
If you did find a case where we somehow encounter a tail page here, I'd
love to see it. And then you'd also want to make sure the other compaction
trackers are appropriately accounted for.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 16:28 ` Zi Yan
@ 2025-09-12 0:28 ` Wei Yang
2025-09-12 1:13 ` Zi Yan
0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-12 0:28 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Thu, Sep 11, 2025 at 12:28:24PM -0400, Zi Yan wrote:
>On 11 Sep 2025, at 2:30, Wei Yang wrote:
>
>> On Thu, Sep 11, 2025 at 03:34:13AM +0000, Wei Yang wrote:
>>> On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>>>> On 10 Sep 2025, at 21:38, Zi Yan wrote:
>>>>
>>>>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>>>>
>>>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>>>
>>>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>>>> to head page.
>>>>>>>>
>>>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>>>
>>>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>>>> undo all the work and try a new range.
>>>>>>>>
>>>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>>>
>>>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>>
>>>> This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>>>> alloc_contig_range handle free hugetlb pages”). The related change is:
>>>>
>>>> + if (PageHuge(page) && cc->alloc_contig) {
>>>> + ret = isolate_or_dissolve_huge_page(page);
>>>> +
>>>> + /*
>>>> + * Fail isolation in case isolate_or_dissolve_huge_page()
>>>> + * reports an error. In case of -ENOMEM, abort right away.
>>>> + */
>>>> + if (ret < 0) {
>>>> + /* Do not report -EBUSY down the chain */
>>>> + if (ret == -EBUSY)
>>>> + ret = 0;
>>>> + low_pfn += (1UL << compound_order(page)) - 1;
>>>
>>> The compound_order(page) return 1 for a tail page.
>>>
>>> See below.
>>>
>>>> + goto isolate_fail;
>>>> + }
>>>> +
>>>> + /*
>>>> + * Ok, the hugepage was dissolved. Now these pages are
>>>> + * Buddy and cannot be re-allocated because they are
>>>> + * isolated. Fall-through as the check below handles
>>>> + * Buddy pages.
>>>> + */
>>>> + }
>>>> +
>>>>
>>>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>>>
>>>>>>> Forgot to cc stable.
>>>>>>>
>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>
>>>>>> Is there any bug report to justify the backport? Since it is more likely
>>>>>> to be a performance issue instead of a correctness issue.
>>>>>>
>>>>>>>
>>>>>>>> ---
>>>>>>>> mm/compaction.c | 2 +-
>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>>>> --- a/mm/compaction.c
>>>>>>>> +++ b/mm/compaction.c
>>>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>>> * Hugepage was successfully isolated and placed
>>>>>>>> * on the cc->migratepages list.
>>>>>>>> */
>>>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>>>
>>>>>>> One question is why we advance compound_nr() in original version.
>>>>>>>
>>>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>>>
>>>>>>> Not sure which part story I missed.
>>>>>>
>>>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>>>> How likely the code hit in the middle of a hugetlb?
>>>>>>
>>>>>
>>>>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>>>>> in the if (PageHuge(page)), why not change them too if you think
>>>>> page can point to the middle of a hugetlb?
>>>>>
>>>
>>> The order here is get from compound_order(page), which is 1 for a tail page.
>>>
>>> So it looks right. Maybe I misunderstand it?
>>
>> Oops, compound_order(page) returns 0 for tail page.
>>
>> What I want to say is low_pfn advance by 1 for tail page. Sorry for the
>> misleading.
>
>OK, that sounds inefficient and inconsistent with your fix.
>
>While at it, can you also change two “low_pfn += (1UL << order) - 1” to skip
>the rest of hugetlb?
>
Sure, glad to.
You prefer do the fix in one patch or have a separate one?
>--
>Best Regards,
>Yan, Zi
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-11 17:27 ` Vishal Moola (Oracle)
@ 2025-09-12 1:07 ` Wei Yang
2025-09-12 1:29 ` Zi Yan
0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-12 1:07 UTC (permalink / raw)
To: Vishal Moola (Oracle)
Cc: Zi Yan, Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Thu, Sep 11, 2025 at 10:27:08AM -0700, Vishal Moola (Oracle) wrote:
>On Thu, Sep 11, 2025 at 12:19:34PM -0400, Zi Yan wrote:
>> On 10 Sep 2025, at 23:27, Wei Yang wrote:
>>
>> > On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
>> >> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>> >>
>> >>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>> >>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>> >>>> isolate_migratepages_block()") converts api from page to folio. But the
>> >>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>> >>>> to head page.
>> >>>>
>> >>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>> >>>> which means low_pfn only advance one in next iteration. After the
>> >>>> change, low_pfn would advance more than the hugetlb range, since
>> >>>> folio_nr_pages() always return total number of the large page. This
>> >>>> results in skipping some range to isolate and then to migrate.
>> >>>>
>> >>>> The worst case for alloc_contig is it does all the isolation and
>> >>>> migration, but finally find some range is still not isolated. And then
>> >>>> undo all the work and try a new range.
>> >>>>
>> >>>> Advance low_pfn to the end of hugetlb.
>> >>>>
>> >>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> >>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>> >>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>> >>>> Cc: Oscar Salvador <osalvador@suse.de>
>> >>>
>> >>> Forgot to cc stable.
>> >>>
>> >>> Cc: <stable@vger.kernel.org>
>> >>
>> >> Is there any bug report to justify the backport? Since it is more likely
>> >> to be a performance issue instead of a correctness issue.
>> >>
>> >
>> > OK, I thought cc-stable is paired with fixes tag.
>> >
>> > If not, please drop it.
>> >
>> >>>
>> >>>> ---
>> >>>> mm/compaction.c | 2 +-
>> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>>>
>> >>>> diff --git a/mm/compaction.c b/mm/compaction.c
>> >>>> index bf021b31c7ec..1e8f8eca318c 100644
>> >>>> --- a/mm/compaction.c
>> >>>> +++ b/mm/compaction.c
>> >>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>> >>>> * Hugepage was successfully isolated and placed
>> >>>> * on the cc->migratepages list.
>> >>>> */
>> >>>> - low_pfn += folio_nr_pages(folio) - 1;
>> >>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>> >>>
>> >>> One question is why we advance compound_nr() in original version.
>> >>>
>> >>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>> >>> on the same large page and do the same thing and advance 1 again.
>> >>>
>> >>> Not sure which part story I missed.
>> >>
>> >> isolate_migratepages_block() starts from the beginning of a pageblock.
>> >> How likely the code hit in the middle of a hugetlb?
>> >>
>> >
>> > OK, this is a kind of optimization based on the knowledge it is not likely to
>> > be a tail page?
>>
>> No, it might be that most of the time page is the head, or people assume so.
>
>For compound pages, we will always have tail pfn < head pfn, so we should
>always find the head page first.
>
I think you want to say tail pfn > head pfn?
>If you did find a case where we somehow encounter a tail page here, I'd
>love to see it. And then you'd also want to make sure the other compaction
>trackers are appropriately accounted for.
I may not follow you here, below is the call flow for
isolate_migratepages_block() invoked during __alloc_contig_pages().
__alloc_contig_pages(nr_pages, ..);
start = ALIGN(zone->zone_start_pfn, nr_pages);
alloc_contig_range_noprof(start, ..);
__alloc_contig_migrate_range(.., start, ..);
pfn = start;
isolate_migratepages_range(.., pfn, ..);
isolate_migratepages_block(.., pfn, ..);
page = pfn_to_page(pfn);
start += nr_pages;
In the loop of __alloc_contig_pages(), it iterate on each nr_pages range. And
nr_pages seems could be any positive number, so it looks the first pfn checked
by isolate_migratepages_block() could be not aligned with page order or less
than MAX_PAGE_ORDER. This mean it could be a tail page per my understanding.
Maybe I missed some point here?
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 0:28 ` Wei Yang
@ 2025-09-12 1:13 ` Zi Yan
2025-09-12 6:00 ` Baolin Wang
0 siblings, 1 reply; 22+ messages in thread
From: Zi Yan @ 2025-09-12 1:13 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 11 Sep 2025, at 20:28, Wei Yang wrote:
> On Thu, Sep 11, 2025 at 12:28:24PM -0400, Zi Yan wrote:
>> On 11 Sep 2025, at 2:30, Wei Yang wrote:
>>
>>> On Thu, Sep 11, 2025 at 03:34:13AM +0000, Wei Yang wrote:
>>>> On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>>>>> On 10 Sep 2025, at 21:38, Zi Yan wrote:
>>>>>
>>>>>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>>>>>
>>>>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>>>>
>>>>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>>>>> to head page.
>>>>>>>>>
>>>>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>>>>
>>>>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>>>>> undo all the work and try a new range.
>>>>>>>>>
>>>>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>>>
>>>>> This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>>>>> alloc_contig_range handle free hugetlb pages”). The related change is:
>>>>>
>>>>> + if (PageHuge(page) && cc->alloc_contig) {
>>>>> + ret = isolate_or_dissolve_huge_page(page);
>>>>> +
>>>>> + /*
>>>>> + * Fail isolation in case isolate_or_dissolve_huge_page()
>>>>> + * reports an error. In case of -ENOMEM, abort right away.
>>>>> + */
>>>>> + if (ret < 0) {
>>>>> + /* Do not report -EBUSY down the chain */
>>>>> + if (ret == -EBUSY)
>>>>> + ret = 0;
>>>>> + low_pfn += (1UL << compound_order(page)) - 1;
>>>>
>>>> The compound_order(page) return 1 for a tail page.
>>>>
>>>> See below.
>>>>
>>>>> + goto isolate_fail;
>>>>> + }
>>>>> +
>>>>> + /*
>>>>> + * Ok, the hugepage was dissolved. Now these pages are
>>>>> + * Buddy and cannot be re-allocated because they are
>>>>> + * isolated. Fall-through as the check below handles
>>>>> + * Buddy pages.
>>>>> + */
>>>>> + }
>>>>> +
>>>>>
>>>>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>>>>
>>>>>>>> Forgot to cc stable.
>>>>>>>>
>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>
>>>>>>> Is there any bug report to justify the backport? Since it is more likely
>>>>>>> to be a performance issue instead of a correctness issue.
>>>>>>>
>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> mm/compaction.c | 2 +-
>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>>>>> --- a/mm/compaction.c
>>>>>>>>> +++ b/mm/compaction.c
>>>>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>>>> * Hugepage was successfully isolated and placed
>>>>>>>>> * on the cc->migratepages list.
>>>>>>>>> */
>>>>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>>>>
>>>>>>>> One question is why we advance compound_nr() in original version.
>>>>>>>>
>>>>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>>>>
>>>>>>>> Not sure which part story I missed.
>>>>>>>
>>>>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>>>>> How likely the code hit in the middle of a hugetlb?
>>>>>>>
>>>>>>
>>>>>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>>>>>> in the if (PageHuge(page)), why not change them too if you think
>>>>>> page can point to the middle of a hugetlb?
>>>>>>
>>>>
>>>> The order here is get from compound_order(page), which is 1 for a tail page.
>>>>
>>>> So it looks right. Maybe I misunderstand it?
>>>
>>> Oops, compound_order(page) returns 0 for tail page.
>>>
>>> What I want to say is low_pfn advance by 1 for tail page. Sorry for the
>>> misleading.
>>
>> OK, that sounds inefficient and inconsistent with your fix.
>>
>> While at it, can you also change two “low_pfn += (1UL << order) - 1” to skip
>> the rest of hugetlb?
>>
>
> Sure, glad to.
>
> You prefer do the fix in one patch or have a separate one?
A separate one is better, since these are optimizations, whereas your current
patch is a fix.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 1:07 ` Wei Yang
@ 2025-09-12 1:29 ` Zi Yan
2025-09-12 17:22 ` Vishal Moola (Oracle)
2025-09-13 0:10 ` Wei Yang
0 siblings, 2 replies; 22+ messages in thread
From: Zi Yan @ 2025-09-12 1:29 UTC (permalink / raw)
To: Wei Yang
Cc: Vishal Moola (Oracle), akpm, vbabka, surenb, mhocko, jackmanb,
hannes, wangkefeng.wang, linux-mm, Oscar Salvador
On 11 Sep 2025, at 21:07, Wei Yang wrote:
> On Thu, Sep 11, 2025 at 10:27:08AM -0700, Vishal Moola (Oracle) wrote:
>> On Thu, Sep 11, 2025 at 12:19:34PM -0400, Zi Yan wrote:
>>> On 10 Sep 2025, at 23:27, Wei Yang wrote:
>>>
>>>> On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
>>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>>
>>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>>> to head page.
>>>>>>>
>>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>>
>>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>>> undo all the work and try a new range.
>>>>>>>
>>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>>
>>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>>
>>>>>> Forgot to cc stable.
>>>>>>
>>>>>> Cc: <stable@vger.kernel.org>
>>>>>
>>>>> Is there any bug report to justify the backport? Since it is more likely
>>>>> to be a performance issue instead of a correctness issue.
>>>>>
>>>>
>>>> OK, I thought cc-stable is paired with fixes tag.
>>>>
>>>> If not, please drop it.
>>>>
>>>>>>
>>>>>>> ---
>>>>>>> mm/compaction.c | 2 +-
>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>>> --- a/mm/compaction.c
>>>>>>> +++ b/mm/compaction.c
>>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>> * Hugepage was successfully isolated and placed
>>>>>>> * on the cc->migratepages list.
>>>>>>> */
>>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>>
>>>>>> One question is why we advance compound_nr() in original version.
>>>>>>
>>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>>
>>>>>> Not sure which part story I missed.
>>>>>
>>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>>> How likely the code hit in the middle of a hugetlb?
>>>>>
>>>>
>>>> OK, this is a kind of optimization based on the knowledge it is not likely to
>>>> be a tail page?
>>>
>>> No, it might be that most of the time page is the head, or people assume so.
>>
>> For compound pages, we will always have tail pfn < head pfn, so we should
>> always find the head page first.
>>
>
> I think you want to say tail pfn > head pfn?
>
>> If you did find a case where we somehow encounter a tail page here, I'd
>> love to see it. And then you'd also want to make sure the other compaction
>> trackers are appropriately accounted for.
>
> I may not follow you here, below is the call flow for
> isolate_migratepages_block() invoked during __alloc_contig_pages().
>
> __alloc_contig_pages(nr_pages, ..);
> start = ALIGN(zone->zone_start_pfn, nr_pages);
> alloc_contig_range_noprof(start, ..);
> __alloc_contig_migrate_range(.., start, ..);
> pfn = start;
> isolate_migratepages_range(.., pfn, ..);
> isolate_migratepages_block(.., pfn, ..);
> page = pfn_to_page(pfn);
> start += nr_pages;
>
> In the loop of __alloc_contig_pages(), it iterate on each nr_pages range. And
> nr_pages seems could be any positive number, so it looks the first pfn checked
> by isolate_migratepages_block() could be not aligned with page order or less
> than MAX_PAGE_ORDER. This mean it could be a tail page per my understanding.
>
> Maybe I missed some point here?
You are right.
But nr_pages cannot be any positive number, since ALIGN only accepts
power of 2 as the alignment. So alloc_contig_pages() might need another
fix to handle the case nr_pages is not power of 2.
Oh, after I checked pfn_range_valid_contig(), I find your example does not
apply, since it returns false when any page in the range is PageHuge.
This means with your example, PageHuge branch will never be executed.
But alloc_contig_range_noprof() is exported and can be used directly,
the @start input can be any pfn, which can be in the middle of PageHuge.
So your fix is still needed for this case.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 1:13 ` Zi Yan
@ 2025-09-12 6:00 ` Baolin Wang
2025-09-13 0:22 ` Zi Yan
0 siblings, 1 reply; 22+ messages in thread
From: Baolin Wang @ 2025-09-12 6:00 UTC (permalink / raw)
To: Zi Yan, Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 2025/9/12 09:13, Zi Yan wrote:
> On 11 Sep 2025, at 20:28, Wei Yang wrote:
>
>> On Thu, Sep 11, 2025 at 12:28:24PM -0400, Zi Yan wrote:
>>> On 11 Sep 2025, at 2:30, Wei Yang wrote:
>>>
>>>> On Thu, Sep 11, 2025 at 03:34:13AM +0000, Wei Yang wrote:
>>>>> On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>>>>>> On 10 Sep 2025, at 21:38, Zi Yan wrote:
>>>>>>
>>>>>>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>>>>>>
>>>>>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>>>>>
>>>>>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>>>>>> to head page.
>>>>>>>>>>
>>>>>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>>>>>
>>>>>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>>>>>> undo all the work and try a new range.
>>>>>>>>>>
>>>>>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>>>>
>>>>>> This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>>>>>> alloc_contig_range handle free hugetlb pages”). The related change is:
>>>>>>
>>>>>> + if (PageHuge(page) && cc->alloc_contig) {
>>>>>> + ret = isolate_or_dissolve_huge_page(page);
>>>>>> +
>>>>>> + /*
>>>>>> + * Fail isolation in case isolate_or_dissolve_huge_page()
>>>>>> + * reports an error. In case of -ENOMEM, abort right away.
>>>>>> + */
>>>>>> + if (ret < 0) {
>>>>>> + /* Do not report -EBUSY down the chain */
>>>>>> + if (ret == -EBUSY)
>>>>>> + ret = 0;
>>>>>> + low_pfn += (1UL << compound_order(page)) - 1;
>>>>>
>>>>> The compound_order(page) return 1 for a tail page.
>>>>>
>>>>> See below.
>>>>>
>>>>>> + goto isolate_fail;
>>>>>> + }
>>>>>> +
>>>>>> + /*
>>>>>> + * Ok, the hugepage was dissolved. Now these pages are
>>>>>> + * Buddy and cannot be re-allocated because they are
>>>>>> + * isolated. Fall-through as the check below handles
>>>>>> + * Buddy pages.
>>>>>> + */
>>>>>> + }
>>>>>> +
>>>>>>
>>>>>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>>>>>
>>>>>>>>> Forgot to cc stable.
>>>>>>>>>
>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>
>>>>>>>> Is there any bug report to justify the backport? Since it is more likely
>>>>>>>> to be a performance issue instead of a correctness issue.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> mm/compaction.c | 2 +-
>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>>>>>> --- a/mm/compaction.c
>>>>>>>>>> +++ b/mm/compaction.c
>>>>>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>>>>> * Hugepage was successfully isolated and placed
>>>>>>>>>> * on the cc->migratepages list.
>>>>>>>>>> */
>>>>>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>>>>>
>>>>>>>>> One question is why we advance compound_nr() in original version.
>>>>>>>>>
>>>>>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>>>>>
>>>>>>>>> Not sure which part story I missed.
>>>>>>>>
>>>>>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>>>>>> How likely the code hit in the middle of a hugetlb?
>>>>>>>>
>>>>>>>
>>>>>>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>>>>>>> in the if (PageHuge(page)), why not change them too if you think
>>>>>>> page can point to the middle of a hugetlb?
>>>>>>>
>>>>>
>>>>> The order here is get from compound_order(page), which is 1 for a tail page.
>>>>>
>>>>> So it looks right. Maybe I misunderstand it?
>>>>
>>>> Oops, compound_order(page) returns 0 for tail page.
>>>>
>>>> What I want to say is low_pfn advance by 1 for tail page. Sorry for the
>>>> misleading.
>>>
>>> OK, that sounds inefficient and inconsistent with your fix.
>>>
>>> While at it, can you also change two “low_pfn += (1UL << order) - 1” to skip
>>> the rest of hugetlb?
>>>
>>
>> Sure, glad to.
>>
>> You prefer do the fix in one patch or have a separate one?
>
> A separate one is better, since these are optimizations, whereas your current
> patch is a fix.
I'm afraid we should not do that. Because the order getting from
compound_order(page) is not stable without lock protection for a
hugetlb, This is why we add a sanity check 'if (order <=
MAX_PAGE_ORDER)' before advancing low_pfn.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 1:29 ` Zi Yan
@ 2025-09-12 17:22 ` Vishal Moola (Oracle)
2025-09-13 0:11 ` Wei Yang
2025-09-13 0:10 ` Wei Yang
1 sibling, 1 reply; 22+ messages in thread
From: Vishal Moola (Oracle) @ 2025-09-12 17:22 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Thu, Sep 11, 2025 at 09:29:39PM -0400, Zi Yan wrote:
> On 11 Sep 2025, at 21:07, Wei Yang wrote:
>
> > On Thu, Sep 11, 2025 at 10:27:08AM -0700, Vishal Moola (Oracle) wrote:
> >> On Thu, Sep 11, 2025 at 12:19:34PM -0400, Zi Yan wrote:
> >>> On 10 Sep 2025, at 23:27, Wei Yang wrote:
> >>>
> >>>> On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
> >>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
> >>>>>
> >>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
> >>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
> >>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
> >>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
> >>>>>>> to head page.
> >>>>>>>
> >>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
> >>>>>>> which means low_pfn only advance one in next iteration. After the
> >>>>>>> change, low_pfn would advance more than the hugetlb range, since
> >>>>>>> folio_nr_pages() always return total number of the large page. This
> >>>>>>> results in skipping some range to isolate and then to migrate.
> >>>>>>>
> >>>>>>> The worst case for alloc_contig is it does all the isolation and
> >>>>>>> migration, but finally find some range is still not isolated. And then
> >>>>>>> undo all the work and try a new range.
> >>>>>>>
> >>>>>>> Advance low_pfn to the end of hugetlb.
> >>>>>>>
> >>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
> >>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
> >>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
> >>>>>>
> >>>>>> Forgot to cc stable.
> >>>>>>
> >>>>>> Cc: <stable@vger.kernel.org>
> >>>>>
> >>>>> Is there any bug report to justify the backport? Since it is more likely
> >>>>> to be a performance issue instead of a correctness issue.
> >>>>>
> >>>>
> >>>> OK, I thought cc-stable is paired with fixes tag.
> >>>>
> >>>> If not, please drop it.
> >>>>
> >>>>>>
> >>>>>>> ---
> >>>>>>> mm/compaction.c | 2 +-
> >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
> >>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
> >>>>>>> --- a/mm/compaction.c
> >>>>>>> +++ b/mm/compaction.c
> >>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >>>>>>> * Hugepage was successfully isolated and placed
> >>>>>>> * on the cc->migratepages list.
> >>>>>>> */
> >>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
> >>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
> >>>>>>
> >>>>>> One question is why we advance compound_nr() in original version.
> >>>>>>
> >>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
> >>>>>> on the same large page and do the same thing and advance 1 again.
> >>>>>>
> >>>>>> Not sure which part story I missed.
> >>>>>
> >>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
> >>>>> How likely the code hit in the middle of a hugetlb?
> >>>>>
> >>>>
> >>>> OK, this is a kind of optimization based on the knowledge it is not likely to
> >>>> be a tail page?
> >>>
> >>> No, it might be that most of the time page is the head, or people assume so.
> >>
> >> For compound pages, we will always have tail pfn < head pfn, so we should
> >> always find the head page first.
> >>
> >
> > I think you want to say tail pfn > head pfn?
> >
> >> If you did find a case where we somehow encounter a tail page here, I'd
> >> love to see it. And then you'd also want to make sure the other compaction
> >> trackers are appropriately accounted for.
> >
> > I may not follow you here, below is the call flow for
> > isolate_migratepages_block() invoked during __alloc_contig_pages().
> >
> > __alloc_contig_pages(nr_pages, ..);
> > start = ALIGN(zone->zone_start_pfn, nr_pages);
> > alloc_contig_range_noprof(start, ..);
> > __alloc_contig_migrate_range(.., start, ..);
> > pfn = start;
> > isolate_migratepages_range(.., pfn, ..);
> > isolate_migratepages_block(.., pfn, ..);
> > page = pfn_to_page(pfn);
> > start += nr_pages;
> >
> > In the loop of __alloc_contig_pages(), it iterate on each nr_pages range. And
> > nr_pages seems could be any positive number, so it looks the first pfn checked
> > by isolate_migratepages_block() could be not aligned with page order or less
> > than MAX_PAGE_ORDER. This mean it could be a tail page per my understanding.
> >
> > Maybe I missed some point here?
That looks right, I missed the comment:
/* Scan block by block. First and last block may be incomplete */
> You are right.
>
> But nr_pages cannot be any positive number, since ALIGN only accepts
> power of 2 as the alignment. So alloc_contig_pages() might need another
> fix to handle the case nr_pages is not power of 2.
>
> Oh, after I checked pfn_range_valid_contig(), I find your example does not
> apply, since it returns false when any page in the range is PageHuge.
> This means with your example, PageHuge branch will never be executed.
> But alloc_contig_range_noprof() is exported and can be used directly,
> the @start input can be any pfn, which can be in the middle of PageHuge.
> So your fix is still needed for this case.
Makes sense to me.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 1:29 ` Zi Yan
2025-09-12 17:22 ` Vishal Moola (Oracle)
@ 2025-09-13 0:10 ` Wei Yang
2025-09-23 2:35 ` Andrew Morton
1 sibling, 1 reply; 22+ messages in thread
From: Wei Yang @ 2025-09-13 0:10 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, Vishal Moola (Oracle), akpm, vbabka, surenb, mhocko,
jackmanb, hannes, wangkefeng.wang, linux-mm, Oscar Salvador
On Thu, Sep 11, 2025 at 09:29:39PM -0400, Zi Yan wrote:
[...]
>>>
>>> For compound pages, we will always have tail pfn < head pfn, so we should
>>> always find the head page first.
>>>
>>
>> I think you want to say tail pfn > head pfn?
>>
>>> If you did find a case where we somehow encounter a tail page here, I'd
>>> love to see it. And then you'd also want to make sure the other compaction
>>> trackers are appropriately accounted for.
>>
>> I may not follow you here, below is the call flow for
>> isolate_migratepages_block() invoked during __alloc_contig_pages().
>>
>> __alloc_contig_pages(nr_pages, ..);
>> start = ALIGN(zone->zone_start_pfn, nr_pages);
>> alloc_contig_range_noprof(start, ..);
>> __alloc_contig_migrate_range(.., start, ..);
>> pfn = start;
>> isolate_migratepages_range(.., pfn, ..);
>> isolate_migratepages_block(.., pfn, ..);
>> page = pfn_to_page(pfn);
>> start += nr_pages;
>>
>> In the loop of __alloc_contig_pages(), it iterate on each nr_pages range. And
>> nr_pages seems could be any positive number, so it looks the first pfn checked
>> by isolate_migratepages_block() could be not aligned with page order or less
>> than MAX_PAGE_ORDER. This mean it could be a tail page per my understanding.
>>
>> Maybe I missed some point here?
>
>You are right.
>
>But nr_pages cannot be any positive number, since ALIGN only accepts
>power of 2 as the alignment. So alloc_contig_pages() might need another
>fix to handle the case nr_pages is not power of 2.
>
You are right, I missed ALIGN() requirement.
So here we should use roundup().
>Oh, after I checked pfn_range_valid_contig(), I find your example does not
>apply, since it returns false when any page in the range is PageHuge.
You are right, here hide some magic...
>This means with your example, PageHuge branch will never be executed.
>But alloc_contig_range_noprof() is exported and can be used directly,
>the @start input can be any pfn, which can be in the middle of PageHuge.
>So your fix is still needed for this case.
>
Then the next question is why we filter PageHuge() in
pfn_range_valid_contig(). Looks we can handle PageHuge() by isolating and
migrate it.
>--
>Best Regards,
>Yan, Zi
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 17:22 ` Vishal Moola (Oracle)
@ 2025-09-13 0:11 ` Wei Yang
0 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2025-09-13 0:11 UTC (permalink / raw)
To: Vishal Moola (Oracle)
Cc: Zi Yan, Wei Yang, akpm, vbabka, surenb, mhocko, jackmanb, hannes,
wangkefeng.wang, linux-mm, Oscar Salvador
On Fri, Sep 12, 2025 at 10:22:01AM -0700, Vishal Moola (Oracle) wrote:
[...]
>> >>>>>>> ---
>> >>>>>>> mm/compaction.c | 2 +-
>> >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>>>>>>
>> >>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>> >>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>> >>>>>>> --- a/mm/compaction.c
>> >>>>>>> +++ b/mm/compaction.c
>> >>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>> >>>>>>> * Hugepage was successfully isolated and placed
>> >>>>>>> * on the cc->migratepages list.
>> >>>>>>> */
>> >>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>> >>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>> >>>>>>
>> >>>>>> One question is why we advance compound_nr() in original version.
>> >>>>>>
>> >>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>> >>>>>> on the same large page and do the same thing and advance 1 again.
>> >>>>>>
>> >>>>>> Not sure which part story I missed.
>> >>>>>
>> >>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>> >>>>> How likely the code hit in the middle of a hugetlb?
>> >>>>>
>> >>>>
>> >>>> OK, this is a kind of optimization based on the knowledge it is not likely to
>> >>>> be a tail page?
>> >>>
>> >>> No, it might be that most of the time page is the head, or people assume so.
>> >>
>> >> For compound pages, we will always have tail pfn < head pfn, so we should
>> >> always find the head page first.
>> >>
>> >
>> > I think you want to say tail pfn > head pfn?
>> >
>> >> If you did find a case where we somehow encounter a tail page here, I'd
>> >> love to see it. And then you'd also want to make sure the other compaction
>> >> trackers are appropriately accounted for.
>> >
>> > I may not follow you here, below is the call flow for
>> > isolate_migratepages_block() invoked during __alloc_contig_pages().
>> >
>> > __alloc_contig_pages(nr_pages, ..);
>> > start = ALIGN(zone->zone_start_pfn, nr_pages);
>> > alloc_contig_range_noprof(start, ..);
>> > __alloc_contig_migrate_range(.., start, ..);
>> > pfn = start;
>> > isolate_migratepages_range(.., pfn, ..);
>> > isolate_migratepages_block(.., pfn, ..);
>> > page = pfn_to_page(pfn);
>> > start += nr_pages;
>> >
>> > In the loop of __alloc_contig_pages(), it iterate on each nr_pages range. And
>> > nr_pages seems could be any positive number, so it looks the first pfn checked
>> > by isolate_migratepages_block() could be not aligned with page order or less
>> > than MAX_PAGE_ORDER. This mean it could be a tail page per my understanding.
>> >
>> > Maybe I missed some point here?
>
>That looks right, I missed the comment:
> /* Scan block by block. First and last block may be incomplete */
>
>> You are right.
>>
>> But nr_pages cannot be any positive number, since ALIGN only accepts
>> power of 2 as the alignment. So alloc_contig_pages() might need another
>> fix to handle the case nr_pages is not power of 2.
>>
>> Oh, after I checked pfn_range_valid_contig(), I find your example does not
>> apply, since it returns false when any page in the range is PageHuge.
>> This means with your example, PageHuge branch will never be executed.
>> But alloc_contig_range_noprof() is exported and can be used directly,
>> the @start input can be any pfn, which can be in the middle of PageHuge.
>> So your fix is still needed for this case.
>
>Makes sense to me.
Thanks :-)
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-12 6:00 ` Baolin Wang
@ 2025-09-13 0:22 ` Zi Yan
0 siblings, 0 replies; 22+ messages in thread
From: Zi Yan @ 2025-09-13 0:22 UTC (permalink / raw)
To: Baolin Wang, Wei Yang
Cc: akpm, vbabka, surenb, mhocko, jackmanb, hannes, wangkefeng.wang,
linux-mm, Oscar Salvador
On 12 Sep 2025, at 2:00, Baolin Wang wrote:
> On 2025/9/12 09:13, Zi Yan wrote:
>> On 11 Sep 2025, at 20:28, Wei Yang wrote:
>>
>>> On Thu, Sep 11, 2025 at 12:28:24PM -0400, Zi Yan wrote:
>>>> On 11 Sep 2025, at 2:30, Wei Yang wrote:
>>>>
>>>>> On Thu, Sep 11, 2025 at 03:34:13AM +0000, Wei Yang wrote:
>>>>>> On Wed, Sep 10, 2025 at 09:50:09PM -0400, Zi Yan wrote:
>>>>>>> On 10 Sep 2025, at 21:38, Zi Yan wrote:
>>>>>>>
>>>>>>>> On 10 Sep 2025, at 21:35, Zi Yan wrote:
>>>>>>>>
>>>>>>>>> On 10 Sep 2025, at 21:25, Wei Yang wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
>>>>>>>>>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
>>>>>>>>>>> isolate_migratepages_block()") converts api from page to folio. But the
>>>>>>>>>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
>>>>>>>>>>> to head page.
>>>>>>>>>>>
>>>>>>>>>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
>>>>>>>>>>> which means low_pfn only advance one in next iteration. After the
>>>>>>>>>>> change, low_pfn would advance more than the hugetlb range, since
>>>>>>>>>>> folio_nr_pages() always return total number of the large page. This
>>>>>>>>>>> results in skipping some range to isolate and then to migrate.
>>>>>>>>>>>
>>>>>>>>>>> The worst case for alloc_contig is it does all the isolation and
>>>>>>>>>>> migration, but finally find some range is still not isolated. And then
>>>>>>>>>>> undo all the work and try a new range.
>>>>>>>>>>>
>>>>>>>>>>> Advance low_pfn to the end of hugetlb.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>>>>>>>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
>>>>>>>
>>>>>>> This behavior seems to be introduced by commit 369fa227c219 ("mm: make
>>>>>>> alloc_contig_range handle free hugetlb pages”). The related change is:
>>>>>>>
>>>>>>> + if (PageHuge(page) && cc->alloc_contig) {
>>>>>>> + ret = isolate_or_dissolve_huge_page(page);
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Fail isolation in case isolate_or_dissolve_huge_page()
>>>>>>> + * reports an error. In case of -ENOMEM, abort right away.
>>>>>>> + */
>>>>>>> + if (ret < 0) {
>>>>>>> + /* Do not report -EBUSY down the chain */
>>>>>>> + if (ret == -EBUSY)
>>>>>>> + ret = 0;
>>>>>>> + low_pfn += (1UL << compound_order(page)) - 1;
>>>>>>
>>>>>> The compound_order(page) return 1 for a tail page.
>>>>>>
>>>>>> See below.
>>>>>>
>>>>>>> + goto isolate_fail;
>>>>>>> + }
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Ok, the hugepage was dissolved. Now these pages are
>>>>>>> + * Buddy and cannot be re-allocated because they are
>>>>>>> + * isolated. Fall-through as the check below handles
>>>>>>> + * Buddy pages.
>>>>>>> + */
>>>>>>> + }
>>>>>>> +
>>>>>>>
>>>>>>>>>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>>>>>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>>>>>>>
>>>>>>>>>> Forgot to cc stable.
>>>>>>>>>>
>>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>>
>>>>>>>>> Is there any bug report to justify the backport? Since it is more likely
>>>>>>>>> to be a performance issue instead of a correctness issue.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>> mm/compaction.c | 2 +-
>>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>>>>>>>>>> index bf021b31c7ec..1e8f8eca318c 100644
>>>>>>>>>>> --- a/mm/compaction.c
>>>>>>>>>>> +++ b/mm/compaction.c
>>>>>>>>>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>>>>>>>>> * Hugepage was successfully isolated and placed
>>>>>>>>>>> * on the cc->migratepages list.
>>>>>>>>>>> */
>>>>>>>>>>> - low_pfn += folio_nr_pages(folio) - 1;
>>>>>>>>>>> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
>>>>>>>>>>
>>>>>>>>>> One question is why we advance compound_nr() in original version.
>>>>>>>>>>
>>>>>>>>>> Yes, there are several places advancing compound_nr(), but it seems to iterate
>>>>>>>>>> on the same large page and do the same thing and advance 1 again.
>>>>>>>>>>
>>>>>>>>>> Not sure which part story I missed.
>>>>>>>>>
>>>>>>>>> isolate_migratepages_block() starts from the beginning of a pageblock.
>>>>>>>>> How likely the code hit in the middle of a hugetlb?
>>>>>>>>>
>>>>>>>>
>>>>>>>> In addition, there are two other “low_pfn += (1UL << order) - 1”
>>>>>>>> in the if (PageHuge(page)), why not change them too if you think
>>>>>>>> page can point to the middle of a hugetlb?
>>>>>>>>
>>>>>>
>>>>>> The order here is get from compound_order(page), which is 1 for a tail page.
>>>>>>
>>>>>> So it looks right. Maybe I misunderstand it?
>>>>>
>>>>> Oops, compound_order(page) returns 0 for tail page.
>>>>>
>>>>> What I want to say is low_pfn advance by 1 for tail page. Sorry for the
>>>>> misleading.
>>>>
>>>> OK, that sounds inefficient and inconsistent with your fix.
>>>>
>>>> While at it, can you also change two “low_pfn += (1UL << order) - 1” to skip
>>>> the rest of hugetlb?
>>>>
>>>
>>> Sure, glad to.
>>>
>>> You prefer do the fix in one patch or have a separate one?
>>
>> A separate one is better, since these are optimizations, whereas your current
>> patch is a fix.
>
> I'm afraid we should not do that. Because the order getting from compound_order(page) is not stable without lock protection for a hugetlb, This is why we add a sanity check 'if (order <= MAX_PAGE_ORDER)' before advancing low_pfn.
But if page is not the head, "low_pfn += (1UL << order) - 1" will skip
more than the hugetlb. But it is OK, since these two are isolate_fail
cases and make alloc_contig = true fail anyway.
OK. It seems that we should leave two "low_pfn += (1UL << order) - 1" unchanged.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-13 0:10 ` Wei Yang
@ 2025-09-23 2:35 ` Andrew Morton
2025-09-23 2:41 ` Zi Yan
0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2025-09-23 2:35 UTC (permalink / raw)
To: Wei Yang
Cc: Zi Yan, Vishal Moola (Oracle), vbabka, surenb, mhocko, jackmanb,
hannes, wangkefeng.wang, linux-mm, Oscar Salvador
There has been a lot of discussion here, little clarity and no recorded
acks. I'm inclined to just drop the patch and to ask that issue be
reopened.
Comments?
Thanks.
From: Wei Yang <richard.weiyang@gmail.com>
Subject: mm/compaction: fix low_pfn advance on isolating hugetlb
Date: Wed, 10 Sep 2025 09:22:40 +0000
Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
isolate_migratepages_block()") converts api from page to folio. But the
low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point to
head page.
Originally, if page is a hugetlb tail page, compound_nr() return 1, which
means low_pfn only advance one in next iteration. After the change,
low_pfn would advance more than the hugetlb range, since folio_nr_pages()
always return total number of the large page. This results in skipping
some range to isolate and then to migrate.
The worst case for alloc_contig is it does all the isolation and
migration, but finally find some range is still not isolated. And then
undo all the work and try a new range.
Advance low_pfn to the end of hugetlb.
Link: https://lkml.kernel.org/r/20250910092240.3981-1-richard.weiyang@gmail.com
Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/compaction.c~mm-compaction-fix-low_pfn-advance-on-isolating-hugetlb
+++ a/mm/compaction.c
@@ -989,7 +989,7 @@ isolate_migratepages_block(struct compac
* Hugepage was successfully isolated and placed
* on the cc->migratepages list.
*/
- low_pfn += folio_nr_pages(folio) - 1;
+ low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
goto isolate_success_no_list;
}
_
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
2025-09-23 2:35 ` Andrew Morton
@ 2025-09-23 2:41 ` Zi Yan
0 siblings, 0 replies; 22+ messages in thread
From: Zi Yan @ 2025-09-23 2:41 UTC (permalink / raw)
To: Andrew Morton
Cc: Wei Yang, Vishal Moola (Oracle), vbabka, surenb, mhocko, jackmanb,
hannes, wangkefeng.wang, linux-mm, Oscar Salvador
On 22 Sep 2025, at 22:35, Andrew Morton wrote:
> There has been a lot of discussion here, little clarity and no recorded
> acks. I'm inclined to just drop the patch and to ask that issue be
> reopened.
>
> Comments?
>
> Thanks.
>
>
> From: Wei Yang <richard.weiyang@gmail.com>
> Subject: mm/compaction: fix low_pfn advance on isolating hugetlb
> Date: Wed, 10 Sep 2025 09:22:40 +0000
>
> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
> isolate_migratepages_block()") converts api from page to folio. But the
> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point to
> head page.
>
> Originally, if page is a hugetlb tail page, compound_nr() return 1, which
> means low_pfn only advance one in next iteration. After the change,
> low_pfn would advance more than the hugetlb range, since folio_nr_pages()
> always return total number of the large page. This results in skipping
> some range to isolate and then to migrate.
>
> The worst case for alloc_contig is it does all the isolation and
> migration, but finally find some range is still not isolated. And then
> undo all the work and try a new range.
>
> Advance low_pfn to the end of hugetlb.
>
> Link: https://lkml.kernel.org/r/20250910092240.3981-1-richard.weiyang@gmail.com
> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Brendan Jackman <jackmanb@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/mm/compaction.c~mm-compaction-fix-low_pfn-advance-on-isolating-hugetlb
> +++ a/mm/compaction.c
> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compac
> * Hugepage was successfully isolated and placed
> * on the cc->migratepages list.
> */
> - low_pfn += folio_nr_pages(folio) - 1;
> + low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
> goto isolate_success_no_list;
> }
>
> _
Acked-by: Zi Yan <ziy@nvidia.com>
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-09-23 2:41 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-10 9:22 [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb Wei Yang
2025-09-11 1:25 ` Wei Yang
2025-09-11 1:35 ` Zi Yan
2025-09-11 1:38 ` Zi Yan
2025-09-11 1:50 ` Zi Yan
2025-09-11 3:34 ` Wei Yang
2025-09-11 6:30 ` Wei Yang
2025-09-11 16:28 ` Zi Yan
2025-09-12 0:28 ` Wei Yang
2025-09-12 1:13 ` Zi Yan
2025-09-12 6:00 ` Baolin Wang
2025-09-13 0:22 ` Zi Yan
2025-09-11 3:27 ` Wei Yang
2025-09-11 16:19 ` Zi Yan
2025-09-11 17:27 ` Vishal Moola (Oracle)
2025-09-12 1:07 ` Wei Yang
2025-09-12 1:29 ` Zi Yan
2025-09-12 17:22 ` Vishal Moola (Oracle)
2025-09-13 0:11 ` Wei Yang
2025-09-13 0:10 ` Wei Yang
2025-09-23 2:35 ` Andrew Morton
2025-09-23 2:41 ` Zi Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).