* [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
@ 2025-08-28 9:16 Wei Yang
2025-08-29 3:02 ` Zi Yan
0 siblings, 1 reply; 8+ messages in thread
From: Wei Yang @ 2025-08-28 9:16 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Wei Yang, Johannes Weiner, Zi Yan, Vlastimil Babka,
David Hildenbrand
We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
buddy. While if the order is less than start_pfn aligned order, we would
get the same pfn and do the same check again.
Iterate from start_pfn aligned order to reduce duplicated work.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
---
I build this and run, but not sure how fully test this.
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 27ea4c7acd15..7f2dfd30106f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page,
/* Look for a buddy that straddles start_pfn */
static unsigned long find_large_buddy(unsigned long start_pfn)
{
- int order = 0;
+ int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER;
struct page *page;
unsigned long pfn = start_pfn;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-28 9:16 [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order Wei Yang
@ 2025-08-29 3:02 ` Zi Yan
2025-08-30 1:25 ` Wei Yang
2025-08-30 2:15 ` Wei Yang
0 siblings, 2 replies; 8+ messages in thread
From: Zi Yan @ 2025-08-29 3:02 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, linux-mm, Johannes Weiner, Vlastimil Babka,
David Hildenbrand
On 28 Aug 2025, at 5:16, Wei Yang wrote:
> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
> buddy. While if the order is less than start_pfn aligned order, we would
> get the same pfn and do the same check again.
>
> Iterate from start_pfn aligned order to reduce duplicated work.
>
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: David Hildenbrand <david@redhat.com>
>
> ---
> I build this and run, but not sure how fully test this.
> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 27ea4c7acd15..7f2dfd30106f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page,
> /* Look for a buddy that straddles start_pfn */
> static unsigned long find_large_buddy(unsigned long start_pfn)
> {
> - int order = 0;
> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER;
> struct page *page;
> unsigned long pfn = start_pfn;
I think it is right, but the code is very subtle and hard to understand
after the change. It is better to add comment to explain it.
Paste the code below for more context:
while (!PageBuddy(page = pfn_to_page(pfn))) {
/* Nothing found */
if (++order > MAX_PAGE_ORDER)
return start_pfn;
pfn &= ~0UL << order;
}
The code tries to find a PageBuddy starting from start_pfn starting from
order=0. When entering the while loop, it means PageBuddy cannot be order-0
and ++order increases the order by 1. Your change fast forwards the process
based on start_pfn. If start_pfn is not an order-0 page, based on first
set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy
order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn)
and it works because "if (++order > MAX_PAGE_ORDER)" increases order
to __ffs(start_pfn) + 1.
Can you add a comment on your "int order = ..."? Something like:
If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn)
to remove unnecessary work in the while below.
Feel free to reword the above.
With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com>
BTW, I also notice that when start_pfn is an order-0 PageBuddy, the
"if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true
even if there is no buddy straddles start_pfn, although "return pfn"
gives the same results as "return start_pfn" (no straddle). The original
code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm:
page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn
before the straddle check, so the correct code should check start_pfn == pfn
and return early. But since current code is functionally equivalent.
Maybe adding a comment about it would be sufficient. Something like:
When the found buddy order is 0, the check would give false positive,
but the returned result is still correct, since pfn is the same as start_pfn.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-29 3:02 ` Zi Yan
@ 2025-08-30 1:25 ` Wei Yang
2025-08-30 3:20 ` Vishal Moola (Oracle)
2025-08-31 1:28 ` Zi Yan
2025-08-30 2:15 ` Wei Yang
1 sibling, 2 replies; 8+ messages in thread
From: Wei Yang @ 2025-08-30 1:25 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Vlastimil Babka,
David Hildenbrand
On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote:
>On 28 Aug 2025, at 5:16, Wei Yang wrote:
>
>> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
>> buddy. While if the order is less than start_pfn aligned order, we would
>> get the same pfn and do the same check again.
>>
>> Iterate from start_pfn aligned order to reduce duplicated work.
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: David Hildenbrand <david@redhat.com>
>>
>> ---
>> I build this and run, but not sure how fully test this.
>> ---
>> mm/page_alloc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 27ea4c7acd15..7f2dfd30106f 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page,
>> /* Look for a buddy that straddles start_pfn */
>> static unsigned long find_large_buddy(unsigned long start_pfn)
>> {
>> - int order = 0;
>> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER;
>> struct page *page;
>> unsigned long pfn = start_pfn;
>
Hi, Zi Yan
Thanks for the review.
>I think it is right, but the code is very subtle and hard to understand
>after the change. It is better to add comment to explain it.
One thing I want to point out is in __move_freepages_block_isolate(),
find_large_buddy() is always given a pageblock aligned start_pfn. This means
if start_pfn is not a free page, it would always try 10 times until give up.
>
>Paste the code below for more context:
>
> while (!PageBuddy(page = pfn_to_page(pfn))) {
> /* Nothing found */
> if (++order > MAX_PAGE_ORDER)
> return start_pfn;
> pfn &= ~0UL << order;
> }
>
>
>
>The code tries to find a PageBuddy starting from start_pfn starting from
>order=0. When entering the while loop, it means PageBuddy cannot be order-0
>and ++order increases the order by 1. Your change fast forwards the process
>based on start_pfn. If start_pfn is not an order-0 page, based on first
>set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy
>order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn)
>and it works because "if (++order > MAX_PAGE_ORDER)" increases order
>to __ffs(start_pfn) + 1.
>
>Can you add a comment on your "int order = ..."? Something like:
>
>If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
>has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn)
>to remove unnecessary work in the while below.
Sure, I would add a comment above the assignment.
But in my mind, we are not farst forward order, but start check from order of
__ffs(start_pfn).
How about: (not good at commento)
/*
* We start find large buddy from start_pfn order, since a
* !PageBuddy() means all lower order page is !PageBuddy().
*/
>
>Feel free to reword the above.
>
>With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com>
>
>
>BTW, I also notice that when start_pfn is an order-0 PageBuddy, the
>"if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true
>even if there is no buddy straddles start_pfn, although "return pfn"
>gives the same results as "return start_pfn" (no straddle). The original
>code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm:
>page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn
>before the straddle check, so the correct code should check start_pfn == pfn
>and return early. But since current code is functionally equivalent.
>Maybe adding a comment about it would be sufficient. Something like:
The comment above this function says "a buddy that straddles start_pfn", this
looks good to me. An order-0 page that start from start_pfn also means
straddle start_pfn.
This may differ from original logic, but seems not wrong.
>
>When the found buddy order is 0, the check would give false positive,
>but the returned result is still correct, since pfn is the same as start_pfn.
>
I prefer not adding this.
>
>
>--
>Best Regards,
>Yan, Zi
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-29 3:02 ` Zi Yan
2025-08-30 1:25 ` Wei Yang
@ 2025-08-30 2:15 ` Wei Yang
1 sibling, 0 replies; 8+ messages in thread
From: Wei Yang @ 2025-08-30 2:15 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Vlastimil Babka,
David Hildenbrand
On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote:
[...]
>
>BTW, I also notice that when start_pfn is an order-0 PageBuddy, the
>"if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true
>even if there is no buddy straddles start_pfn, although "return pfn"
>gives the same results as "return start_pfn" (no straddle). The original
>code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm:
>page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn
>before the straddle check, so the correct code should check start_pfn == pfn
>and return early. But since current code is functionally equivalent.
>Maybe adding a comment about it would be sufficient. Something like:
>
>When the found buddy order is 0, the check would give false positive,
>but the returned result is still correct, since pfn is the same as start_pfn.
Hm... found another thing.
In __move_freepages_block_isolate(), when checking "We're the starting block
of a large buddy", variable "page" may not be the large buddy page we found
from find_large_buddy(). Since prep_move_freepages_block() set start_pfn to
pageblock_start_pfn(page_to_pfn(page)), which may be different.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-30 1:25 ` Wei Yang
@ 2025-08-30 3:20 ` Vishal Moola (Oracle)
2025-08-30 7:48 ` Wei Yang
2025-08-31 1:28 ` Zi Yan
1 sibling, 1 reply; 8+ messages in thread
From: Vishal Moola (Oracle) @ 2025-08-30 3:20 UTC (permalink / raw)
To: Wei Yang
Cc: Zi Yan, akpm, linux-mm, Johannes Weiner, Vlastimil Babka,
David Hildenbrand
On Sat, Aug 30, 2025 at 01:25:05AM +0000, Wei Yang wrote:
> On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote:
> >On 28 Aug 2025, at 5:16, Wei Yang wrote:
> >
> >> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
> >> buddy. While if the order is less than start_pfn aligned order, we would
> >> get the same pfn and do the same check again.
> >>
> >> Iterate from start_pfn aligned order to reduce duplicated work.
> >>
> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >> Cc: Johannes Weiner <hannes@cmpxchg.org>
> >> Cc: Zi Yan <ziy@nvidia.com>
> >> Cc: Vlastimil Babka <vbabka@suse.cz>
> >> Cc: David Hildenbrand <david@redhat.com>
> >>
> >> ---
> >I think it is right, but the code is very subtle and hard to understand
> >after the change. It is better to add comment to explain it.
>
> One thing I want to point out is in __move_freepages_block_isolate(),
> find_large_buddy() is always given a pageblock aligned start_pfn. This means
> if start_pfn is not a free page, it would always try 10 times until give up.
>
> >
> >Paste the code below for more context:
> >
> > while (!PageBuddy(page = pfn_to_page(pfn))) {
> > /* Nothing found */
> > if (++order > MAX_PAGE_ORDER)
> > return start_pfn;
> > pfn &= ~0UL << order;
> > }
> >
> >
> >
> >The code tries to find a PageBuddy starting from start_pfn starting from
> >order=0. When entering the while loop, it means PageBuddy cannot be order-0
> >and ++order increases the order by 1. Your change fast forwards the process
> >based on start_pfn. If start_pfn is not an order-0 page, based on first
> >set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy
> >order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn)
> >and it works because "if (++order > MAX_PAGE_ORDER)" increases order
> >to __ffs(start_pfn) + 1.
> >
> >Can you add a comment on your "int order = ..."? Something like:
> >
> >If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
> >has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn)
> >to remove unnecessary work in the while below.
>
> Sure, I would add a comment above the assignment.
>
> But in my mind, we are not farst forward order, but start check from order of
> __ffs(start_pfn).
>
>
> How about: (not good at commento)
>
> /*
> * We start find large buddy from start_pfn order, since a
> * !PageBuddy() means all lower order page is !PageBuddy().
> */
Personally, the way Zi worded it makes more sense to me. Maybe replacing
the second sentence with your comment is fine, but I think Zi's first
sentence provides important context in a clear way.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-30 3:20 ` Vishal Moola (Oracle)
@ 2025-08-30 7:48 ` Wei Yang
0 siblings, 0 replies; 8+ messages in thread
From: Wei Yang @ 2025-08-30 7:48 UTC (permalink / raw)
To: Vishal Moola (Oracle)
Cc: Wei Yang, Zi Yan, akpm, linux-mm, Johannes Weiner,
Vlastimil Babka, David Hildenbrand
On Fri, Aug 29, 2025 at 08:20:04PM -0700, Vishal Moola (Oracle) wrote:
>On Sat, Aug 30, 2025 at 01:25:05AM +0000, Wei Yang wrote:
>> On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote:
>> >On 28 Aug 2025, at 5:16, Wei Yang wrote:
>> >
>> >> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
>> >> buddy. While if the order is less than start_pfn aligned order, we would
>> >> get the same pfn and do the same check again.
>> >>
>> >> Iterate from start_pfn aligned order to reduce duplicated work.
>> >>
>> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> >> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> >> Cc: Zi Yan <ziy@nvidia.com>
>> >> Cc: Vlastimil Babka <vbabka@suse.cz>
>> >> Cc: David Hildenbrand <david@redhat.com>
>> >>
>> >> ---
>> >I think it is right, but the code is very subtle and hard to understand
>> >after the change. It is better to add comment to explain it.
>>
>> One thing I want to point out is in __move_freepages_block_isolate(),
>> find_large_buddy() is always given a pageblock aligned start_pfn. This means
>> if start_pfn is not a free page, it would always try 10 times until give up.
>>
>> >
>> >Paste the code below for more context:
>> >
>> > while (!PageBuddy(page = pfn_to_page(pfn))) {
>> > /* Nothing found */
>> > if (++order > MAX_PAGE_ORDER)
>> > return start_pfn;
>> > pfn &= ~0UL << order;
>> > }
>> >
>> >
>> >
>> >The code tries to find a PageBuddy starting from start_pfn starting from
>> >order=0. When entering the while loop, it means PageBuddy cannot be order-0
>> >and ++order increases the order by 1. Your change fast forwards the process
>> >based on start_pfn. If start_pfn is not an order-0 page, based on first
>> >set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy
>> >order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn)
>> >and it works because "if (++order > MAX_PAGE_ORDER)" increases order
>> >to __ffs(start_pfn) + 1.
>> >
>> >Can you add a comment on your "int order = ..."? Something like:
>> >
>> >If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
>> >has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn)
>> >to remove unnecessary work in the while below.
>>
>> Sure, I would add a comment above the assignment.
>>
>> But in my mind, we are not farst forward order, but start check from order of
>> __ffs(start_pfn).
>>
>>
>> How about: (not good at commento)
>>
>> /*
>> * We start find large buddy from start_pfn order, since a
>> * !PageBuddy() means all lower order page is !PageBuddy().
>> */
>
>Personally, the way Zi worded it makes more sense to me. Maybe replacing
>the second sentence with your comment is fine, but I think Zi's first
>sentence provides important context in a clear way.
Hi, Vishal
Thanks for your comment. I am ok with it. Let's see Zi's opnion. And I will
send a new version. :-)
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-30 1:25 ` Wei Yang
2025-08-30 3:20 ` Vishal Moola (Oracle)
@ 2025-08-31 1:28 ` Zi Yan
2025-08-31 3:35 ` Wei Yang
1 sibling, 1 reply; 8+ messages in thread
From: Zi Yan @ 2025-08-31 1:28 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, linux-mm, Johannes Weiner, Vlastimil Babka,
David Hildenbrand
On 29 Aug 2025, at 21:25, Wei Yang wrote:
> On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote:
>> On 28 Aug 2025, at 5:16, Wei Yang wrote:
>>
>>> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
>>> buddy. While if the order is less than start_pfn aligned order, we would
>>> get the same pfn and do the same check again.
>>>
>>> Iterate from start_pfn aligned order to reduce duplicated work.
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>> Cc: David Hildenbrand <david@redhat.com>
>>>
>>> ---
>>> I build this and run, but not sure how fully test this.
>>> ---
>>> mm/page_alloc.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 27ea4c7acd15..7f2dfd30106f 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page,
>>> /* Look for a buddy that straddles start_pfn */
>>> static unsigned long find_large_buddy(unsigned long start_pfn)
>>> {
>>> - int order = 0;
>>> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER;
>>> struct page *page;
>>> unsigned long pfn = start_pfn;
>>
>
> Hi, Zi Yan
>
> Thanks for the review.
>
>> I think it is right, but the code is very subtle and hard to understand
>> after the change. It is better to add comment to explain it.
>
> One thing I want to point out is in __move_freepages_block_isolate(),
> find_large_buddy() is always given a pageblock aligned start_pfn. This means
> if start_pfn is not a free page, it would always try 10 times until give up.
find_large_buddy() is used to deal with free pages, so start_pfn is likely
to be a free page.
>
>>
>> Paste the code below for more context:
>>
>> while (!PageBuddy(page = pfn_to_page(pfn))) {
>> /* Nothing found */
>> if (++order > MAX_PAGE_ORDER)
>> return start_pfn;
>> pfn &= ~0UL << order;
>> }
>>
>>
>>
>> The code tries to find a PageBuddy starting from start_pfn starting from
>> order=0. When entering the while loop, it means PageBuddy cannot be order-0
>> and ++order increases the order by 1. Your change fast forwards the process
>> based on start_pfn. If start_pfn is not an order-0 page, based on first
>> set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy
>> order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn)
>> and it works because "if (++order > MAX_PAGE_ORDER)" increases order
>> to __ffs(start_pfn) + 1.
>>
>> Can you add a comment on your "int order = ..."? Something like:
>>
>> If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
>> has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn)
>> to remove unnecessary work in the while below.
>
> Sure, I would add a comment above the assignment.
>
> But in my mind, we are not farst forward order, but start check from order of
> __ffs(start_pfn).
Sure.
>
>
> How about: (not good at commento)
>
> /*
> * We start find large buddy from start_pfn order, since a
It is unclear what start_pfn order means.
> * !PageBuddy() means all lower order page is !PageBuddy().
> */
Here you assume start_pfn is not PageBuddy() already, but it
can be an order-0 PageBuddy(). That is why my comment explicitly
excluded the case to begin with.
How about?
If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
has minimal order of __ffs(start_pfn) + 1. Start checking the order with
__ffs(start_pfn). If start_pfn is order-0, the starting order does not matter.
>
>>
>> Feel free to reword the above.
>>
>> With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com>
>>
>>
>> BTW, I also notice that when start_pfn is an order-0 PageBuddy, the
>> "if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true
>> even if there is no buddy straddles start_pfn, although "return pfn"
>> gives the same results as "return start_pfn" (no straddle). The original
>> code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm:
>> page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn
>> before the straddle check, so the correct code should check start_pfn == pfn
>> and return early. But since current code is functionally equivalent.
>> Maybe adding a comment about it would be sufficient. Something like:
>
> The comment above this function says "a buddy that straddles start_pfn", this
> looks good to me. An order-0 page that start from start_pfn also means
> straddle start_pfn.
>
> This may differ from original logic, but seems not wrong.
Straddle means a buddy page has start_pfn in the middle and the caller
in __move_freepages_block_isolate() needs to split the buddy page.
>
>>
>> When the found buddy order is 0, the check would give false positive,
>> but the returned result is still correct, since pfn is the same as start_pfn.
>>
>
> I prefer not adding this.
No strong opinion about it.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order
2025-08-31 1:28 ` Zi Yan
@ 2025-08-31 3:35 ` Wei Yang
0 siblings, 0 replies; 8+ messages in thread
From: Wei Yang @ 2025-08-31 3:35 UTC (permalink / raw)
To: Zi Yan
Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Vlastimil Babka,
David Hildenbrand
On Sat, Aug 30, 2025 at 09:28:24PM -0400, Zi Yan wrote:
>On 29 Aug 2025, at 21:25, Wei Yang wrote:
>
>> On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote:
>>> On 28 Aug 2025, at 5:16, Wei Yang wrote:
>>>
>>>> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large
>>>> buddy. While if the order is less than start_pfn aligned order, we would
>>>> get the same pfn and do the same check again.
>>>>
>>>> Iterate from start_pfn aligned order to reduce duplicated work.
>>>>
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>>> Cc: Zi Yan <ziy@nvidia.com>
>>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>>> Cc: David Hildenbrand <david@redhat.com>
>>>>
>>>> ---
>>>> I build this and run, but not sure how fully test this.
>>>> ---
>>>> mm/page_alloc.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 27ea4c7acd15..7f2dfd30106f 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page,
>>>> /* Look for a buddy that straddles start_pfn */
>>>> static unsigned long find_large_buddy(unsigned long start_pfn)
>>>> {
>>>> - int order = 0;
>>>> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER;
>>>> struct page *page;
>>>> unsigned long pfn = start_pfn;
>>>
>>
>> Hi, Zi Yan
>>
>> Thanks for the review.
>>
>>> I think it is right, but the code is very subtle and hard to understand
>>> after the change. It is better to add comment to explain it.
>>
>> One thing I want to point out is in __move_freepages_block_isolate(),
>> find_large_buddy() is always given a pageblock aligned start_pfn. This means
>> if start_pfn is not a free page, it would always try 10 times until give up.
>
>find_large_buddy() is used to deal with free pages, so start_pfn is likely
>to be a free page.
>
I am not that familiar with the background, my question here.
I see the call flow is:
alloc_contig_pages_noprof()
__alloc_contig_pages(pfn, ) <-- start_pfn
start_isolate_page_range()
isolate_single_pageblock()
set_migratetype_isolate()
pageblock_isolate_and_move_free_pages()
find_large_buddy()
If my understanding is correct, the start_pfn comes from
__alloc_contig_pages(), which is get from zone's pfn iteration.
I don't see it filter non-free page. So the possibility of free/non-free is
equal to me.
Maybe I missed something?
[...]
>> How about: (not good at commento)
>>
>> /*
>> * We start find large buddy from start_pfn order, since a
>
>It is unclear what start_pfn order means.
>
>> * !PageBuddy() means all lower order page is !PageBuddy().
>> */
>
>Here you assume start_pfn is not PageBuddy() already, but it
>can be an order-0 PageBuddy(). That is why my comment explicitly
>excluded the case to begin with.
>
>How about?
>
>If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn
>has minimal order of __ffs(start_pfn) + 1. Start checking the order with
>__ffs(start_pfn). If start_pfn is order-0, the starting order does not matter.
>
Thanks, will use this one.
>>
>>>
>>> Feel free to reword the above.
>>>
>>> With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com>
>>>
>>>
>>> BTW, I also notice that when start_pfn is an order-0 PageBuddy, the
>>> "if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true
>>> even if there is no buddy straddles start_pfn, although "return pfn"
>>> gives the same results as "return start_pfn" (no straddle). The original
>>> code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm:
>>> page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn
>>> before the straddle check, so the correct code should check start_pfn == pfn
>>> and return early. But since current code is functionally equivalent.
>>> Maybe adding a comment about it would be sufficient. Something like:
>>
>> The comment above this function says "a buddy that straddles start_pfn", this
>> looks good to me. An order-0 page that start from start_pfn also means
>> straddle start_pfn.
>>
>> This may differ from original logic, but seems not wrong.
>
>Straddle means a buddy page has start_pfn in the middle and the caller
Ok, maybe the word "straddle" literally means it. If so, it really confuse.
>in __move_freepages_block_isolate() needs to split the buddy page.
Do you mean if start_pfn is the head of a large buddy page, we don't split it
in __move_freepages_block_isolate()?
Per my understanding, if this pageblock is a part of a large buddy, no matter
it is at the beginning or end, we should split it.
Just want to confirm the behavior with you in case I misunderstand.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-08-31 3:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28 9:16 [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order Wei Yang
2025-08-29 3:02 ` Zi Yan
2025-08-30 1:25 ` Wei Yang
2025-08-30 3:20 ` Vishal Moola (Oracle)
2025-08-30 7:48 ` Wei Yang
2025-08-31 1:28 ` Zi Yan
2025-08-31 3:35 ` Wei Yang
2025-08-30 2:15 ` Wei Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).