All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	akpm@linux-foundation.org, vbabka@suse.cz, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org,
	wangkefeng.wang@huawei.com, linux-mm@kvack.org,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb
Date: Thu, 11 Sep 2025 10:27:08 -0700	[thread overview]
Message-ID: <aMMGbPX9uUj7TfoG@fedora> (raw)
In-Reply-To: <3DE28F4B-ACB1-468F-89B9-D7750D24BE4E@nvidia.com>

On Thu, Sep 11, 2025 at 12:19:34PM -0400, Zi Yan wrote:
> On 10 Sep 2025, at 23:27, Wei Yang wrote:
> 
> > On Wed, Sep 10, 2025 at 09:35:53PM -0400, Zi Yan wrote:
> >> On 10 Sep 2025, at 21:25, Wei Yang wrote:
> >>
> >>> On Wed, Sep 10, 2025 at 09:22:40AM +0000, Wei Yang wrote:
> >>>> Commit 56ae0bb349b4 ("mm: compaction: convert to use a folio in
> >>>> isolate_migratepages_block()") converts api from page to folio. But the
> >>>> low_pfn advance for hugetlb page seems wrong when low_pfn doesn't point
> >>>> to head page.
> >>>>
> >>>> Originally, if page is a hugetlb tail page, compound_nr() return 1,
> >>>> which means low_pfn only advance one in next iteration. After the
> >>>> change, low_pfn would advance more than the hugetlb range, since
> >>>> folio_nr_pages() always return total number of the large page. This
> >>>> results in skipping some range to isolate and then to migrate.
> >>>>
> >>>> The worst case for alloc_contig is it does all the isolation and
> >>>> migration, but finally find some range is still not isolated. And then
> >>>> undo all the work and try a new range.
> >>>>
> >>>> Advance low_pfn to the end of hugetlb.
> >>>>
> >>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >>>> Fixes: 56ae0bb349b4 ("mm: compaction: convert to use a folio in isolate_migratepages_block()")
> >>>> Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
> >>>> Cc: Oscar Salvador <osalvador@suse.de>
> >>>
> >>> Forgot to cc stable.
> >>>
> >>> Cc: <stable@vger.kernel.org>
> >>
> >> Is there any bug report to justify the backport? Since it is more likely
> >> to be a performance issue instead of a correctness issue.
> >>
> >
> > OK, I thought cc-stable is paired with fixes tag.
> >
> > If not, please drop it.
> >
> >>>
> >>>> ---
> >>>> mm/compaction.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/mm/compaction.c b/mm/compaction.c
> >>>> index bf021b31c7ec..1e8f8eca318c 100644
> >>>> --- a/mm/compaction.c
> >>>> +++ b/mm/compaction.c
> >>>> @@ -989,7 +989,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >>>> 				 * Hugepage was successfully isolated and placed
> >>>> 				 * on the cc->migratepages list.
> >>>> 				 */
> >>>> -				low_pfn += folio_nr_pages(folio) - 1;
> >>>> +				low_pfn += folio_nr_pages(folio) - folio_page_idx(folio, page) - 1;
> >>>
> >>> One question is why we advance compound_nr() in original version.
> >>>
> >>> Yes, there are several places advancing compound_nr(), but it seems to iterate
> >>> on the same large page and do the same thing and advance 1 again.
> >>>
> >>> Not sure which part story I missed.
> >>
> >> isolate_migratepages_block() starts from the beginning of a pageblock.
> >> How likely the code hit in the middle of a hugetlb?
> >>
> >
> > OK, this is a kind of optimization based on the knowledge it is not likely to
> > be a tail page?
> 
> No, it might be that most of the time page is the head, or people assume so.

For compound pages, we will always have tail pfn < head pfn, so we should
always find the head page first.

If you did find a case where we somehow encounter a tail page here, I'd
love to see it. And then you'd also want to make sure the other compaction
trackers are appropriately accounted for.


  reply	other threads:[~2025-09-11 17:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-10  9:22 [PATCH] mm/compaction: fix low_pfn advance on isolating hugetlb Wei Yang
2025-09-11  1:25 ` Wei Yang
2025-09-11  1:35   ` Zi Yan
2025-09-11  1:38     ` Zi Yan
2025-09-11  1:50       ` Zi Yan
2025-09-11  3:34         ` Wei Yang
2025-09-11  6:30           ` Wei Yang
2025-09-11 16:28             ` Zi Yan
2025-09-12  0:28               ` Wei Yang
2025-09-12  1:13                 ` Zi Yan
2025-09-12  6:00                   ` Baolin Wang
2025-09-13  0:22                     ` Zi Yan
2025-09-11  3:27     ` Wei Yang
2025-09-11 16:19       ` Zi Yan
2025-09-11 17:27         ` Vishal Moola (Oracle) [this message]
2025-09-12  1:07           ` Wei Yang
2025-09-12  1:29             ` Zi Yan
2025-09-12 17:22               ` Vishal Moola (Oracle)
2025-09-13  0:11                 ` Wei Yang
2025-09-13  0:10               ` Wei Yang
2025-09-23  2:35                 ` Andrew Morton
2025-09-23  2:41                   ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMMGbPX9uUj7TfoG@fedora \
    --to=vishal.moola@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=richard.weiyang@gmail.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.