From: David Hildenbrand <david@redhat.com>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <muchun.song@linux.dev>
Cc: sidhartha.kumar@oracle.com, jane.chu@oracle.com,
	Zi Yan <ziy@nvidia.com>, Vlastimil Babka <vbabka@suse.cz>,
	Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH v3 3/6] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}()
Date: Mon, 20 Oct 2025 15:07:25 +0200	[thread overview]
Message-ID: <9ee230da-3985-4fd7-96a1-6ea5ce55d298@redhat.com> (raw)
In-Reply-To: <56ad383f-80c4-43bf-848e-845311f83907@huawei.com>
>>
>>> +void free_contig_range_frozen(unsigned long pfn, unsigned long nr_pages)
>>> +{
>>> +    struct folio *folio = pfn_folio(pfn);
>>> +
>>> +    if (folio_test_large(folio)) {
>>> +        int expected = folio_nr_pages(folio);
>>> +
>>> +        WARN_ON(folio_ref_count(folio));
>>> +
>>> +        if (nr_pages == expected)
>>> +            free_frozen_pages(&folio->page, folio_order(folio));
>>> +        else
>>> +            WARN(true, "PFN %lu: nr_pages %lu != expected %d\n",
>>> +                 pfn, nr_pages, expected);
>>> +        return;
>>> +    }
>>> +
>>> +    for (; nr_pages--; pfn++) {
>>> +        struct page *page = pfn_to_page(pfn);
>>> +
>>> +        WARN_ON(page_ref_count(page));
>>> +        free_frozen_pages(page, 0);
>>> +    }
>>
>> That's mostly a copy-and-paste of free_contig_range().
>>
>> I wonder if there is some way to avoid duplicating a lot of
>> free_contig_range() here. Hmmm.
>>
>> Also, the folio stuff in there looks a bit weird I'm afraid.
>>
>> Can't we just refuse to free compound pages throught this interface and
>> free_contig_range() ? IIRC only hugetlb uses it and uses folio_put()
>> either way?
>>
>> Then we can just document that compound allocations are to be freed
>> differently.
> 
> 
> There is a case for cma_free_folio, which calls free_contig_range for
> both in cma_release(), but I will try to check whether we could avoid
> the folio stuff in free_contig_range().
Ah, right, there is hugetlb_cma_free_folio()->cma_free_folio().
And we need that, because we have to make sure that CMA stats are 
updated properly.
All compound page handling in the freeing path is just nasty and not 
particularly future-proof regarding memdescs.
I wonder if we could just teach alloc_contig to never hand out compound 
pages and then let the freeing path similarly assert that there are no 
compound pages.
Whoever wants a compound page (currently only hugetlb?) can create that 
from a frozen range. Before returning the frozen range the compound page 
can be dissolved. That way also any memdesc can be allocated/freed by 
the caller later.
The only nasty thing is the handing of splitting/merging of 
set_page_owner/page_table_check_alloc etc. :(
As an alternative, we could only allow compound pages for frozen pages. 
This way, we'd force any caller to handle the allocation/freeing of the 
memdesc in the future manually.
Essentially, only allow GFP_COMPOUND on the frozen interface, which we 
would convert hugetlb to.
That means that we can simplify free_contig_range() [no need to handle 
compound pages]. For free_contig_frozen_range() we would skip refcount 
checks on that level and do something like:
void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
{
	struct page *first_page = pfn_to_page(pfn)
	const unsigned int order = ilog2(nr_pages);
	if (PageHead(first_page)) {
		WARN_ON_ONCE(order != compound_order(first_page));
		free_frozen_pages(first_page, order);
		return;
	}
	for (; nr_pages--; pfn++)
		free_frozen_pages(pfn_to_page(pfn), 0);
}
CCing Willy, I don't know yet what will be better in the future. But the 
folio stuff in there screams for problems.
-- 
Cheers
David / dhildenb
next prev parent reply	other threads:[~2025-10-20 13:07 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 13:38 [PATCH v3 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 2/6] mm: page_alloc: add __split_page() Kefeng Wang
2025-10-13 19:44   ` David Hildenbrand
2025-10-14  3:45     ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 3/6] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
2025-10-16 20:53   ` David Hildenbrand
2025-10-17  7:19     ` Kefeng Wang
2025-10-20 13:07       ` David Hildenbrand [this message]
2025-10-20 15:21         ` Kefeng Wang
2025-10-23 12:06           ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 4/6] mm: cma: add __cma_release() Kefeng Wang
2025-10-13 19:48   ` David Hildenbrand
2025-10-14  3:45     ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 6/6] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
2025-10-16  1:20 ` [PATCH v3 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=9ee230da-3985-4fd7-96a1-6ea5ce55d298@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=sidhartha.kumar@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).