linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <muchun.song@linux.dev>
Cc: <sidhartha.kumar@oracle.com>, <jane.chu@oracle.com>,
	Zi Yan <ziy@nvidia.com>, Vlastimil Babka <vbabka@suse.cz>,
	Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>, <linux-mm@kvack.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v3 3/6] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}()
Date: Mon, 20 Oct 2025 23:21:56 +0800	[thread overview]
Message-ID: <a10446f4-d069-45dc-94c8-7c22fcf7f30f@huawei.com> (raw)
In-Reply-To: <9ee230da-3985-4fd7-96a1-6ea5ce55d298@redhat.com>

+ Matthew

On 2025/10/20 21:07, David Hildenbrand wrote:
> 
>>>
>>>> +void free_contig_range_frozen(unsigned long pfn, unsigned long 
>>>> nr_pages)
>>>> +{
>>>> +    struct folio *folio = pfn_folio(pfn);
>>>> +
>>>> +    if (folio_test_large(folio)) {
>>>> +        int expected = folio_nr_pages(folio);
>>>> +
>>>> +        WARN_ON(folio_ref_count(folio));
>>>> +
>>>> +        if (nr_pages == expected)
>>>> +            free_frozen_pages(&folio->page, folio_order(folio));
>>>> +        else
>>>> +            WARN(true, "PFN %lu: nr_pages %lu != expected %d\n",
>>>> +                 pfn, nr_pages, expected);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    for (; nr_pages--; pfn++) {
>>>> +        struct page *page = pfn_to_page(pfn);
>>>> +
>>>> +        WARN_ON(page_ref_count(page));
>>>> +        free_frozen_pages(page, 0);
>>>> +    }
>>>
>>> That's mostly a copy-and-paste of free_contig_range().
>>>
>>> I wonder if there is some way to avoid duplicating a lot of
>>> free_contig_range() here. Hmmm.
>>>
>>> Also, the folio stuff in there looks a bit weird I'm afraid.
>>>
>>> Can't we just refuse to free compound pages throught this interface and
>>> free_contig_range() ? IIRC only hugetlb uses it and uses folio_put()
>>> either way?
>>>
>>> Then we can just document that compound allocations are to be freed
>>> differently.
>>
>>
>> There is a case for cma_free_folio, which calls free_contig_range for
>> both in cma_release(), but I will try to check whether we could avoid
>> the folio stuff in free_contig_range().
> 
> 
> Ah, right, there is hugetlb_cma_free_folio()->cma_free_folio().
> 
> And we need that, because we have to make sure that CMA stats are 
> updated properly.
> 
> All compound page handling in the freeing path is just nasty and not 
> particularly future-proof regarding memdescs.
> 
> I wonder if we could just teach alloc_contig to never hand out compound 
> pages and then let the freeing path similarly assert that there are no 
> compound pages.
> 
> Whoever wants a compound page (currently only hugetlb?) can create that 
> from a frozen range. Before returning the frozen range the compound page 
> can be dissolved. That way also any memdesc can be allocated/freed by 
> the caller later.
> 
> The only nasty thing is the handing of splitting/merging of 
> set_page_owner/page_table_check_alloc etc. :(
> 
> 
> 
> As an alternative, we could only allow compound pages for frozen pages. 
> This way, we'd force any caller to handle the allocation/freeing of the 
> memdesc in the future manually.
> 
> Essentially, only allow GFP_COMPOUND on the frozen interface, which we 
> would convert hugetlb to.
> 
> That means that we can simplify free_contig_range() [no need to handle 
> compound pages]. For free_contig_frozen_range() we would skip refcount 
> checks on that level and do something like:
> 
> void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
> {
>      struct page *first_page = pfn_to_page(pfn)
>      const unsigned int order = ilog2(nr_pages);
> 
>      if (PageHead(first_page)) {
>          WARN_ON_ONCE(order != compound_order(first_page));
>          free_frozen_pages(first_page, order);
>          return;
>      }
> 
>      for (; nr_pages--; pfn++)
>          free_frozen_pages(pfn_to_page(pfn), 0);
> }
> 
> CCing Willy, I don't know yet what will be better in the future. But the 
> folio stuff in there screams for problems.
> 

Sorry forget to add cc in v3, the full link[1],

[1] 
https://lore.kernel.org/linux-mm/20251013133854.2466530-1-wangkefeng.wang@huawei.com/ 





  reply	other threads:[~2025-10-20 15:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 13:38 [PATCH v3 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 2/6] mm: page_alloc: add __split_page() Kefeng Wang
2025-10-13 19:44   ` David Hildenbrand
2025-10-14  3:45     ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 3/6] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
2025-10-16 20:53   ` David Hildenbrand
2025-10-17  7:19     ` Kefeng Wang
2025-10-20 13:07       ` David Hildenbrand
2025-10-20 15:21         ` Kefeng Wang [this message]
2025-10-23 12:06           ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 4/6] mm: cma: add __cma_release() Kefeng Wang
2025-10-13 19:48   ` David Hildenbrand
2025-10-14  3:45     ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 6/6] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
2025-10-16  1:20 ` [PATCH v3 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a10446f4-d069-45dc-94c8-7c22fcf7f30f@huawei.com \
    --to=wangkefeng.wang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=sidhartha.kumar@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).