From: Oscar Salvador <osalvador@suse.de>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Muchun Song <songmuchun@bytedance.com>,
Joao Martins <joao.m.martins@oracle.com>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@suse.com>, Peter Xu <peterx@redhat.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Vlastimil Babka <vbabka@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2] hugetlb: freeze allocated pages before creating hugetlb pages
Date: Tue, 20 Sep 2022 06:02:56 +0200 [thread overview]
Message-ID: <Yyk7cN8KhUlNFmM8@localhost.localdomain> (raw)
In-Reply-To: <20220916214638.155744-1-mike.kravetz@oracle.com>
On Fri, Sep 16, 2022 at 02:46:38PM -0700, Mike Kravetz wrote:
> When creating hugetlb pages, the hugetlb code must first allocate
> contiguous pages from a low level allocator such as buddy, cma or
> memblock. The pages returned from these low level allocators are
> ref counted. This creates potential issues with other code taking
> speculative references on these pages before they can be transformed to
> a hugetlb page. This issue has been addressed with methods and code
> such as that provided in [1].
>
> Recent discussions about vmemmap freeing [2] have indicated that it
> would be beneficial to freeze all sub pages, including the head page
> of pages returned from low level allocators before converting to a
> hugetlb page. This helps avoid races if we want to replace the page
> containing vmemmap for the head page.
>
> There have been proposals to change at least the buddy allocator to
> return frozen pages as described at [3]. If such a change is made, it
> can be employed by the hugetlb code. However, as mentioned above
> hugetlb uses several low level allocators so each would need to be
> modified to return frozen pages. For now, we can manually freeze the
> returned pages. This is done in two places:
> 1) alloc_buddy_huge_page, only the returned head page is ref counted.
> We freeze the head page, retrying once in the VERY rare case where
> there may be an inflated ref count.
> 2) prep_compound_gigantic_page, for gigantic pages the current code
> freezes all pages except the head page. New code will simply freeze
> the head page as well.
>
> In a few other places, code checks for inflated ref counts on newly
> allocated hugetlb pages. With the modifications to freeze after
> allocating, this code can be removed.
>
> After hugetlb pages are freshly allocated, they are often added to the
> hugetlb free lists. Since these pages were previously ref counted, this
> was done via put_page() which would end up calling the hugetlb
> destructor: free_huge_page. With changes to freeze pages, we simply
> call free_huge_page directly to add the pages to the free list.
>
> In a few other places, freshly allocated hugetlb pages were immediately
> put into use, and the expectation was they were already ref counted. In
> these cases, we must manually ref count the page.
>
> [1] https://lore.kernel.org/linux-mm/20210622021423.154662-3-mike.kravetz@oracle.com/
> [2] https://lore.kernel.org/linux-mm/20220802180309.19340-1-joao.m.martins@oracle.com/
> [3] https://lore.kernel.org/linux-mm/20220809171854.3725722-1-willy@infradead.org/
>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Hi Mike,
this looks great and simplifies the code much more.
I got a question though:
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1787,9 +1787,8 @@ static bool __prep_compound_gigantic_page(struct page *page, unsigned int order,
>
> /* we rely on prep_new_huge_page to set the destructor */
> set_compound_order(page, order);
> - __ClearPageReserved(page);
> __SetPageHead(page);
> - for (i = 1; i < nr_pages; i++) {
> + for (i = 0; i < nr_pages; i++) {
> p = nth_page(page, i);
>
> /*
> @@ -1830,17 +1829,19 @@ static bool __prep_compound_gigantic_page(struct page *page, unsigned int order,
> } else {
> VM_BUG_ON_PAGE(page_count(p), p);
> }
> - set_compound_head(p, page);
> + if (i != 0)
> + set_compound_head(p, page);
Sure I am missing something here, but why we only freeze refcount here
in case it is for demote?
We seem to be doing it inconditionally in alloc_buddy_huge_page.
--
Oscar Salvador
SUSE Labs
next prev parent reply other threads:[~2022-09-20 4:03 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-16 21:46 [PATCH v2] hugetlb: freeze allocated pages before creating hugetlb pages Mike Kravetz
2022-09-20 4:02 ` Oscar Salvador [this message]
2022-09-20 17:10 ` Mike Kravetz
2022-09-21 3:54 ` Oscar Salvador
2022-09-21 1:48 ` HORIGUCHI NAOYA(堀口 直也)
2022-09-21 3:17 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yyk7cN8KhUlNFmM8@localhost.localdomain \
--to=osalvador@suse.de \
--cc=akpm@linux-foundation.org \
--cc=joao.m.martins@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=peterx@redhat.com \
--cc=songmuchun@bytedance.com \
--cc=vbabka@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.