From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Ye Liu <ye.liu@linux.dev>, Andrew Morton <akpm@linux-foundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 8/9] mm/page_owner: clamp skip_buddy_pages() PFN advance at MAX_ORDER_NR_PAGES boundary
Date: Wed, 1 Jul 2026 08:34:51 +0200 [thread overview]
Message-ID: <ccb58e83-e854-4c09-9d18-36649eade774@kernel.org> (raw)
In-Reply-To: <20260701061101.344679-9-ye.liu@linux.dev>
On 7/1/26 08:10, Ye Liu wrote:
> The lockless buddy_order_unsafe() read can return a garbage order
> value if the page is concurrently allocated between the PageBuddy
> check and the private read. If this bogus order is <= MAX_PAGE_ORDER,
> skip_buddy_pages() would arbitrarily advance the PFN, potentially
> jumping past a MAX_ORDER_NR_PAGES boundary whose pfn_valid() check
> would have caught an offline memory section.
>
> In read_page_owner(), which relies solely on boundary-aligned
> pfn_valid() to guard pfn_to_page(), skipping the boundary could
> cause pfn_to_page() to access an unmapped mem_section.
>
> Clamp the advance so it never crosses the next MAX_ORDER_NR_PAGES
> boundary. This is safe for all three callers: the pageblock-iterating
> ones already handle boundary transitions in their outer loops, and
> for read_page_owner() the worst case is one extra PageBuddy check per
> 1024 pages for a huge buddy block straddling the boundary.
I don't see how a huge buddy block can straddle the boundary, as the largest
buddy block order is MAX_ORDER?
> Signed-off-by: Ye Liu <ye.liu@linux.dev>
Other than that, LGTM
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
> ---
> mm/page_owner.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 46a933f9c229..2e3880053a34 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -428,6 +428,12 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old)
> * to skip less than the full buddy block, but that is acceptable for page owner
> * iteration purposes.
> *
> + * The lockless read of buddy_order_unsafe() can also return a garbage order if
> + * the page is concurrently allocated and PageBuddy is cleared between the check
> + * and the read. Clamp the advance at the next MAX_ORDER_NR_PAGES boundary so
> + * that a bogus order cannot carry @pfn into an unvalidated memory section,
> + * which would break callers that rely on boundary-aligned pfn_valid() checks.
> + *
> * Return: true if the page was skipped (caller should continue its loop),
> * false if the page is not a buddy page and should be processed normally.
> */
> @@ -439,8 +445,12 @@ static inline bool skip_buddy_pages(unsigned long *pfn, struct page *page)
> return false;
>
> order = buddy_order_unsafe(page);
> - if (order <= MAX_PAGE_ORDER)
> - *pfn += (1UL << order) - 1;
> + if (order <= MAX_PAGE_ORDER) {
> + unsigned long new_pfn = *pfn + (1UL << order);
> + unsigned long boundary = ALIGN(*pfn + 1, MAX_ORDER_NR_PAGES);
> +
> + *pfn = min(new_pfn, boundary) - 1;
> + }
>
> return true;
> }
next prev parent reply other threads:[~2026-07-01 6:34 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 6:10 [PATCH v5 0/9] mm/page_owner: misc cleanups Ye Liu
2026-07-01 6:10 ` [PATCH v5 1/9] mm/page_owner: extract skip_buddy_pages() helper to unify buddy page skipping Ye Liu
2026-07-01 6:10 ` [PATCH v5 2/9] mm/page_owner: add MR_NEVER to enum migrate_reason and use it for last_migrate_reason Ye Liu
2026-07-01 6:10 ` [PATCH v5 3/9] mm: use enum migrate_reason instead of int for migration reason parameters Ye Liu
2026-07-01 10:23 ` Lorenzo Stoakes
2026-07-01 6:10 ` [PATCH v5 4/9] mm/page_owner: hoist CONFIG_MEMCG to function level for print_page_owner_memcg() Ye Liu
2026-07-01 6:10 ` [PATCH v5 5/9] mm/page_owner: add missing newline to count_threshold format string Ye Liu
2026-07-01 6:10 ` [PATCH v5 6/9] mm/page_owner: move free_ts_nsec output to free section in __dump_page_owner() Ye Liu
2026-07-01 6:10 ` [PATCH v5 7/9] mm/page_owner: drop redundant page_owner prefix from static symbols Ye Liu
2026-07-01 6:10 ` [PATCH v5 8/9] mm/page_owner: clamp skip_buddy_pages() PFN advance at MAX_ORDER_NR_PAGES boundary Ye Liu
2026-07-01 6:34 ` Vlastimil Babka (SUSE) [this message]
2026-07-01 6:10 ` [PATCH v5 9/9] mm/page_owner: use memcg_data snapshot instead of PageMemcgKmem() to avoid TOCTOU VM_BUG_ON Ye Liu
2026-07-01 6:49 ` Vlastimil Babka (SUSE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ccb58e83-e854-4c09-9d18-36649eade774@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=surenb@google.com \
--cc=ye.liu@linux.dev \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox