From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Zi Yan <ziy@nvidia.com>, Kaitao Cheng <kaitao.cheng@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Liu Shixin <liushixin2@huawei.com>,
Oscar Salvador <osalvador@suse.de>,
muchun.song@linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Kaitao Cheng <chengkaitao@kylinos.cn>
Subject: Re: [PATCH v2] mm: page_isolation: avoid unsafe folio reads while scanning compound pages
Date: Tue, 2 Jun 2026 20:56:36 +0200 [thread overview]
Message-ID: <b03c0d32-07e6-4148-bebd-50784aa3cbb1@kernel.org> (raw)
In-Reply-To: <34F0943E-4ACD-4655-B9B8-41F658FCDB7E@nvidia.com>
On 6/2/26 17:02, Zi Yan wrote:
> On 2 Jun 2026, at 9:07, Kaitao Cheng wrote:
>
>> From: Kaitao Cheng <chengkaitao@kylinos.cn>
>>
>> page_is_unmovable() can inspect compound pages without holding a folio
>> reference or any lock. The folio can therefore be freed, split or reused
>> while the scanner is still looking at it.
>>
>> The existing HugeTLB handling already avoids folio_hstate() for this
>> reason, but it still derives the hstate from folio_size() and later
>> derives the scan step from folio_nr_pages() and folio_page_idx().
>> These helpers rely on the folio still being a valid folio head. If
>> the folio changed concurrently, the scanner can read inconsistent folio
>> metadata and compute a wrong step. In the worst case, folio_nr_pages()
>> can return 1 for what used to be a tail page and the subtraction from
>> folio_page_idx() can underflow.
>>
>> There is a similar issue for non-Hugetlb compound pages: folio_test_lru()
>> expects a valid folio. If the previously observed head page has been
>> reused as a tail page of another compound page, the folio flag checks
>> can trigger VM_BUG_ON_PGFLAGS().
>>
>> Read the compound order once with compound_order(), reject obviously
>> bogus orders, and derive the hstate and scan step from that order
>> instead of querying folio size information again. Also use PageLRU(page),
>> which is safe for the page being scanned, instead of folio_test_lru()
>> on a potentially stale folio pointer.
>>
>> Treat an unknown HugeTLB hstate as unmovable so the scanner does not try
>> to skip over an unstable HugeTLB folio.
>>
>> Fixes: a0a9f2180b90 ("mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock")
>> Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
>> ---
>> Changes in v2:
>> - Avoid unsafe folio metadata reads in the unlocked scanner by deriving
>> the hstate and scan step from compound_order(). (David Hildenbrand,
>> Andrew Morton)
>> - Treat invalid compound orders or unknown HugeTLB hstates as unmovable.
>> - Use PageLRU(page) instead of folio_test_lru(folio) to avoid folio flag
>> checks on a stale folio pointer. ()
>> - Update the commit log (David Hildenbrand)
>>
>> Link to v1:
>> https://lore.kernel.org/all/20260519121646.40833-1-kaitao.cheng@linux.dev/
>>
>> ---
>> mm/page_isolation.c | 19 +++++++++++++------
>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 7a9d631945a3..32ce8a7d9df3 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -41,8 +41,14 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
>> * We need not scan over tail pages because we don't
>> * handle each tail page individually in migration.
>> */
>> - if (PageHuge(page) || PageCompound(page)) {
>> + if (PageCompound(page)) {
>> struct folio *folio = page_folio(page);
>> + unsigned long nr_pages, pfn;
>> + unsigned int order;
>> +
>> + order = compound_order(&folio->page);
>> + if (order > MAX_FOLIO_ORDER)
>> + return true;
>>
>> if (folio_test_hugetlb(folio)) {
>> struct hstate *h;
>> @@ -54,15 +60,16 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
>> * The huge page may be freed so can not
>> * use folio_hstate() directly.
>> */
>> - h = size_to_hstate(folio_size(folio));
>> - if (h && !hugepage_migration_supported(h))
>> + h = size_to_hstate(PAGE_SIZE << order);
>> + if (!h || !hugepage_migration_supported(h))
>> return true;
>> -
>> - } else if (!folio_test_lru(folio)) {
>> + } else if (!PageLRU(page)) {
>> return true;
>> }
>>
>> - *step = folio_nr_pages(folio) - folio_page_idx(folio, page);
>> + nr_pages = 1UL << order;
>> + pfn = page_to_pfn(page);
>> + *step = (pfn | (nr_pages - 1)) + 1 - pfn;
>> return false;
>> }
>
> LGTM. Thanks.
>
> Just a comment, order can be dropped and use
> nr_pages = compound_nr(&folio->page) instead:
> 1. order > MAX_FOLIO_ORDER -> nr_pages > MAX_FOLIO_NR_PAGES
> 2. PAGE_SIZE << order -> PAGE_SIZE * nr_pages.
> But it is not worth a new version.
If we use compound_nr, we have to check for power-of-2, like
scan_movable_pages(), as folio_large_nr_pages() might use the per-computed
value (which might read garbage).
--
Cheers,
David
next prev parent reply other threads:[~2026-06-02 18:56 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 13:07 [PATCH v2] mm: page_isolation: avoid unsafe folio reads while scanning compound pages Kaitao Cheng
2026-06-02 15:02 ` Zi Yan
2026-06-02 17:11 ` Andrew Morton
2026-06-02 17:46 ` Zi Yan
2026-06-02 18:56 ` David Hildenbrand (Arm) [this message]
2026-06-02 19:31 ` Zi Yan
2026-06-02 19:37 ` David Hildenbrand (Arm)
2026-06-02 19:05 ` David Hildenbrand (Arm)
2026-06-03 5:35 ` Oscar Salvador (SUSE)
2026-06-11 7:52 ` Oscar Salvador (SUSE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b03c0d32-07e6-4148-bebd-50784aa3cbb1@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=chengkaitao@kylinos.cn \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kaitao.cheng@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liushixin2@huawei.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox