Re: [PATCH v2] mm: page_isolation: avoid unsafe folio reads while scanning compound pages

Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Zi Yan <ziy@nvidia.com>, Kaitao Cheng <kaitao.cheng@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Liu Shixin <liushixin2@huawei.com>,
	Oscar Salvador <osalvador@suse.de>,
	muchun.song@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Kaitao Cheng <chengkaitao@kylinos.cn>
Subject: Re: [PATCH v2] mm: page_isolation: avoid unsafe folio reads while scanning compound pages
Date: Tue, 2 Jun 2026 20:56:36 +0200	[thread overview]
Message-ID: <b03c0d32-07e6-4148-bebd-50784aa3cbb1@kernel.org> (raw)
In-Reply-To: <34F0943E-4ACD-4655-B9B8-41F658FCDB7E@nvidia.com>

On 6/2/26 17:02, Zi Yan wrote:
> On 2 Jun 2026, at 9:07, Kaitao Cheng wrote:
> 
>> From: Kaitao Cheng <chengkaitao@kylinos.cn>
>>
>> page_is_unmovable() can inspect compound pages without holding a folio
>> reference or any lock. The folio can therefore be freed, split or reused
>> while the scanner is still looking at it.
>>
>> The existing HugeTLB handling already avoids folio_hstate() for this
>> reason, but it still derives the hstate from folio_size() and later
>> derives the scan step from folio_nr_pages() and folio_page_idx().
>> These helpers rely on the folio still being a valid folio head. If
>> the folio changed concurrently, the scanner can read inconsistent folio
>> metadata and compute a wrong step. In the worst case, folio_nr_pages()
>> can return 1 for what used to be a tail page and the subtraction from
>> folio_page_idx() can underflow.
>>
>> There is a similar issue for non-Hugetlb compound pages: folio_test_lru()
>> expects a valid folio. If the previously observed head page has been
>> reused as a tail page of another compound page, the folio flag checks
>> can trigger VM_BUG_ON_PGFLAGS().
>>
>> Read the compound order once with compound_order(), reject obviously
>> bogus orders, and derive the hstate and scan step from that order
>> instead of querying folio size information again. Also use PageLRU(page),
>> which is safe for the page being scanned, instead of folio_test_lru()
>> on a potentially stale folio pointer.
>>
>> Treat an unknown HugeTLB hstate as unmovable so the scanner does not try
>> to skip over an unstable HugeTLB folio.
>>
>> Fixes: a0a9f2180b90 ("mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock")
>> Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
>> ---
>> Changes in v2:
>> - Avoid unsafe folio metadata reads in the unlocked scanner by deriving
>>   the hstate and scan step from compound_order(). (David Hildenbrand,
>>   Andrew Morton)
>> - Treat invalid compound orders or unknown HugeTLB hstates as unmovable.
>> - Use PageLRU(page) instead of folio_test_lru(folio) to avoid folio flag
>>   checks on a stale folio pointer. ()
>> - Update the commit log (David Hildenbrand)
>>
>> Link to v1:
>> https://lore.kernel.org/all/20260519121646.40833-1-kaitao.cheng@linux.dev/
>>
>> ---
>>  mm/page_isolation.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 7a9d631945a3..32ce8a7d9df3 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -41,8 +41,14 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
>>  	 * We need not scan over tail pages because we don't
>>  	 * handle each tail page individually in migration.
>>  	 */
>> -	if (PageHuge(page) || PageCompound(page)) {
>> +	if (PageCompound(page)) {
>>  		struct folio *folio = page_folio(page);
>> +		unsigned long nr_pages, pfn;
>> +		unsigned int order;
>> +
>> +		order = compound_order(&folio->page);
>> +		if (order > MAX_FOLIO_ORDER)
>> +			return true;
>>
>>  		if (folio_test_hugetlb(folio)) {
>>  			struct hstate *h;
>> @@ -54,15 +60,16 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
>>  			 * The huge page may be freed so can not
>>  			 * use folio_hstate() directly.
>>  			 */
>> -			h = size_to_hstate(folio_size(folio));
>> -			if (h && !hugepage_migration_supported(h))
>> +			h = size_to_hstate(PAGE_SIZE << order);
>> +			if (!h || !hugepage_migration_supported(h))
>>  				return true;
>> -
>> -		} else if (!folio_test_lru(folio)) {
>> +		} else if (!PageLRU(page)) {
>>  			return true;
>>  		}
>>
>> -		*step = folio_nr_pages(folio) - folio_page_idx(folio, page);
>> +		nr_pages = 1UL << order;
>> +		pfn = page_to_pfn(page);
>> +		*step = (pfn | (nr_pages - 1)) + 1 - pfn;
>>  		return false;
>>  	}
> 
> LGTM. Thanks.
> 
> Just a comment, order can be dropped and use
> nr_pages = compound_nr(&folio->page) instead:
> 1. order > MAX_FOLIO_ORDER -> nr_pages > MAX_FOLIO_NR_PAGES
> 2. PAGE_SIZE << order -> PAGE_SIZE * nr_pages.
> But it is not worth a new version.

If we use compound_nr, we have to check for power-of-2, like
scan_movable_pages(),  as folio_large_nr_pages() might use the per-computed
value (which might read garbage).

-- 
Cheers,

David

next prev parent reply	other threads:[~2026-06-02 18:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-02 13:07 [PATCH v2] mm: page_isolation: avoid unsafe folio reads while scanning compound pages Kaitao Cheng
2026-06-02 15:02 ` Zi Yan
2026-06-02 17:11   ` Andrew Morton
2026-06-02 17:46     ` Zi Yan
2026-06-02 18:56   ` David Hildenbrand (Arm) [this message]
2026-06-02 19:31     ` Zi Yan
2026-06-02 19:37       ` David Hildenbrand (Arm)
2026-06-02 19:05 ` David Hildenbrand (Arm)
2026-06-03  5:35 ` Oscar Salvador (SUSE)
2026-06-11  7:52 ` Oscar Salvador (SUSE)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b03c0d32-07e6-4148-bebd-50784aa3cbb1@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=chengkaitao@kylinos.cn \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=kaitao.cheng@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox