All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Sidhartha Kumar <sidhartha.kumar@oracle.com>,
	linux-mm@kvack.org, linux-cxl@vger.kernel.org,
	linux-riscv@lists.infradead.org
Subject: Re: [PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()
Date: Mon, 23 Mar 2026 14:40:16 +0100	[thread overview]
Message-ID: <4caebf3b-8924-406f-9b92-909cf86ef669@kernel.org> (raw)
In-Reply-To: <5b489c40-bd89-4dff-9669-560b1b28ec69@lucifer.local>

On 3/23/26 14:26, Lorenzo Stoakes (Oracle) wrote:
> On Fri, Mar 20, 2026 at 11:13:33PM +0100, David Hildenbrand (Arm) wrote:
>> If a hugetlb folio gets freed while we are in scan_movable_pages(),
>> folio_nr_pages() could return 0, resulting in or'ing "0 - 1 = -1"
>> to the PFN, resulting in PFN = -1. We're not holding any locks or
>> references that would prevent that.
>>
>> for_each_valid_pfn() would then search for the next valid PFN, and could
>> return a PFN that is outside of the range of the original requested
>> range. do_migrate_page() would then try to migrate quite a big range,
>> which is certainly undesirable.
>>
>> To fix it, simply test for valid folio_nr_pages() values. While at it,
>> as PageHuge() really just does a page_folio() internally, we can just
>> use folio_test_hugetlb() on the folio directly.
>>
>> scan_movable_pages() is expected to be fast, and we try to avoid taking
>> locks or grabbing references. We cannot use folio_try_get() as that does
>> not work for free hugetlb folios. We could grab the hugetlb_lock, but
>> that just adds complexity.
>>
>> The race is unlikely to trigger in practice, so we won't be CCing
>> stable.
>>
>> Fixes: 16540dae959d ("mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()")
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> 
> Logic looks right to me, though some nits below. With those accounted for:
> 
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> 
>> ---
>>  mm/memory_hotplug.c | 11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 86d3faf50453..969cd7ddf68f 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1747,6 +1747,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  	unsigned long pfn;
>>
>>  	for_each_valid_pfn(pfn, start, end) {
>> +		unsigned long nr_pages;
>>  		struct page *page;
>>  		struct folio *folio;
>>
>> @@ -1763,9 +1764,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		if (PageOffline(page) && page_count(page))
>>  			return -EBUSY;
>>
>> -		if (!PageHuge(page))
> 
> Yeah interesting to see this is folio_test_hugetlb(page_folio(page)) :))
> 
> So this is a nice change for sure.
> 
>> -			continue;
>>  		folio = page_folio(page);
>> +		if (!folio_test_hugetlb(folio))
>> +			continue;
>>  		/*
>>  		 * This test is racy as we hold no reference or lock.  The
>>  		 * hugetlb page could have been free'ed and head is no longer
>> @@ -1775,7 +1776,11 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		 */
>>  		if (folio_test_hugetlb_migratable(folio))
>>  			goto found;
>> -		pfn |= folio_nr_pages(folio) - 1;
>> +		nr_pages = folio_nr_pages(folio);
>> +		if (unlikely(nr_pages < 1 || nr_pages > MAX_FOLIO_NR_PAGES ||
> 
> NIT: since nr_pages is an unsigned long, would this be better as !nr_pages || ...?

It's easier on the brain when spotting that only a given range is
allowed, without having to remember the exact type of the variable :)

So I guess it doesn't really make a difference in the end.

> 
>> +			     !is_power_of_2(nr_pages)))
> 
> Could the latter two conditions ever really happen? I guess some weird tearing
> or something maybe?

Yes, or when the fields gets reused for something else.

> 
> It would also be nice to maybe separate this out as is_valid_nr_pages() or
> something, but then again, I suppose given this is a rare case of us
> checking this under circumstances where the value might not be valid, maybe
> not worth it.

I had the same thought. But this code is way too special regarding
raciness that I hope nobody else will really require this ... and if
they do, they might be doing something wrong :)

-- 
Cheers,

David

WARNING: multiple messages have this Message-ID (diff)
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Sidhartha Kumar <sidhartha.kumar@oracle.com>,
	linux-mm@kvack.org, linux-cxl@vger.kernel.org,
	linux-riscv@lists.infradead.org
Subject: Re: [PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()
Date: Mon, 23 Mar 2026 14:40:16 +0100	[thread overview]
Message-ID: <4caebf3b-8924-406f-9b92-909cf86ef669@kernel.org> (raw)
In-Reply-To: <5b489c40-bd89-4dff-9669-560b1b28ec69@lucifer.local>

On 3/23/26 14:26, Lorenzo Stoakes (Oracle) wrote:
> On Fri, Mar 20, 2026 at 11:13:33PM +0100, David Hildenbrand (Arm) wrote:
>> If a hugetlb folio gets freed while we are in scan_movable_pages(),
>> folio_nr_pages() could return 0, resulting in or'ing "0 - 1 = -1"
>> to the PFN, resulting in PFN = -1. We're not holding any locks or
>> references that would prevent that.
>>
>> for_each_valid_pfn() would then search for the next valid PFN, and could
>> return a PFN that is outside of the range of the original requested
>> range. do_migrate_page() would then try to migrate quite a big range,
>> which is certainly undesirable.
>>
>> To fix it, simply test for valid folio_nr_pages() values. While at it,
>> as PageHuge() really just does a page_folio() internally, we can just
>> use folio_test_hugetlb() on the folio directly.
>>
>> scan_movable_pages() is expected to be fast, and we try to avoid taking
>> locks or grabbing references. We cannot use folio_try_get() as that does
>> not work for free hugetlb folios. We could grab the hugetlb_lock, but
>> that just adds complexity.
>>
>> The race is unlikely to trigger in practice, so we won't be CCing
>> stable.
>>
>> Fixes: 16540dae959d ("mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()")
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> 
> Logic looks right to me, though some nits below. With those accounted for:
> 
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> 
>> ---
>>  mm/memory_hotplug.c | 11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 86d3faf50453..969cd7ddf68f 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1747,6 +1747,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  	unsigned long pfn;
>>
>>  	for_each_valid_pfn(pfn, start, end) {
>> +		unsigned long nr_pages;
>>  		struct page *page;
>>  		struct folio *folio;
>>
>> @@ -1763,9 +1764,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		if (PageOffline(page) && page_count(page))
>>  			return -EBUSY;
>>
>> -		if (!PageHuge(page))
> 
> Yeah interesting to see this is folio_test_hugetlb(page_folio(page)) :))
> 
> So this is a nice change for sure.
> 
>> -			continue;
>>  		folio = page_folio(page);
>> +		if (!folio_test_hugetlb(folio))
>> +			continue;
>>  		/*
>>  		 * This test is racy as we hold no reference or lock.  The
>>  		 * hugetlb page could have been free'ed and head is no longer
>> @@ -1775,7 +1776,11 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		 */
>>  		if (folio_test_hugetlb_migratable(folio))
>>  			goto found;
>> -		pfn |= folio_nr_pages(folio) - 1;
>> +		nr_pages = folio_nr_pages(folio);
>> +		if (unlikely(nr_pages < 1 || nr_pages > MAX_FOLIO_NR_PAGES ||
> 
> NIT: since nr_pages is an unsigned long, would this be better as !nr_pages || ...?

It's easier on the brain when spotting that only a given range is
allowed, without having to remember the exact type of the variable :)

So I guess it doesn't really make a difference in the end.

> 
>> +			     !is_power_of_2(nr_pages)))
> 
> Could the latter two conditions ever really happen? I guess some weird tearing
> or something maybe?

Yes, or when the fields gets reused for something else.

> 
> It would also be nice to maybe separate this out as is_valid_nr_pages() or
> something, but then again, I suppose given this is a rare case of us
> checking this under circumstances where the value might not be valid, maybe
> not worth it.

I had the same thought. But this code is way too special regarding
raciness that I hope nobody else will really require this ... and if
they do, they might be doing something wrong :)

-- 
Cheers,

David

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2026-03-23 13:40 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 22:13 [PATCH v2 00/15] mm: memory hot(un)plug and SPARSEMEM cleanups David Hildenbrand (Arm)
2026-03-20 22:13 ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages() David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-23 13:26   ` Lorenzo Stoakes (Oracle)
2026-03-23 13:26     ` Lorenzo Stoakes (Oracle)
2026-03-23 13:40     ` David Hildenbrand (Arm) [this message]
2026-03-23 13:40       ` David Hildenbrand (Arm)
2026-03-23 14:00       ` Lorenzo Stoakes (Oracle)
2026-03-23 14:00         ` Lorenzo Stoakes (Oracle)
2026-03-20 22:13 ` [PATCH v2 02/15] mm/memory_hotplug: remove for_each_valid_pfn() usage David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 03/15] mm/sparse: remove WARN_ONs from (online|offline)_mem_sections() David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 04/15] mm/Kconfig: make CONFIG_MEMORY_HOTPLUG depend on CONFIG_SPARSEMEM_VMEMMAP David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 05/15] mm/memory_hotplug: simplify check_pfn_span() David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 06/15] mm/sparse: remove !CONFIG_SPARSEMEM_VMEMMAP leftovers for CONFIG_MEMORY_HOTPLUG David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 07/15] mm/bootmem_info: remove handling for !CONFIG_SPARSEMEM_VMEMMAP David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 08/15] mm/bootmem_info: avoid using sparse_decode_mem_map() David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 09/15] mm/sparse: remove sparse_decode_mem_map() David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 10/15] mm/sparse: remove CONFIG_MEMORY_HOTPLUG-specific usemap allocation handling David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 11/15] mm: prepare to move subsection_map_init() to mm/sparse-vmemmap.c David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 12/15] mm/sparse: drop set_section_nid() from sparse_add_section() David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 13/15] mm/sparse: move sparse_init_one_section() to internal.h David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-23  8:49   ` David Hildenbrand (Arm)
2026-03-23  8:49     ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 14/15] mm/sparse: move __section_mark_present() " David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-20 22:13 ` [PATCH v2 15/15] mm/sparse: move memory hotplug bits to sparse-vmemmap.c David Hildenbrand (Arm)
2026-03-20 22:13   ` David Hildenbrand (Arm)
2026-03-27  2:58 ` [PATCH v2 00/15] mm: memory hot(un)plug and SPARSEMEM cleanups Andrew Morton
2026-03-27  2:58   ` Andrew Morton
2026-03-27 10:00   ` Lorenzo Stoakes (Oracle)
2026-03-27 10:00     ` Lorenzo Stoakes (Oracle)
2026-04-30  3:25 ` patchwork-bot+linux-riscv
2026-04-30  3:25   ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4caebf3b-8924-406f-9b92-909cf86ef669@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=sidhartha.kumar@oracle.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.