Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
@ 2026-06-22 14:14 Ketan
  2026-06-22 14:20 ` David Hildenbrand (Arm)
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Ketan @ 2026-06-22 14:14 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Zi Yan, Luiz Capitulino,
	David Hildenbrand
  Cc: kernel, stable, linux-mm, linux-kernel, Matthew Wilcox,
	Ketan Kishore, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport,
	David Hildenbrand

The page_ext iteration API does not validate if the PFN still
belongs to a valid section while advancing the iterator. When
dynamically adding memory in the hotplug path, it can lead to a
NULL pointer dereference during page_ext_lookup at the boundary
of the last valid section when iterator count equals __pgcount.

The for_each_page_ext() macro calls page_ext_iter_next() as its
loop increment. for_each_page_ext() does a
"__page_ext = page_ext_iter_next(&__iter)" at the end. This
causes page_ext_iter_next() to increment iter->index past
__pgcount and call page_ext_lookup(start_pfn + __pgcount).
During memory hotplug (online), the PFN at start_pfn + __pgcount
may belong to a section that has not yet been initialized,
causing page_ext_lookup() to trigger a NULL pointer dereference.

[   14.555124][  T846] Call trace:
[   14.555125][  T846]  lookup_page_ext+0x6c/0x108 (P)
[   14.555127][  T846]  page_ext_lookup+0x30/0x3c
[   14.555129][  T846]  __reset_page_owner+0x11c/0x260
[   14.571201][  T846]  __free_pages_ok+0x5e8/0x8e0
[   14.571204][  T846]  __free_pages_core+0x78/0xf0
[   14.571206][  T846]  generic_online_page+0x14/0x24
[   14.597782][  T846]  online_pages+0x178/0x30c
[   14.597784][  T846]  memory_block_change_state+0x284/0x32c
[   14.597787][  T846]  memory_subsys_online+0x4c/0x64
[   14.597789][  T846]  device_online+0x88/0xb0
[   14.597791][  T846]  online_memory_block+0x30/0x40
[   14.597793][  T846]  walk_memory_blocks+0xac/0xe8
[   14.597794][  T846]  add_memory_resource+0x280/0x298
[   14.656161][  T846]  add_memory+0x60/0x98

Move the iteration boundary enforcement inside the iterator
functions, so callers cannot inadvertently access beyond the
requested range.

Fixes: 9039b9096ea2 ("mm: page_owner: use new iteration API")
Cc: stable@vger.kernel.org
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Ketan Kishore <ketan.kishore@oss.qualcomm.com>
---
Changes in v2:
- Incorporated comments from David and Matthew to check for invalid PFN
  in page_ext iterator rather than checking for NULL section in
  page_ext_lookup.
- Minor improvement in commit description to include the issue with
  page_ext_iter_next
- Link to v1: https://patch.msgid.link/20260617-page_ext-v1-1-37ad802b1a38@oss.qualcomm.com

To: Andrew Morton <akpm@linux-foundation.org>
To: David Hildenbrand <david@kernel.org>
To: Lorenzo Stoakes <ljs@kernel.org>
To: "Liam R. Howlett" <liam@infradead.org>
To: Vlastimil Babka <vbabka@kernel.org>
To: Mike Rapoport <rppt@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
To: Michal Hocko <mhocko@suse.com>
To: Luiz Capitulino <luizcap@redhat.com>
Cc: kernel@oss.qualcomm.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 include/linux/page_ext.h | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index 61e876e255e8..4f7d7a8709de 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -120,14 +120,18 @@ struct page_ext_iter {
  * page_ext_iter_begin() - Prepare for iterating through page extensions.
  * @iter: page extension iterator.
  * @pfn: PFN of the page we're interested in.
+ * @count: maximum number of page extensions to return.
  *
  * Must be called with RCU read lock taken.
  *
  * Return: NULL if no page_ext exists for this page.
  */
 static inline struct page_ext *page_ext_iter_begin(struct page_ext_iter *iter,
-						unsigned long pfn)
+		unsigned long pfn, unsigned long count)
 {
+	if (count == 0)
+		return NULL;
+
 	iter->index = 0;
 	iter->start_pfn = pfn;
 	iter->page_ext = page_ext_lookup(pfn);
@@ -138,19 +142,22 @@ static inline struct page_ext *page_ext_iter_begin(struct page_ext_iter *iter,
 /**
  * page_ext_iter_next() - Get next page extension
  * @iter: page extension iterator.
+ * @count: maximum number of page extensions to return.
  *
  * Must be called with RCU read lock taken.
  *
  * Return: NULL if no next page_ext exists.
  */
-static inline struct page_ext *page_ext_iter_next(struct page_ext_iter *iter)
+static inline struct page_ext *page_ext_iter_next(struct page_ext_iter *iter,
+		unsigned long count)
 {
 	unsigned long pfn;
 
 	if (WARN_ON_ONCE(!iter->page_ext))
 		return NULL;
 
-	iter->index++;
+	if (iter->index++ >= count)
+		return NULL;
 	pfn = iter->start_pfn + iter->index;
 
 	if (page_ext_iter_next_fast_possible(pfn))
@@ -183,9 +190,9 @@ static inline struct page_ext *page_ext_iter_get(const struct page_ext_iter *ite
  * IMPORTANT: must be called with RCU read lock taken.
  */
 #define for_each_page_ext(__page, __pgcount, __page_ext, __iter) \
-	for (__page_ext = page_ext_iter_begin(&__iter, page_to_pfn(__page));\
-		__page_ext && __iter.index < __pgcount;          \
-		__page_ext = page_ext_iter_next(&__iter))
+	for (__page_ext = page_ext_iter_begin(&__iter, page_to_pfn(__page), __pgcount); \
+		__page_ext; \
+		__page_ext = page_ext_iter_next(&__iter, __pgcount))
 
 #else /* !CONFIG_PAGE_EXTENSION */
 struct page_ext;

---
base-commit: c425609d6ac4012c8bbf01ec2e10e801b1923a7b
change-id: 20260616-page_ext-31ef555456fc

Best regards,
--  
Ketan Kishore <ketan.kishore@oss.qualcomm.com>



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 14:14 [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access Ketan
@ 2026-06-22 14:20 ` David Hildenbrand (Arm)
  2026-06-22 16:14 ` Zi Yan
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-22 14:20 UTC (permalink / raw)
  To: Ketan, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan,
	Luiz Capitulino
  Cc: kernel, stable, linux-mm, linux-kernel, Matthew Wilcox,
	Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport

On 6/22/26 16:14, Ketan wrote:
> The page_ext iteration API does not validate if the PFN still
> belongs to a valid section while advancing the iterator. When
> dynamically adding memory in the hotplug path, it can lead to a
> NULL pointer dereference during page_ext_lookup at the boundary
> of the last valid section when iterator count equals __pgcount.
> 
> The for_each_page_ext() macro calls page_ext_iter_next() as its
> loop increment. for_each_page_ext() does a
> "__page_ext = page_ext_iter_next(&__iter)" at the end. This
> causes page_ext_iter_next() to increment iter->index past
> __pgcount and call page_ext_lookup(start_pfn + __pgcount).
> During memory hotplug (online), the PFN at start_pfn + __pgcount
> may belong to a section that has not yet been initialized,
> causing page_ext_lookup() to trigger a NULL pointer dereference.
> 
> [   14.555124][  T846] Call trace:
> [   14.555125][  T846]  lookup_page_ext+0x6c/0x108 (P)
> [   14.555127][  T846]  page_ext_lookup+0x30/0x3c
> [   14.555129][  T846]  __reset_page_owner+0x11c/0x260
> [   14.571201][  T846]  __free_pages_ok+0x5e8/0x8e0
> [   14.571204][  T846]  __free_pages_core+0x78/0xf0
> [   14.571206][  T846]  generic_online_page+0x14/0x24
> [   14.597782][  T846]  online_pages+0x178/0x30c
> [   14.597784][  T846]  memory_block_change_state+0x284/0x32c
> [   14.597787][  T846]  memory_subsys_online+0x4c/0x64
> [   14.597789][  T846]  device_online+0x88/0xb0
> [   14.597791][  T846]  online_memory_block+0x30/0x40
> [   14.597793][  T846]  walk_memory_blocks+0xac/0xe8
> [   14.597794][  T846]  add_memory_resource+0x280/0x298
> [   14.656161][  T846]  add_memory+0x60/0x98
> 
> Move the iteration boundary enforcement inside the iterator
> functions, so callers cannot inadvertently access beyond the
> requested range.
> 
> Fixes: 9039b9096ea2 ("mm: page_owner: use new iteration API")
> Cc: stable@vger.kernel.org
> Suggested-by: David Hildenbrand <david@redhat.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Ketan Kishore <ketan.kishore@oss.qualcomm.com>
> ---
> Changes in v2:
> - Incorporated comments from David and Matthew to check for invalid PFN
>   in page_ext iterator rather than checking for NULL section in
>   page_ext_lookup.
> - Minor improvement in commit description to include the issue with
>   page_ext_iter_next
> - Link to v1: https://patch.msgid.link/20260617-page_ext-v1-1-37ad802b1a38@oss.qualcomm.com
> 
> To: Andrew Morton <akpm@linux-foundation.org>
> To: David Hildenbrand <david@kernel.org>
> To: Lorenzo Stoakes <ljs@kernel.org>
> To: "Liam R. Howlett" <liam@infradead.org>
> To: Vlastimil Babka <vbabka@kernel.org>
> To: Mike Rapoport <rppt@kernel.org>
> To: Suren Baghdasaryan <surenb@google.com>
> To: Michal Hocko <mhocko@suse.com>
> To: Luiz Capitulino <luizcap@redhat.com>
> Cc: kernel@oss.qualcomm.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  include/linux/page_ext.h | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
> index 61e876e255e8..4f7d7a8709de 100644
> --- a/include/linux/page_ext.h
> +++ b/include/linux/page_ext.h
> @@ -120,14 +120,18 @@ struct page_ext_iter {
>   * page_ext_iter_begin() - Prepare for iterating through page extensions.
>   * @iter: page extension iterator.
>   * @pfn: PFN of the page we're interested in.
> + * @count: maximum number of page extensions to return.
>   *
>   * Must be called with RCU read lock taken.
>   *
>   * Return: NULL if no page_ext exists for this page.
>   */
>  static inline struct page_ext *page_ext_iter_begin(struct page_ext_iter *iter,
> -						unsigned long pfn)
> +		unsigned long pfn, unsigned long count)
>  {
> +	if (count == 0)

Nit: !count, but we can keep it as is. Thanks!

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 14:14 [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access Ketan
  2026-06-22 14:20 ` David Hildenbrand (Arm)
@ 2026-06-22 16:14 ` Zi Yan
  2026-06-22 17:01 ` [syzbot ci] " syzbot ci
  2026-06-22 19:44 ` [PATCH v2] " Zi Yan
  3 siblings, 0 replies; 8+ messages in thread
From: Zi Yan @ 2026-06-22 16:14 UTC (permalink / raw)
  To: Ketan, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Luiz Capitulino,
	David Hildenbrand
  Cc: kernel, stable, linux-mm, linux-kernel, Matthew Wilcox,
	Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport

On Mon Jun 22, 2026 at 10:14 AM EDT, Ketan wrote:
> The page_ext iteration API does not validate if the PFN still
> belongs to a valid section while advancing the iterator. When
> dynamically adding memory in the hotplug path, it can lead to a
> NULL pointer dereference during page_ext_lookup at the boundary
> of the last valid section when iterator count equals __pgcount.
>
> The for_each_page_ext() macro calls page_ext_iter_next() as its
> loop increment. for_each_page_ext() does a
> "__page_ext = page_ext_iter_next(&__iter)" at the end. This
> causes page_ext_iter_next() to increment iter->index past
> __pgcount and call page_ext_lookup(start_pfn + __pgcount).
> During memory hotplug (online), the PFN at start_pfn + __pgcount
> may belong to a section that has not yet been initialized,
> causing page_ext_lookup() to trigger a NULL pointer dereference.
>
> [   14.555124][  T846] Call trace:
> [   14.555125][  T846]  lookup_page_ext+0x6c/0x108 (P)
> [   14.555127][  T846]  page_ext_lookup+0x30/0x3c
> [   14.555129][  T846]  __reset_page_owner+0x11c/0x260
> [   14.571201][  T846]  __free_pages_ok+0x5e8/0x8e0
> [   14.571204][  T846]  __free_pages_core+0x78/0xf0
> [   14.571206][  T846]  generic_online_page+0x14/0x24
> [   14.597782][  T846]  online_pages+0x178/0x30c
> [   14.597784][  T846]  memory_block_change_state+0x284/0x32c
> [   14.597787][  T846]  memory_subsys_online+0x4c/0x64
> [   14.597789][  T846]  device_online+0x88/0xb0
> [   14.597791][  T846]  online_memory_block+0x30/0x40
> [   14.597793][  T846]  walk_memory_blocks+0xac/0xe8
> [   14.597794][  T846]  add_memory_resource+0x280/0x298
> [   14.656161][  T846]  add_memory+0x60/0x98
>
> Move the iteration boundary enforcement inside the iterator
> functions, so callers cannot inadvertently access beyond the
> requested range.
>
> Fixes: 9039b9096ea2 ("mm: page_owner: use new iteration API")
> Cc: stable@vger.kernel.org
> Suggested-by: David Hildenbrand <david@redhat.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Ketan Kishore <ketan.kishore@oss.qualcomm.com>
> ---
> Changes in v2:
> - Incorporated comments from David and Matthew to check for invalid PFN
>   in page_ext iterator rather than checking for NULL section in
>   page_ext_lookup.
> - Minor improvement in commit description to include the issue with
>   page_ext_iter_next
> - Link to v1: https://patch.msgid.link/20260617-page_ext-v1-1-37ad802b1a38@oss.qualcomm.com
>
> To: Andrew Morton <akpm@linux-foundation.org>
> To: David Hildenbrand <david@kernel.org>
> To: Lorenzo Stoakes <ljs@kernel.org>
> To: "Liam R. Howlett" <liam@infradead.org>
> To: Vlastimil Babka <vbabka@kernel.org>
> To: Mike Rapoport <rppt@kernel.org>
> To: Suren Baghdasaryan <surenb@google.com>
> To: Michal Hocko <mhocko@suse.com>
> To: Luiz Capitulino <luizcap@redhat.com>
> Cc: kernel@oss.qualcomm.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  include/linux/page_ext.h | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
>
LGTM. Thanks.
Acked-by: Zi Yan <ziy@nvidia.com>

-- 
Best Regards,
Yan, Zi



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [syzbot ci] Re: mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 14:14 [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access Ketan
  2026-06-22 14:20 ` David Hildenbrand (Arm)
  2026-06-22 16:14 ` Zi Yan
@ 2026-06-22 17:01 ` syzbot ci
  2026-06-22 19:36   ` [syzbot ci] " Zi Yan
  2026-06-22 19:44 ` [PATCH v2] " Zi Yan
  3 siblings, 1 reply; 8+ messages in thread
From: syzbot ci @ 2026-06-22 17:01 UTC (permalink / raw)
  To: akpm, david, hannes, jackmanb, kernel, ketan.kishore, liam,
	linux-kernel, linux-mm, ljs, luizcap, mhocko, rppt, stable,
	surenb, vbabka, willy, ziy
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
https://lore.kernel.org/all/20260622-page_ext-v2-1-135d4cfbc42f@oss.qualcomm.com
* [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access

and found the following issue:
WARNING in depot_fetch_stack

Full report is available here:
https://ci.syzbot.org/series/092dd7dc-cb78-46b6-8703-6044fff2631d

***

WARNING in depot_fetch_stack

tree:      mm-new
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
base:      e1201ff76176ef666b13d1a4ec6b6190ddc6abc8
arch:      amd64
compiler:  Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
config:    https://ci.syzbot.org/builds/18f461a2-7098-44bc-9d42-634b56ba48d9/config

------------[ cut here ]------------
!refcount_read(&stack->count)
WARNING: lib/stackdepot.c:517 at depot_fetch_stack+0x91/0xa0, CPU#0: kworker/u9:4/1114
Modules linked in:
CPU: 0 UID: 0 PID: 1114 Comm: kworker/u9:4 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: events_unbound call_usermodehelper_exec_work
RIP: 0010:depot_fetch_stack+0x91/0xa0
Code: 39 f5 72 d0 48 8d 3d 7e 1b 4d 0b 89 ee 44 89 f2 89 d9 67 48 0f b9 3a 31 c0 5b 41 5e 5d e9 87 67 b8 06 cc 90 0f 0b 90 eb ee 90 <0f> 0b 90 eb e8 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90
RSP: 0000:ffffc900079a6ce0 EFLAGS: 00010246
RAX: ffff888168b94000 RBX: 0000000000000ce0 RCX: 0000000000000067
RDX: 0000000000000000 RSI: ffffffff8e215937 RDI: ffffffff8c28ab20
RBP: 0000000000000067 R08: ffff88810495a407 R09: 1ffff1102092b480
R10: dffffc0000000000 R11: ffffed102092b481 R12: 00000000019c0068
R13: 0000000000000001 R14: 000000000000010f R15: ffff88810afb1dc0
FS:  0000000000000000(0000) GS:ffff88818dcb5000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88823ffff000 CR3: 000000000e74a000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 __set_page_owner+0x140/0x4c0
 post_alloc_hook+0x1f9/0x250
 get_page_from_freelist+0x21fa/0x2270
 __alloc_frozen_pages_noprof+0x18d/0x380
 alloc_pages_mpol+0x212/0x380
 alloc_pages_noprof+0xac/0x2a0
 get_free_pages_noprof+0xf/0x80
 __kasan_populate_vmalloc+0x38/0x1c0
 alloc_vmap_area+0xd1a/0x1420
 __get_vm_area_node+0x1f2/0x300
 __vmalloc_node_range_noprof+0x358/0x1730
 __vmalloc_node_noprof+0xc2/0x100
 dup_task_struct+0x28e/0x830
 copy_process+0x79d/0x4380
 kernel_clone+0x2d7/0x940
 user_mode_thread+0x110/0x180
 call_usermodehelper_exec_work+0x5c/0x230
 process_scheduled_works+0xa8e/0x14e0
 worker_thread+0xa47/0xfb0
 kthread+0x389/0x470
 ret_from_fork+0x514/0xb70
 ret_from_fork_asm+0x1a/0x30
 </TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot ci] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 17:01 ` [syzbot ci] " syzbot ci
@ 2026-06-22 19:36   ` Zi Yan
  2026-06-22 19:45     ` Ketan Kishore
  2026-06-22 20:24     ` [syzbot ci] " syzbot ci
  0 siblings, 2 replies; 8+ messages in thread
From: Zi Yan @ 2026-06-22 19:36 UTC (permalink / raw)
  To: syzbot ci
  Cc: akpm, david, hannes, jackmanb, kernel, ketan.kishore, liam,
	linux-kernel, linux-mm, ljs, luizcap, mhocko, rppt, stable,
	surenb, vbabka, willy, syzbot, syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 3650 bytes --]

On 22 Jun 2026, at 13:01, syzbot ci wrote:

> syzbot ci has tested the following series
>
> [v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
> https://lore.kernel.org/all/20260622-page_ext-v2-1-135d4cfbc42f@oss.qualcomm.com
> * [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
>
> and found the following issue:
> WARNING in depot_fetch_stack
>
> Full report is available here:
> https://ci.syzbot.org/series/092dd7dc-cb78-46b6-8703-6044fff2631d
>
> ***
>
> WARNING in depot_fetch_stack
>
> tree:      mm-new
> URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
> base:      e1201ff76176ef666b13d1a4ec6b6190ddc6abc8
> arch:      amd64
> compiler:  Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
> config:    https://ci.syzbot.org/builds/18f461a2-7098-44bc-9d42-634b56ba48d9/config
>
> ------------[ cut here ]------------
> !refcount_read(&stack->count)
> WARNING: lib/stackdepot.c:517 at depot_fetch_stack+0x91/0xa0, CPU#0: kworker/u9:4/1114
> Modules linked in:
> CPU: 0 UID: 0 PID: 1114 Comm: kworker/u9:4 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> Workqueue: events_unbound call_usermodehelper_exec_work
> RIP: 0010:depot_fetch_stack+0x91/0xa0
> Code: 39 f5 72 d0 48 8d 3d 7e 1b 4d 0b 89 ee 44 89 f2 89 d9 67 48 0f b9 3a 31 c0 5b 41 5e 5d e9 87 67 b8 06 cc 90 0f 0b 90 eb ee 90 <0f> 0b 90 eb e8 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90
> RSP: 0000:ffffc900079a6ce0 EFLAGS: 00010246
> RAX: ffff888168b94000 RBX: 0000000000000ce0 RCX: 0000000000000067
> RDX: 0000000000000000 RSI: ffffffff8e215937 RDI: ffffffff8c28ab20
> RBP: 0000000000000067 R08: ffff88810495a407 R09: 1ffff1102092b480
> R10: dffffc0000000000 R11: ffffed102092b481 R12: 00000000019c0068
> R13: 0000000000000001 R14: 000000000000010f R15: ffff88810afb1dc0
> FS:  0000000000000000(0000) GS:ffff88818dcb5000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffff88823ffff000 CR3: 000000000e74a000 CR4: 00000000000006f0
> Call Trace:
>  <TASK>
>  __set_page_owner+0x140/0x4c0
>  post_alloc_hook+0x1f9/0x250
>  get_page_from_freelist+0x21fa/0x2270
>  __alloc_frozen_pages_noprof+0x18d/0x380
>  alloc_pages_mpol+0x212/0x380
>  alloc_pages_noprof+0xac/0x2a0
>  get_free_pages_noprof+0xf/0x80
>  __kasan_populate_vmalloc+0x38/0x1c0
>  alloc_vmap_area+0xd1a/0x1420
>  __get_vm_area_node+0x1f2/0x300
>  __vmalloc_node_range_noprof+0x358/0x1730
>  __vmalloc_node_noprof+0xc2/0x100
>  dup_task_struct+0x28e/0x830
>  copy_process+0x79d/0x4380
>  kernel_clone+0x2d7/0x940
>  user_mode_thread+0x110/0x180
>  call_usermodehelper_exec_work+0x5c/0x230
>  process_scheduled_works+0xa8e/0x14e0
>  worker_thread+0xa47/0xfb0
>  kthread+0x389/0x470
>  ret_from_fork+0x514/0xb70
>  ret_from_fork_asm+0x1a/0x30
>  </TASK>
>
>
> ***
>
> If these findings have caused you to resend the series or submit a
> separate fix, please add the following tag to your commit message:
>   Tested-by: syzbot@syzkaller.appspotmail.com
>
> ---
> This report is generated by a bot. It may contain errors.
> syzbot ci engineers can be reached at syzkaller@googlegroups.com.
>
> To test a patch for this bug, please reply with `#syz test`
> (should be on a separate line).
>
> The patch should be attached to the email.
> Note: arguments like custom git repos and branches are not supported.

#syz test

Best Regards,
Yan, Zi

[-- Attachment #2: page_ext_fixup.patch --]
[-- Type: text/plain, Size: 466 bytes --]

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index 4f7d7a8709de..f797bc898b8b 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -156,7 +156,8 @@ static inline struct page_ext *page_ext_iter_next(struct page_ext_iter *iter,
 	if (WARN_ON_ONCE(!iter->page_ext))
 		return NULL;
 
-	if (iter->index++ >= count)
+	iter->index++;
+	if (iter->index >= count)
 		return NULL;
 	pfn = iter->start_pfn + iter->index;
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 14:14 [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access Ketan
                   ` (2 preceding siblings ...)
  2026-06-22 17:01 ` [syzbot ci] " syzbot ci
@ 2026-06-22 19:44 ` Zi Yan
  3 siblings, 0 replies; 8+ messages in thread
From: Zi Yan @ 2026-06-22 19:44 UTC (permalink / raw)
  To: Ketan, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Luiz Capitulino,
	David Hildenbrand
  Cc: kernel, stable, linux-mm, linux-kernel, Matthew Wilcox,
	Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport

On Mon Jun 22, 2026 at 10:14 AM EDT, Ketan wrote:
> The page_ext iteration API does not validate if the PFN still
> belongs to a valid section while advancing the iterator. When
> dynamically adding memory in the hotplug path, it can lead to a
> NULL pointer dereference during page_ext_lookup at the boundary
> of the last valid section when iterator count equals __pgcount.
>
> The for_each_page_ext() macro calls page_ext_iter_next() as its
> loop increment. for_each_page_ext() does a
> "__page_ext = page_ext_iter_next(&__iter)" at the end. This
> causes page_ext_iter_next() to increment iter->index past
> __pgcount and call page_ext_lookup(start_pfn + __pgcount).
> During memory hotplug (online), the PFN at start_pfn + __pgcount
> may belong to a section that has not yet been initialized,
> causing page_ext_lookup() to trigger a NULL pointer dereference.
>
> [   14.555124][  T846] Call trace:
> [   14.555125][  T846]  lookup_page_ext+0x6c/0x108 (P)
> [   14.555127][  T846]  page_ext_lookup+0x30/0x3c
> [   14.555129][  T846]  __reset_page_owner+0x11c/0x260
> [   14.571201][  T846]  __free_pages_ok+0x5e8/0x8e0
> [   14.571204][  T846]  __free_pages_core+0x78/0xf0
> [   14.571206][  T846]  generic_online_page+0x14/0x24
> [   14.597782][  T846]  online_pages+0x178/0x30c
> [   14.597784][  T846]  memory_block_change_state+0x284/0x32c
> [   14.597787][  T846]  memory_subsys_online+0x4c/0x64
> [   14.597789][  T846]  device_online+0x88/0xb0
> [   14.597791][  T846]  online_memory_block+0x30/0x40
> [   14.597793][  T846]  walk_memory_blocks+0xac/0xe8
> [   14.597794][  T846]  add_memory_resource+0x280/0x298
> [   14.656161][  T846]  add_memory+0x60/0x98
>
> Move the iteration boundary enforcement inside the iterator
> functions, so callers cannot inadvertently access beyond the
> requested range.
>
> Fixes: 9039b9096ea2 ("mm: page_owner: use new iteration API")
> Cc: stable@vger.kernel.org
> Suggested-by: David Hildenbrand <david@redhat.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Ketan Kishore <ketan.kishore@oss.qualcomm.com>
> ---
> Changes in v2:
> - Incorporated comments from David and Matthew to check for invalid PFN
>   in page_ext iterator rather than checking for NULL section in
>   page_ext_lookup.
> - Minor improvement in commit description to include the issue with
>   page_ext_iter_next
> - Link to v1: https://patch.msgid.link/20260617-page_ext-v1-1-37ad802b1a38@oss.qualcomm.com
>
> To: Andrew Morton <akpm@linux-foundation.org>
> To: David Hildenbrand <david@kernel.org>
> To: Lorenzo Stoakes <ljs@kernel.org>
> To: "Liam R. Howlett" <liam@infradead.org>
> To: Vlastimil Babka <vbabka@kernel.org>
> To: Mike Rapoport <rppt@kernel.org>
> To: Suren Baghdasaryan <surenb@google.com>
> To: Michal Hocko <mhocko@suse.com>
> To: Luiz Capitulino <luizcap@redhat.com>
> Cc: kernel@oss.qualcomm.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  include/linux/page_ext.h | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
>

<snip>

> @@ -138,19 +142,22 @@ static inline struct page_ext *page_ext_iter_begin(struct page_ext_iter *iter,
>  /**
>   * page_ext_iter_next() - Get next page extension
>   * @iter: page extension iterator.
> + * @count: maximum number of page extensions to return.
>   *
>   * Must be called with RCU read lock taken.
>   *
>   * Return: NULL if no next page_ext exists.
>   */
> -static inline struct page_ext *page_ext_iter_next(struct page_ext_iter *iter)
> +static inline struct page_ext *page_ext_iter_next(struct page_ext_iter *iter,
> +		unsigned long count)
>  {
>  	unsigned long pfn;
>  
>  	if (WARN_ON_ONCE(!iter->page_ext))
>  		return NULL;
>  
> -	iter->index++;
> +	if (iter->index++ >= count)

The before-incremented iter->index is used to compared to count.

Either

if (++iter->index >= count)

or

iter->index++;
if (iter->index >= count)

works.

I tried the latter locally and it fixed the issue reported by syzbot[1].

[1] https://lore.kernel.org/all/6a396a5a.ac26f6c2.9a9c4.0000.GAE@google.com/

> +		return NULL;
>  	pfn = iter->start_pfn + iter->index;

-- 
Best Regards,
Yan, Zi



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot ci] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 19:36   ` [syzbot ci] " Zi Yan
@ 2026-06-22 19:45     ` Ketan Kishore
  2026-06-22 20:24     ` [syzbot ci] " syzbot ci
  1 sibling, 0 replies; 8+ messages in thread
From: Ketan Kishore @ 2026-06-22 19:45 UTC (permalink / raw)
  To: Zi Yan
  Cc: syzbot ci, akpm, david, hannes, jackmanb, kernel, liam,
	linux-kernel, linux-mm, ljs, luizcap, mhocko, rppt, stable,
	surenb, vbabka, willy, syzbot, syzkaller-bugs

Thankyou Yan.
i missed the post increment change from the last v2.
i am posting a v3 with pre incerement.

- if (iter->index++ >= count)
+ if (++iter->index >= count)

i think that is aligned with the fix you suggested.

Thank you
Ketan

On Tue, Jun 23, 2026 at 1:06 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 22 Jun 2026, at 13:01, syzbot ci wrote:
>
> > syzbot ci has tested the following series
> >
> > [v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
> > https://lore.kernel.org/all/20260622-page_ext-v2-1-135d4cfbc42f@oss.qualcomm.com
> > * [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
> >
> > and found the following issue:
> > WARNING in depot_fetch_stack
> >
> > Full report is available here:
> > https://ci.syzbot.org/series/092dd7dc-cb78-46b6-8703-6044fff2631d
> >
> > ***
> >
> > WARNING in depot_fetch_stack
> >
> > tree:      mm-new
> > URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
> > base:      e1201ff76176ef666b13d1a4ec6b6190ddc6abc8
> > arch:      amd64
> > compiler:  Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
> > config:    https://ci.syzbot.org/builds/18f461a2-7098-44bc-9d42-634b56ba48d9/config
> >
> > ------------[ cut here ]------------
> > !refcount_read(&stack->count)
> > WARNING: lib/stackdepot.c:517 at depot_fetch_stack+0x91/0xa0, CPU#0: kworker/u9:4/1114
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 1114 Comm: kworker/u9:4 Not tainted syzkaller #0 PREEMPT(full)
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > Workqueue: events_unbound call_usermodehelper_exec_work
> > RIP: 0010:depot_fetch_stack+0x91/0xa0
> > Code: 39 f5 72 d0 48 8d 3d 7e 1b 4d 0b 89 ee 44 89 f2 89 d9 67 48 0f b9 3a 31 c0 5b 41 5e 5d e9 87 67 b8 06 cc 90 0f 0b 90 eb ee 90 <0f> 0b 90 eb e8 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90
> > RSP: 0000:ffffc900079a6ce0 EFLAGS: 00010246
> > RAX: ffff888168b94000 RBX: 0000000000000ce0 RCX: 0000000000000067
> > RDX: 0000000000000000 RSI: ffffffff8e215937 RDI: ffffffff8c28ab20
> > RBP: 0000000000000067 R08: ffff88810495a407 R09: 1ffff1102092b480
> > R10: dffffc0000000000 R11: ffffed102092b481 R12: 00000000019c0068
> > R13: 0000000000000001 R14: 000000000000010f R15: ffff88810afb1dc0
> > FS:  0000000000000000(0000) GS:ffff88818dcb5000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffff88823ffff000 CR3: 000000000e74a000 CR4: 00000000000006f0
> > Call Trace:
> >  <TASK>
> >  __set_page_owner+0x140/0x4c0
> >  post_alloc_hook+0x1f9/0x250
> >  get_page_from_freelist+0x21fa/0x2270
> >  __alloc_frozen_pages_noprof+0x18d/0x380
> >  alloc_pages_mpol+0x212/0x380
> >  alloc_pages_noprof+0xac/0x2a0
> >  get_free_pages_noprof+0xf/0x80
> >  __kasan_populate_vmalloc+0x38/0x1c0
> >  alloc_vmap_area+0xd1a/0x1420
> >  __get_vm_area_node+0x1f2/0x300
> >  __vmalloc_node_range_noprof+0x358/0x1730
> >  __vmalloc_node_noprof+0xc2/0x100
> >  dup_task_struct+0x28e/0x830
> >  copy_process+0x79d/0x4380
> >  kernel_clone+0x2d7/0x940
> >  user_mode_thread+0x110/0x180
> >  call_usermodehelper_exec_work+0x5c/0x230
> >  process_scheduled_works+0xa8e/0x14e0
> >  worker_thread+0xa47/0xfb0
> >  kthread+0x389/0x470
> >  ret_from_fork+0x514/0xb70
> >  ret_from_fork_asm+0x1a/0x30
> >  </TASK>
> >
> >
> > ***
> >
> > If these findings have caused you to resend the series or submit a
> > separate fix, please add the following tag to your commit message:
> >   Tested-by: syzbot@syzkaller.appspotmail.com
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > syzbot ci engineers can be reached at syzkaller@googlegroups.com.
> >
> > To test a patch for this bug, please reply with `#syz test`
> > (should be on a separate line).
> >
> > The patch should be attached to the email.
> > Note: arguments like custom git repos and branches are not supported.
>
> #syz test
>
> Best Regards,
> Yan, Zi


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [syzbot ci] Re: mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
  2026-06-22 19:36   ` [syzbot ci] " Zi Yan
  2026-06-22 19:45     ` Ketan Kishore
@ 2026-06-22 20:24     ` syzbot ci
  1 sibling, 0 replies; 8+ messages in thread
From: syzbot ci @ 2026-06-22 20:24 UTC (permalink / raw)
  To: ziy, akpm, david, hannes, jackmanb, kernel, ketan.kishore, liam,
	linux-kernel, linux-mm, ljs, luizcap, mhocko, rppt, stable,
	surenb, syzbot, syzkaller-bugs, vbabka, willy
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the suggested fix patch on top of the following series:

[v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access
https://lore.kernel.org/all/20260622-page_ext-v2-1-135d4cfbc42f@oss.qualcomm.com

Patch: https://ci.syzbot.org/jobs/c1b2fffb-acae-4bd1-b00c-622d37f59120/patch
Testing results:
* [build 0] Build Patched: passed
* [build 0] Boot test: Patched: passed

Full report is available here:
https://ci.syzbot.org/session/e38844bc-7d59-4cf9-ac06-b987a2745fb3

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-22 20:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-22 14:14 [PATCH v2] mm: page_ext: add count limit to page_ext_iter_next to prevent invalid PFN access Ketan
2026-06-22 14:20 ` David Hildenbrand (Arm)
2026-06-22 16:14 ` Zi Yan
2026-06-22 17:01 ` [syzbot ci] " syzbot ci
2026-06-22 19:36   ` [syzbot ci] " Zi Yan
2026-06-22 19:45     ` Ketan Kishore
2026-06-22 20:24     ` [syzbot ci] " syzbot ci
2026-06-22 19:44 ` [PATCH v2] " Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox