[RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU

Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
@ 2026-06-10 12:05 zhaoyang.huang
  2026-06-10 12:50 ` David Hildenbrand (Arm)
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: zhaoyang.huang @ 2026-06-10 12:05 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Zi Yan, Lorenzo Stoakes,
	Barry Song, Baolin Wang, Lance Yang, Liam R . Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, linux-mm, linux-kernel, Zhaoyang Huang,
	steve.kang

From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

The kernel panics are keeping to be reported especially when the f2fs
partition get almost full. By investigation, we find that the reason is
one f2fs page got freed to buddy without being deleted from LRU and the
root cause is the race happened in [2] which is enrolled by this commit.
We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
non-uptodate folio from the page cache in move_data_block").

There are 3 race processes in this scenario, please find below for their
main activities. However, by further investigation over the code, I
think there is a common race window for the truncated folios between
split_folio_to_order and folio_isolate_lru, where the folios lost the
refcount on page cache and remains the transient one of the split
caller, under which the folio could enter free path and compete with the
isolation process. This commit would like to suggest to have the folios
beyond EOF stay out of LRU.

Truncate:
The changed code in move_data_block() lets the GC path evict the tail-end
folio from the page cache through folio_end_dropbehind().  Once
folio_unmap_invalidate() removes the folio from mapping->i_pages, the
page-cache references for all pages in the folio are dropped.  The folio
is then kept alive only by temporary external references, which allows a
later split to operate on a folio whose subpages are no longer protected
by page-cache references.

Split:
After the page-cache references are gone, split_folio_to_order() can
split the big folio into individual pages and put the resulting subpages
back on the LRU.  For tail pages beyond EOF, split removes them from the
page cache and drops their page-cache references.  A tail page can then
remain on the LRU with PG_lru set while holding only the split caller's
temporary reference.  When free_folio_and_swap_cache() drops that final
reference, the page enters the final folio_put() release path.

Isolate:
In parallel, folio_isolate_lru() can observe the same tail page with a
non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
reference.  If this races with the final folio_put() from the split path,
__folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
The page is then freed back to the allocator while its lru links are
still present in the LRU list.  A later LRU operation on a neighboring
page detects the stale link and reports list corruption.

[1]
[   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
[   22.486130] ------------[ cut here ]------------
[   22.486134] kernel BUG at lib/list_debug.c:67!
[   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
[   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
[   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
[   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
[   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
[   22.488539] sp : ffffffc08006b830
[   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
[   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
[   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
[   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
[   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
[   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
[   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
[   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
[   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
[   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
[   22.488647] Call trace:
[   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
[   22.488661]  __folio_put+0x2bc/0x434
[   22.488670]  folio_put+0x28/0x58
[   22.488678]  do_garbage_collect+0x1a34/0x2584
[   22.488689]  f2fs_gc+0x230/0x9b4
[   22.488697]  f2fs_fallocate+0xb90/0xdf4
[   22.488706]  vfs_fallocate+0x1b4/0x2bc
[   22.488716]  __arm64_sys_fallocate+0x44/0x78
[   22.488725]  invoke_syscall+0x58/0xe4
[   22.488732]  do_el0_svc+0x48/0xdc
[   22.488739]  el0_svc+0x3c/0x98
[   22.488747]  el0t_64_sync_handler+0x20/0x130
[   22.488754]  el0t_64_sync+0x1c4/0x1c8

[2]
CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)

F: pagecache refs = n
F: extra refs = GC + split
F: PG_lru set
move_data_block()
folio = f2fs_grab_cache_folio(F)
...
__folio_set_dropbehind(F)
folio_unlock(F)
folio_end_dropbehind(F)
  folio_unmap_invalidate(F)
    __filemap_remove_folio(F)
    folio_put_refs(F, n)
folio_put(F)
                            split_folio_to_order(F)
                              folio_ref_freeze(F, 1)
                              ...
                              lru_add_split_folio(T)
                                list_add_tail(&T->lru, &F->lru)
                                folio_set_lru(T)
                              __filemap_remove_folio(T)
                              folio_put_refs(T, 1)
                              /* T refcount == 1, PageLRU set */
                            free_folio_and_swap_cache(T)
                              folio_put(T)
                                /* refcount: 1 -> 0 */
                                                                  folio_isolate_lru(T)
                                                                    folio_test_clear_lru(T)
                                __folio_put(T)
                                  __page_cache_release(T)
                                    folio_test_lru(T) == false
                                    /* skip lruvec_del_folio(T) */
                                  free_frozen_pages(T)
                                                                  folio_get(T)
                                                                  lruvec_del_folio(T)
later:
  list_del(adjacent->lru)
    next == &T->lru
    next->prev == LIST_POISON / PCP freelist
    BUG

Assisted-by: Cursor:claude-opus-4-8
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 mm/huge_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 970e077019b7..7465525a94a8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 			folio_ref_unfreeze(new_folio,
 					   folio_cache_ref_count(new_folio) + 1);

-			if (do_lru)
+			if (do_lru && !(mapping && new_folio->index >= end))
 				lru_add_split_folio(folio, new_folio, lruvec, list);

 			/*
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 12:05 [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU zhaoyang.huang
@ 2026-06-10 12:50 ` David Hildenbrand (Arm)
  2026-06-10 14:38   ` Zi Yan
  2026-06-10 20:30 ` Andrew Morton
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-10 12:50 UTC (permalink / raw)
  To: zhaoyang.huang, Andrew Morton, Zi Yan, Lorenzo Stoakes,
	Barry Song, Baolin Wang, Lance Yang, Liam R . Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, linux-mm, linux-kernel, Zhaoyang Huang,
	steve.kang

On 6/10/26 14:05, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> non-uptodate folio from the page cache in move_data_block").

But I assume, that other FSes can trigger this as well? Any insights?

> 
> There are 3 race processes in this scenario, please find below for their
> main activities. However, by further investigation over the code, I
> think there is a common race window for the truncated folios between
> split_folio_to_order and folio_isolate_lru, where the folios lost the
> refcount on page cache and remains the transient one of the split
> caller, under which the folio could enter free path and compete with the
> isolation process. This commit would like to suggest to have the folios
> beyond EOF stay out of LRU.
> 
> Truncate:
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind().  Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped.  The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
> 
> Split:
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU.  For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references.  A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference.  When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
> 
> Isolate:
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> reference.  If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list.  A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.

Complicated mess :(

So, folio_isolate_lru() really only requires the caller to hold a folio
reference, which can happen given that we did the folio_ref_unfreeze(). It can,
for example, be triggered by memory offlining or page migration.

So we really want to not allow folio_isolate_lru() while we are still processing
the folio.

What your patch does is, simply not add folios that we will drop from the page
cache to the LRU?


You should describe here how you are fixing it: "Let's fix it by..."

> 
> [1]
> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> [   22.486130] ------------[ cut here ]------------
> [   22.486134] kernel BUG at lib/list_debug.c:67!
> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488539] sp : ffffffc08006b830
> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> [   22.488647] Call trace:
> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> [   22.488661]  __folio_put+0x2bc/0x434
> [   22.488670]  folio_put+0x28/0x58
> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> [   22.488689]  f2fs_gc+0x230/0x9b4
> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> [   22.488725]  invoke_syscall+0x58/0xe4
> [   22.488732]  do_el0_svc+0x48/0xdc
> [   22.488739]  el0_svc+0x3c/0x98
> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> 
> [2]
> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> 
> F: pagecache refs = n
> F: extra refs = GC + split
> F: PG_lru set
> move_data_block()
> folio = f2fs_grab_cache_folio(F)
> ...
> __folio_set_dropbehind(F)
> folio_unlock(F)
> folio_end_dropbehind(F)
>   folio_unmap_invalidate(F)
>     __filemap_remove_folio(F)
>     folio_put_refs(F, n)
> folio_put(F)
>                             split_folio_to_order(F)
>                               folio_ref_freeze(F, 1)
>                               ...
>                               lru_add_split_folio(T)
>                                 list_add_tail(&T->lru, &F->lru)
>                                 folio_set_lru(T)
>                               __filemap_remove_folio(T)
>                               folio_put_refs(T, 1)
>                               /* T refcount == 1, PageLRU set */
>                             free_folio_and_swap_cache(T)
>                               folio_put(T)
>                                 /* refcount: 1 -> 0 */
>                                                                   folio_isolate_lru(T)
>                                                                     folio_test_clear_lru(T)
>                                 __folio_put(T)
>                                   __page_cache_release(T)
>                                     folio_test_lru(T) == false
>                                     /* skip lruvec_del_folio(T) */
>                                   free_frozen_pages(T)
>                                                                   folio_get(T)
>                                                                   lruvec_del_folio(T)
> later:
>   list_del(adjacent->lru)
>     next == &T->lru
>     next->prev == LIST_POISON / PCP freelist
>     BUG
> 
> Assisted-by: Cursor:claude-opus-4-8
> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

I'm wondering if this has been broken the whole time, or if some rework allowed
this to trigger.

I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?

Looking into the history, I think we always unconditionally did the
lru_add_split_folio()/lru_add_page_tail().

> ---
>  mm/huge_memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 970e077019b7..7465525a94a8 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>  			folio_ref_unfreeze(new_folio,
>  					   folio_cache_ref_count(new_folio) + 1);
>  
> -			if (do_lru)
> +			if (do_lru && !(mapping && new_folio->index >= end))

It might be clearer to write this as

	do_lru && (!mapping || new_folio->index < end)

To match the page-cache check further below

	if (!mapping)
		continue

	...
	if (new_folio->index < end)
		...

>  				lru_add_split_folio(folio, new_folio, lruvec, list);
>  
>  			/*

folio_check_splittable() makes sure that we have a mapping for non-anon folios.
(no truncation). end is then only set for non-anon folios.

@Zi, any thoughts?

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 12:50 ` David Hildenbrand (Arm)
@ 2026-06-10 14:38   ` Zi Yan
  2026-06-10 17:25     ` Zi Yan
  2026-06-11  1:39     ` Zhaoyang Huang
  0 siblings, 2 replies; 16+ messages in thread
From: Zi Yan @ 2026-06-10 14:38 UTC (permalink / raw)
  To: David Hildenbrand (Arm), zhaoyang.huang
  Cc: Andrew Morton, Lorenzo Stoakes, Barry Song, Baolin Wang,
	Lance Yang, Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	linux-mm, linux-kernel, Zhaoyang Huang, steve.kang

On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:

> On 6/10/26 14:05, zhaoyang.huang wrote:
>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>
>> The kernel panics are keeping to be reported especially when the f2fs
>> partition get almost full. By investigation, we find that the reason is
>> one f2fs page got freed to buddy without being deleted from LRU and the
>> root cause is the race happened in [2] which is enrolled by this commit.
>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>> non-uptodate folio from the page cache in move_data_block").
>
> But I assume, that other FSes can trigger this as well? Any insights?
>
>>
>> There are 3 race processes in this scenario, please find below for their
>> main activities. However, by further investigation over the code, I
>> think there is a common race window for the truncated folios between
>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>> refcount on page cache and remains the transient one of the split
>> caller, under which the folio could enter free path and compete with the
>> isolation process. This commit would like to suggest to have the folios
>> beyond EOF stay out of LRU.
>>
>> Truncate:
>> The changed code in move_data_block() lets the GC path evict the tail-end
>> folio from the page cache through folio_end_dropbehind().  Once
>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>> page-cache references for all pages in the folio are dropped.  The folio
>> is then kept alive only by temporary external references, which allows a
>> later split to operate on a folio whose subpages are no longer protected
>> by page-cache references.
>>
>> Split:
>> After the page-cache references are gone, split_folio_to_order() can
>> split the big folio into individual pages and put the resulting subpages
>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>> page cache and drops their page-cache references.  A tail page can then
>> remain on the LRU with PG_lru set while holding only the split caller's
>> temporary reference.  When free_folio_and_swap_cache() drops that final
>> reference, the page enters the final folio_put() release path.
>>
>> Isolate:
>> In parallel, folio_isolate_lru() can observe the same tail page with a
>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>> reference.  If this races with the final folio_put() from the split path,
>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>> The page is then freed back to the allocator while its lru links are
>> still present in the LRU list.  A later LRU operation on a neighboring
>> page detects the stale link and reports list corruption.
>
> Complicated mess :(
>
> So, folio_isolate_lru() really only requires the caller to hold a folio
> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
> for example, be triggered by memory offlining or page migration.
>
> So we really want to not allow folio_isolate_lru() while we are still processing
> the folio.

Or we should defer adding split folios to LRU after unfreeze.

>
> What your patch does is, simply not add folios that we will drop from the page
> cache to the LRU?
>
>
> You should describe here how you are fixing it: "Let's fix it by..."
>
>>
>> [1]
>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
>> [   22.486130] ------------[ cut here ]------------
>> [   22.486134] kernel BUG at lib/list_debug.c:67!
>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
>> [   22.488539] sp : ffffffc08006b830
>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
>> [   22.488647] Call trace:
>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
>> [   22.488661]  __folio_put+0x2bc/0x434
>> [   22.488670]  folio_put+0x28/0x58
>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
>> [   22.488689]  f2fs_gc+0x230/0x9b4
>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
>> [   22.488725]  invoke_syscall+0x58/0xe4
>> [   22.488732]  do_el0_svc+0x48/0xdc
>> [   22.488739]  el0_svc+0x3c/0x98
>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>>
>> [2]
>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>>
>> F: pagecache refs = n
>> F: extra refs = GC + split
>> F: PG_lru set
>> move_data_block()
>> folio = f2fs_grab_cache_folio(F)
>> ...
>> __folio_set_dropbehind(F)
>> folio_unlock(F)
>> folio_end_dropbehind(F)
>>   folio_unmap_invalidate(F)
>>     __filemap_remove_folio(F)
>>     folio_put_refs(F, n)
>> folio_put(F)
>>                             split_folio_to_order(F)
>>                               folio_ref_freeze(F, 1)
>>                               ...
>>                               lru_add_split_folio(T)
>>                                 list_add_tail(&T->lru, &F->lru)
>>                                 folio_set_lru(T)
>>                               __filemap_remove_folio(T)
>>                               folio_put_refs(T, 1)
>>                               /* T refcount == 1, PageLRU set */
>>                             free_folio_and_swap_cache(T)
>>                               folio_put(T)
>>                                 /* refcount: 1 -> 0 */
>>                                                                   folio_isolate_lru(T)

If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
folio_isolate_lru() would be triggered. Maybe we could just return false in that case.

>>                                                                     folio_test_clear_lru(T)
>>                                 __folio_put(T)
>>                                   __page_cache_release(T)
>>                                     folio_test_lru(T) == false
>>                                     /* skip lruvec_del_folio(T) */
>>                                   free_frozen_pages(T)
>>                                                                   folio_get(T)
>>                                                                   lruvec_del_folio(T)

But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.

>> later:
>>   list_del(adjacent->lru)
>>     next == &T->lru
>>     next->prev == LIST_POISON / PCP freelist
>>     BUG
>>

Why does CPU0 still see the stale link from adjacent?

>> Assisted-by: Cursor:claude-opus-4-8
>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> I'm wondering if this has been broken the whole time, or if some rework allowed
> this to trigger.
>
> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
>
> Looking into the history, I think we always unconditionally did the
> lru_add_split_folio()/lru_add_page_tail().
>
>> ---
>>  mm/huge_memory.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 970e077019b7..7465525a94a8 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>  			folio_ref_unfreeze(new_folio,
>>  					   folio_cache_ref_count(new_folio) + 1);
>>
>> -			if (do_lru)
>> +			if (do_lru && !(mapping && new_folio->index >= end))
>
> It might be clearer to write this as
>
> 	do_lru && (!mapping || new_folio->index < end)
>
> To match the page-cache check further below
>
> 	if (!mapping)
> 		continue
>
> 	...
> 	if (new_folio->index < end)
> 		...
>
>>  				lru_add_split_folio(folio, new_folio, lruvec, list);
>>
>>  			/*
>
> folio_check_splittable() makes sure that we have a mapping for non-anon folios.
> (no truncation). end is then only set for non-anon folios.
>
> @Zi, any thoughts?

The fix works but I feel that it is masking the race between folio_isolate_lru() and
folio_put(). I worry that the same issue might be triggered in other ways or
in new code if we do not fix the race.

To summarize my thoughts above:
1. adding frozen folios in LRU might be problematic, since folio_isolate_lru()
has a VM_BUG_ON_FOLIO() for it but still chooses to proceed the isolation.

2. the race analysis is not clear, since both folio_isolate_lru() and folio_put()
do lruvec_del_folio() if folio is on LRU. When list_del(adjacent->lru) sees
the stale link, the folio is already in buddy and page->lru is modified for
PageBuddy use? So even without CPU0, folio_isolate_lru()'s lruvec_del_folio()
can do the wrong thing on pages on buddy?


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 14:38   ` Zi Yan
@ 2026-06-10 17:25     ` Zi Yan
  2026-06-10 18:44       ` Zi Yan
  2026-06-11  1:39     ` Zhaoyang Huang
  1 sibling, 1 reply; 16+ messages in thread
From: Zi Yan @ 2026-06-10 17:25 UTC (permalink / raw)
  To: David Hildenbrand (Arm), zhaoyang.huang
  Cc: Andrew Morton, Lorenzo Stoakes, Barry Song, Baolin Wang,
	Lance Yang, Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	linux-mm, linux-kernel, Zhaoyang Huang, steve.kang

On 10 Jun 2026, at 10:38, Zi Yan wrote:

> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
>
>> On 6/10/26 14:05, zhaoyang.huang wrote:
>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>
>>> The kernel panics are keeping to be reported especially when the f2fs
>>> partition get almost full. By investigation, we find that the reason is
>>> one f2fs page got freed to buddy without being deleted from LRU and the
>>> root cause is the race happened in [2] which is enrolled by this commit.
>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>>> non-uptodate folio from the page cache in move_data_block").
>>
>> But I assume, that other FSes can trigger this as well? Any insights?
>>
>>>
>>> There are 3 race processes in this scenario, please find below for their
>>> main activities. However, by further investigation over the code, I
>>> think there is a common race window for the truncated folios between
>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>>> refcount on page cache and remains the transient one of the split
>>> caller, under which the folio could enter free path and compete with the
>>> isolation process. This commit would like to suggest to have the folios
>>> beyond EOF stay out of LRU.
>>>
>>> Truncate:
>>> The changed code in move_data_block() lets the GC path evict the tail-end
>>> folio from the page cache through folio_end_dropbehind().  Once
>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>>> page-cache references for all pages in the folio are dropped.  The folio
>>> is then kept alive only by temporary external references, which allows a
>>> later split to operate on a folio whose subpages are no longer protected
>>> by page-cache references.
>>>
>>> Split:
>>> After the page-cache references are gone, split_folio_to_order() can
>>> split the big folio into individual pages and put the resulting subpages
>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>>> page cache and drops their page-cache references.  A tail page can then
>>> remain on the LRU with PG_lru set while holding only the split caller's
>>> temporary reference.  When free_folio_and_swap_cache() drops that final
>>> reference, the page enters the final folio_put() release path.
>>>
>>> Isolate:
>>> In parallel, folio_isolate_lru() can observe the same tail page with a
>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>>> reference.  If this races with the final folio_put() from the split path,
>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>>> The page is then freed back to the allocator while its lru links are
>>> still present in the LRU list.  A later LRU operation on a neighboring
>>> page detects the stale link and reports list corruption.
>>
>> Complicated mess :(
>>
>> So, folio_isolate_lru() really only requires the caller to hold a folio
>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
>> for example, be triggered by memory offlining or page migration.
>>
>> So we really want to not allow folio_isolate_lru() while we are still processing
>> the folio.
>
> Or we should defer adding split folios to LRU after unfreeze.
>
>>
>> What your patch does is, simply not add folios that we will drop from the page
>> cache to the LRU?
>>
>>
>> You should describe here how you are fixing it: "Let's fix it by..."
>>
>>>
>>> [1]
>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
>>> [   22.486130] ------------[ cut here ]------------
>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
>>> [   22.488539] sp : ffffffc08006b830
>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
>>> [   22.488647] Call trace:
>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
>>> [   22.488661]  __folio_put+0x2bc/0x434
>>> [   22.488670]  folio_put+0x28/0x58
>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
>>> [   22.488689]  f2fs_gc+0x230/0x9b4
>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
>>> [   22.488725]  invoke_syscall+0x58/0xe4
>>> [   22.488732]  do_el0_svc+0x48/0xdc
>>> [   22.488739]  el0_svc+0x3c/0x98
>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>>>
>>> [2]
>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>>>
>>> F: pagecache refs = n
>>> F: extra refs = GC + split
>>> F: PG_lru set
>>> move_data_block()
>>> folio = f2fs_grab_cache_folio(F)
>>> ...
>>> __folio_set_dropbehind(F)
>>> folio_unlock(F)
>>> folio_end_dropbehind(F)
>>>   folio_unmap_invalidate(F)
>>>     __filemap_remove_folio(F)
>>>     folio_put_refs(F, n)
>>> folio_put(F)
>>>                             split_folio_to_order(F)
>>>                               folio_ref_freeze(F, 1)
>>>                               ...
>>>                               lru_add_split_folio(T)
>>>                                 list_add_tail(&T->lru, &F->lru)
>>>                                 folio_set_lru(T)
>>>                               __filemap_remove_folio(T)
>>>                               folio_put_refs(T, 1)
>>>                               /* T refcount == 1, PageLRU set */
>>>                             free_folio_and_swap_cache(T)
>>>                               folio_put(T)
>>>                                 /* refcount: 1 -> 0 */
>>>                                                                   folio_isolate_lru(T)
>
> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
>
>>>                                                                     folio_test_clear_lru(T)
>>>                                 __folio_put(T)
>>>                                   __page_cache_release(T)
>>>                                     folio_test_lru(T) == false
>>>                                     /* skip lruvec_del_folio(T) */
>>>                                   free_frozen_pages(T)
>>>                                                                   folio_get(T)
>>>                                                                   lruvec_del_folio(T)
>
> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
>
>>> later:
>>>   list_del(adjacent->lru)
>>>     next == &T->lru
>>>     next->prev == LIST_POISON / PCP freelist
>>>     BUG
>>>
>
> Why does CPU0 still see the stale link from adjacent?
>
>>> Assisted-by: Cursor:claude-opus-4-8
>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>
>> I'm wondering if this has been broken the whole time, or if some rework allowed
>> this to trigger.
>>
>> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
>>
>> Looking into the history, I think we always unconditionally did the
>> lru_add_split_folio()/lru_add_page_tail().
>>
>>> ---
>>>  mm/huge_memory.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 970e077019b7..7465525a94a8 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>>  			folio_ref_unfreeze(new_folio,
>>>  					   folio_cache_ref_count(new_folio) + 1);
>>>
>>> -			if (do_lru)
>>> +			if (do_lru && !(mapping && new_folio->index >= end))
>>
>> It might be clearer to write this as
>>
>> 	do_lru && (!mapping || new_folio->index < end)
>>
>> To match the page-cache check further below
>>
>> 	if (!mapping)
>> 		continue
>>
>> 	...
>> 	if (new_folio->index < end)
>> 		...
>>
>>>  				lru_add_split_folio(folio, new_folio, lruvec, list);

Talked to Claude and find an accounting issue with this. Without putting
EOF after-split folios back to LRU, they are not going through lruvec_del_folio(),
which decreases NR_*_LRU counter along with removing the folio from LRU
and it causes NR_*_LRU accounting errors. Note that the original folio
is on LRU all the time and LRU counters are not modified and after the split
the original folio size is decreased and the after-split folios need to
be added back to LRU to keep the LRU counters right. We will need to adjust
LRU accounting for (!mapping || new_folio->index < end) if we decide to
not add them back to LRU.


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 17:25     ` Zi Yan
@ 2026-06-10 18:44       ` Zi Yan
  2026-06-11  1:19         ` Zhaoyang Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Zi Yan @ 2026-06-10 18:44 UTC (permalink / raw)
  To: David Hildenbrand (Arm), zhaoyang.huang
  Cc: Andrew Morton, Lorenzo Stoakes, Barry Song, Baolin Wang,
	Lance Yang, Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	linux-mm, linux-kernel, Zhaoyang Huang, steve.kang

On 10 Jun 2026, at 13:25, Zi Yan wrote:

> On 10 Jun 2026, at 10:38, Zi Yan wrote:
>
>> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
>>
>>> On 6/10/26 14:05, zhaoyang.huang wrote:
>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>>
>>>> The kernel panics are keeping to be reported especially when the f2fs
>>>> partition get almost full. By investigation, we find that the reason is
>>>> one f2fs page got freed to buddy without being deleted from LRU and the
>>>> root cause is the race happened in [2] which is enrolled by this commit.
>>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>>>> non-uptodate folio from the page cache in move_data_block").
>>>
>>> But I assume, that other FSes can trigger this as well? Any insights?
>>>
>>>>
>>>> There are 3 race processes in this scenario, please find below for their
>>>> main activities. However, by further investigation over the code, I
>>>> think there is a common race window for the truncated folios between
>>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>>>> refcount on page cache and remains the transient one of the split
>>>> caller, under which the folio could enter free path and compete with the
>>>> isolation process. This commit would like to suggest to have the folios
>>>> beyond EOF stay out of LRU.
>>>>
>>>> Truncate:
>>>> The changed code in move_data_block() lets the GC path evict the tail-end
>>>> folio from the page cache through folio_end_dropbehind().  Once
>>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>>>> page-cache references for all pages in the folio are dropped.  The folio
>>>> is then kept alive only by temporary external references, which allows a
>>>> later split to operate on a folio whose subpages are no longer protected
>>>> by page-cache references.
>>>>
>>>> Split:
>>>> After the page-cache references are gone, split_folio_to_order() can
>>>> split the big folio into individual pages and put the resulting subpages
>>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>>>> page cache and drops their page-cache references.  A tail page can then
>>>> remain on the LRU with PG_lru set while holding only the split caller's
>>>> temporary reference.  When free_folio_and_swap_cache() drops that final
>>>> reference, the page enters the final folio_put() release path.
>>>>
>>>> Isolate:
>>>> In parallel, folio_isolate_lru() can observe the same tail page with a
>>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>>>> reference.  If this races with the final folio_put() from the split path,
>>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>>>> The page is then freed back to the allocator while its lru links are
>>>> still present in the LRU list.  A later LRU operation on a neighboring
>>>> page detects the stale link and reports list corruption.

Something is wrong here with the caller of folio_isolate_lru(), since
folio_isolate_lru() requires the caller to take an elevated refcount.
This means when entering folio_isolate_lru(), the EOF folio should have
at least refcount == 2, 1 from folio_split(), 1 from the caller of
folio_isolate_lru(). This should prevent the EOF folio being freed
by the parallel __folio_put().

Hi Zhaoyang, can you elaborate on the folio_isolate_lru() caller?

In addition (with the help of Claude), the race trace[2] below
looks invalid. It says split happens after folio_end_dropbehind(),
which sets folio->mapping to NULL, but __folio_split() returns -EBUSY
when folio->mapping is NULL in filemap_release_folio() check.
So the split cannot happen.

Now I am not sure if the bug report is valid or not. At least for
folio_split() and folio_isolate_lru(), the race should not exist.
But let me know if I miss anything.

>>>
>>> Complicated mess :(
>>>
>>> So, folio_isolate_lru() really only requires the caller to hold a folio
>>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
>>> for example, be triggered by memory offlining or page migration.
>>>
>>> So we really want to not allow folio_isolate_lru() while we are still processing
>>> the folio.
>>
>> Or we should defer adding split folios to LRU after unfreeze.
>>
>>>
>>> What your patch does is, simply not add folios that we will drop from the page
>>> cache to the LRU?
>>>
>>>
>>> You should describe here how you are fixing it: "Let's fix it by..."
>>>
>>>>
>>>> [1]
>>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
>>>> [   22.486130] ------------[ cut here ]------------
>>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
>>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
>>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
>>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
>>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
>>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
>>>> [   22.488539] sp : ffffffc08006b830
>>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
>>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
>>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
>>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
>>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
>>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
>>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
>>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
>>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
>>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
>>>> [   22.488647] Call trace:
>>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
>>>> [   22.488661]  __folio_put+0x2bc/0x434
>>>> [   22.488670]  folio_put+0x28/0x58
>>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
>>>> [   22.488689]  f2fs_gc+0x230/0x9b4
>>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
>>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
>>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
>>>> [   22.488725]  invoke_syscall+0x58/0xe4
>>>> [   22.488732]  do_el0_svc+0x48/0xdc
>>>> [   22.488739]  el0_svc+0x3c/0x98
>>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
>>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>>>>
>>>> [2]
>>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>>>>
>>>> F: pagecache refs = n
>>>> F: extra refs = GC + split
>>>> F: PG_lru set
>>>> move_data_block()
>>>> folio = f2fs_grab_cache_folio(F)
>>>> ...
>>>> __folio_set_dropbehind(F)
>>>> folio_unlock(F)
>>>> folio_end_dropbehind(F)
>>>>   folio_unmap_invalidate(F)
>>>>     __filemap_remove_folio(F)
>>>>     folio_put_refs(F, n)
>>>> folio_put(F)
>>>>                             split_folio_to_order(F)
>>>>                               folio_ref_freeze(F, 1)
>>>>                               ...
>>>>                               lru_add_split_folio(T)
>>>>                                 list_add_tail(&T->lru, &F->lru)
>>>>                                 folio_set_lru(T)
>>>>                               __filemap_remove_folio(T)
>>>>                               folio_put_refs(T, 1)
>>>>                               /* T refcount == 1, PageLRU set */
>>>>                             free_folio_and_swap_cache(T)
>>>>                               folio_put(T)
>>>>                                 /* refcount: 1 -> 0 */
>>>>                                                                   folio_isolate_lru(T)
>>
>> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
>> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
>>
>>>>                                                                     folio_test_clear_lru(T)
>>>>                                 __folio_put(T)
>>>>                                   __page_cache_release(T)
>>>>                                     folio_test_lru(T) == false
>>>>                                     /* skip lruvec_del_folio(T) */
>>>>                                   free_frozen_pages(T)
>>>>                                                                   folio_get(T)
>>>>                                                                   lruvec_del_folio(T)
>>
>> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
>>
>>>> later:
>>>>   list_del(adjacent->lru)
>>>>     next == &T->lru
>>>>     next->prev == LIST_POISON / PCP freelist
>>>>     BUG
>>>>
>>
>> Why does CPU0 still see the stale link from adjacent?
>>
>>>> Assisted-by: Cursor:claude-opus-4-8
>>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>
>>> I'm wondering if this has been broken the whole time, or if some rework allowed
>>> this to trigger.
>>>
>>> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
>>>
>>> Looking into the history, I think we always unconditionally did the
>>> lru_add_split_folio()/lru_add_page_tail().
>>>
>>>> ---
>>>>  mm/huge_memory.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index 970e077019b7..7465525a94a8 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>>>  			folio_ref_unfreeze(new_folio,
>>>>  					   folio_cache_ref_count(new_folio) + 1);
>>>>
>>>> -			if (do_lru)
>>>> +			if (do_lru && !(mapping && new_folio->index >= end))
>>>
>>> It might be clearer to write this as
>>>
>>> 	do_lru && (!mapping || new_folio->index < end)
>>>
>>> To match the page-cache check further below
>>>
>>> 	if (!mapping)
>>> 		continue
>>>
>>> 	...
>>> 	if (new_folio->index < end)
>>> 		...
>>>
>>>>  				lru_add_split_folio(folio, new_folio, lruvec, list);
>
> Talked to Claude and find an accounting issue with this. Without putting
> EOF after-split folios back to LRU, they are not going through lruvec_del_folio(),
> which decreases NR_*_LRU counter along with removing the folio from LRU
> and it causes NR_*_LRU accounting errors. Note that the original folio
> is on LRU all the time and LRU counters are not modified and after the split
> the original folio size is decreased and the after-split folios need to
> be added back to LRU to keep the LRU counters right. We will need to adjust
> LRU accounting for (!mapping || new_folio->index < end) if we decide to
> not add them back to LRU.
>
>
> Best Regards,
> Yan, Zi


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 18:44       ` Zi Yan
@ 2026-06-11  1:19         ` Zhaoyang Huang
  2026-06-11  1:49           ` Zi Yan
  0 siblings, 1 reply; 16+ messages in thread
From: Zhaoyang Huang @ 2026-06-11  1:19 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang@unisoc.com, hao_hao.wang

On Thu, Jun 11, 2026 at 2:44 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 10 Jun 2026, at 13:25, Zi Yan wrote:
>
> > On 10 Jun 2026, at 10:38, Zi Yan wrote:
> >
> >> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
> >>
> >>> On 6/10/26 14:05, zhaoyang.huang wrote:
> >>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>>>
> >>>> The kernel panics are keeping to be reported especially when the f2fs
> >>>> partition get almost full. By investigation, we find that the reason is
> >>>> one f2fs page got freed to buddy without being deleted from LRU and the
> >>>> root cause is the race happened in [2] which is enrolled by this commit.
> >>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> >>>> non-uptodate folio from the page cache in move_data_block").
> >>>
> >>> But I assume, that other FSes can trigger this as well? Any insights?
> >>>
> >>>>
> >>>> There are 3 race processes in this scenario, please find below for their
> >>>> main activities. However, by further investigation over the code, I
> >>>> think there is a common race window for the truncated folios between
> >>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
> >>>> refcount on page cache and remains the transient one of the split
> >>>> caller, under which the folio could enter free path and compete with the
> >>>> isolation process. This commit would like to suggest to have the folios
> >>>> beyond EOF stay out of LRU.
> >>>>
> >>>> Truncate:
> >>>> The changed code in move_data_block() lets the GC path evict the tail-end
> >>>> folio from the page cache through folio_end_dropbehind().  Once
> >>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> >>>> page-cache references for all pages in the folio are dropped.  The folio
> >>>> is then kept alive only by temporary external references, which allows a
> >>>> later split to operate on a folio whose subpages are no longer protected
> >>>> by page-cache references.
> >>>>
> >>>> Split:
> >>>> After the page-cache references are gone, split_folio_to_order() can
> >>>> split the big folio into individual pages and put the resulting subpages
> >>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
> >>>> page cache and drops their page-cache references.  A tail page can then
> >>>> remain on the LRU with PG_lru set while holding only the split caller's
> >>>> temporary reference.  When free_folio_and_swap_cache() drops that final
> >>>> reference, the page enters the final folio_put() release path.
> >>>>
> >>>> Isolate:
> >>>> In parallel, folio_isolate_lru() can observe the same tail page with a
> >>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> >>>> reference.  If this races with the final folio_put() from the split path,
> >>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> >>>> The page is then freed back to the allocator while its lru links are
> >>>> still present in the LRU list.  A later LRU operation on a neighboring
> >>>> page detects the stale link and reports list corruption.
>
> Something is wrong here with the caller of folio_isolate_lru(), since
> folio_isolate_lru() requires the caller to take an elevated refcount.
> This means when entering folio_isolate_lru(), the EOF folio should have
> at least refcount == 2, 1 from folio_split(), 1 from the caller of
> folio_isolate_lru(). This should prevent the EOF folio being freed
> by the parallel __folio_put().
This is one of the key points for this issue. Could the isolate caller
grab the refcount(by folio_get but not folio_try_get) after the
spliter's folio_put->folio_put_testzero? If it may, then the panic
happens

                    CPU1 (split_folio_to_order)          CPU2
(folio_isolate_lru)

                            split_folio_to_order(F)
                              folio_ref_freeze(F, 1)
                              ...
                              lru_add_split_folio(T)
                                list_add_tail(&T->lru, &F->lru)
                                folio_set_lru(T)
                              __filemap_remove_folio(T)
                              folio_put_refs(T, 1)
                              /* T refcount == 1, PageLRU set */
                            free_folio_and_swap_cache(T)
                              folio_put(T)
                                /* refcount: 1 -> 0 */

//caller grab the refcount here?

folio_isolate_lru(T)

folio_test_clear_lru(T)
                                __folio_put(T)
                                  __page_cache_release(T)
                                    folio_test_lru(T) == false
                                    /* skip lruvec_del_folio(T) */
                                  free_frozen_pages(T)
                                                                  folio_get(T)

lruvec_del_folio(T)
>
> Hi Zhaoyang, can you elaborate on the folio_isolate_lru() caller?
Sorry, no. Split and isolate thing are merely assumption by the phenomenons.
>
> In addition (with the help of Claude), the race trace[2] below
> looks invalid. It says split happens after folio_end_dropbehind(),
> which sets folio->mapping to NULL, but __folio_split() returns -EBUSY
> when folio->mapping is NULL in filemap_release_folio() check.
> So the split cannot happen.
Could the folio_needs_release return false?

if (!folio_needs_release(folio))
return true;

>
> Now I am not sure if the bug report is valid or not. At least for
> folio_split() and folio_isolate_lru(), the race should not exist.
> But let me know if I miss anything.
>
> >>>
> >>> Complicated mess :(
> >>>
> >>> So, folio_isolate_lru() really only requires the caller to hold a folio
> >>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
> >>> for example, be triggered by memory offlining or page migration.
> >>>
> >>> So we really want to not allow folio_isolate_lru() while we are still processing
> >>> the folio.
> >>
> >> Or we should defer adding split folios to LRU after unfreeze.
> >>
> >>>
> >>> What your patch does is, simply not add folios that we will drop from the page
> >>> cache to the LRU?
> >>>
> >>>
> >>> You should describe here how you are fixing it: "Let's fix it by..."
> >>>
> >>>>
> >>>> [1]
> >>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> >>>> [   22.486130] ------------[ cut here ]------------
> >>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
> >>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> >>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> >>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> >>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> >>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> >>>> [   22.488539] sp : ffffffc08006b830
> >>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> >>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> >>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> >>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> >>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> >>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> >>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> >>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> >>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> >>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> >>>> [   22.488647] Call trace:
> >>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> >>>> [   22.488661]  __folio_put+0x2bc/0x434
> >>>> [   22.488670]  folio_put+0x28/0x58
> >>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> >>>> [   22.488689]  f2fs_gc+0x230/0x9b4
> >>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> >>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> >>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> >>>> [   22.488725]  invoke_syscall+0x58/0xe4
> >>>> [   22.488732]  do_el0_svc+0x48/0xdc
> >>>> [   22.488739]  el0_svc+0x3c/0x98
> >>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> >>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> >>>>
> >>>> [2]
> >>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> >>>>
> >>>> F: pagecache refs = n
> >>>> F: extra refs = GC + split
> >>>> F: PG_lru set
> >>>> move_data_block()
> >>>> folio = f2fs_grab_cache_folio(F)
> >>>> ...
> >>>> __folio_set_dropbehind(F)
> >>>> folio_unlock(F)
> >>>> folio_end_dropbehind(F)
> >>>>   folio_unmap_invalidate(F)
> >>>>     __filemap_remove_folio(F)
> >>>>     folio_put_refs(F, n)
> >>>> folio_put(F)
> >>>>                             split_folio_to_order(F)
> >>>>                               folio_ref_freeze(F, 1)
> >>>>                               ...
> >>>>                               lru_add_split_folio(T)
> >>>>                                 list_add_tail(&T->lru, &F->lru)
> >>>>                                 folio_set_lru(T)
> >>>>                               __filemap_remove_folio(T)
> >>>>                               folio_put_refs(T, 1)
> >>>>                               /* T refcount == 1, PageLRU set */
> >>>>                             free_folio_and_swap_cache(T)
> >>>>                               folio_put(T)
> >>>>                                 /* refcount: 1 -> 0 */
> >>>>                                                                   folio_isolate_lru(T)
> >>
> >> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
> >> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
> >>
> >>>>                                                                     folio_test_clear_lru(T)
> >>>>                                 __folio_put(T)
> >>>>                                   __page_cache_release(T)
> >>>>                                     folio_test_lru(T) == false
> >>>>                                     /* skip lruvec_del_folio(T) */
> >>>>                                   free_frozen_pages(T)
> >>>>                                                                   folio_get(T)
> >>>>                                                                   lruvec_del_folio(T)
> >>
> >> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
> >>
> >>>> later:
> >>>>   list_del(adjacent->lru)
> >>>>     next == &T->lru
> >>>>     next->prev == LIST_POISON / PCP freelist
> >>>>     BUG
> >>>>
> >>
> >> Why does CPU0 still see the stale link from adjacent?
> >>
> >>>> Assisted-by: Cursor:claude-opus-4-8
> >>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>>
> >>> I'm wondering if this has been broken the whole time, or if some rework allowed
> >>> this to trigger.
> >>>
> >>> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
> >>>
> >>> Looking into the history, I think we always unconditionally did the
> >>> lru_add_split_folio()/lru_add_page_tail().
> >>>
> >>>> ---
> >>>>  mm/huge_memory.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 970e077019b7..7465525a94a8 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> >>>>                    folio_ref_unfreeze(new_folio,
> >>>>                                       folio_cache_ref_count(new_folio) + 1);
> >>>>
> >>>> -                  if (do_lru)
> >>>> +                  if (do_lru && !(mapping && new_folio->index >= end))
> >>>
> >>> It might be clearer to write this as
> >>>
> >>>     do_lru && (!mapping || new_folio->index < end)
> >>>
> >>> To match the page-cache check further below
> >>>
> >>>     if (!mapping)
> >>>             continue
> >>>
> >>>     ...
> >>>     if (new_folio->index < end)
> >>>             ...
> >>>
> >>>>                            lru_add_split_folio(folio, new_folio, lruvec, list);
> >
> > Talked to Claude and find an accounting issue with this. Without putting
> > EOF after-split folios back to LRU, they are not going through lruvec_del_folio(),
> > which decreases NR_*_LRU counter along with removing the folio from LRU
> > and it causes NR_*_LRU accounting errors. Note that the original folio
> > is on LRU all the time and LRU counters are not modified and after the split
> > the original folio size is decreased and the after-split folios need to
> > be added back to LRU to keep the LRU counters right. We will need to adjust
> > LRU accounting for (!mapping || new_folio->index < end) if we decide to
> > not add them back to LRU.
> >
> >
> > Best Regards,
> > Yan, Zi
>
>
> Best Regards,
> Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-11  1:19         ` Zhaoyang Huang
@ 2026-06-11  1:49           ` Zi Yan
  0 siblings, 0 replies; 16+ messages in thread
From: Zi Yan @ 2026-06-11  1:49 UTC (permalink / raw)
  To: Zhaoyang Huang
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang, hao_hao.wang

On 10 Jun 2026, at 21:19, Zhaoyang Huang wrote:

> On Thu, Jun 11, 2026 at 2:44 AM Zi Yan <ziy@nvidia.com> wrote:
>>
>> On 10 Jun 2026, at 13:25, Zi Yan wrote:
>>
>>> On 10 Jun 2026, at 10:38, Zi Yan wrote:
>>>
>>>> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
>>>>
>>>>> On 6/10/26 14:05, zhaoyang.huang wrote:
>>>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>>>>
>>>>>> The kernel panics are keeping to be reported especially when the f2fs
>>>>>> partition get almost full. By investigation, we find that the reason is
>>>>>> one f2fs page got freed to buddy without being deleted from LRU and the
>>>>>> root cause is the race happened in [2] which is enrolled by this commit.
>>>>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>>>>>> non-uptodate folio from the page cache in move_data_block").
>>>>>
>>>>> But I assume, that other FSes can trigger this as well? Any insights?
>>>>>
>>>>>>
>>>>>> There are 3 race processes in this scenario, please find below for their
>>>>>> main activities. However, by further investigation over the code, I
>>>>>> think there is a common race window for the truncated folios between
>>>>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>>>>>> refcount on page cache and remains the transient one of the split
>>>>>> caller, under which the folio could enter free path and compete with the
>>>>>> isolation process. This commit would like to suggest to have the folios
>>>>>> beyond EOF stay out of LRU.
>>>>>>
>>>>>> Truncate:
>>>>>> The changed code in move_data_block() lets the GC path evict the tail-end
>>>>>> folio from the page cache through folio_end_dropbehind().  Once
>>>>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>>>>>> page-cache references for all pages in the folio are dropped.  The folio
>>>>>> is then kept alive only by temporary external references, which allows a
>>>>>> later split to operate on a folio whose subpages are no longer protected
>>>>>> by page-cache references.
>>>>>>
>>>>>> Split:
>>>>>> After the page-cache references are gone, split_folio_to_order() can
>>>>>> split the big folio into individual pages and put the resulting subpages
>>>>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>>>>>> page cache and drops their page-cache references.  A tail page can then
>>>>>> remain on the LRU with PG_lru set while holding only the split caller's
>>>>>> temporary reference.  When free_folio_and_swap_cache() drops that final
>>>>>> reference, the page enters the final folio_put() release path.
>>>>>>
>>>>>> Isolate:
>>>>>> In parallel, folio_isolate_lru() can observe the same tail page with a
>>>>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>>>>>> reference.  If this races with the final folio_put() from the split path,
>>>>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>>>>>> The page is then freed back to the allocator while its lru links are
>>>>>> still present in the LRU list.  A later LRU operation on a neighboring
>>>>>> page detects the stale link and reports list corruption.
>>
>> Something is wrong here with the caller of folio_isolate_lru(), since
>> folio_isolate_lru() requires the caller to take an elevated refcount.
>> This means when entering folio_isolate_lru(), the EOF folio should have
>> at least refcount == 2, 1 from folio_split(), 1 from the caller of
>> folio_isolate_lru(). This should prevent the EOF folio being freed
>> by the parallel __folio_put().
> This is one of the key points for this issue. Could the isolate caller
> grab the refcount(by folio_get but not folio_try_get) after the
> spliter's folio_put->folio_put_testzero? If it may, then the panic
> happens
>
>                     CPU1 (split_folio_to_order)          CPU2
> (folio_isolate_lru)
>
>                             split_folio_to_order(F)
>                               folio_ref_freeze(F, 1)
>                               ...
>                               lru_add_split_folio(T)
>                                 list_add_tail(&T->lru, &F->lru)
>                                 folio_set_lru(T)
>                               __filemap_remove_folio(T)
>                               folio_put_refs(T, 1)
>                               /* T refcount == 1, PageLRU set */
>                             free_folio_and_swap_cache(T)
>                               folio_put(T)
>                                 /* refcount: 1 -> 0 */
>
> //caller grab the refcount here?

Which caller calls folio_get() instead of folio_try_get()?
Claude does not find any caller doing folio_get() + folio_isolate_lru(),
except migrate_device_unmap(), which holds a page table
lock to make sure the folio has a mapping and non-zero ref.

Even with folio_get(), it has
VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio),
which prevents caller from elevating 0-refcounted folios,
unless your runs did not have DEBUG_VM enabled.

>
> folio_isolate_lru(T)
>
> folio_test_clear_lru(T)
>                                 __folio_put(T)
>                                   __page_cache_release(T)
>                                     folio_test_lru(T) == false
>                                     /* skip lruvec_del_folio(T) */
>                                   free_frozen_pages(T)
>                                                                   folio_get(T)
>
> lruvec_del_folio(T)
>>
>> Hi Zhaoyang, can you elaborate on the folio_isolate_lru() caller?
> Sorry, no. Split and isolate thing are merely assumption by the phenomenons.
>>
>> In addition (with the help of Claude), the race trace[2] below
>> looks invalid. It says split happens after folio_end_dropbehind(),
>> which sets folio->mapping to NULL, but __folio_split() returns -EBUSY
>> when folio->mapping is NULL in filemap_release_folio() check.
>> So the split cannot happen.
> Could the folio_needs_release return false?

Wait, if folio->mapping is NULL and folio is not anonymous,
folio_check_splittable() returns false at the beginning of
__folio_split(). So the split cannot happen.

>
> if (!folio_needs_release(folio))
> return true;
>
>>
>> Now I am not sure if the bug report is valid or not. At least for
>> folio_split() and folio_isolate_lru(), the race should not exist.
>> But let me know if I miss anything.
>>
>>>>>
>>>>> Complicated mess :(
>>>>>
>>>>> So, folio_isolate_lru() really only requires the caller to hold a folio
>>>>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
>>>>> for example, be triggered by memory offlining or page migration.
>>>>>
>>>>> So we really want to not allow folio_isolate_lru() while we are still processing
>>>>> the folio.
>>>>
>>>> Or we should defer adding split folios to LRU after unfreeze.
>>>>
>>>>>
>>>>> What your patch does is, simply not add folios that we will drop from the page
>>>>> cache to the LRU?
>>>>>
>>>>>
>>>>> You should describe here how you are fixing it: "Let's fix it by..."
>>>>>
>>>>>>
>>>>>> [1]
>>>>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
>>>>>> [   22.486130] ------------[ cut here ]------------
>>>>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
>>>>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
>>>>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
>>>>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
>>>>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
>>>>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
>>>>>> [   22.488539] sp : ffffffc08006b830
>>>>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
>>>>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
>>>>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
>>>>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
>>>>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
>>>>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
>>>>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
>>>>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
>>>>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
>>>>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
>>>>>> [   22.488647] Call trace:
>>>>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
>>>>>> [   22.488661]  __folio_put+0x2bc/0x434
>>>>>> [   22.488670]  folio_put+0x28/0x58
>>>>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
>>>>>> [   22.488689]  f2fs_gc+0x230/0x9b4
>>>>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
>>>>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
>>>>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
>>>>>> [   22.488725]  invoke_syscall+0x58/0xe4
>>>>>> [   22.488732]  do_el0_svc+0x48/0xdc
>>>>>> [   22.488739]  el0_svc+0x3c/0x98
>>>>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
>>>>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>>>>>>
>>>>>> [2]
>>>>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>>>>>>
>>>>>> F: pagecache refs = n
>>>>>> F: extra refs = GC + split
>>>>>> F: PG_lru set
>>>>>> move_data_block()
>>>>>> folio = f2fs_grab_cache_folio(F)
>>>>>> ...
>>>>>> __folio_set_dropbehind(F)
>>>>>> folio_unlock(F)
>>>>>> folio_end_dropbehind(F)
>>>>>>   folio_unmap_invalidate(F)
>>>>>>     __filemap_remove_folio(F)
>>>>>>     folio_put_refs(F, n)
>>>>>> folio_put(F)
>>>>>>                             split_folio_to_order(F)
>>>>>>                               folio_ref_freeze(F, 1)
>>>>>>                               ...
>>>>>>                               lru_add_split_folio(T)
>>>>>>                                 list_add_tail(&T->lru, &F->lru)
>>>>>>                                 folio_set_lru(T)
>>>>>>                               __filemap_remove_folio(T)
>>>>>>                               folio_put_refs(T, 1)
>>>>>>                               /* T refcount == 1, PageLRU set */
>>>>>>                             free_folio_and_swap_cache(T)
>>>>>>                               folio_put(T)
>>>>>>                                 /* refcount: 1 -> 0 */
>>>>>>                                                                   folio_isolate_lru(T)
>>>>
>>>> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
>>>> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
>>>>
>>>>>>                                                                     folio_test_clear_lru(T)
>>>>>>                                 __folio_put(T)
>>>>>>                                   __page_cache_release(T)
>>>>>>                                     folio_test_lru(T) == false
>>>>>>                                     /* skip lruvec_del_folio(T) */
>>>>>>                                   free_frozen_pages(T)
>>>>>>                                                                   folio_get(T)
>>>>>>                                                                   lruvec_del_folio(T)
>>>>
>>>> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
>>>>
>>>>>> later:
>>>>>>   list_del(adjacent->lru)
>>>>>>     next == &T->lru
>>>>>>     next->prev == LIST_POISON / PCP freelist
>>>>>>     BUG
>>>>>>
>>>>
>>>> Why does CPU0 still see the stale link from adjacent?
>>>>
>>>>>> Assisted-by: Cursor:claude-opus-4-8
>>>>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>>>
>>>>> I'm wondering if this has been broken the whole time, or if some rework allowed
>>>>> this to trigger.
>>>>>
>>>>> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
>>>>>
>>>>> Looking into the history, I think we always unconditionally did the
>>>>> lru_add_split_folio()/lru_add_page_tail().
>>>>>
>>>>>> ---
>>>>>>  mm/huge_memory.c | 2 +-
>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>>>> index 970e077019b7..7465525a94a8 100644
>>>>>> --- a/mm/huge_memory.c
>>>>>> +++ b/mm/huge_memory.c
>>>>>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>>>>>                    folio_ref_unfreeze(new_folio,
>>>>>>                                       folio_cache_ref_count(new_folio) + 1);
>>>>>>
>>>>>> -                  if (do_lru)
>>>>>> +                  if (do_lru && !(mapping && new_folio->index >= end))
>>>>>
>>>>> It might be clearer to write this as
>>>>>
>>>>>     do_lru && (!mapping || new_folio->index < end)
>>>>>
>>>>> To match the page-cache check further below
>>>>>
>>>>>     if (!mapping)
>>>>>             continue
>>>>>
>>>>>     ...
>>>>>     if (new_folio->index < end)
>>>>>             ...
>>>>>
>>>>>>                            lru_add_split_folio(folio, new_folio, lruvec, list);
>>>
>>> Talked to Claude and find an accounting issue with this. Without putting
>>> EOF after-split folios back to LRU, they are not going through lruvec_del_folio(),
>>> which decreases NR_*_LRU counter along with removing the folio from LRU
>>> and it causes NR_*_LRU accounting errors. Note that the original folio
>>> is on LRU all the time and LRU counters are not modified and after the split
>>> the original folio size is decreased and the after-split folios need to
>>> be added back to LRU to keep the LRU counters right. We will need to adjust
>>> LRU accounting for (!mapping || new_folio->index < end) if we decide to
>>> not add them back to LRU.
>>>
>>>
>>> Best Regards,
>>> Yan, Zi
>>
>>
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 14:38   ` Zi Yan
  2026-06-10 17:25     ` Zi Yan
@ 2026-06-11  1:39     ` Zhaoyang Huang
  2026-06-11  1:56       ` Zi Yan
  1 sibling, 1 reply; 16+ messages in thread
From: Zhaoyang Huang @ 2026-06-11  1:39 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang@unisoc.com, hao_hao.wang

On Wed, Jun 10, 2026 at 10:38 PM Zi Yan <ziy@nvidia.com> wrote:
>
> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
>
> > On 6/10/26 14:05, zhaoyang.huang wrote:
> >> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>
> >> The kernel panics are keeping to be reported especially when the f2fs
> >> partition get almost full. By investigation, we find that the reason is
> >> one f2fs page got freed to buddy without being deleted from LRU and the
> >> root cause is the race happened in [2] which is enrolled by this commit.
> >> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> >> non-uptodate folio from the page cache in move_data_block").
> >
> > But I assume, that other FSes can trigger this as well? Any insights?

Yes, I think all FSes support big folio could suffer from this defect.

> >
> >>
> >> There are 3 race processes in this scenario, please find below for their
> >> main activities. However, by further investigation over the code, I
> >> think there is a common race window for the truncated folios between
> >> split_folio_to_order and folio_isolate_lru, where the folios lost the
> >> refcount on page cache and remains the transient one of the split
> >> caller, under which the folio could enter free path and compete with the
> >> isolation process. This commit would like to suggest to have the folios
> >> beyond EOF stay out of LRU.
> >>
> >> Truncate:
> >> The changed code in move_data_block() lets the GC path evict the tail-end
> >> folio from the page cache through folio_end_dropbehind().  Once
> >> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> >> page-cache references for all pages in the folio are dropped.  The folio
> >> is then kept alive only by temporary external references, which allows a
> >> later split to operate on a folio whose subpages are no longer protected
> >> by page-cache references.
> >>
> >> Split:
> >> After the page-cache references are gone, split_folio_to_order() can
> >> split the big folio into individual pages and put the resulting subpages
> >> back on the LRU.  For tail pages beyond EOF, split removes them from the
> >> page cache and drops their page-cache references.  A tail page can then
> >> remain on the LRU with PG_lru set while holding only the split caller's
> >> temporary reference.  When free_folio_and_swap_cache() drops that final
> >> reference, the page enters the final folio_put() release path.
> >>
> >> Isolate:
> >> In parallel, folio_isolate_lru() can observe the same tail page with a
> >> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> >> reference.  If this races with the final folio_put() from the split path,
> >> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> >> The page is then freed back to the allocator while its lru links are
> >> still present in the LRU list.  A later LRU operation on a neighboring
> >> page detects the stale link and reports list corruption.
> >
> > Complicated mess :(
> >
> > So, folio_isolate_lru() really only requires the caller to hold a folio
> > reference, which can happen given that we did the folio_ref_unfreeze(). It can,
> > for example, be triggered by memory offlining or page migration.
> >
> > So we really want to not allow folio_isolate_lru() while we are still processing
> > the folio.
>
> Or we should defer adding split folios to LRU after unfreeze.
>
> >
> > What your patch does is, simply not add folios that we will drop from the page
> > cache to the LRU?
> >
> >
> > You should describe here how you are fixing it: "Let's fix it by..."
Yes. This commit would like to suggest to fix it by having the folio
skip the lru_add_split_folio
> >
> >>
> >> [1]
> >> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> >> [   22.486130] ------------[ cut here ]------------
> >> [   22.486134] kernel BUG at lib/list_debug.c:67!
> >> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> >> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> >> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> >> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> >> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> >> [   22.488539] sp : ffffffc08006b830
> >> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> >> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> >> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> >> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> >> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> >> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> >> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> >> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> >> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> >> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> >> [   22.488647] Call trace:
> >> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> >> [   22.488661]  __folio_put+0x2bc/0x434
> >> [   22.488670]  folio_put+0x28/0x58
> >> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> >> [   22.488689]  f2fs_gc+0x230/0x9b4
> >> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> >> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> >> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> >> [   22.488725]  invoke_syscall+0x58/0xe4
> >> [   22.488732]  do_el0_svc+0x48/0xdc
> >> [   22.488739]  el0_svc+0x3c/0x98
> >> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> >> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> >>
> >> [2]
> >> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> >>
> >> F: pagecache refs = n
> >> F: extra refs = GC + split
> >> F: PG_lru set
> >> move_data_block()
> >> folio = f2fs_grab_cache_folio(F)
> >> ...
> >> __folio_set_dropbehind(F)
> >> folio_unlock(F)
> >> folio_end_dropbehind(F)
> >>   folio_unmap_invalidate(F)
> >>     __filemap_remove_folio(F)
> >>     folio_put_refs(F, n)
> >> folio_put(F)
> >>                             split_folio_to_order(F)
> >>                               folio_ref_freeze(F, 1)
> >>                               ...
> >>                               lru_add_split_folio(T)
> >>                                 list_add_tail(&T->lru, &F->lru)
> >>                                 folio_set_lru(T)
> >>                               __filemap_remove_folio(T)
> >>                               folio_put_refs(T, 1)
> >>                               /* T refcount == 1, PageLRU set */
> >>                             free_folio_and_swap_cache(T)
> >>                               folio_put(T)
> >>                                 /* refcount: 1 -> 0 */
> >>                                                                   folio_isolate_lru(T)
>
> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
No, isolate caller will grab one refcount.
>
> >>                                                                     folio_test_clear_lru(T)
> >>                                 __folio_put(T)
> >>                                   __page_cache_release(T)
> >>                                     folio_test_lru(T) == false
> >>                                     /* skip lruvec_del_folio(T) */
> >>                                   free_frozen_pages(T)
> >>                                                                   folio_get(T)
> >>                                                                   lruvec_del_folio(T)
>
> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
>
> >> later:
> >>   list_del(adjacent->lru)
> >>     next == &T->lru
> >>     next->prev == LIST_POISON / PCP freelist
> >>     BUG
> >>
>
> Why does CPU0 still see the stale link from adjacent?
The staled link should be from LRU since the folio never be deleted from lru.
>
> >> Assisted-by: Cursor:claude-opus-4-8
> >> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > I'm wondering if this has been broken the whole time, or if some rework allowed
> > this to trigger.
This issue is from AOSP with v6.18 which just supports big folio in
f2fs. Besides, it is triggered by the timing of f2fs's partition get
almost full during the test case of filling f2fs's partition(should be
the trigger factor of f2fs's gc which enroll truncate thing)
> >
> > I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
> >
> > Looking into the history, I think we always unconditionally did the
> > lru_add_split_folio()/lru_add_page_tail().
> >
> >> ---
> >>  mm/huge_memory.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 970e077019b7..7465525a94a8 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> >>                      folio_ref_unfreeze(new_folio,
> >>                                         folio_cache_ref_count(new_folio) + 1);
> >>
> >> -                    if (do_lru)
> >> +                    if (do_lru && !(mapping && new_folio->index >= end))
> >
> > It might be clearer to write this as
> >
> >       do_lru && (!mapping || new_folio->index < end)
> >
> > To match the page-cache check further below
> >
> >       if (!mapping)
> >               continue
> >
> >       ...
> >       if (new_folio->index < end)
> >               ...
> >
> >>                              lru_add_split_folio(folio, new_folio, lruvec, list);
> >>
> >>                      /*
> >
> > folio_check_splittable() makes sure that we have a mapping for non-anon folios.
> > (no truncation). end is then only set for non-anon folios.
> >
> > @Zi, any thoughts?
>
> The fix works but I feel that it is masking the race between folio_isolate_lru() and
> folio_put(). I worry that the same issue might be triggered in other ways or
> in new code if we do not fix the race.
>
> To summarize my thoughts above:
> 1. adding frozen folios in LRU might be problematic, since folio_isolate_lru()
> has a VM_BUG_ON_FOLIO() for it but still chooses to proceed the isolation.
>
> 2. the race analysis is not clear, since both folio_isolate_lru() and folio_put()
> do lruvec_del_folio() if folio is on LRU. When list_del(adjacent->lru) sees
> the stale link, the folio is already in buddy and page->lru is modified for
> PageBuddy use? So even without CPU0, folio_isolate_lru()'s lruvec_del_folio()
> can do the wrong thing on pages on buddy?
>
>
> --
> Best Regards,
> Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-11  1:39     ` Zhaoyang Huang
@ 2026-06-11  1:56       ` Zi Yan
  2026-06-11  2:39         ` Zhaoyang Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Zi Yan @ 2026-06-11  1:56 UTC (permalink / raw)
  To: Zhaoyang Huang
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang, hao_hao.wang

On 10 Jun 2026, at 21:39, Zhaoyang Huang wrote:

> On Wed, Jun 10, 2026 at 10:38 PM Zi Yan <ziy@nvidia.com> wrote:
>>
>> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
>>
>>> On 6/10/26 14:05, zhaoyang.huang wrote:
>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>>
>>>> The kernel panics are keeping to be reported especially when the f2fs
>>>> partition get almost full. By investigation, we find that the reason is
>>>> one f2fs page got freed to buddy without being deleted from LRU and the
>>>> root cause is the race happened in [2] which is enrolled by this commit.
>>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>>>> non-uptodate folio from the page cache in move_data_block").
>>>
>>> But I assume, that other FSes can trigger this as well? Any insights?
>
> Yes, I think all FSes support big folio could suffer from this defect.
>
>>>
>>>>
>>>> There are 3 race processes in this scenario, please find below for their
>>>> main activities. However, by further investigation over the code, I
>>>> think there is a common race window for the truncated folios between
>>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>>>> refcount on page cache and remains the transient one of the split
>>>> caller, under which the folio could enter free path and compete with the
>>>> isolation process. This commit would like to suggest to have the folios
>>>> beyond EOF stay out of LRU.
>>>>
>>>> Truncate:
>>>> The changed code in move_data_block() lets the GC path evict the tail-end
>>>> folio from the page cache through folio_end_dropbehind().  Once
>>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>>>> page-cache references for all pages in the folio are dropped.  The folio
>>>> is then kept alive only by temporary external references, which allows a
>>>> later split to operate on a folio whose subpages are no longer protected
>>>> by page-cache references.
>>>>
>>>> Split:
>>>> After the page-cache references are gone, split_folio_to_order() can
>>>> split the big folio into individual pages and put the resulting subpages
>>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>>>> page cache and drops their page-cache references.  A tail page can then
>>>> remain on the LRU with PG_lru set while holding only the split caller's
>>>> temporary reference.  When free_folio_and_swap_cache() drops that final
>>>> reference, the page enters the final folio_put() release path.
>>>>
>>>> Isolate:
>>>> In parallel, folio_isolate_lru() can observe the same tail page with a
>>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>>>> reference.  If this races with the final folio_put() from the split path,
>>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>>>> The page is then freed back to the allocator while its lru links are
>>>> still present in the LRU list.  A later LRU operation on a neighboring
>>>> page detects the stale link and reports list corruption.
>>>
>>> Complicated mess :(
>>>
>>> So, folio_isolate_lru() really only requires the caller to hold a folio
>>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
>>> for example, be triggered by memory offlining or page migration.
>>>
>>> So we really want to not allow folio_isolate_lru() while we are still processing
>>> the folio.
>>
>> Or we should defer adding split folios to LRU after unfreeze.
>>
>>>
>>> What your patch does is, simply not add folios that we will drop from the page
>>> cache to the LRU?
>>>
>>>
>>> You should describe here how you are fixing it: "Let's fix it by..."
> Yes. This commit would like to suggest to fix it by having the folio
> skip the lru_add_split_folio

Skipping it causes more issues like LRU counter mismatch, firing up bad_page()
since PG_active, PG_unevictable, or MGLRU fields in ->flags.f could stay
uncleared at page free time.

>>>
>>>>
>>>> [1]
>>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
>>>> [   22.486130] ------------[ cut here ]------------
>>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
>>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
>>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
>>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
>>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
>>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
>>>> [   22.488539] sp : ffffffc08006b830
>>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
>>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
>>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
>>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
>>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
>>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
>>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
>>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
>>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
>>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
>>>> [   22.488647] Call trace:
>>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
>>>> [   22.488661]  __folio_put+0x2bc/0x434
>>>> [   22.488670]  folio_put+0x28/0x58
>>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
>>>> [   22.488689]  f2fs_gc+0x230/0x9b4
>>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
>>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
>>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
>>>> [   22.488725]  invoke_syscall+0x58/0xe4
>>>> [   22.488732]  do_el0_svc+0x48/0xdc
>>>> [   22.488739]  el0_svc+0x3c/0x98
>>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
>>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>>>>
>>>> [2]
>>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>>>>
>>>> F: pagecache refs = n
>>>> F: extra refs = GC + split
>>>> F: PG_lru set
>>>> move_data_block()
>>>> folio = f2fs_grab_cache_folio(F)
>>>> ...
>>>> __folio_set_dropbehind(F)
>>>> folio_unlock(F)
>>>> folio_end_dropbehind(F)
>>>>   folio_unmap_invalidate(F)
>>>>     __filemap_remove_folio(F)
>>>>     folio_put_refs(F, n)
>>>> folio_put(F)
>>>>                             split_folio_to_order(F)
>>>>                               folio_ref_freeze(F, 1)
>>>>                               ...
>>>>                               lru_add_split_folio(T)
>>>>                                 list_add_tail(&T->lru, &F->lru)
>>>>                                 folio_set_lru(T)
>>>>                               __filemap_remove_folio(T)
>>>>                               folio_put_refs(T, 1)
>>>>                               /* T refcount == 1, PageLRU set */
>>>>                             free_folio_and_swap_cache(T)
>>>>                               folio_put(T)
>>>>                                 /* refcount: 1 -> 0 */
>>>>                                                                   folio_isolate_lru(T)
>>
>> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
>> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
> No, isolate caller will grab one refcount.

As I said in another email, isolate caller cannot grab a refcount when folio refcount
is 0.

>>
>>>>                                                                     folio_test_clear_lru(T)
>>>>                                 __folio_put(T)
>>>>                                   __page_cache_release(T)
>>>>                                     folio_test_lru(T) == false
>>>>                                     /* skip lruvec_del_folio(T) */
>>>>                                   free_frozen_pages(T)
>>>>                                                                   folio_get(T)
>>>>                                                                   lruvec_del_folio(T)
>>
>> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
>>
>>>> later:
>>>>   list_del(adjacent->lru)
>>>>     next == &T->lru
>>>>     next->prev == LIST_POISON / PCP freelist
>>>>     BUG
>>>>
>>
>> Why does CPU0 still see the stale link from adjacent?
> The staled link should be from LRU since the folio never be deleted from lru.
>>
>>>> Assisted-by: Cursor:claude-opus-4-8
>>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>
>>> I'm wondering if this has been broken the whole time, or if some rework allowed
>>> this to trigger.
> This issue is from AOSP with v6.18 which just supports big folio in
> f2fs. Besides, it is triggered by the timing of f2fs's partition get
> almost full during the test case of filling f2fs's partition(should be
> the trigger factor of f2fs's gc which enroll truncate thing)

Are you able to reproduce it with other FSes supporting large folio?

>>>
>>> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
>>>
>>> Looking into the history, I think we always unconditionally did the
>>> lru_add_split_folio()/lru_add_page_tail().
>>>
>>>> ---
>>>>  mm/huge_memory.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index 970e077019b7..7465525a94a8 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>>>                      folio_ref_unfreeze(new_folio,
>>>>                                         folio_cache_ref_count(new_folio) + 1);
>>>>
>>>> -                    if (do_lru)
>>>> +                    if (do_lru && !(mapping && new_folio->index >= end))
>>>
>>> It might be clearer to write this as
>>>
>>>       do_lru && (!mapping || new_folio->index < end)
>>>
>>> To match the page-cache check further below
>>>
>>>       if (!mapping)
>>>               continue
>>>
>>>       ...
>>>       if (new_folio->index < end)
>>>               ...
>>>
>>>>                              lru_add_split_folio(folio, new_folio, lruvec, list);
>>>>
>>>>                      /*
>>>
>>> folio_check_splittable() makes sure that we have a mapping for non-anon folios.
>>> (no truncation). end is then only set for non-anon folios.
>>>
>>> @Zi, any thoughts?
>>
>> The fix works but I feel that it is masking the race between folio_isolate_lru() and
>> folio_put(). I worry that the same issue might be triggered in other ways or
>> in new code if we do not fix the race.
>>
>> To summarize my thoughts above:
>> 1. adding frozen folios in LRU might be problematic, since folio_isolate_lru()
>> has a VM_BUG_ON_FOLIO() for it but still chooses to proceed the isolation.
>>
>> 2. the race analysis is not clear, since both folio_isolate_lru() and folio_put()
>> do lruvec_del_folio() if folio is on LRU. When list_del(adjacent->lru) sees
>> the stale link, the folio is already in buddy and page->lru is modified for
>> PageBuddy use? So even without CPU0, folio_isolate_lru()'s lruvec_del_folio()
>> can do the wrong thing on pages on buddy?
>>
>>
>> --
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-11  1:56       ` Zi Yan
@ 2026-06-11  2:39         ` Zhaoyang Huang
  2026-06-11  3:06           ` Zi Yan
  0 siblings, 1 reply; 16+ messages in thread
From: Zhaoyang Huang @ 2026-06-11  2:39 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang, hao_hao.wang,
	jyescas@google.com

On Thu, Jun 11, 2026 at 9:56 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 10 Jun 2026, at 21:39, Zhaoyang Huang wrote:
>
> > On Wed, Jun 10, 2026 at 10:38 PM Zi Yan <ziy@nvidia.com> wrote:
> >>
> >> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
> >>
> >>> On 6/10/26 14:05, zhaoyang.huang wrote:
> >>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>>>
> >>>> The kernel panics are keeping to be reported especially when the f2fs
> >>>> partition get almost full. By investigation, we find that the reason is
> >>>> one f2fs page got freed to buddy without being deleted from LRU and the
> >>>> root cause is the race happened in [2] which is enrolled by this commit.
> >>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> >>>> non-uptodate folio from the page cache in move_data_block").
> >>>
> >>> But I assume, that other FSes can trigger this as well? Any insights?
> >
> > Yes, I think all FSes support big folio could suffer from this defect.
> >
> >>>
> >>>>
> >>>> There are 3 race processes in this scenario, please find below for their
> >>>> main activities. However, by further investigation over the code, I
> >>>> think there is a common race window for the truncated folios between
> >>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
> >>>> refcount on page cache and remains the transient one of the split
> >>>> caller, under which the folio could enter free path and compete with the
> >>>> isolation process. This commit would like to suggest to have the folios
> >>>> beyond EOF stay out of LRU.
> >>>>
> >>>> Truncate:
> >>>> The changed code in move_data_block() lets the GC path evict the tail-end
> >>>> folio from the page cache through folio_end_dropbehind().  Once
> >>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> >>>> page-cache references for all pages in the folio are dropped.  The folio
> >>>> is then kept alive only by temporary external references, which allows a
> >>>> later split to operate on a folio whose subpages are no longer protected
> >>>> by page-cache references.
> >>>>
> >>>> Split:
> >>>> After the page-cache references are gone, split_folio_to_order() can
> >>>> split the big folio into individual pages and put the resulting subpages
> >>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
> >>>> page cache and drops their page-cache references.  A tail page can then
> >>>> remain on the LRU with PG_lru set while holding only the split caller's
> >>>> temporary reference.  When free_folio_and_swap_cache() drops that final
> >>>> reference, the page enters the final folio_put() release path.
> >>>>
> >>>> Isolate:
> >>>> In parallel, folio_isolate_lru() can observe the same tail page with a
> >>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> >>>> reference.  If this races with the final folio_put() from the split path,
> >>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> >>>> The page is then freed back to the allocator while its lru links are
> >>>> still present in the LRU list.  A later LRU operation on a neighboring
> >>>> page detects the stale link and reports list corruption.
> >>>
> >>> Complicated mess :(
> >>>
> >>> So, folio_isolate_lru() really only requires the caller to hold a folio
> >>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
> >>> for example, be triggered by memory offlining or page migration.
> >>>
> >>> So we really want to not allow folio_isolate_lru() while we are still processing
> >>> the folio.
> >>
> >> Or we should defer adding split folios to LRU after unfreeze.
> >>
> >>>
> >>> What your patch does is, simply not add folios that we will drop from the page
> >>> cache to the LRU?
> >>>
> >>>
> >>> You should describe here how you are fixing it: "Let's fix it by..."
> > Yes. This commit would like to suggest to fix it by having the folio
> > skip the lru_add_split_folio
>
> Skipping it causes more issues like LRU counter mismatch, firing up bad_page()
> since PG_active, PG_unevictable, or MGLRU fields in ->flags.f could stay
> uncleared at page free time.

OK, we should solve this issue.
>
> >>>
> >>>>
> >>>> [1]
> >>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> >>>> [   22.486130] ------------[ cut here ]------------
> >>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
> >>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> >>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> >>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> >>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> >>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> >>>> [   22.488539] sp : ffffffc08006b830
> >>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> >>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> >>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> >>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> >>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> >>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> >>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> >>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> >>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> >>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> >>>> [   22.488647] Call trace:
> >>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> >>>> [   22.488661]  __folio_put+0x2bc/0x434
> >>>> [   22.488670]  folio_put+0x28/0x58
> >>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> >>>> [   22.488689]  f2fs_gc+0x230/0x9b4
> >>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> >>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> >>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> >>>> [   22.488725]  invoke_syscall+0x58/0xe4
> >>>> [   22.488732]  do_el0_svc+0x48/0xdc
> >>>> [   22.488739]  el0_svc+0x3c/0x98
> >>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> >>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> >>>>
> >>>> [2]
> >>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> >>>>
> >>>> F: pagecache refs = n
> >>>> F: extra refs = GC + split
> >>>> F: PG_lru set
> >>>> move_data_block()
> >>>> folio = f2fs_grab_cache_folio(F)
> >>>> ...
> >>>> __folio_set_dropbehind(F)
> >>>> folio_unlock(F)
> >>>> folio_end_dropbehind(F)
> >>>>   folio_unmap_invalidate(F)
> >>>>     __filemap_remove_folio(F)
> >>>>     folio_put_refs(F, n)
> >>>> folio_put(F)
> >>>>                             split_folio_to_order(F)
> >>>>                               folio_ref_freeze(F, 1)
> >>>>                               ...
> >>>>                               lru_add_split_folio(T)
> >>>>                                 list_add_tail(&T->lru, &F->lru)
> >>>>                                 folio_set_lru(T)
> >>>>                               __filemap_remove_folio(T)
> >>>>                               folio_put_refs(T, 1)
> >>>>                               /* T refcount == 1, PageLRU set */
> >>>>                             free_folio_and_swap_cache(T)
> >>>>                               folio_put(T)
> >>>>                                 /* refcount: 1 -> 0 */
> >>>>                                                                   folio_isolate_lru(T)
> >>
> >> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
> >> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
> > No, isolate caller will grab one refcount.
>
> As I said in another email, isolate caller cannot grab a refcount when folio refcount
> is 0.

pin_user_pages*(..., FOLL_LONGTERM)
└─ __gup_longterm_locked() [gup.c:2465]
│ ├─ follow_page_pte() [gup.c:802]
│ │ └─ try_grab_folio() [gup.c:858]
             if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
                 return -ENOMEM;

                               // Could __folio_split->folio_put could
race here ?
             if (flags & FOLL_GET)
                 folio_ref_add(folio, refs);
└─ check_and_migrate_movable_pages() [gup.c:2490]
└─ collect_longterm_unpinnable_folios() [gup.c:2391]
└─ └─if (!folio_isolate_lru(folio))

Could the __folio_split race in the above scenario? It looks like
try_grab_folio set the refcount without using atomic operation.

>(from previous mail)
> Wait, if folio->mapping is NULL and folio is not anonymous,
> folio_check_splittable() returns false at the beginning of
> __folio_split(). So the split cannot happen.

According to my understanding, the folio checked here is still big
folio which is locked and with folio->mapping set, right?
>
> >>
> >>>>                                                                     folio_test_clear_lru(T)
> >>>>                                 __folio_put(T)
> >>>>                                   __page_cache_release(T)
> >>>>                                     folio_test_lru(T) == false
> >>>>                                     /* skip lruvec_del_folio(T) */
> >>>>                                   free_frozen_pages(T)
> >>>>                                                                   folio_get(T)
> >>>>                                                                   lruvec_del_folio(T)
> >>
> >> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
> >>
> >>>> later:
> >>>>   list_del(adjacent->lru)
> >>>>     next == &T->lru
> >>>>     next->prev == LIST_POISON / PCP freelist
> >>>>     BUG
> >>>>
> >>
> >> Why does CPU0 still see the stale link from adjacent?
> > The staled link should be from LRU since the folio never be deleted from lru.
> >>
> >>>> Assisted-by: Cursor:claude-opus-4-8
> >>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>>
> >>> I'm wondering if this has been broken the whole time, or if some rework allowed
> >>> this to trigger.
> > This issue is from AOSP with v6.18 which just supports big folio in
> > f2fs. Besides, it is triggered by the timing of f2fs's partition get
> > almost full during the test case of filling f2fs's partition(should be
> > the trigger factor of f2fs's gc which enroll truncate thing)
>
> Are you able to reproduce it with other FSes supporting large folio?

Sorry, I can't so far since only f2fs has gc in the Android system.
>
> >>>
> >>> I assume the issue can be triggered for other FSes, and we want Fixes: + CC: stable?
> >>>
> >>> Looking into the history, I think we always unconditionally did the
> >>> lru_add_split_folio()/lru_add_page_tail().
> >>>
> >>>> ---
> >>>>  mm/huge_memory.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 970e077019b7..7465525a94a8 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -3966,7 +3966,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> >>>>                      folio_ref_unfreeze(new_folio,
> >>>>                                         folio_cache_ref_count(new_folio) + 1);
> >>>>
> >>>> -                    if (do_lru)
> >>>> +                    if (do_lru && !(mapping && new_folio->index >= end))
> >>>
> >>> It might be clearer to write this as
> >>>
> >>>       do_lru && (!mapping || new_folio->index < end)
> >>>
> >>> To match the page-cache check further below
> >>>
> >>>       if (!mapping)
> >>>               continue
> >>>
> >>>       ...
> >>>       if (new_folio->index < end)
> >>>               ...
> >>>
> >>>>                              lru_add_split_folio(folio, new_folio, lruvec, list);
> >>>>
> >>>>                      /*
> >>>
> >>> folio_check_splittable() makes sure that we have a mapping for non-anon folios.
> >>> (no truncation). end is then only set for non-anon folios.
> >>>
> >>> @Zi, any thoughts?
> >>
> >> The fix works but I feel that it is masking the race between folio_isolate_lru() and
> >> folio_put(). I worry that the same issue might be triggered in other ways or
> >> in new code if we do not fix the race.
> >>
> >> To summarize my thoughts above:
> >> 1. adding frozen folios in LRU might be problematic, since folio_isolate_lru()
> >> has a VM_BUG_ON_FOLIO() for it but still chooses to proceed the isolation.
> >>
> >> 2. the race analysis is not clear, since both folio_isolate_lru() and folio_put()
> >> do lruvec_del_folio() if folio is on LRU. When list_del(adjacent->lru) sees
> >> the stale link, the folio is already in buddy and page->lru is modified for
> >> PageBuddy use? So even without CPU0, folio_isolate_lru()'s lruvec_del_folio()
> >> can do the wrong thing on pages on buddy?
> >>
> >>
> >> --
> >> Best Regards,
> >> Yan, Zi
>
>
> --
> Best Regards,
> Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-11  2:39         ` Zhaoyang Huang
@ 2026-06-11  3:06           ` Zi Yan
  2026-06-11  7:45             ` Zhaoyang Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Zi Yan @ 2026-06-11  3:06 UTC (permalink / raw)
  To: Zhaoyang Huang
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang, hao_hao.wang, jyescas

On 10 Jun 2026, at 22:39, Zhaoyang Huang wrote:

> On Thu, Jun 11, 2026 at 9:56 AM Zi Yan <ziy@nvidia.com> wrote:
>>
>> On 10 Jun 2026, at 21:39, Zhaoyang Huang wrote:
>>
>>> On Wed, Jun 10, 2026 at 10:38 PM Zi Yan <ziy@nvidia.com> wrote:
>>>>
>>>> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
>>>>
>>>>> On 6/10/26 14:05, zhaoyang.huang wrote:
>>>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>>>>
>>>>>> The kernel panics are keeping to be reported especially when the f2fs
>>>>>> partition get almost full. By investigation, we find that the reason is
>>>>>> one f2fs page got freed to buddy without being deleted from LRU and the
>>>>>> root cause is the race happened in [2] which is enrolled by this commit.
>>>>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>>>>>> non-uptodate folio from the page cache in move_data_block").
>>>>>
>>>>> But I assume, that other FSes can trigger this as well? Any insights?
>>>
>>> Yes, I think all FSes support big folio could suffer from this defect.
>>>
>>>>>
>>>>>>
>>>>>> There are 3 race processes in this scenario, please find below for their
>>>>>> main activities. However, by further investigation over the code, I
>>>>>> think there is a common race window for the truncated folios between
>>>>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>>>>>> refcount on page cache and remains the transient one of the split
>>>>>> caller, under which the folio could enter free path and compete with the
>>>>>> isolation process. This commit would like to suggest to have the folios
>>>>>> beyond EOF stay out of LRU.
>>>>>>
>>>>>> Truncate:
>>>>>> The changed code in move_data_block() lets the GC path evict the tail-end
>>>>>> folio from the page cache through folio_end_dropbehind().  Once
>>>>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>>>>>> page-cache references for all pages in the folio are dropped.  The folio
>>>>>> is then kept alive only by temporary external references, which allows a
>>>>>> later split to operate on a folio whose subpages are no longer protected
>>>>>> by page-cache references.
>>>>>>
>>>>>> Split:
>>>>>> After the page-cache references are gone, split_folio_to_order() can
>>>>>> split the big folio into individual pages and put the resulting subpages
>>>>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>>>>>> page cache and drops their page-cache references.  A tail page can then
>>>>>> remain on the LRU with PG_lru set while holding only the split caller's
>>>>>> temporary reference.  When free_folio_and_swap_cache() drops that final
>>>>>> reference, the page enters the final folio_put() release path.
>>>>>>
>>>>>> Isolate:
>>>>>> In parallel, folio_isolate_lru() can observe the same tail page with a
>>>>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>>>>>> reference.  If this races with the final folio_put() from the split path,
>>>>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>>>>>> The page is then freed back to the allocator while its lru links are
>>>>>> still present in the LRU list.  A later LRU operation on a neighboring
>>>>>> page detects the stale link and reports list corruption.
>>>>>
>>>>> Complicated mess :(
>>>>>
>>>>> So, folio_isolate_lru() really only requires the caller to hold a folio
>>>>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
>>>>> for example, be triggered by memory offlining or page migration.
>>>>>
>>>>> So we really want to not allow folio_isolate_lru() while we are still processing
>>>>> the folio.
>>>>
>>>> Or we should defer adding split folios to LRU after unfreeze.
>>>>
>>>>>
>>>>> What your patch does is, simply not add folios that we will drop from the page
>>>>> cache to the LRU?
>>>>>
>>>>>
>>>>> You should describe here how you are fixing it: "Let's fix it by..."
>>> Yes. This commit would like to suggest to fix it by having the folio
>>> skip the lru_add_split_folio
>>
>> Skipping it causes more issues like LRU counter mismatch, firing up bad_page()
>> since PG_active, PG_unevictable, or MGLRU fields in ->flags.f could stay
>> uncleared at page free time.
>
> OK, we should solve this issue.
>>
>>>>>
>>>>>>
>>>>>> [1]
>>>>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
>>>>>> [   22.486130] ------------[ cut here ]------------
>>>>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
>>>>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
>>>>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
>>>>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
>>>>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
>>>>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
>>>>>> [   22.488539] sp : ffffffc08006b830
>>>>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
>>>>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
>>>>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
>>>>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
>>>>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
>>>>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
>>>>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
>>>>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
>>>>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
>>>>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
>>>>>> [   22.488647] Call trace:
>>>>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
>>>>>> [   22.488661]  __folio_put+0x2bc/0x434
>>>>>> [   22.488670]  folio_put+0x28/0x58
>>>>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
>>>>>> [   22.488689]  f2fs_gc+0x230/0x9b4
>>>>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
>>>>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
>>>>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
>>>>>> [   22.488725]  invoke_syscall+0x58/0xe4
>>>>>> [   22.488732]  do_el0_svc+0x48/0xdc
>>>>>> [   22.488739]  el0_svc+0x3c/0x98
>>>>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
>>>>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>>>>>>
>>>>>> [2]
>>>>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>>>>>>
>>>>>> F: pagecache refs = n
>>>>>> F: extra refs = GC + split
>>>>>> F: PG_lru set
>>>>>> move_data_block()
>>>>>> folio = f2fs_grab_cache_folio(F)
>>>>>> ...
>>>>>> __folio_set_dropbehind(F)
>>>>>> folio_unlock(F)
>>>>>> folio_end_dropbehind(F)
>>>>>>   folio_unmap_invalidate(F)
>>>>>>     __filemap_remove_folio(F)
>>>>>>     folio_put_refs(F, n)
>>>>>> folio_put(F)
>>>>>>                             split_folio_to_order(F)
>>>>>>                               folio_ref_freeze(F, 1)
>>>>>>                               ...
>>>>>>                               lru_add_split_folio(T)
>>>>>>                                 list_add_tail(&T->lru, &F->lru)
>>>>>>                                 folio_set_lru(T)
>>>>>>                               __filemap_remove_folio(T)
>>>>>>                               folio_put_refs(T, 1)
>>>>>>                               /* T refcount == 1, PageLRU set */
>>>>>>                             free_folio_and_swap_cache(T)
>>>>>>                               folio_put(T)
>>>>>>                                 /* refcount: 1 -> 0 */
>>>>>>                                                                   folio_isolate_lru(T)
>>>>
>>>> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
>>>> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
>>> No, isolate caller will grab one refcount.
>>
>> As I said in another email, isolate caller cannot grab a refcount when folio refcount
>> is 0.
>
> pin_user_pages*(..., FOLL_LONGTERM)
> └─ __gup_longterm_locked() [gup.c:2465]
> │ ├─ follow_page_pte() [gup.c:802]
> │ │ └─ try_grab_folio() [gup.c:858]
>              if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
>                  return -ENOMEM;
>
>                                // Could __folio_split->folio_put could
> race here ?
>              if (flags & FOLL_GET)
>                  folio_ref_add(folio, refs);
> └─ check_and_migrate_movable_pages() [gup.c:2490]
> └─ collect_longterm_unpinnable_folios() [gup.c:2391]
> └─ └─if (!folio_isolate_lru(folio))
>
> Could the __folio_split race in the above scenario? It looks like
> try_grab_folio set the refcount without using atomic operation.

folio_ref_add() used by try_grab_folio() is an atomic op.
Which refcount change is not atomic here?

In addition, who is GUPing f2fs folio?

I think you need to find the actual f2fs code path instead of
chasing theoretical code combinations.

>
>> (from previous mail)
>> Wait, if folio->mapping is NULL and folio is not anonymous,
>> folio_check_splittable() returns false at the beginning of
>> __folio_split(). So the split cannot happen.
>
> According to my understanding, the folio checked here is still big
> folio which is locked and with folio->mapping set, right?

But the provided trace says the folio is split after folio_end_dropbehind(F)
and folio->mapping is NULL.

>>
>>>>
>>>>>>                                                                     folio_test_clear_lru(T)
>>>>>>                                 __folio_put(T)
>>>>>>                                   __page_cache_release(T)
>>>>>>                                     folio_test_lru(T) == false
>>>>>>                                     /* skip lruvec_del_folio(T) */
>>>>>>                                   free_frozen_pages(T)
>>>>>>                                                                   folio_get(T)
>>>>>>                                                                   lruvec_del_folio(T)
>>>>
>>>> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
>>>>
>>>>>> later:
>>>>>>   list_del(adjacent->lru)
>>>>>>     next == &T->lru
>>>>>>     next->prev == LIST_POISON / PCP freelist
>>>>>>     BUG
>>>>>>
>>>>
>>>> Why does CPU0 still see the stale link from adjacent?
>>> The staled link should be from LRU since the folio never be deleted from lru.
>>>>
>>>>>> Assisted-by: Cursor:claude-opus-4-8
>>>>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>>>>
>>>>> I'm wondering if this has been broken the whole time, or if some rework allowed
>>>>> this to trigger.
>>> This issue is from AOSP with v6.18 which just supports big folio in
>>> f2fs. Besides, it is triggered by the timing of f2fs's partition get
>>> almost full during the test case of filling f2fs's partition(should be
>>> the trigger factor of f2fs's gc which enroll truncate thing)
>>
>> Are you able to reproduce it with other FSes supporting large folio?
>
> Sorry, I can't so far since only f2fs has gc in the Android system.

Have you checked f2fs gc code to make sure it is working correctly?
BTW, what makes you think the issue is related to folio_split()?
Can you elaborate more on your investigation?

Thanks.


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-11  3:06           ` Zi Yan
@ 2026-06-11  7:45             ` Zhaoyang Huang
  0 siblings, 0 replies; 16+ messages in thread
From: Zhaoyang Huang @ 2026-06-11  7:45 UTC (permalink / raw)
  To: Zi Yan, jaegeuk, Chao Yu, jyescas@google.com
  Cc: David Hildenbrand (Arm), zhaoyang.huang, Andrew Morton,
	Lorenzo Stoakes, Barry Song, Baolin Wang, Lance Yang,
	Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, linux-mm,
	linux-kernel, steve.kang, xiuhong.wang, hao_hao.wang

+f2fs and android folks
@jaegeuk ,chao and jyescas, this mailing thread is talking about an
issue which related to f2fs, that is, with the commit 9609dd704725
("f2fs: remove non-uptodate folio from the page cache in
move_data_block") on and off the android's v6.18, we can reproduce or
not the kernel panic reported by this RFC. Could you please have
insight into this or just revert the suspicious commit?

On Thu, Jun 11, 2026 at 11:06 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 10 Jun 2026, at 22:39, Zhaoyang Huang wrote:
>
> > On Thu, Jun 11, 2026 at 9:56 AM Zi Yan <ziy@nvidia.com> wrote:
> >>
> >> On 10 Jun 2026, at 21:39, Zhaoyang Huang wrote:
> >>
> >>> On Wed, Jun 10, 2026 at 10:38 PM Zi Yan <ziy@nvidia.com> wrote:
> >>>>
> >>>> On 10 Jun 2026, at 8:50, David Hildenbrand (Arm) wrote:
> >>>>
> >>>>> On 6/10/26 14:05, zhaoyang.huang wrote:
> >>>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>>>>>
> >>>>>> The kernel panics are keeping to be reported especially when the f2fs
> >>>>>> partition get almost full. By investigation, we find that the reason is
> >>>>>> one f2fs page got freed to buddy without being deleted from LRU and the
> >>>>>> root cause is the race happened in [2] which is enrolled by this commit.
> >>>>>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> >>>>>> non-uptodate folio from the page cache in move_data_block").
> >>>>>
> >>>>> But I assume, that other FSes can trigger this as well? Any insights?
> >>>
> >>> Yes, I think all FSes support big folio could suffer from this defect.
> >>>
> >>>>>
> >>>>>>
> >>>>>> There are 3 race processes in this scenario, please find below for their
> >>>>>> main activities. However, by further investigation over the code, I
> >>>>>> think there is a common race window for the truncated folios between
> >>>>>> split_folio_to_order and folio_isolate_lru, where the folios lost the
> >>>>>> refcount on page cache and remains the transient one of the split
> >>>>>> caller, under which the folio could enter free path and compete with the
> >>>>>> isolation process. This commit would like to suggest to have the folios
> >>>>>> beyond EOF stay out of LRU.
> >>>>>>
> >>>>>> Truncate:
> >>>>>> The changed code in move_data_block() lets the GC path evict the tail-end
> >>>>>> folio from the page cache through folio_end_dropbehind().  Once
> >>>>>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> >>>>>> page-cache references for all pages in the folio are dropped.  The folio
> >>>>>> is then kept alive only by temporary external references, which allows a
> >>>>>> later split to operate on a folio whose subpages are no longer protected
> >>>>>> by page-cache references.
> >>>>>>
> >>>>>> Split:
> >>>>>> After the page-cache references are gone, split_folio_to_order() can
> >>>>>> split the big folio into individual pages and put the resulting subpages
> >>>>>> back on the LRU.  For tail pages beyond EOF, split removes them from the
> >>>>>> page cache and drops their page-cache references.  A tail page can then
> >>>>>> remain on the LRU with PG_lru set while holding only the split caller's
> >>>>>> temporary reference.  When free_folio_and_swap_cache() drops that final
> >>>>>> reference, the page enters the final folio_put() release path.
> >>>>>>
> >>>>>> Isolate:
> >>>>>> In parallel, folio_isolate_lru() can observe the same tail page with a
> >>>>>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> >>>>>> reference.  If this races with the final folio_put() from the split path,
> >>>>>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> >>>>>> The page is then freed back to the allocator while its lru links are
> >>>>>> still present in the LRU list.  A later LRU operation on a neighboring
> >>>>>> page detects the stale link and reports list corruption.
> >>>>>
> >>>>> Complicated mess :(
> >>>>>
> >>>>> So, folio_isolate_lru() really only requires the caller to hold a folio
> >>>>> reference, which can happen given that we did the folio_ref_unfreeze(). It can,
> >>>>> for example, be triggered by memory offlining or page migration.
> >>>>>
> >>>>> So we really want to not allow folio_isolate_lru() while we are still processing
> >>>>> the folio.
> >>>>
> >>>> Or we should defer adding split folios to LRU after unfreeze.
> >>>>
> >>>>>
> >>>>> What your patch does is, simply not add folios that we will drop from the page
> >>>>> cache to the LRU?
> >>>>>
> >>>>>
> >>>>> You should describe here how you are fixing it: "Let's fix it by..."
> >>> Yes. This commit would like to suggest to fix it by having the folio
> >>> skip the lru_add_split_folio
> >>
> >> Skipping it causes more issues like LRU counter mismatch, firing up bad_page()
> >> since PG_active, PG_unevictable, or MGLRU fields in ->flags.f could stay
> >> uncleared at page free time.
> >
> > OK, we should solve this issue.
> >>
> >>>>>
> >>>>>>
> >>>>>> [1]
> >>>>>> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> >>>>>> [   22.486130] ------------[ cut here ]------------
> >>>>>> [   22.486134] kernel BUG at lib/list_debug.c:67!
> >>>>>> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> >>>>>> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> >>>>>> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> >>>>>> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >>>>>> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> >>>>>> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> >>>>>> [   22.488539] sp : ffffffc08006b830
> >>>>>> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> >>>>>> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> >>>>>> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> >>>>>> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> >>>>>> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> >>>>>> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> >>>>>> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> >>>>>> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> >>>>>> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> >>>>>> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> >>>>>> [   22.488647] Call trace:
> >>>>>> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> >>>>>> [   22.488661]  __folio_put+0x2bc/0x434
> >>>>>> [   22.488670]  folio_put+0x28/0x58
> >>>>>> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> >>>>>> [   22.488689]  f2fs_gc+0x230/0x9b4
> >>>>>> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> >>>>>> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> >>>>>> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> >>>>>> [   22.488725]  invoke_syscall+0x58/0xe4
> >>>>>> [   22.488732]  do_el0_svc+0x48/0xdc
> >>>>>> [   22.488739]  el0_svc+0x3c/0x98
> >>>>>> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> >>>>>> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> >>>>>>
> >>>>>> [2]
> >>>>>> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> >>>>>>
> >>>>>> F: pagecache refs = n
> >>>>>> F: extra refs = GC + split
> >>>>>> F: PG_lru set
> >>>>>> move_data_block()
> >>>>>> folio = f2fs_grab_cache_folio(F)
> >>>>>> ...
> >>>>>> __folio_set_dropbehind(F)
> >>>>>> folio_unlock(F)
> >>>>>> folio_end_dropbehind(F)
> >>>>>>   folio_unmap_invalidate(F)
> >>>>>>     __filemap_remove_folio(F)
> >>>>>>     folio_put_refs(F, n)
> >>>>>> folio_put(F)
> >>>>>>                             split_folio_to_order(F)
> >>>>>>                               folio_ref_freeze(F, 1)
> >>>>>>                               ...
> >>>>>>                               lru_add_split_folio(T)
> >>>>>>                                 list_add_tail(&T->lru, &F->lru)
> >>>>>>                                 folio_set_lru(T)
> >>>>>>                               __filemap_remove_folio(T)
> >>>>>>                               folio_put_refs(T, 1)
> >>>>>>                               /* T refcount == 1, PageLRU set */
> >>>>>>                             free_folio_and_swap_cache(T)
> >>>>>>                               folio_put(T)
> >>>>>>                                 /* refcount: 1 -> 0 */
> >>>>>>                                                                   folio_isolate_lru(T)
> >>>>
> >>>> If refcount is 0 at this point, VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio) in
> >>>> folio_isolate_lru() would be triggered. Maybe we could just return false in that case.
> >>> No, isolate caller will grab one refcount.
> >>
> >> As I said in another email, isolate caller cannot grab a refcount when folio refcount
> >> is 0.
> >
> > pin_user_pages*(..., FOLL_LONGTERM)
> > └─ __gup_longterm_locked() [gup.c:2465]
> > │ ├─ follow_page_pte() [gup.c:802]
> > │ │ └─ try_grab_folio() [gup.c:858]
> >              if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> >                  return -ENOMEM;
> >
> >                                // Could __folio_split->folio_put could
> > race here ?
> >              if (flags & FOLL_GET)
> >                  folio_ref_add(folio, refs);
> > └─ check_and_migrate_movable_pages() [gup.c:2490]
> > └─ collect_longterm_unpinnable_folios() [gup.c:2391]
> > └─ └─if (!folio_isolate_lru(folio))
> >
> > Could the __folio_split race in the above scenario? It looks like
> > try_grab_folio set the refcount without using atomic operation.
>
> folio_ref_add() used by try_grab_folio() is an atomic op.
> Which refcount change is not atomic here?
The atomic I mean is folio_try_get is implemented by
atomic_add_unless, while try_grab_folio does this by the below
sequence which leaves a window to have __folio_split race with it.
right?

if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
....

if (flags & FOLL_GET)
    folio_ref_add(folio, refs);


>
> In addition, who is GUPing f2fs folio?
Don't know yet.
>
> I think you need to find the actual f2fs code path instead of
> chasing theoretical code combinations.
The test case get passed by reverting the commit of
folio_end_dropbehind which encourage us to believe this is the clue.
>
> >
> >> (from previous mail)
> >> Wait, if folio->mapping is NULL and folio is not anonymous,
> >> folio_check_splittable() returns false at the beginning of
> >> __folio_split(). So the split cannot happen.
> >
> > According to my understanding, the folio checked here is still big
> > folio which is locked and with folio->mapping set, right?
>
> But the provided trace says the folio is split after folio_end_dropbehind(F)
> and folio->mapping is NULL.
Please find  below for more information of the coredump. We can know
the BUG_ON information that the folio just under list_del is
fffffffec096e440 while its lru.next folio fffffffec096e480 is the one
which get freed to PCP without lruvec_del_folio wrongly[1]. We can
also find that that 'folio(0xfffffffec096e440)->lru.prev =
fffffffec0f639c0' in which fffffffec0f639c0 is an alone index folio
within the page cache that looks like the result of the fallocate[3].
So if it is possible that the split happens prior to fallocate and
then the folio got truncate and free_folio_and_swap_cache race with
folio_isolate_lru?

[1]
[   22.339229] list_del corruption. next->prev should be
fffffffec096e448, but was ffffff80f9791830. (next=fffffffec096e488)

struct page 0xfffffffec096e440 {
         lru = {
          next = 0xfffffffec096e488,
          prev = 0xfffffffec096e408

[2]
fffffffec096e440 a5b91000                0       18  0 24 referenced,lru
fffffffec096e480 a5b92000 ffffff801e930481  73009e9  1 41028
uptodate,lru,owner_2,swapbacked
fffffffec096e4c0 a5b93000 ffffff801e930481  730033a  1 41028
uptodate,lru,owner_2,swapbacked

[3]
fffffffec33f9440
  index: 76446  position: root/0/18/42/30
fffffffec00da9c0
  index: 76448  position: root/0/18/42/32
fffffffec3ded040
  index: 76449  position: root/0/18/42/33
fffffffec0f639c0
  index: 6188581  position: root/23/38/56/37
fffffffec0f63a00
  index: 6188853  position: root/23/38/60/53
fffffffec0f63a40
  index: 6188854  position: root/23/38/60/54

[4]
    CPU0 (f2fs GC)              CPU1 (split_folio_to_order)
CPU2 (folio_isolate_lru)

                                split_folio_to_order(F)
                                  folio_ref_freeze(F, 1)
                                  ...
                                  lru_add_split_folio(T)
                                    list_add_tail(&T->lru, &F->lru)
                                    folio_set_lru(T)
                                  __filemap_remove_folio(T)
                                  folio_put_refs(T, 1)
                                  folio_unlock(new_folio);
   move_data_block()
    folio = f2fs_grab_cache_folio(F)
    ...
    __folio_set_dropbehind(F)
    folio_unlock(F)
    folio_end_dropbehind(F)
      folio_unmap_invalidate(F)
        __filemap_remove_folio(F)
        folio_put_refs(F, n)
    folio_put(F)
                                  /* T refcount == 1, PageLRU set */
                                free_folio_and_swap_cache(T)
                                  folio_put(T)
                                    /* refcount: 1 -> 0 */

folio_isolate_lru(T)

 folio_test_clear_lru(T)
                                    __folio_put(T)
                                      __page_cache_release(T)
                                        folio_test_lru(T) == false
                                        /* skip lruvec_del_folio(T) */
                                      free_frozen_pages(T)

folio_get(T)

lruvec_del_folio(T)

>
> >>
> >>>>
> >>>>>>                                                                     folio_test_clear_lru(T)
> >>>>>>                                 __folio_put(T)
> >>>>>>                                   __page_cache_release(T)
> >>>>>>                                     folio_test_lru(T) == false
> >>>>>>                                     /* skip lruvec_del_folio(T) */
> >>>>>>                                   free_frozen_pages(T)
> >>>>>>                                                                   folio_get(T)
> >>>>>>                                                                   lruvec_del_folio(T)
> >>>>
> >>>> But in CPU2 (folio_isolate_lru), lruvec_del_folio(T) should remove T from LRU list.
> >>>>
> >>>>>> later:
> >>>>>>   list_del(adjacent->lru)
> >>>>>>     next == &T->lru
> >>>>>>     next->prev == LIST_POISON / PCP freelist
> >>>>>>     BUG
> >>>>>>
> >>>>
> >>>> Why does CPU0 still see the stale link from adjacent?
> >>> The staled link should be from LRU since the folio never be deleted from lru.
> >>>>
> >>>>>> Assisted-by: Cursor:claude-opus-4-8
> >>>>>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >>>>>
> >>>>> I'm wondering if this has been broken the whole time, or if some rework allowed
> >>>>> this to trigger.
> >>> This issue is from AOSP with v6.18 which just supports big folio in
> >>> f2fs. Besides, it is triggered by the timing of f2fs's partition get
> >>> almost full during the test case of filling f2fs's partition(should be
> >>> the trigger factor of f2fs's gc which enroll truncate thing)
> >>
> >> Are you able to reproduce it with other FSes supporting large folio?
> >
> > Sorry, I can't so far since only f2fs has gc in the Android system.
>
> Have you checked f2fs gc code to make sure it is working correctly?
> BTW, what makes you think the issue is related to folio_split()?
> Can you elaborate more on your investigation?
>
> Thanks.
>
>
> --
> Best Regards,
> Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 12:05 [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU zhaoyang.huang
  2026-06-10 12:50 ` David Hildenbrand (Arm)
@ 2026-06-10 20:30 ` Andrew Morton
  2026-06-10 20:36   ` Zi Yan
  2026-06-11  7:33 ` [syzbot ci] " syzbot ci
  2026-06-11  9:30 ` [RFC PATCH] " Lorenzo Stoakes
  3 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2026-06-10 20:30 UTC (permalink / raw)
  To: zhaoyang.huang
  Cc: David Hildenbrand, Zi Yan, Lorenzo Stoakes, Barry Song,
	Baolin Wang, Lance Yang, Liam R . Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, linux-mm, linux-kernel, Zhaoyang Huang,
	steve.kang

On Wed, 10 Jun 2026 20:05:35 +0800 "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote:

> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> non-uptodate folio from the page cache in move_data_block").
> 
> There are 3 race processes in this scenario, please find below for their
> main activities. However, by further investigation over the code, I
> think there is a common race window for the truncated folios between
> split_folio_to_order and folio_isolate_lru, where the folios lost the
> refcount on page cache and remains the transient one of the split
> caller, under which the folio could enter free path and compete with the
> isolation process. This commit would like to suggest to have the folios
> beyond EOF stay out of LRU.
> 
> Truncate:
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind().  Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped.  The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
> 
> Split:
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU.  For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references.  A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference.  When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
> 
> Isolate:
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> reference.  If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list.  A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.

Thanks.  Sashiko AI review might have found some problems with folio
flags:

	https://sashiko.dev/#/patchset/20260610120535.2370844-1-zhaoyang.huang@unisoc.com



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 20:30 ` Andrew Morton
@ 2026-06-10 20:36   ` Zi Yan
  0 siblings, 0 replies; 16+ messages in thread
From: Zi Yan @ 2026-06-10 20:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: zhaoyang.huang, David Hildenbrand, Lorenzo Stoakes, Barry Song,
	Baolin Wang, Lance Yang, Liam R . Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, linux-mm, linux-kernel, Zhaoyang Huang,
	steve.kang

On 10 Jun 2026, at 16:30, Andrew Morton wrote:

> On Wed, 10 Jun 2026 20:05:35 +0800 "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote:
>
>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>
>> The kernel panics are keeping to be reported especially when the f2fs
>> partition get almost full. By investigation, we find that the reason is
>> one f2fs page got freed to buddy without being deleted from LRU and the
>> root cause is the race happened in [2] which is enrolled by this commit.
>> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
>> non-uptodate folio from the page cache in move_data_block").
>>
>> There are 3 race processes in this scenario, please find below for their
>> main activities. However, by further investigation over the code, I
>> think there is a common race window for the truncated folios between
>> split_folio_to_order and folio_isolate_lru, where the folios lost the
>> refcount on page cache and remains the transient one of the split
>> caller, under which the folio could enter free path and compete with the
>> isolation process. This commit would like to suggest to have the folios
>> beyond EOF stay out of LRU.
>>
>> Truncate:
>> The changed code in move_data_block() lets the GC path evict the tail-end
>> folio from the page cache through folio_end_dropbehind().  Once
>> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
>> page-cache references for all pages in the folio are dropped.  The folio
>> is then kept alive only by temporary external references, which allows a
>> later split to operate on a folio whose subpages are no longer protected
>> by page-cache references.
>>
>> Split:
>> After the page-cache references are gone, split_folio_to_order() can
>> split the big folio into individual pages and put the resulting subpages
>> back on the LRU.  For tail pages beyond EOF, split removes them from the
>> page cache and drops their page-cache references.  A tail page can then
>> remain on the LRU with PG_lru set while holding only the split caller's
>> temporary reference.  When free_folio_and_swap_cache() drops that final
>> reference, the page enters the final folio_put() release path.
>>
>> Isolate:
>> In parallel, folio_isolate_lru() can observe the same tail page with a
>> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
>> reference.  If this races with the final folio_put() from the split path,
>> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
>> The page is then freed back to the allocator while its lru links are
>> still present in the LRU list.  A later LRU operation on a neighboring
>> page detects the stale link and reports list corruption.
>
> Thanks.  Sashiko AI review might have found some problems with folio
> flags:
>
> 	https://sashiko.dev/#/patchset/20260610120535.2370844-1-zhaoyang.huang@unisoc.com

Claude also raised the same concern when I was reasoning about this issue.

At least for now, my conclusion is that the race between folio_split()
and folio_isolate_lru() should not cause the issue and something else
is wrong.

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [syzbot ci] Re: mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 12:05 [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU zhaoyang.huang
  2026-06-10 12:50 ` David Hildenbrand (Arm)
  2026-06-10 20:30 ` Andrew Morton
@ 2026-06-11  7:33 ` syzbot ci
  2026-06-11  9:30 ` [RFC PATCH] " Lorenzo Stoakes
  3 siblings, 0 replies; 16+ messages in thread
From: syzbot ci @ 2026-06-11  7:33 UTC (permalink / raw)
  To: akpm, baohua, baolin.wang, david, dev.jain, huangzhaoyang,
	lance.yang, liam.howlett, linux-kernel, linux-mm, lorenzo.stoakes,
	npache, ryan.roberts, steve.kang, zhaoyang.huang, ziy
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v1] mm/huge_memory: do not add dropped split tail folios to LRU
https://lore.kernel.org/all/20260610120535.2370844-1-zhaoyang.huang@unisoc.com
* [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU

and found the following issues:
* BUG: Bad page state in ext4_write_begin
* BUG: Bad page state in iomap_write_begin
* BUG: Bad page state in shmem_get_folio_gfp

Full report is available here:
https://ci.syzbot.org/series/c3e122ba-1000-4581-ba3f-237f41482af8

***

BUG: Bad page state in ext4_write_begin

tree:      mm-new
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
base:      1ec3cca2d8b6b9ff6584ca626d4c8918bbf48d44
arch:      amd64
compiler:  Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config:    https://ci.syzbot.org/builds/ffde37a3-aed0-4f49-bba1-ca31cd6a4b04/config
syz repro: https://ci.syzbot.org/findings/07322c5f-4419-4281-bbd5-1b06eebe91f2/syz_repro

ext2 filesystem being mounted at /0/file1 supports timestamps until 2038-01-19 (0x7fffffff)
BUG: Bad page state in process syz.0.17  pfn:11e231
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x11e231
flags: 0x17ff20000000000(node=0|zone=2|lastcpupid=0x7ff)
raw: 017ff20000000000 0000000000000000 00000000ffffffff 0000000000000000
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Movable, gfp_mask 0x153cca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5841, tgid 5840 (syz.0.17), ts 75851604747, free_ts 72751451789
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 write_begin_get_folio include/linux/pagemap.h:789 [inline]
 ext4_write_begin+0x4ad/0x1890 fs/ext4/inode.c:1331
 generic_perform_write+0x2e2/0x8f0 mm/filemap.c:4325
 ext4_buffered_write_iter+0xce/0x3a0 fs/ext4/file.c:316
 ext4_file_write_iter+0x298/0x1bf0 fs/ext4/file.c:-1
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5718 tgid 5718 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x41d/0x490 mm/swap_state.c:404
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 0 UID: 0 PID: 5841 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 ext4_truncate_failed_write fs/ext4/truncate.h:21 [inline]
 ext4_write_end+0x784/0xa30 fs/ext4/inode.c:1495
 generic_perform_write+0x620/0x8f0 mm/filemap.c:4346
 ext4_buffered_write_iter+0xce/0x3a0 fs/ext4/file.c:316
 ext4_file_write_iter+0x298/0x1bf0 fs/ext4/file.c:-1
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb06359ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb0643ff028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007fb063815fa0 RCX: 00007fb06359ce59
RDX: 000000000000fdef RSI: 0000200000000140 RDI: 0000000000000004
RBP: 00007fb063632d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000c00 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb063816038 R14: 00007fb063815fa0 R15: 00007ffe99cbba98
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:11e232
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x2 pfn:0x11e232
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x17ff20000000040(head|node=0|zone=2|lastcpupid=0x7ff)
raw: 017ff20000000040 0000000000000000 ffffea0004788c90 0000000000000000
raw: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff20000000040 0000000000000000 ffffea0004788c90 0000000000000000
head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 1, migratetype Movable, gfp_mask 0x153cca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5841, tgid 5840 (syz.0.17), ts 75851604747, free_ts 72751458324
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 write_begin_get_folio include/linux/pagemap.h:789 [inline]
 ext4_write_begin+0x4ad/0x1890 fs/ext4/inode.c:1331
 generic_perform_write+0x2e2/0x8f0 mm/filemap.c:4325
 ext4_buffered_write_iter+0xce/0x3a0 fs/ext4/file.c:316
 ext4_file_write_iter+0x298/0x1bf0 fs/ext4/file.c:-1
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5718 tgid 5718 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x41d/0x490 mm/swap_state.c:404
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:

CPU: 1 UID: 0 PID: 5841 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 ext4_truncate_failed_write fs/ext4/truncate.h:21 [inline]
 ext4_write_end+0x784/0xa30 fs/ext4/inode.c:1495
 generic_perform_write+0x620/0x8f0 mm/filemap.c:4346
 ext4_buffered_write_iter+0xce/0x3a0 fs/ext4/file.c:316
 ext4_file_write_iter+0x298/0x1bf0 fs/ext4/file.c:-1
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb06359ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb0643ff028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007fb063815fa0 RCX: 00007fb06359ce59
RDX: 000000000000fdef RSI: 0000200000000140 RDI: 0000000000000004
RBP: 00007fb063632d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000c00 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb063816038 R14: 00007fb063815fa0 R15: 00007ffe99cbba98
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:11e234
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x4 pfn:0x11e234
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x17ff20000000040(head|node=0|zone=2|lastcpupid=0x7ff)
raw: 017ff20000000040 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff20000000040 0000000000000000 dead000000000122 0000000000000000
head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Movable, gfp_mask 0x153cca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5841, tgid 5840 (syz.0.17), ts 75851604747, free_ts 72751484534
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 write_begin_get_folio include/linux/pagemap.h:789 [inline]
 ext4_write_begin+0x4ad/0x1890 fs/ext4/inode.c:1331
 generic_perform_write+0x2e2/0x8f0 mm/filemap.c:4325
 ext4_buffered_write_iter+0xce/0x3a0 fs/ext4/file.c:316
 ext4_file_write_iter+0x298/0x1bf0 fs/ext4/file.c:-1
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5718 tgid 5718 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x41d/0x490 mm/swap_state.c:404
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 1 UID: 0 PID: 5841 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 ext4_truncate_failed_write fs/ext4/truncate.h:21 [inline]
 ext4_write_end+0x784/0xa30 fs/ext4/inode.c:1495
 generic_perform_write+0x620/0x8f0 mm/filemap.c:4346
 ext4_buffered_write_iter+0xce/0x3a0 fs/ext4/file.c:316
 ext4_file_write_iter+0x298/0x1bf0 fs/ext4/file.c:-1
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb06359ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb0643ff028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007fb063815fa0 RCX: 00007fb06359ce59
RDX: 000000000000fdef RSI: 0000200000000140 RDI: 0000000000000004
RBP: 00007fb063632d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000c00 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb063816038 R14: 00007fb063815fa0 R15: 00007ffe99cbba98
 </TASK>


***

BUG: Bad page state in iomap_write_begin

tree:      mm-new
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
base:      1ec3cca2d8b6b9ff6584ca626d4c8918bbf48d44
arch:      amd64
compiler:  Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config:    https://ci.syzbot.org/builds/ffde37a3-aed0-4f49-bba1-ca31cd6a4b04/config
syz repro: https://ci.syzbot.org/findings/8030d7fe-0d2e-4e47-ab50-b1211533d9c1/syz_repro

XFS (loop0): Mounting V5 Filesystem d7dc424e-7990-42cb-9f91-9cb7200a101d
XFS (loop0): Ending clean mount
BUG: Bad page state in process syz.0.17  pfn:1a6481
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x8081 pfn:0x1a6481
flags: 0x57ff20000000000(node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000000 0000000000000000 00000000ffffffff 0000000000000000
raw: 0000000000008081 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72347127723
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 1 UID: 0 PID: 5877 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:1a6482
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x8082 pfn:0x1a6482
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x57ff20000000040(head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000040 0000000000000000 ffffea0006992090 0000000000000000
raw: 0000000000008082 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff20000000040 0000000000000000 ffffea0006992090 0000000000000000
head: 0000000000008082 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 1, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72347116236
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 0 UID: 0 PID: 5877 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:1a6484
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x8084 pfn:0x1a6484
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x57ff20000000040(head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000008084 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
head: 0000000000008084 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72347038008
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 0 UID: 0 PID: 5877 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:1a6488
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x8088 pfn:0x1a6488
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x57ff20000000040(head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000008088 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
head: 0000000000008088 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72346997158
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:

CPU: 1 UID: 0 PID: 5877 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_frozen_pages+0xcd9/0xd30 mm/page_alloc.c:2938
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:1a6490
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x8090 pfn:0x1a6490
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x57ff20000000040(head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000008090 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
head: 0000000000008090 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 4, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72346919466
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 1 UID: 0 PID: 5877 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_pages_ok+0xb8c/0xbd0 mm/page_alloc.c:1578
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:1a64a0
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x80a0 pfn:0x1a64a0
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x57ff20000000040(head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
raw: 00000000000080a0 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
head: 00000000000080a0 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 5, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72346647882
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 0 UID: 0 PID: 5877 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_pages_ok+0xb8c/0xbd0 mm/page_alloc.c:1578
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>
BUG: Bad page state in process syz.0.17  pfn:1a64c0
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x80c0 pfn:0x1a64c0
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x57ff20000000040(head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
raw: 00000000000080c0 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff20000000040 0000000000000000 dead000000000122 0000000000000000
head: 00000000000080c0 0000000000000000 00000000ffffffff 0000000000000000
head: 057ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 6, migratetype Movable, gfp_mask 0x153c4a(GFP_NOFS|__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_WRITE|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5877, tgid 5876 (syz.0.17), ts 79178255762, free_ts 72346190117
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x235/0x490 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 folio_alloc_noprof+0x1e/0x30 mm/mempolicy.c:2591
 filemap_alloc_folio_noprof+0x111/0x470 mm/filemap.c:1014
 __filemap_get_folio_mpol+0x3fc/0xb00 mm/filemap.c:2012
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 iomap_get_folio fs/iomap/buffered-io.c:725 [inline]
 __iomap_get_folio fs/iomap/buffered-io.c:896 [inline]
 iomap_write_begin+0x6d9/0x14f0 fs/iomap/buffered-io.c:960
 iomap_write_iter fs/iomap/buffered-io.c:1144 [inline]
 iomap_file_buffered_write+0x47a/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5710 tgid 5710 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
 tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
 tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
 exit_mmap+0x498/0x9e0 mm/mmap.c:1313
 __mmput+0x118/0x430 kernel/fork.c:1178
 exit_mm+0x1f6/0x2d0 kernel/exit.c:582
 do_exit+0x6a2/0x22c0 kernel/exit.c:964
 do_group_exit+0x21b/0x2d0 kernel/exit.c:1119
 get_signal+0x1284/0x1330 kernel/signal.c:3037
 arch_do_signal_or_restart+0xbc/0x840 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0xa9/0x680 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:230 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline]
 do_syscall_64+0x353/0x580 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Modules linked in:
CPU: 0 UID: 0 PID: 5877 Comm: syz.0.17 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_pages_ok+0xb8c/0xbd0 mm/page_alloc.c:1578
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 truncate_inode_pages_range+0x5f1/0xe30 mm/truncate.c:416
 iomap_write_failed fs/iomap/buffered-io.c:785 [inline]
 iomap_write_iter fs/iomap/buffered-io.c:1187 [inline]
 iomap_file_buffered_write+0x788/0xb30 fs/iomap/buffered-io.c:1225
 xfs_file_buffered_write+0x212/0x8c0 fs/xfs/xfs_file.c:1056
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_pwrite64 fs/read_write.c:795 [inline]
 __do_sys_pwrite64 fs/read_write.c:803 [inline]
 __se_sys_pwrite64 fs/read_write.c:800 [inline]
 __x64_sys_pwrite64+0x199/0x230 fs/read_write.c:800
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3f0719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f07ff7028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f3f07415fa0 RCX: 00007f3f0719ce59
RDX: 00000000ffffffb7 RSI: 0000200000000040 RDI: 0000000000000004
RBP: 00007f3f07232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000008080c61 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3f07416038 R14: 00007f3f07415fa0 R15: 00007ffe415e9148
 </TASK>


***

BUG: Bad page state in shmem_get_folio_gfp

tree:      mm-new
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
base:      1ec3cca2d8b6b9ff6584ca626d4c8918bbf48d44
arch:      amd64
compiler:  Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config:    https://ci.syzbot.org/builds/ffde37a3-aed0-4f49-bba1-ca31cd6a4b04/config
syz repro: https://ci.syzbot.org/findings/f40ca5d2-8fd7-4dbe-a861-a7c4a5f442dd/syz_repro

BUG: Bad page state in process syz.0.53  pfn:11ea80
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x680 pfn:0x11ea80
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x17ff7800002025c(referenced|uptodate|dirty|workingset|head|swapbacked|node=0|zone=2|lastcpupid=0x7ff)
raw: 017ff7800002025c 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000680 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff7800002025c 0000000000000000 dead000000000122 0000000000000000
head: 0000000000000680 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 7, migratetype Movable, gfp_mask 0x3d20ca(GFP_TRANSHUGE_LIGHT|__GFP_NORETRY|__GFP_THISNODE), pid 5990, tgid 5988 (syz.0.53), ts 80487329937, free_ts 80461370315
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x1da/0x490 mm/mempolicy.c:2476
 folio_alloc_mpol_noprof+0x39/0x160 mm/mempolicy.c:2509
 shmem_alloc_folio+0xba/0x160 mm/shmem.c:1933
 shmem_alloc_and_add_folio+0x62f/0xf80 mm/shmem.c:1962
 shmem_get_folio_gfp+0x555/0x1670 mm/shmem.c:2552
 shmem_get_folio mm/shmem.c:2670 [inline]
 shmem_write_begin+0x16c/0x330 mm/shmem.c:3303
 generic_perform_write+0x2e2/0x8f0 mm/filemap.c:4325
 shmem_file_write_iter+0xf8/0x120 mm/shmem.c:3478
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_write+0x150/0x270 fs/read_write.c:740
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5749 tgid 5749 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 folio_batch_release include/linux/folio_batch.h:101 [inline]
 shmem_undo_range+0x52c/0x1660 mm/shmem.c:1149
 shmem_truncate_range mm/shmem.c:1277 [inline]
 shmem_evict_inode+0x289/0xae0 mm/shmem.c:1407
 evict+0x61e/0xb10 fs/inode.c:841
 __dentry_kill+0x1a2/0x690 fs/dcache.c:718
 shrink_kill+0xa9/0x2c0 fs/dcache.c:1195
 shrink_dentry_list+0x2e0/0x5e0 fs/dcache.c:1222
 shrink_dcache_tree+0xe9/0x5d0 fs/dcache.c:-1
 do_one_tree fs/dcache.c:1721 [inline]
 shrink_dcache_for_umount+0xa8/0x1f0 fs/dcache.c:1738
 generic_shutdown_super+0x6f/0x2d0 fs/super.c:624
 kill_anon_super+0x3b/0x70 fs/super.c:1292
 deactivate_locked_super+0xbc/0x130 fs/super.c:476
 cleanup_mnt+0x437/0x4d0 fs/namespace.c:1312
 task_work_run+0x1d9/0x270 kernel/task_work.c:233
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 __exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
 exit_to_user_mode_loop+0x193/0x680 kernel/entry/common.c:98
Modules linked in:
CPU: 0 UID: 0 PID: 5990 Comm: syz.0.53 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_pages_ok+0xb8c/0xbd0 mm/page_alloc.c:1578
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 shmem_undo_range+0x9a2/0x1660 mm/shmem.c:1181
 shmem_truncate_range mm/shmem.c:1277 [inline]
 shmem_fallocate+0x51c/0xec0 mm/shmem.c:3703
 vfs_fallocate+0x669/0x7e0 fs/open.c:338
 madvise_remove mm/madvise.c:1039 [inline]
 madvise_vma_behavior+0x2bc8/0x4300 mm/madvise.c:1352
 madvise_walk_vmas+0x573/0xae0 mm/madvise.c:1713
 madvise_do_behavior+0x386/0x540 mm/madvise.c:1929
 do_madvise+0x1fa/0x2e0 mm/madvise.c:2022
 __do_sys_madvise mm/madvise.c:2031 [inline]
 __se_sys_madvise mm/madvise.c:2029 [inline]
 __x64_sys_madvise+0xa6/0xc0 mm/madvise.c:2029
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc37db9ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc37ead0028 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
RAX: ffffffffffffffda RBX: 00007fc37de15fa0 RCX: 00007fc37db9ce59
RDX: 0000000000000009 RSI: 0000000000600003 RDI: 0000200000000000
RBP: 00007fc37dc32d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fc37de16038 R14: 00007fc37de15fa0 R15: 00007fff07d58848
 </TASK>
BUG: Bad page state in process syz.0.53  pfn:11eb00
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x700 pfn:0x11eb00
head: order:0 mapcount:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0
flags: 0x17ff7800002025c(referenced|uptodate|dirty|workingset|head|swapbacked|node=0|zone=2|lastcpupid=0x7ff)
raw: 017ff7800002025c 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000700 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff7800002025c 0000000000000000 dead000000000122 0000000000000000
head: 0000000000000700 0000000000000000 00000000ffffffff 0000000000000000
head: 017ff00000000000 0000000000000000 00000000ffffffff 0000000000000000
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner tracks the page as allocated
page last allocated via order 8, migratetype Movable, gfp_mask 0x3d20ca(GFP_TRANSHUGE_LIGHT|__GFP_NORETRY|__GFP_THISNODE), pid 5990, tgid 5988 (syz.0.53), ts 80487329937, free_ts 80461370315
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
 prep_new_page mm/page_alloc.c:1861 [inline]
 get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
 alloc_pages_mpol+0x1da/0x490 mm/mempolicy.c:2476
 folio_alloc_mpol_noprof+0x39/0x160 mm/mempolicy.c:2509
 shmem_alloc_folio+0xba/0x160 mm/shmem.c:1933
 shmem_alloc_and_add_folio+0x62f/0xf80 mm/shmem.c:1962
 shmem_get_folio_gfp+0x555/0x1670 mm/shmem.c:2552
 shmem_get_folio mm/shmem.c:2670 [inline]
 shmem_write_begin+0x16c/0x330 mm/shmem.c:3303
 generic_perform_write+0x2e2/0x8f0 mm/filemap.c:4325
 shmem_file_write_iter+0xf8/0x120 mm/shmem.c:3478
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_write+0x150/0x270 fs/read_write.c:740
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5749 tgid 5749 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1397 [inline]
 free_unref_folios+0xd9f/0x14c0 mm/page_alloc.c:2999
 folios_put_refs+0x9ff/0xb40 mm/swap.c:1008
 folio_batch_release include/linux/folio_batch.h:101 [inline]
 shmem_undo_range+0x52c/0x1660 mm/shmem.c:1149
 shmem_truncate_range mm/shmem.c:1277 [inline]
 shmem_evict_inode+0x289/0xae0 mm/shmem.c:1407
 evict+0x61e/0xb10 fs/inode.c:841
 __dentry_kill+0x1a2/0x690 fs/dcache.c:718
 shrink_kill+0xa9/0x2c0 fs/dcache.c:1195
 shrink_dentry_list+0x2e0/0x5e0 fs/dcache.c:1222
 shrink_dcache_tree+0xe9/0x5d0 fs/dcache.c:-1
 do_one_tree fs/dcache.c:1721 [inline]
 shrink_dcache_for_umount+0xa8/0x1f0 fs/dcache.c:1738
 generic_shutdown_super+0x6f/0x2d0 fs/super.c:624
 kill_anon_super+0x3b/0x70 fs/super.c:1292
 deactivate_locked_super+0xbc/0x130 fs/super.c:476
 cleanup_mnt+0x437/0x4d0 fs/namespace.c:1312
 task_work_run+0x1d9/0x270 kernel/task_work.c:233
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 __exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
 exit_to_user_mode_loop+0x193/0x680 kernel/entry/common.c:98
Modules linked in:
CPU: 0 UID: 0 PID: 5990 Comm: syz.0.53 Tainted: G    B               syzkaller #0 PREEMPT(full) 
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 bad_page+0x17f/0x1c0 mm/page_alloc.c:632
 free_page_is_bad mm/page_alloc.c:1076 [inline]
 __free_pages_prepare mm/page_alloc.c:1388 [inline]
 __free_pages_ok+0xb8c/0xbd0 mm/page_alloc.c:1578
 __folio_put+0x4a2/0x580 mm/swap.c:112
 __folio_split+0xffe/0x1570 mm/huge_memory.c:4199
 try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
 try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
 truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255
 shmem_undo_range+0x9a2/0x1660 mm/shmem.c:1181
 shmem_truncate_range mm/shmem.c:1277 [inline]
 shmem_fallocate+0x51c/0xec0 mm/shmem.c:3703
 vfs_fallocate+0x669/0x7e0 fs/open.c:338
 madvise_remove mm/madvise.c:1039 [inline]
 madvise_vma_behavior+0x2bc8/0x4300 mm/madvise.c:1352
 madvise_walk_vmas+0x573/0xae0 mm/madvise.c:1713
 madvise_do_behavior+0x386/0x540 mm/madvise.c:1929
 do_madvise+0x1fa/0x2e0 mm/madvise.c:2022
 __do_sys_madvise mm/madvise.c:2031 [inline]
 __se_sys_madvise mm/madvise.c:2029 [inline]
 __x64_sys_madvise+0xa6/0xc0 mm/madvise.c:2029
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc37db9ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc37ead0028 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
RAX: ffffffffffffffda RBX: 00007fc37de15fa0 RCX: 00007fc37db9ce59
RDX: 0000000000000009 RSI: 0000000000600003 RDI: 0000200000000000
RBP: 00007fc37dc32d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fc37de16038 R14: 00007fc37de15fa0 R15: 00007fff07d58848
 </TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU
  2026-06-10 12:05 [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU zhaoyang.huang
                   ` (2 preceding siblings ...)
  2026-06-11  7:33 ` [syzbot ci] " syzbot ci
@ 2026-06-11  9:30 ` Lorenzo Stoakes
  3 siblings, 0 replies; 16+ messages in thread
From: Lorenzo Stoakes @ 2026-06-11  9:30 UTC (permalink / raw)
  To: zhaoyang.huang
  Cc: Andrew Morton, David Hildenbrand, Zi Yan, Barry Song,
	Liam R. Howlett, Baolin Wang, Lance Yang, Nico Pache,
	Ryan Roberts, Dev Jain, linux-mm, linux-kernel, Zhaoyang Huang,
	steve.kang

-cc incorrect email addresses
+cc correct ones

$ scripts/get_maintainer.pl --no-git mm/huge_memory.c
Andrew Morton <akpm@linux-foundation.org> (maintainer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
David Hildenbrand <david@kernel.org> (maintainer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
Lorenzo Stoakes <ljs@kernel.org> (maintainer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
                 ^
                 |----- Please use the correct email address.

Zi Yan <ziy@nvidia.com> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
Baolin Wang <baolin.wang@linux.alibaba.com> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
"Liam R. Howlett" <liam@infradead.org> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
                   ^
                   |--- Please use the correct email address.

Nico Pache <npache@redhat.com> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
Ryan Roberts <ryan.roberts@arm.com> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
Dev Jain <dev.jain@arm.com> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
Barry Song <baohua@kernel.org> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
Lance Yang <lance.yang@linux.dev> (reviewer:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
linux-mm@kvack.org (open list:MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE))
linux-kernel@vger.kernel.org (open list)

Thanks, Lorenzo


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-06-11  9:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 12:05 [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU zhaoyang.huang
2026-06-10 12:50 ` David Hildenbrand (Arm)
2026-06-10 14:38   ` Zi Yan
2026-06-10 17:25     ` Zi Yan
2026-06-10 18:44       ` Zi Yan
2026-06-11  1:19         ` Zhaoyang Huang
2026-06-11  1:49           ` Zi Yan
2026-06-11  1:39     ` Zhaoyang Huang
2026-06-11  1:56       ` Zi Yan
2026-06-11  2:39         ` Zhaoyang Huang
2026-06-11  3:06           ` Zi Yan
2026-06-11  7:45             ` Zhaoyang Huang
2026-06-10 20:30 ` Andrew Morton
2026-06-10 20:36   ` Zi Yan
2026-06-11  7:33 ` [syzbot ci] " syzbot ci
2026-06-11  9:30 ` [RFC PATCH] " Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox