All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
@ 2026-06-08  9:09 ` zhaoyang.huang via Linux-f2fs-devel
  0 siblings, 0 replies; 10+ messages in thread
From: zhaoyang.huang @ 2026-06-08  9:09 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel,
	Zhaoyang Huang, steve.kang

From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.

The kernel panics are keeping to be reported especially when the f2fs
partition get almost full. By investigation, we find that the reason is
one f2fs page got freed to buddy without being deleted from LRU and the
root cause is the race happened in [2] which is enrolled by this commit.

There are 3 race processes in this scenario, please find below for their
main activities.

The changed code in move_data_block() lets the GC path evict the tail-end
folio from the page cache through folio_end_dropbehind().  Once
folio_unmap_invalidate() removes the folio from mapping->i_pages, the
page-cache references for all pages in the folio are dropped.  The folio
is then kept alive only by temporary external references, which allows a
later split to operate on a folio whose subpages are no longer protected
by page-cache references.

After the page-cache references are gone, split_folio_to_order() can
split the big folio into individual pages and put the resulting subpages
back on the LRU.  For tail pages beyond EOF, split removes them from the
page cache and drops their page-cache references.  A tail page can then
remain on the LRU with PG_lru set while holding only the split caller's
temporary reference.  When free_folio_and_swap_cache() drops that final
reference, the page enters the final folio_put() release path.

In parallel, folio_isolate_lru() can observe the same tail page with a
non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
reference.  If this races with the final folio_put() from the split path,
__folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
The page is then freed back to the allocator while its lru links are
still present in the LRU list.  A later LRU operation on a neighboring
page detects the stale link and reports list corruption.

[1]
[   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
[   22.486130] ------------[ cut here ]------------
[   22.486134] kernel BUG at lib/list_debug.c:67!
[   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
[   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
[   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
[   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
[   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
[   22.488539] sp : ffffffc08006b830
[   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
[   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
[   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
[   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
[   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
[   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
[   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
[   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
[   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
[   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
[   22.488647] Call trace:
[   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
[   22.488661]  __folio_put+0x2bc/0x434
[   22.488670]  folio_put+0x28/0x58
[   22.488678]  do_garbage_collect+0x1a34/0x2584
[   22.488689]  f2fs_gc+0x230/0x9b4
[   22.488697]  f2fs_fallocate+0xb90/0xdf4
[   22.488706]  vfs_fallocate+0x1b4/0x2bc
[   22.488716]  __arm64_sys_fallocate+0x44/0x78
[   22.488725]  invoke_syscall+0x58/0xe4
[   22.488732]  do_el0_svc+0x48/0xdc
[   22.488739]  el0_svc+0x3c/0x98
[   22.488747]  el0t_64_sync_handler+0x20/0x130
[   22.488754]  el0t_64_sync+0x1c4/0x1c8

[2]
CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)

F: pagecache refs = n
F: extra refs = GC + split
F: PG_lru set
move_data_block()
folio = f2fs_grab_cache_folio(F)
...
__folio_set_dropbehind(F)
folio_unlock(F)
folio_end_dropbehind(F)
  folio_unmap_invalidate(F)
    __filemap_remove_folio(F)
    folio_put_refs(F, n)
folio_put(F)
                            split_folio_to_order(F)
                              folio_ref_freeze(F, 1)
                              ...
                              lru_add_split_folio(T)
                                list_add_tail(&T->lru, &F->lru)
                                folio_set_lru(T)
                              __filemap_remove_folio(T)
                              folio_put_refs(T, 1)
                              /* T refcount == 1, PageLRU set */
                                                                  folio_isolate_lru(T)
                                                                    folio_test_clear_lru(T)
                            free_folio_and_swap_cache(T)
                              folio_put(T)
                                /* refcount: 1 -> 0 */
                                __folio_put(T)
                                  __page_cache_release(T)
                                    folio_test_lru(T) == false
                                    /* skip lruvec_del_folio(T) */
                                  free_frozen_pages(T)
                                                                  folio_get(T)
                                                                  lruvec_del_folio(T)
later:
  list_del(adjacent->lru)
    next == &T->lru
    next->prev == LIST_POISON / PCP freelist
    BUG

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 fs/f2fs/gc.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index ba93010924c0..3084e05e22f2 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
 put_out:
 	f2fs_put_dnode(&dn);
 out:
-	if (!folio_test_uptodate(folio))
-		__folio_set_dropbehind(folio);
-	folio_unlock(folio);
-	folio_end_dropbehind(folio);
-	folio_put(folio);
+	f2fs_folio_put(folio, true);
 	return err;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [f2fs-dev] [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
@ 2026-06-08  9:09 ` zhaoyang.huang via Linux-f2fs-devel
  0 siblings, 0 replies; 10+ messages in thread
From: zhaoyang.huang via Linux-f2fs-devel @ 2026-06-08  9:09 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel,
	Zhaoyang Huang, steve.kang

From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.

The kernel panics are keeping to be reported especially when the f2fs
partition get almost full. By investigation, we find that the reason is
one f2fs page got freed to buddy without being deleted from LRU and the
root cause is the race happened in [2] which is enrolled by this commit.

There are 3 race processes in this scenario, please find below for their
main activities.

The changed code in move_data_block() lets the GC path evict the tail-end
folio from the page cache through folio_end_dropbehind().  Once
folio_unmap_invalidate() removes the folio from mapping->i_pages, the
page-cache references for all pages in the folio are dropped.  The folio
is then kept alive only by temporary external references, which allows a
later split to operate on a folio whose subpages are no longer protected
by page-cache references.

After the page-cache references are gone, split_folio_to_order() can
split the big folio into individual pages and put the resulting subpages
back on the LRU.  For tail pages beyond EOF, split removes them from the
page cache and drops their page-cache references.  A tail page can then
remain on the LRU with PG_lru set while holding only the split caller's
temporary reference.  When free_folio_and_swap_cache() drops that final
reference, the page enters the final folio_put() release path.

In parallel, folio_isolate_lru() can observe the same tail page with a
non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
reference.  If this races with the final folio_put() from the split path,
__folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
The page is then freed back to the allocator while its lru links are
still present in the LRU list.  A later LRU operation on a neighboring
page detects the stale link and reports list corruption.

[1]
[   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
[   22.486130] ------------[ cut here ]------------
[   22.486134] kernel BUG at lib/list_debug.c:67!
[   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
[   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
[   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
[   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
[   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
[   22.488539] sp : ffffffc08006b830
[   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
[   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
[   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
[   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
[   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
[   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
[   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
[   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
[   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
[   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
[   22.488647] Call trace:
[   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
[   22.488661]  __folio_put+0x2bc/0x434
[   22.488670]  folio_put+0x28/0x58
[   22.488678]  do_garbage_collect+0x1a34/0x2584
[   22.488689]  f2fs_gc+0x230/0x9b4
[   22.488697]  f2fs_fallocate+0xb90/0xdf4
[   22.488706]  vfs_fallocate+0x1b4/0x2bc
[   22.488716]  __arm64_sys_fallocate+0x44/0x78
[   22.488725]  invoke_syscall+0x58/0xe4
[   22.488732]  do_el0_svc+0x48/0xdc
[   22.488739]  el0_svc+0x3c/0x98
[   22.488747]  el0t_64_sync_handler+0x20/0x130
[   22.488754]  el0t_64_sync+0x1c4/0x1c8

[2]
CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)

F: pagecache refs = n
F: extra refs = GC + split
F: PG_lru set
move_data_block()
folio = f2fs_grab_cache_folio(F)
...
__folio_set_dropbehind(F)
folio_unlock(F)
folio_end_dropbehind(F)
  folio_unmap_invalidate(F)
    __filemap_remove_folio(F)
    folio_put_refs(F, n)
folio_put(F)
                            split_folio_to_order(F)
                              folio_ref_freeze(F, 1)
                              ...
                              lru_add_split_folio(T)
                                list_add_tail(&T->lru, &F->lru)
                                folio_set_lru(T)
                              __filemap_remove_folio(T)
                              folio_put_refs(T, 1)
                              /* T refcount == 1, PageLRU set */
                                                                  folio_isolate_lru(T)
                                                                    folio_test_clear_lru(T)
                            free_folio_and_swap_cache(T)
                              folio_put(T)
                                /* refcount: 1 -> 0 */
                                __folio_put(T)
                                  __page_cache_release(T)
                                    folio_test_lru(T) == false
                                    /* skip lruvec_del_folio(T) */
                                  free_frozen_pages(T)
                                                                  folio_get(T)
                                                                  lruvec_del_folio(T)
later:
  list_del(adjacent->lru)
    next == &T->lru
    next->prev == LIST_POISON / PCP freelist
    BUG

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 fs/f2fs/gc.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index ba93010924c0..3084e05e22f2 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
 put_out:
 	f2fs_put_dnode(&dn);
 out:
-	if (!folio_test_uptodate(folio))
-		__folio_set_dropbehind(folio);
-	folio_unlock(folio);
-	folio_end_dropbehind(folio);
-	folio_put(folio);
+	f2fs_folio_put(folio, true);
 	return err;
 }
 
-- 
2.25.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [f2fs-dev] [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
  2026-06-08  9:09 ` [f2fs-dev] " zhaoyang.huang via Linux-f2fs-devel
@ 2026-06-09  0:28   ` Zhaoyang Huang
  -1 siblings, 0 replies; 10+ messages in thread
From: Zhaoyang Huang @ 2026-06-09  0:28 UTC (permalink / raw)
  To: zhaoyang.huang, jaegeuk, Chao Yu
  Cc: steve.kang, linux-kernel, linux-f2fs-devel

+jaegeuk, chao

On Mon, Jun 8, 2026 at 5:10 PM zhaoyang.huang <zhaoyang.huang@unisoc.com> wrote:
>
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
>
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
>
> There are 3 race processes in this scenario, please find below for their
> main activities.
>
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind().  Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped.  The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
>
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU.  For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references.  A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference.  When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
>
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> reference.  If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list.  A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.
>
> [1]
> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> [   22.486130] ------------[ cut here ]------------
> [   22.486134] kernel BUG at lib/list_debug.c:67!
> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488539] sp : ffffffc08006b830
> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> [   22.488647] Call trace:
> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> [   22.488661]  __folio_put+0x2bc/0x434
> [   22.488670]  folio_put+0x28/0x58
> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> [   22.488689]  f2fs_gc+0x230/0x9b4
> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> [   22.488725]  invoke_syscall+0x58/0xe4
> [   22.488732]  do_el0_svc+0x48/0xdc
> [   22.488739]  el0_svc+0x3c/0x98
> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>
> [2]
> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>
> F: pagecache refs = n
> F: extra refs = GC + split
> F: PG_lru set
> move_data_block()
> folio = f2fs_grab_cache_folio(F)
> ...
> __folio_set_dropbehind(F)
> folio_unlock(F)
> folio_end_dropbehind(F)
>   folio_unmap_invalidate(F)
>     __filemap_remove_folio(F)
>     folio_put_refs(F, n)
> folio_put(F)
>                             split_folio_to_order(F)
>                               folio_ref_freeze(F, 1)
>                               ...
>                               lru_add_split_folio(T)
>                                 list_add_tail(&T->lru, &F->lru)
>                                 folio_set_lru(T)
>                               __filemap_remove_folio(T)
>                               folio_put_refs(T, 1)
>                               /* T refcount == 1, PageLRU set */
>                                                                   folio_isolate_lru(T)
>                                                                     folio_test_clear_lru(T)
>                             free_folio_and_swap_cache(T)
>                               folio_put(T)
>                                 /* refcount: 1 -> 0 */
>                                 __folio_put(T)
>                                   __page_cache_release(T)
>                                     folio_test_lru(T) == false
>                                     /* skip lruvec_del_folio(T) */
>                                   free_frozen_pages(T)
>                                                                   folio_get(T)
>                                                                   lruvec_del_folio(T)
> later:
>   list_del(adjacent->lru)
>     next == &T->lru
>     next->prev == LIST_POISON / PCP freelist
>     BUG
>
> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> ---
>  fs/f2fs/gc.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index ba93010924c0..3084e05e22f2 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
>  put_out:
>         f2fs_put_dnode(&dn);
>  out:
> -       if (!folio_test_uptodate(folio))
> -               __folio_set_dropbehind(folio);
> -       folio_unlock(folio);
> -       folio_end_dropbehind(folio);
> -       folio_put(folio);
> +       f2fs_folio_put(folio, true);
>         return err;
>  }
>
> --
> 2.25.1
>


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
@ 2026-06-09  0:28   ` Zhaoyang Huang
  0 siblings, 0 replies; 10+ messages in thread
From: Zhaoyang Huang @ 2026-06-09  0:28 UTC (permalink / raw)
  To: zhaoyang.huang, jaegeuk, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, steve.kang

+jaegeuk, chao

On Mon, Jun 8, 2026 at 5:10 PM zhaoyang.huang <zhaoyang.huang@unisoc.com> wrote:
>
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
>
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
>
> There are 3 race processes in this scenario, please find below for their
> main activities.
>
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind().  Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped.  The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
>
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU.  For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references.  A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference.  When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
>
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> reference.  If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list.  A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.
>
> [1]
> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> [   22.486130] ------------[ cut here ]------------
> [   22.486134] kernel BUG at lib/list_debug.c:67!
> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488539] sp : ffffffc08006b830
> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> [   22.488647] Call trace:
> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> [   22.488661]  __folio_put+0x2bc/0x434
> [   22.488670]  folio_put+0x28/0x58
> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> [   22.488689]  f2fs_gc+0x230/0x9b4
> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> [   22.488725]  invoke_syscall+0x58/0xe4
> [   22.488732]  do_el0_svc+0x48/0xdc
> [   22.488739]  el0_svc+0x3c/0x98
> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
>
> [2]
> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
>
> F: pagecache refs = n
> F: extra refs = GC + split
> F: PG_lru set
> move_data_block()
> folio = f2fs_grab_cache_folio(F)
> ...
> __folio_set_dropbehind(F)
> folio_unlock(F)
> folio_end_dropbehind(F)
>   folio_unmap_invalidate(F)
>     __filemap_remove_folio(F)
>     folio_put_refs(F, n)
> folio_put(F)
>                             split_folio_to_order(F)
>                               folio_ref_freeze(F, 1)
>                               ...
>                               lru_add_split_folio(T)
>                                 list_add_tail(&T->lru, &F->lru)
>                                 folio_set_lru(T)
>                               __filemap_remove_folio(T)
>                               folio_put_refs(T, 1)
>                               /* T refcount == 1, PageLRU set */
>                                                                   folio_isolate_lru(T)
>                                                                     folio_test_clear_lru(T)
>                             free_folio_and_swap_cache(T)
>                               folio_put(T)
>                                 /* refcount: 1 -> 0 */
>                                 __folio_put(T)
>                                   __page_cache_release(T)
>                                     folio_test_lru(T) == false
>                                     /* skip lruvec_del_folio(T) */
>                                   free_frozen_pages(T)
>                                                                   folio_get(T)
>                                                                   lruvec_del_folio(T)
> later:
>   list_del(adjacent->lru)
>     next == &T->lru
>     next->prev == LIST_POISON / PCP freelist
>     BUG
>
> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> ---
>  fs/f2fs/gc.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index ba93010924c0..3084e05e22f2 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
>  put_out:
>         f2fs_put_dnode(&dn);
>  out:
> -       if (!folio_test_uptodate(folio))
> -               __folio_set_dropbehind(folio);
> -       folio_unlock(folio);
> -       folio_end_dropbehind(folio);
> -       folio_put(folio);
> +       f2fs_folio_put(folio, true);
>         return err;
>  }
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [f2fs-dev] [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
  2026-06-08  9:09 ` [f2fs-dev] " zhaoyang.huang via Linux-f2fs-devel
@ 2026-06-12  3:25   ` Chao Yu
  -1 siblings, 0 replies; 10+ messages in thread
From: Chao Yu via Linux-f2fs-devel @ 2026-06-12  3:25 UTC (permalink / raw)
  To: zhaoyang.huang, Jaegeuk Kim, linux-f2fs-devel, linux-kernel,
	Zhaoyang Huang, steve.kang

On 6/8/26 17:09, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
> 
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
> 
> There are 3 race processes in this scenario, please find below for their
> main activities.
> 
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind().  Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped.  The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
> 
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU.  For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references.  A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference.  When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
> 
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> reference.  If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list.  A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.
> 
> [1]
> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> [   22.486130] ------------[ cut here ]------------
> [   22.486134] kernel BUG at lib/list_debug.c:67!
> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488539] sp : ffffffc08006b830
> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> [   22.488647] Call trace:
> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> [   22.488661]  __folio_put+0x2bc/0x434
> [   22.488670]  folio_put+0x28/0x58
> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> [   22.488689]  f2fs_gc+0x230/0x9b4
> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> [   22.488725]  invoke_syscall+0x58/0xe4
> [   22.488732]  do_el0_svc+0x48/0xdc
> [   22.488739]  el0_svc+0x3c/0x98
> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> 
> [2]
> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> 
> F: pagecache refs = n
> F: extra refs = GC + split
> F: PG_lru set
> move_data_block()
> folio = f2fs_grab_cache_folio(F)
> ...
> __folio_set_dropbehind(F)
> folio_unlock(F)
> folio_end_dropbehind(F)
>    folio_unmap_invalidate(F)
>      __filemap_remove_folio(F)
>      folio_put_refs(F, n)
> folio_put(F)
>                              split_folio_to_order(F)
>                                folio_ref_freeze(F, 1)
>                                ...
>                                lru_add_split_folio(T)
>                                  list_add_tail(&T->lru, &F->lru)
>                                  folio_set_lru(T)
>                                __filemap_remove_folio(T)
>                                folio_put_refs(T, 1)
>                                /* T refcount == 1, PageLRU set */
>                                                                    folio_isolate_lru(T)
>                                                                      folio_test_clear_lru(T)
>                              free_folio_and_swap_cache(T)
>                                folio_put(T)
>                                  /* refcount: 1 -> 0 */
>                                  __folio_put(T)
>                                    __page_cache_release(T)
>                                      folio_test_lru(T) == false
>                                      /* skip lruvec_del_folio(T) */
>                                    free_frozen_pages(T)
>                                                                    folio_get(T)
>                                                                    lruvec_del_folio(T)
> later:
>    list_del(adjacent->lru)
>      next == &T->lru
>      next->prev == LIST_POISON / PCP freelist
>      BUG
> 

Missing Fixes and Cc: stable lines.

> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

I suspect this is a bug of MM, we can revert this first, and reapply after we
fix this iusse in MM.

Thanks,

> ---
>   fs/f2fs/gc.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index ba93010924c0..3084e05e22f2 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
>   put_out:
>   	f2fs_put_dnode(&dn);
>   out:
> -	if (!folio_test_uptodate(folio))
> -		__folio_set_dropbehind(folio);
> -	folio_unlock(folio);
> -	folio_end_dropbehind(folio);
> -	folio_put(folio);
> +	f2fs_folio_put(folio, true);
>   	return err;
>   }
>   



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
@ 2026-06-12  3:25   ` Chao Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chao Yu @ 2026-06-12  3:25 UTC (permalink / raw)
  To: zhaoyang.huang, Jaegeuk Kim, linux-f2fs-devel, linux-kernel,
	Zhaoyang Huang, steve.kang
  Cc: chao

On 6/8/26 17:09, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
> 
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
> 
> There are 3 race processes in this scenario, please find below for their
> main activities.
> 
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind().  Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped.  The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
> 
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU.  For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references.  A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference.  When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
> 
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> reference.  If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list.  A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.
> 
> [1]
> [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> [   22.486130] ------------[ cut here ]------------
> [   22.486134] kernel BUG at lib/list_debug.c:67!
> [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> [   22.488539] sp : ffffffc08006b830
> [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> [   22.488647] Call trace:
> [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> [   22.488661]  __folio_put+0x2bc/0x434
> [   22.488670]  folio_put+0x28/0x58
> [   22.488678]  do_garbage_collect+0x1a34/0x2584
> [   22.488689]  f2fs_gc+0x230/0x9b4
> [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> [   22.488725]  invoke_syscall+0x58/0xe4
> [   22.488732]  do_el0_svc+0x48/0xdc
> [   22.488739]  el0_svc+0x3c/0x98
> [   22.488747]  el0t_64_sync_handler+0x20/0x130
> [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> 
> [2]
> CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> 
> F: pagecache refs = n
> F: extra refs = GC + split
> F: PG_lru set
> move_data_block()
> folio = f2fs_grab_cache_folio(F)
> ...
> __folio_set_dropbehind(F)
> folio_unlock(F)
> folio_end_dropbehind(F)
>    folio_unmap_invalidate(F)
>      __filemap_remove_folio(F)
>      folio_put_refs(F, n)
> folio_put(F)
>                              split_folio_to_order(F)
>                                folio_ref_freeze(F, 1)
>                                ...
>                                lru_add_split_folio(T)
>                                  list_add_tail(&T->lru, &F->lru)
>                                  folio_set_lru(T)
>                                __filemap_remove_folio(T)
>                                folio_put_refs(T, 1)
>                                /* T refcount == 1, PageLRU set */
>                                                                    folio_isolate_lru(T)
>                                                                      folio_test_clear_lru(T)
>                              free_folio_and_swap_cache(T)
>                                folio_put(T)
>                                  /* refcount: 1 -> 0 */
>                                  __folio_put(T)
>                                    __page_cache_release(T)
>                                      folio_test_lru(T) == false
>                                      /* skip lruvec_del_folio(T) */
>                                    free_frozen_pages(T)
>                                                                    folio_get(T)
>                                                                    lruvec_del_folio(T)
> later:
>    list_del(adjacent->lru)
>      next == &T->lru
>      next->prev == LIST_POISON / PCP freelist
>      BUG
> 

Missing Fixes and Cc: stable lines.

> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

I suspect this is a bug of MM, we can revert this first, and reapply after we
fix this iusse in MM.

Thanks,

> ---
>   fs/f2fs/gc.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index ba93010924c0..3084e05e22f2 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
>   put_out:
>   	f2fs_put_dnode(&dn);
>   out:
> -	if (!folio_test_uptodate(folio))
> -		__folio_set_dropbehind(folio);
> -	folio_unlock(folio);
> -	folio_end_dropbehind(folio);
> -	folio_put(folio);
> +	f2fs_folio_put(folio, true);
>   	return err;
>   }
>   


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [f2fs-dev] [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
  2026-06-12  3:25   ` Chao Yu
@ 2026-06-12  3:57     ` Zhaoyang Huang
  -1 siblings, 0 replies; 10+ messages in thread
From: Zhaoyang Huang @ 2026-06-12  3:57 UTC (permalink / raw)
  To: Chao Yu
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, steve.kang,
	zhaoyang.huang

On Fri, Jun 12, 2026 at 11:26 AM Chao Yu <chao@kernel.org> wrote:
>
> On 6/8/26 17:09, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
> >
> > The kernel panics are keeping to be reported especially when the f2fs
> > partition get almost full. By investigation, we find that the reason is
> > one f2fs page got freed to buddy without being deleted from LRU and the
> > root cause is the race happened in [2] which is enrolled by this commit.
> >
> > There are 3 race processes in this scenario, please find below for their
> > main activities.
> >
> > The changed code in move_data_block() lets the GC path evict the tail-end
> > folio from the page cache through folio_end_dropbehind().  Once
> > folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> > page-cache references for all pages in the folio are dropped.  The folio
> > is then kept alive only by temporary external references, which allows a
> > later split to operate on a folio whose subpages are no longer protected
> > by page-cache references.
> >
> > After the page-cache references are gone, split_folio_to_order() can
> > split the big folio into individual pages and put the resulting subpages
> > back on the LRU.  For tail pages beyond EOF, split removes them from the
> > page cache and drops their page-cache references.  A tail page can then
> > remain on the LRU with PG_lru set while holding only the split caller's
> > temporary reference.  When free_folio_and_swap_cache() drops that final
> > reference, the page enters the final folio_put() release path.
> >
> > In parallel, folio_isolate_lru() can observe the same tail page with a
> > non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> > reference.  If this races with the final folio_put() from the split path,
> > __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> > The page is then freed back to the allocator while its lru links are
> > still present in the LRU list.  A later LRU operation on a neighboring
> > page detects the stale link and reports list corruption.
> >
> > [1]
> > [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> > [   22.486130] ------------[ cut here ]------------
> > [   22.486134] kernel BUG at lib/list_debug.c:67!
> > [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> > [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> > [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> > [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488539] sp : ffffffc08006b830
> > [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> > [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> > [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> > [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> > [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> > [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> > [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> > [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> > [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> > [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> > [   22.488647] Call trace:
> > [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> > [   22.488661]  __folio_put+0x2bc/0x434
> > [   22.488670]  folio_put+0x28/0x58
> > [   22.488678]  do_garbage_collect+0x1a34/0x2584
> > [   22.488689]  f2fs_gc+0x230/0x9b4
> > [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> > [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> > [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> > [   22.488725]  invoke_syscall+0x58/0xe4
> > [   22.488732]  do_el0_svc+0x48/0xdc
> > [   22.488739]  el0_svc+0x3c/0x98
> > [   22.488747]  el0t_64_sync_handler+0x20/0x130
> > [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> >
> > [2]
> > CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> >
> > F: pagecache refs = n
> > F: extra refs = GC + split
> > F: PG_lru set
> > move_data_block()
> > folio = f2fs_grab_cache_folio(F)
> > ...
> > __folio_set_dropbehind(F)
> > folio_unlock(F)
> > folio_end_dropbehind(F)
> >    folio_unmap_invalidate(F)
> >      __filemap_remove_folio(F)
> >      folio_put_refs(F, n)
> > folio_put(F)
> >                              split_folio_to_order(F)
> >                                folio_ref_freeze(F, 1)
> >                                ...
> >                                lru_add_split_folio(T)
> >                                  list_add_tail(&T->lru, &F->lru)
> >                                  folio_set_lru(T)
> >                                __filemap_remove_folio(T)
> >                                folio_put_refs(T, 1)
> >                                /* T refcount == 1, PageLRU set */
> >                                                                    folio_isolate_lru(T)
> >                                                                      folio_test_clear_lru(T)
> >                              free_folio_and_swap_cache(T)
> >                                folio_put(T)
> >                                  /* refcount: 1 -> 0 */
> >                                  __folio_put(T)
> >                                    __page_cache_release(T)
> >                                      folio_test_lru(T) == false
> >                                      /* skip lruvec_del_folio(T) */
> >                                    free_frozen_pages(T)
> >                                                                    folio_get(T)
> >                                                                    lruvec_del_folio(T)
> > later:
> >    list_del(adjacent->lru)
> >      next == &T->lru
> >      next->prev == LIST_POISON / PCP freelist
> >      BUG
> >
>
> Missing Fixes and Cc: stable lines.
>
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> I suspect this is a bug of MM, we can revert this first, and reapply after we
> fix this iusse in MM.
Yes. There is another mailing thread talking about the MM thing on
this issue. You are on the send-to list. I think it is no need to
revert it in a hurry if you are also convinced about my analysis on
split_folio's defect

https://lore.kernel.org/linux-mm/20260612023456.2424044-1-zhaoyang.huang@unisoc.com/


>
> Thanks,
>
> > ---
> >   fs/f2fs/gc.c | 6 +-----
> >   1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index ba93010924c0..3084e05e22f2 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
> >   put_out:
> >       f2fs_put_dnode(&dn);
> >   out:
> > -     if (!folio_test_uptodate(folio))
> > -             __folio_set_dropbehind(folio);
> > -     folio_unlock(folio);
> > -     folio_end_dropbehind(folio);
> > -     folio_put(folio);
> > +     f2fs_folio_put(folio, true);
> >       return err;
> >   }
> >
>


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
@ 2026-06-12  3:57     ` Zhaoyang Huang
  0 siblings, 0 replies; 10+ messages in thread
From: Zhaoyang Huang @ 2026-06-12  3:57 UTC (permalink / raw)
  To: Chao Yu
  Cc: zhaoyang.huang, Jaegeuk Kim, linux-f2fs-devel, linux-kernel,
	steve.kang

On Fri, Jun 12, 2026 at 11:26 AM Chao Yu <chao@kernel.org> wrote:
>
> On 6/8/26 17:09, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
> >
> > The kernel panics are keeping to be reported especially when the f2fs
> > partition get almost full. By investigation, we find that the reason is
> > one f2fs page got freed to buddy without being deleted from LRU and the
> > root cause is the race happened in [2] which is enrolled by this commit.
> >
> > There are 3 race processes in this scenario, please find below for their
> > main activities.
> >
> > The changed code in move_data_block() lets the GC path evict the tail-end
> > folio from the page cache through folio_end_dropbehind().  Once
> > folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> > page-cache references for all pages in the folio are dropped.  The folio
> > is then kept alive only by temporary external references, which allows a
> > later split to operate on a folio whose subpages are no longer protected
> > by page-cache references.
> >
> > After the page-cache references are gone, split_folio_to_order() can
> > split the big folio into individual pages and put the resulting subpages
> > back on the LRU.  For tail pages beyond EOF, split removes them from the
> > page cache and drops their page-cache references.  A tail page can then
> > remain on the LRU with PG_lru set while holding only the split caller's
> > temporary reference.  When free_folio_and_swap_cache() drops that final
> > reference, the page enters the final folio_put() release path.
> >
> > In parallel, folio_isolate_lru() can observe the same tail page with a
> > non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> > reference.  If this races with the final folio_put() from the split path,
> > __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> > The page is then freed back to the allocator while its lru links are
> > still present in the LRU list.  A later LRU operation on a neighboring
> > page detects the stale link and reports list corruption.
> >
> > [1]
> > [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> > [   22.486130] ------------[ cut here ]------------
> > [   22.486134] kernel BUG at lib/list_debug.c:67!
> > [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> > [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> > [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> > [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488539] sp : ffffffc08006b830
> > [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> > [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> > [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> > [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> > [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> > [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> > [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> > [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> > [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> > [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> > [   22.488647] Call trace:
> > [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> > [   22.488661]  __folio_put+0x2bc/0x434
> > [   22.488670]  folio_put+0x28/0x58
> > [   22.488678]  do_garbage_collect+0x1a34/0x2584
> > [   22.488689]  f2fs_gc+0x230/0x9b4
> > [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> > [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> > [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> > [   22.488725]  invoke_syscall+0x58/0xe4
> > [   22.488732]  do_el0_svc+0x48/0xdc
> > [   22.488739]  el0_svc+0x3c/0x98
> > [   22.488747]  el0t_64_sync_handler+0x20/0x130
> > [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> >
> > [2]
> > CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> >
> > F: pagecache refs = n
> > F: extra refs = GC + split
> > F: PG_lru set
> > move_data_block()
> > folio = f2fs_grab_cache_folio(F)
> > ...
> > __folio_set_dropbehind(F)
> > folio_unlock(F)
> > folio_end_dropbehind(F)
> >    folio_unmap_invalidate(F)
> >      __filemap_remove_folio(F)
> >      folio_put_refs(F, n)
> > folio_put(F)
> >                              split_folio_to_order(F)
> >                                folio_ref_freeze(F, 1)
> >                                ...
> >                                lru_add_split_folio(T)
> >                                  list_add_tail(&T->lru, &F->lru)
> >                                  folio_set_lru(T)
> >                                __filemap_remove_folio(T)
> >                                folio_put_refs(T, 1)
> >                                /* T refcount == 1, PageLRU set */
> >                                                                    folio_isolate_lru(T)
> >                                                                      folio_test_clear_lru(T)
> >                              free_folio_and_swap_cache(T)
> >                                folio_put(T)
> >                                  /* refcount: 1 -> 0 */
> >                                  __folio_put(T)
> >                                    __page_cache_release(T)
> >                                      folio_test_lru(T) == false
> >                                      /* skip lruvec_del_folio(T) */
> >                                    free_frozen_pages(T)
> >                                                                    folio_get(T)
> >                                                                    lruvec_del_folio(T)
> > later:
> >    list_del(adjacent->lru)
> >      next == &T->lru
> >      next->prev == LIST_POISON / PCP freelist
> >      BUG
> >
>
> Missing Fixes and Cc: stable lines.
>
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> I suspect this is a bug of MM, we can revert this first, and reapply after we
> fix this iusse in MM.
Yes. There is another mailing thread talking about the MM thing on
this issue. You are on the send-to list. I think it is no need to
revert it in a hurry if you are also convinced about my analysis on
split_folio's defect

https://lore.kernel.org/linux-mm/20260612023456.2424044-1-zhaoyang.huang@unisoc.com/


>
> Thanks,
>
> > ---
> >   fs/f2fs/gc.c | 6 +-----
> >   1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index ba93010924c0..3084e05e22f2 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
> >   put_out:
> >       f2fs_put_dnode(&dn);
> >   out:
> > -     if (!folio_test_uptodate(folio))
> > -             __folio_set_dropbehind(folio);
> > -     folio_unlock(folio);
> > -     folio_end_dropbehind(folio);
> > -     folio_put(folio);
> > +     f2fs_folio_put(folio, true);
> >       return err;
> >   }
> >
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
  2026-06-12  3:25   ` Chao Yu
@ 2026-06-15 15:21     ` Jaegeuk Kim via Linux-f2fs-devel
  -1 siblings, 0 replies; 10+ messages in thread
From: Jaegeuk Kim @ 2026-06-15 15:21 UTC (permalink / raw)
  To: Chao Yu
  Cc: zhaoyang.huang, linux-f2fs-devel, linux-kernel, Zhaoyang Huang,
	steve.kang

On 06/12, Chao Yu wrote:
> On 6/8/26 17:09, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > 
> > This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
> > 
> > The kernel panics are keeping to be reported especially when the f2fs
> > partition get almost full. By investigation, we find that the reason is
> > one f2fs page got freed to buddy without being deleted from LRU and the
> > root cause is the race happened in [2] which is enrolled by this commit.
> > 
> > There are 3 race processes in this scenario, please find below for their
> > main activities.
> > 
> > The changed code in move_data_block() lets the GC path evict the tail-end
> > folio from the page cache through folio_end_dropbehind().  Once
> > folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> > page-cache references for all pages in the folio are dropped.  The folio
> > is then kept alive only by temporary external references, which allows a
> > later split to operate on a folio whose subpages are no longer protected
> > by page-cache references.
> > 
> > After the page-cache references are gone, split_folio_to_order() can
> > split the big folio into individual pages and put the resulting subpages
> > back on the LRU.  For tail pages beyond EOF, split removes them from the
> > page cache and drops their page-cache references.  A tail page can then
> > remain on the LRU with PG_lru set while holding only the split caller's
> > temporary reference.  When free_folio_and_swap_cache() drops that final
> > reference, the page enters the final folio_put() release path.
> > 
> > In parallel, folio_isolate_lru() can observe the same tail page with a
> > non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> > reference.  If this races with the final folio_put() from the split path,
> > __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> > The page is then freed back to the allocator while its lru links are
> > still present in the LRU list.  A later LRU operation on a neighboring
> > page detects the stale link and reports list corruption.
> > 
> > [1]
> > [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> > [   22.486130] ------------[ cut here ]------------
> > [   22.486134] kernel BUG at lib/list_debug.c:67!
> > [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> > [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> > [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> > [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488539] sp : ffffffc08006b830
> > [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> > [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> > [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> > [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> > [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> > [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> > [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> > [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> > [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> > [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> > [   22.488647] Call trace:
> > [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> > [   22.488661]  __folio_put+0x2bc/0x434
> > [   22.488670]  folio_put+0x28/0x58
> > [   22.488678]  do_garbage_collect+0x1a34/0x2584
> > [   22.488689]  f2fs_gc+0x230/0x9b4
> > [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> > [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> > [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> > [   22.488725]  invoke_syscall+0x58/0xe4
> > [   22.488732]  do_el0_svc+0x48/0xdc
> > [   22.488739]  el0_svc+0x3c/0x98
> > [   22.488747]  el0t_64_sync_handler+0x20/0x130
> > [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> > 
> > [2]
> > CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> > 
> > F: pagecache refs = n
> > F: extra refs = GC + split
> > F: PG_lru set
> > move_data_block()
> > folio = f2fs_grab_cache_folio(F)
> > ...
> > __folio_set_dropbehind(F)
> > folio_unlock(F)
> > folio_end_dropbehind(F)
> >    folio_unmap_invalidate(F)
> >      __filemap_remove_folio(F)
> >      folio_put_refs(F, n)
> > folio_put(F)
> >                              split_folio_to_order(F)
> >                                folio_ref_freeze(F, 1)
> >                                ...
> >                                lru_add_split_folio(T)
> >                                  list_add_tail(&T->lru, &F->lru)
> >                                  folio_set_lru(T)
> >                                __filemap_remove_folio(T)
> >                                folio_put_refs(T, 1)
> >                                /* T refcount == 1, PageLRU set */
> >                                                                    folio_isolate_lru(T)
> >                                                                      folio_test_clear_lru(T)
> >                              free_folio_and_swap_cache(T)
> >                                folio_put(T)
> >                                  /* refcount: 1 -> 0 */
> >                                  __folio_put(T)
> >                                    __page_cache_release(T)
> >                                      folio_test_lru(T) == false
> >                                      /* skip lruvec_del_folio(T) */
> >                                    free_frozen_pages(T)
> >                                                                    folio_get(T)
> >                                                                    lruvec_del_folio(T)
> > later:
> >    list_del(adjacent->lru)
> >      next == &T->lru
> >      next->prev == LIST_POISON / PCP freelist
> >      BUG
> > 
> 
> Missing Fixes and Cc: stable lines.

Applied with them.

> 
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> I suspect this is a bug of MM, we can revert this first, and reapply after we
> fix this iusse in MM.
> 
> Thanks,
> 
> > ---
> >   fs/f2fs/gc.c | 6 +-----
> >   1 file changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index ba93010924c0..3084e05e22f2 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
> >   put_out:
> >   	f2fs_put_dnode(&dn);
> >   out:
> > -	if (!folio_test_uptodate(folio))
> > -		__folio_set_dropbehind(folio);
> > -	folio_unlock(folio);
> > -	folio_end_dropbehind(folio);
> > -	folio_put(folio);
> > +	f2fs_folio_put(folio, true);
> >   	return err;
> >   }
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [f2fs-dev] [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
@ 2026-06-15 15:21     ` Jaegeuk Kim via Linux-f2fs-devel
  0 siblings, 0 replies; 10+ messages in thread
From: Jaegeuk Kim via Linux-f2fs-devel @ 2026-06-15 15:21 UTC (permalink / raw)
  To: Chao Yu
  Cc: Zhaoyang Huang, linux-f2fs-devel, linux-kernel, steve.kang,
	zhaoyang.huang

On 06/12, Chao Yu wrote:
> On 6/8/26 17:09, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > 
> > This reverts commit 9609dd704725a40cd63d915f2ab6c44248a44598.
> > 
> > The kernel panics are keeping to be reported especially when the f2fs
> > partition get almost full. By investigation, we find that the reason is
> > one f2fs page got freed to buddy without being deleted from LRU and the
> > root cause is the race happened in [2] which is enrolled by this commit.
> > 
> > There are 3 race processes in this scenario, please find below for their
> > main activities.
> > 
> > The changed code in move_data_block() lets the GC path evict the tail-end
> > folio from the page cache through folio_end_dropbehind().  Once
> > folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> > page-cache references for all pages in the folio are dropped.  The folio
> > is then kept alive only by temporary external references, which allows a
> > later split to operate on a folio whose subpages are no longer protected
> > by page-cache references.
> > 
> > After the page-cache references are gone, split_folio_to_order() can
> > split the big folio into individual pages and put the resulting subpages
> > back on the LRU.  For tail pages beyond EOF, split removes them from the
> > page cache and drops their page-cache references.  A tail page can then
> > remain on the LRU with PG_lru set while holding only the split caller's
> > temporary reference.  When free_folio_and_swap_cache() drops that final
> > reference, the page enters the final folio_put() release path.
> > 
> > In parallel, folio_isolate_lru() can observe the same tail page with a
> > non-zero refcount and PG_lru set.  It clears PG_lru before taking its own
> > reference.  If this races with the final folio_put() from the split path,
> > __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> > The page is then freed back to the allocator while its lru links are
> > still present in the LRU list.  A later LRU operation on a neighboring
> > page detects the stale link and reports list corruption.
> > 
> > [1]
> > [   22.486082] list_del corruption. next->prev should be fffffffec10e0ac8, but was dead000000000122. (next=fffffffec10e0a88)
> > [   22.486130] ------------[ cut here ]------------
> > [   22.486134] kernel BUG at lib/list_debug.c:67!
> > [   22.486141] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
> > [   22.488502] Tainted: [W]=WARN, [O]=OOT_MODULE
> > [   22.488506] Hardware name: Spreadtrum UMS9230 1H10 SoC (DT)
> > [   22.488511] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   22.488517] pc : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488531] lr : __list_del_entry_valid_or_report+0x14c/0x154
> > [   22.488539] sp : ffffffc08006b830
> > [   22.488542] x29: ffffffc08006b868 x28: 0000000000003020 x27: 0000000000000000
> > [   22.488553] x26: 0000000000000000 x25: 0000000000000004 x24: fffffffec10e0ac0
> > [   22.488564] x23: 00000000000000e8 x22: 0000000000000024 x21: dead000000000122
> > [   22.488574] x20: fffffffec10e0a88 x19: fffffffec10e0ac8 x18: ffffffc080061060
> > [   22.488585] x17: 20747562202c3863 x16: 6130653031636566 x15: 0000000000000058
> > [   22.488595] x14: 0000000000000004 x13: ffffff80f91e0000 x12: 0000000000000003
> > [   22.488605] x11: 0000000000000003 x10: 0000000000000001 x9 : ffe85721f0e25f00
> > [   22.488615] x8 : ffe85721f0e25f00 x7 : 0000000000000000 x6 : 6c65645f7473696c
> > [   22.488625] x5 : ffffffed39b23026 x4 : 0000000000000000 x3 : 0000000000000010
> > [   22.488636] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000006d
> > [   22.488647] Call trace:
> > [   22.488651]  __list_del_entry_valid_or_report+0x14c/0x154 (P)
> > [   22.488661]  __folio_put+0x2bc/0x434
> > [   22.488670]  folio_put+0x28/0x58
> > [   22.488678]  do_garbage_collect+0x1a34/0x2584
> > [   22.488689]  f2fs_gc+0x230/0x9b4
> > [   22.488697]  f2fs_fallocate+0xb90/0xdf4
> > [   22.488706]  vfs_fallocate+0x1b4/0x2bc
> > [   22.488716]  __arm64_sys_fallocate+0x44/0x78
> > [   22.488725]  invoke_syscall+0x58/0xe4
> > [   22.488732]  do_el0_svc+0x48/0xdc
> > [   22.488739]  el0_svc+0x3c/0x98
> > [   22.488747]  el0t_64_sync_handler+0x20/0x130
> > [   22.488754]  el0t_64_sync+0x1c4/0x1c8
> > 
> > [2]
> > CPU0 (f2fs GC)              CPU1 (split_folio_to_order)          CPU2 (folio_isolate_lru)
> > 
> > F: pagecache refs = n
> > F: extra refs = GC + split
> > F: PG_lru set
> > move_data_block()
> > folio = f2fs_grab_cache_folio(F)
> > ...
> > __folio_set_dropbehind(F)
> > folio_unlock(F)
> > folio_end_dropbehind(F)
> >    folio_unmap_invalidate(F)
> >      __filemap_remove_folio(F)
> >      folio_put_refs(F, n)
> > folio_put(F)
> >                              split_folio_to_order(F)
> >                                folio_ref_freeze(F, 1)
> >                                ...
> >                                lru_add_split_folio(T)
> >                                  list_add_tail(&T->lru, &F->lru)
> >                                  folio_set_lru(T)
> >                                __filemap_remove_folio(T)
> >                                folio_put_refs(T, 1)
> >                                /* T refcount == 1, PageLRU set */
> >                                                                    folio_isolate_lru(T)
> >                                                                      folio_test_clear_lru(T)
> >                              free_folio_and_swap_cache(T)
> >                                folio_put(T)
> >                                  /* refcount: 1 -> 0 */
> >                                  __folio_put(T)
> >                                    __page_cache_release(T)
> >                                      folio_test_lru(T) == false
> >                                      /* skip lruvec_del_folio(T) */
> >                                    free_frozen_pages(T)
> >                                                                    folio_get(T)
> >                                                                    lruvec_del_folio(T)
> > later:
> >    list_del(adjacent->lru)
> >      next == &T->lru
> >      next->prev == LIST_POISON / PCP freelist
> >      BUG
> > 
> 
> Missing Fixes and Cc: stable lines.

Applied with them.

> 
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> I suspect this is a bug of MM, we can revert this first, and reapply after we
> fix this iusse in MM.
> 
> Thanks,
> 
> > ---
> >   fs/f2fs/gc.c | 6 +-----
> >   1 file changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index ba93010924c0..3084e05e22f2 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1468,11 +1468,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
> >   put_out:
> >   	f2fs_put_dnode(&dn);
> >   out:
> > -	if (!folio_test_uptodate(folio))
> > -		__folio_set_dropbehind(folio);
> > -	folio_unlock(folio);
> > -	folio_end_dropbehind(folio);
> > -	folio_put(folio);
> > +	f2fs_folio_put(folio, true);
> >   	return err;
> >   }
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-06-15 15:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08  9:09 [PATCH] Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block" zhaoyang.huang
2026-06-08  9:09 ` [f2fs-dev] " zhaoyang.huang via Linux-f2fs-devel
2026-06-09  0:28 ` Zhaoyang Huang
2026-06-09  0:28   ` Zhaoyang Huang
2026-06-12  3:25 ` [f2fs-dev] " Chao Yu via Linux-f2fs-devel
2026-06-12  3:25   ` Chao Yu
2026-06-12  3:57   ` [f2fs-dev] " Zhaoyang Huang
2026-06-12  3:57     ` Zhaoyang Huang
2026-06-15 15:21   ` Jaegeuk Kim
2026-06-15 15:21     ` [f2fs-dev] " Jaegeuk Kim via Linux-f2fs-devel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.