* [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
@ 2026-01-05 15:30 Nanzhe Zhao
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
` (5 more replies)
0 siblings, 6 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao
When reading immutable, non-compressed files with large folios enabled,
I was able to reproduce readahead hangs while reading sparse files with
holes and heavily fragmented files. The problems were caused by a few
corner cases in the large-folio read loop:
- f2fs_folio_state could be observed with uninitialized field
read_pages_pending
- subpage accounting could become inconsistent with BIO completion,
leading to folios being prematurely unlocked/marked uptodate.
- NULL_ADDR/NEW_ADDR blocks can carry F2FS_MAP_MAPPED, causing the
large-folio read path to treat hole blocks as mapped and to account
them in read_pages_pending.
- in readahead, a folio that never had any subpage queued to a BIO
would not be seen by f2fs_finish_read_bio(), leaving it locked.
- the zeroing path did not advance index/offset before continuing.
This patch series fixes the above issues in f2fs_read_data_large_folio()
introduced by commit 05e65c14ea59 ("f2fs: support large folio for
immutable non-compressed case").
Testing
-------
All patches pass scripts/checkpatch.pl.
I tested the basic large-folio immutable read case described in the
original thread (create a large file, set immutable, drop caches to
reload the inode, then read it), and additionally verified:
- sparse file
- heavily fragmented file
In all cases, reads completed without hangs and data was verified against
the expected contents.
Nanzhe Zhao (5):
f2fs: Zero f2fs_folio_state on allocation
f2fs: Accounting large folio subpages before bio submission
f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
f2fs: add 'folio_in_bio' to handle readahead folios with no BIO
submission
f2fs: advance index and offset after zeroing in large folio read
fs/f2fs/data.c | 54 +++++++++++++++++++++++++++++++++-----------------
1 file changed, 36 insertions(+), 18 deletions(-)
base-commit: 48b5439e04ddf4508ecaf588219012dc81d947c0
--
2.34.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
@ 2026-01-05 15:30 ` Nanzhe Zhao
2026-01-06 3:38 ` Barry Song
2026-01-06 9:16 ` Chao Yu
2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
` (4 subsequent siblings)
5 siblings, 2 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao
f2fs_folio_state is attached to folio->private and is expected to start
with read_pages_pending == 0. However, the structure was allocated from
ffs_entry_slab without being fully initialized, which can leave
read_pages_pending with stale values.
Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
creation time.
Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
fs/f2fs/data.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 471e52c6c1e0..ab091b294fa7 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2389,7 +2389,7 @@ static struct f2fs_folio_state *ffs_find_or_alloc(struct folio *folio)
if (ffs)
return ffs;
- ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO, true, NULL);
+ ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO | __GFP_ZERO, true, NULL);
spin_lock_init(&ffs->state_lock);
folio_attach_private(folio, ffs);
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
@ 2026-01-05 15:30 ` Nanzhe Zhao
2026-01-06 9:16 ` Chao Yu
2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
` (3 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao
In f2fs_read_data_large_folio(), read_pages_pending is incremented only
after the subpage has been added to the BIO. With a heavily fragmented
file, each new subpage can force submission of the previous BIO.
If the BIO completes quickly, f2fs_finish_read_bio() may decrement
read_pages_pending to zero and call folio_end_read() while the read loop
is still processing other subpages of the same large folio.
Fix the ordering by incrementing read_pages_pending before any possible
BIO submission for the current subpage, matching the iomap ordering and
preventing premature folio_end_read().
Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
fs/f2fs/data.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ab091b294fa7..4bef04560924 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2486,6 +2486,18 @@ static int f2fs_read_data_large_folio(struct inode *inode,
continue;
}
+ /* We must increment read_pages_pending before possible BIOs submitting
+ * to prevent from premature folio_end_read() call on folio
+ */
+ if (folio_test_large(folio)) {
+ ffs = ffs_find_or_alloc(folio);
+
+ /* set the bitmap to wait */
+ spin_lock_irq(&ffs->state_lock);
+ ffs->read_pages_pending++;
+ spin_unlock_irq(&ffs->state_lock);
+ }
+
/*
* This page will go to BIO. Do we need to send this
* BIO off first?
@@ -2513,15 +2525,6 @@ static int f2fs_read_data_large_folio(struct inode *inode,
offset << PAGE_SHIFT))
goto submit_and_realloc;
- if (folio_test_large(folio)) {
- ffs = ffs_find_or_alloc(folio);
-
- /* set the bitmap to wait */
- spin_lock_irq(&ffs->state_lock);
- ffs->read_pages_pending++;
- spin_unlock_irq(&ffs->state_lock);
- }
-
inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
F2FS_BLKSIZE);
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
@ 2026-01-05 15:30 ` Nanzhe Zhao
2026-01-06 9:19 ` Chao Yu
2026-01-06 9:30 ` kernel test robot
2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
` (2 subsequent siblings)
5 siblings, 2 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao
f2fs_read_data_large_folio() relies on f2fs_map_blocks() to decide whether
a subpage should be zero-filled or queued to a read bio.
However, f2fs_map_blocks() can set F2FS_MAP_MAPPED for NULL_ADDR and
NEW_ADDR in the non-DIO, no-create path. The large folio read code then
treats such hole blocks as mapped blocks, and may account them
in read_pages_pending and attempt to build bios for them, which can
leave tasks stuck in readahead for heavily fragmented files.
Add a helper, f2fs_block_needs_zeroing(), which detects NULL_ADDR and
NEW_ADDR from struct f2fs_map_blocks. Use it to prioritize the zeroing
path by checking f2fs_block_needs_zeroing() before
(map.m_flags & F2FS_MAP_MAPPED) under got_it: label.
Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
fs/f2fs/data.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 4bef04560924..66ab7a43a56f 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2413,6 +2413,11 @@ static void ffs_detach_free(struct folio *folio)
kmem_cache_free(ffs_entry_slab, ffs);
}
+static inline bool f2fs_block_needs_zeroing(const struct f2fs_map_blocks *map)
+{
+ return map->m_pblk == NULL_ADDR || map->m_pblk == NEW_ADDR;
+}
+
static int f2fs_read_data_large_folio(struct inode *inode,
struct readahead_control *rac, struct folio *folio)
{
@@ -2468,14 +2473,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
if (ret)
goto err_out;
got_it:
- if ((map.m_flags & F2FS_MAP_MAPPED)) {
- block_nr = map.m_pblk + index - map.m_lblk;
- if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
- DATA_GENERIC_ENHANCE_READ)) {
- ret = -EFSCORRUPTED;
- goto err_out;
- }
- } else {
+ if ((f2fs_block_needs_zeroing(&map))) {
folio_zero_range(folio, offset << PAGE_SHIFT, PAGE_SIZE);
if (f2fs_need_verity(inode, index) &&
!fsverity_verify_page(folio_file_page(folio,
@@ -2484,6 +2482,13 @@ static int f2fs_read_data_large_folio(struct inode *inode,
goto err_out;
}
continue;
+ } else if((map.m_flags & F2FS_MAP_MAPPED)) {
+ block_nr = map.m_pblk + index - map.m_lblk;
+ if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
+ DATA_GENERIC_ENHANCE_READ)) {
+ ret = -EFSCORRUPTED;
+ goto err_out;
+ }
}
/* We must increment read_pages_pending before possible BIOs submitting
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
` (2 preceding siblings ...)
2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
@ 2026-01-05 15:31 ` Nanzhe Zhao
2026-01-06 9:31 ` Chao Yu
2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
2026-01-07 3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
5 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:31 UTC (permalink / raw)
To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao
f2fs_read_data_large_folio() can build a single read BIO across multiple
folios during readahead. If a folio ends up having none of its subpages
added to the BIO (e.g. all subpages are zeroed / treated as holes), it
will never be seen by f2fs_finish_read_bio(), so folio_end_read() is
never called. This leaves the folio locked and not marked uptodate.
Track whether the current folio has been added to a BIO via a local
'folio_in_bio' bool flag, and when iterating readahead folios, explicitly
mark the folio uptodate (on success) and unlock it when nothing was added.
Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
fs/f2fs/data.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 66ab7a43a56f..ac569a396914 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
unsigned nrpages;
struct f2fs_folio_state *ffs;
int ret = 0;
+ bool folio_in_bio = false;
if (!IS_IMMUTABLE(inode))
return -EOPNOTSUPP;
@@ -2445,6 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
if (!folio)
goto out;
+ folio_in_bio = false
index = folio->index;
offset = 0;
ffs = NULL;
@@ -2530,6 +2532,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
offset << PAGE_SHIFT))
goto submit_and_realloc;
+ folio_in_bio = true;
inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
F2FS_BLKSIZE);
@@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
}
trace_f2fs_read_folio(folio, DATA);
if (rac) {
+ if (!folio_in_bio) {
+ if (!ret)
+ folio_mark_uptodate(folio);
+ folio_unlock(folio);
+ }
folio = readahead_folio(rac);
goto next_folio;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
` (3 preceding siblings ...)
2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
@ 2026-01-05 15:31 ` Nanzhe Zhao
2026-01-06 9:35 ` Chao Yu
2026-01-07 3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
5 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:31 UTC (permalink / raw)
To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao
In f2fs_read_data_large_folio(), the block zeroing path calls
folio_zero_range() and then continues the loop. However, it fails to
advance index and offset before continuing.
This can cause the loop to repeatedly process the same subpage of the
folio, leading to stalls/hangs and incorrect progress when reading large
folios with holes/zeroed blocks.
Fix it by incrementing index and offset in the zeroing path before
continuing.
Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
fs/f2fs/data.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ac569a396914..07c222bcc5e0 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2446,7 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
if (!folio)
goto out;
- folio_in_bio = false
+ folio_in_bio = false;
index = folio->index;
offset = 0;
ffs = NULL;
@@ -2483,6 +2483,8 @@ static int f2fs_read_data_large_folio(struct inode *inode,
ret = -EIO;
goto err_out;
}
+ index++;
+ offset++;
continue;
} else if((map.m_flags & F2FS_MAP_MAPPED)) {
block_nr = map.m_pblk + index - map.m_lblk;
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
@ 2026-01-06 3:38 ` Barry Song
2026-01-07 3:44 ` Nanzhe Zhao
2026-01-06 9:16 ` Chao Yu
1 sibling, 1 reply; 21+ messages in thread
From: Barry Song @ 2026-01-06 3:38 UTC (permalink / raw)
To: Nanzhe Zhao; +Cc: Kim Jaegeuk, Chao Yu, linux-f2fs-devel, linux-kernel
On Tue, Jan 6, 2026 at 12:12 AM Nanzhe Zhao <nzzhao@126.com> wrote:
>
> f2fs_folio_state is attached to folio->private and is expected to start
> with read_pages_pending == 0. However, the structure was allocated from
> ffs_entry_slab without being fully initialized, which can leave
> read_pages_pending with stale values.
>
> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
> creation time.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
We already have GFP_F2FS_ZERO, but it includes GFP_IO. Should we
introduce another variant, such as GFP_F2FS_NOIO_ZERO (or similar)?
Overall, LGTM.
Reviewed-by: Barry Song <baohua@kernel.org>
> ---
> fs/f2fs/data.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 471e52c6c1e0..ab091b294fa7 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2389,7 +2389,7 @@ static struct f2fs_folio_state *ffs_find_or_alloc(struct folio *folio)
> if (ffs)
> return ffs;
>
> - ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO, true, NULL);
> + ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO | __GFP_ZERO, true, NULL);
>
> spin_lock_init(&ffs->state_lock);
> folio_attach_private(folio, ffs);
> --
> 2.34.1
Thanks
Barry
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
2026-01-06 3:38 ` Barry Song
@ 2026-01-06 9:16 ` Chao Yu
1 sibling, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-06 9:16 UTC (permalink / raw)
To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel
On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
> f2fs_folio_state is attached to folio->private and is expected to start
> with read_pages_pending == 0. However, the structure was allocated from
> ffs_entry_slab without being fully initialized, which can leave
> read_pages_pending with stale values.
>
> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
> creation time.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Thanks,
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission
2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
@ 2026-01-06 9:16 ` Chao Yu
0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-06 9:16 UTC (permalink / raw)
To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel
On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
> In f2fs_read_data_large_folio(), read_pages_pending is incremented only
> after the subpage has been added to the BIO. With a heavily fragmented
> file, each new subpage can force submission of the previous BIO.
>
> If the BIO completes quickly, f2fs_finish_read_bio() may decrement
> read_pages_pending to zero and call folio_end_read() while the read loop
> is still processing other subpages of the same large folio.
>
> Fix the ordering by incrementing read_pages_pending before any possible
> BIO submission for the current subpage, matching the iomap ordering and
> preventing premature folio_end_read().
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Thanks,
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
@ 2026-01-06 9:19 ` Chao Yu
2026-01-06 11:25 ` Nanzhe Zhao
2026-01-06 9:30 ` kernel test robot
1 sibling, 1 reply; 21+ messages in thread
From: Chao Yu @ 2026-01-06 9:19 UTC (permalink / raw)
To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel
On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
> f2fs_read_data_large_folio() relies on f2fs_map_blocks() to decide whether
> a subpage should be zero-filled or queued to a read bio.
>
> However, f2fs_map_blocks() can set F2FS_MAP_MAPPED for NULL_ADDR and
> NEW_ADDR in the non-DIO, no-create path. The large folio read code then
Nanzhe,
IIUC, f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT) won't map hole
space, or am I missing something?
Thanks,
> treats such hole blocks as mapped blocks, and may account them
> in read_pages_pending and attempt to build bios for them, which can
> leave tasks stuck in readahead for heavily fragmented files.
>
> Add a helper, f2fs_block_needs_zeroing(), which detects NULL_ADDR and
> NEW_ADDR from struct f2fs_map_blocks. Use it to prioritize the zeroing
> path by checking f2fs_block_needs_zeroing() before
> (map.m_flags & F2FS_MAP_MAPPED) under got_it: label.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
> fs/f2fs/data.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 4bef04560924..66ab7a43a56f 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2413,6 +2413,11 @@ static void ffs_detach_free(struct folio *folio)
> kmem_cache_free(ffs_entry_slab, ffs);
> }
>
> +static inline bool f2fs_block_needs_zeroing(const struct f2fs_map_blocks *map)
> +{
> + return map->m_pblk == NULL_ADDR || map->m_pblk == NEW_ADDR;
> +}
> +
> static int f2fs_read_data_large_folio(struct inode *inode,
> struct readahead_control *rac, struct folio *folio)
> {
> @@ -2468,14 +2473,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> if (ret)
> goto err_out;
> got_it:
> - if ((map.m_flags & F2FS_MAP_MAPPED)) {
> - block_nr = map.m_pblk + index - map.m_lblk;
> - if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
> - DATA_GENERIC_ENHANCE_READ)) {
> - ret = -EFSCORRUPTED;
> - goto err_out;
> - }
> - } else {
> + if ((f2fs_block_needs_zeroing(&map))) {
> folio_zero_range(folio, offset << PAGE_SHIFT, PAGE_SIZE);
> if (f2fs_need_verity(inode, index) &&
> !fsverity_verify_page(folio_file_page(folio,
> @@ -2484,6 +2482,13 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> goto err_out;
> }
> continue;
> + } else if((map.m_flags & F2FS_MAP_MAPPED)) {
> + block_nr = map.m_pblk + index - map.m_lblk;
> + if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
> + DATA_GENERIC_ENHANCE_READ)) {
> + ret = -EFSCORRUPTED;
> + goto err_out;
> + }
> }
>
> /* We must increment read_pages_pending before possible BIOs submitting
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
2026-01-06 9:19 ` Chao Yu
@ 2026-01-06 9:30 ` kernel test robot
1 sibling, 0 replies; 21+ messages in thread
From: kernel test robot @ 2026-01-06 9:30 UTC (permalink / raw)
To: Nanzhe Zhao, Kim Jaegeuk
Cc: llvm, oe-kbuild-all, Chao Yu, linux-f2fs-devel, linux-kernel,
Nanzhe Zhao
Hi Nanzhe,
kernel test robot noticed the following build warnings:
[auto build test WARNING on 48b5439e04ddf4508ecaf588219012dc81d947c0]
url: https://github.com/intel-lab-lkp/linux/commits/Nanzhe-Zhao/f2fs-Zero-f2fs_folio_state-on-allocation/20260106-005006
base: 48b5439e04ddf4508ecaf588219012dc81d947c0
patch link: https://lore.kernel.org/r/20260105153101.152892-4-nzzhao%40126.com
patch subject: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260106/202601061013.MBnRTOrG-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260106/202601061013.MBnRTOrG-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202601061013.MBnRTOrG-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> fs/f2fs/data.c:2485:13: warning: variable 'block_nr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
2485 | } else if((map.m_flags & F2FS_MAP_MAPPED)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/f2fs/data.c:2527:39: note: uninitialized use occurs here
2527 | f2fs_wait_on_block_writeback(inode, block_nr);
| ^~~~~~~~
fs/f2fs/data.c:2485:10: note: remove the 'if' if its condition is always true
2485 | } else if((map.m_flags & F2FS_MAP_MAPPED)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/f2fs/data.c:2454:20: note: initialize the variable 'block_nr' to silence this warning
2454 | sector_t block_nr;
| ^
| = 0
1 warning generated.
vim +2485 fs/f2fs/data.c
2420
2421 static int f2fs_read_data_large_folio(struct inode *inode,
2422 struct readahead_control *rac, struct folio *folio)
2423 {
2424 struct bio *bio = NULL;
2425 sector_t last_block_in_bio = 0;
2426 struct f2fs_map_blocks map = {0, };
2427 pgoff_t index, offset;
2428 unsigned max_nr_pages = rac ? readahead_count(rac) :
2429 folio_nr_pages(folio);
2430 unsigned nrpages;
2431 struct f2fs_folio_state *ffs;
2432 int ret = 0;
2433
2434 if (!IS_IMMUTABLE(inode))
2435 return -EOPNOTSUPP;
2436
2437 if (f2fs_compressed_file(inode))
2438 return -EOPNOTSUPP;
2439
2440 map.m_seg_type = NO_CHECK_TYPE;
2441
2442 if (rac)
2443 folio = readahead_folio(rac);
2444 next_folio:
2445 if (!folio)
2446 goto out;
2447
2448 index = folio->index;
2449 offset = 0;
2450 ffs = NULL;
2451 nrpages = folio_nr_pages(folio);
2452
2453 for (; nrpages; nrpages--) {
2454 sector_t block_nr;
2455 /*
2456 * Map blocks using the previous result first.
2457 */
2458 if ((map.m_flags & F2FS_MAP_MAPPED) &&
2459 index > map.m_lblk &&
2460 index < (map.m_lblk + map.m_len))
2461 goto got_it;
2462
2463 /*
2464 * Then do more f2fs_map_blocks() calls until we are
2465 * done with this page.
2466 */
2467 memset(&map, 0, sizeof(map));
2468 map.m_seg_type = NO_CHECK_TYPE;
2469 map.m_lblk = index;
2470 map.m_len = max_nr_pages;
2471
2472 ret = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT);
2473 if (ret)
2474 goto err_out;
2475 got_it:
2476 if ((f2fs_block_needs_zeroing(&map))) {
2477 folio_zero_range(folio, offset << PAGE_SHIFT, PAGE_SIZE);
2478 if (f2fs_need_verity(inode, index) &&
2479 !fsverity_verify_page(folio_file_page(folio,
2480 index))) {
2481 ret = -EIO;
2482 goto err_out;
2483 }
2484 continue;
> 2485 } else if((map.m_flags & F2FS_MAP_MAPPED)) {
2486 block_nr = map.m_pblk + index - map.m_lblk;
2487 if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
2488 DATA_GENERIC_ENHANCE_READ)) {
2489 ret = -EFSCORRUPTED;
2490 goto err_out;
2491 }
2492 }
2493
2494 /* We must increment read_pages_pending before possible BIOs submitting
2495 * to prevent from premature folio_end_read() call on folio
2496 */
2497 if (folio_test_large(folio)) {
2498 ffs = ffs_find_or_alloc(folio);
2499
2500 /* set the bitmap to wait */
2501 spin_lock_irq(&ffs->state_lock);
2502 ffs->read_pages_pending++;
2503 spin_unlock_irq(&ffs->state_lock);
2504 }
2505
2506 /*
2507 * This page will go to BIO. Do we need to send this
2508 * BIO off first?
2509 */
2510 if (bio && (!page_is_mergeable(F2FS_I_SB(inode), bio,
2511 last_block_in_bio, block_nr) ||
2512 !f2fs_crypt_mergeable_bio(bio, inode, index, NULL))) {
2513 submit_and_realloc:
2514 f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
2515 bio = NULL;
2516 }
2517 if (bio == NULL)
2518 bio = f2fs_grab_read_bio(inode, block_nr,
2519 max_nr_pages,
2520 f2fs_ra_op_flags(rac),
2521 index, false);
2522
2523 /*
2524 * If the page is under writeback, we need to wait for
2525 * its completion to see the correct decrypted data.
2526 */
2527 f2fs_wait_on_block_writeback(inode, block_nr);
2528
2529 if (!bio_add_folio(bio, folio, F2FS_BLKSIZE,
2530 offset << PAGE_SHIFT))
2531 goto submit_and_realloc;
2532
2533 inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
2534 f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
2535 F2FS_BLKSIZE);
2536 last_block_in_bio = block_nr;
2537 index++;
2538 offset++;
2539 }
2540 trace_f2fs_read_folio(folio, DATA);
2541 if (rac) {
2542 folio = readahead_folio(rac);
2543 goto next_folio;
2544 }
2545 err_out:
2546 /* Nothing was submitted. */
2547 if (!bio) {
2548 if (!ret)
2549 folio_mark_uptodate(folio);
2550 folio_unlock(folio);
2551 return ret;
2552 }
2553
2554 if (ret) {
2555 f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
2556
2557 /* Wait bios and clear uptodate. */
2558 folio_lock(folio);
2559 folio_clear_uptodate(folio);
2560 folio_unlock(folio);
2561 }
2562 out:
2563 f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
2564 return ret;
2565 }
2566
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
@ 2026-01-06 9:31 ` Chao Yu
2026-01-07 0:33 ` Nanzhe Zhao
0 siblings, 1 reply; 21+ messages in thread
From: Chao Yu @ 2026-01-06 9:31 UTC (permalink / raw)
To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel
On 1/5/2026 11:31 PM, Nanzhe Zhao wrote:
> f2fs_read_data_large_folio() can build a single read BIO across multiple
> folios during readahead. If a folio ends up having none of its subpages
> added to the BIO (e.g. all subpages are zeroed / treated as holes), it
> will never be seen by f2fs_finish_read_bio(), so folio_end_read() is
> never called. This leaves the folio locked and not marked uptodate.
>
> Track whether the current folio has been added to a BIO via a local
> 'folio_in_bio' bool flag, and when iterating readahead folios, explicitly
> mark the folio uptodate (on success) and unlock it when nothing was added.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
> fs/f2fs/data.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 66ab7a43a56f..ac569a396914 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> unsigned nrpages;
> struct f2fs_folio_state *ffs;
> int ret = 0;
> + bool folio_in_bio = false;
No need to initialize folio_in_bio?
>
> if (!IS_IMMUTABLE(inode))
> return -EOPNOTSUPP;
> @@ -2445,6 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> if (!folio)
> goto out;
>
> + folio_in_bio = false
folio_in_bio = false;
> index = folio->index;
> offset = 0;
> ffs = NULL;
> @@ -2530,6 +2532,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> offset << PAGE_SHIFT))
> goto submit_and_realloc;
>
> + folio_in_bio = true;
> inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
> f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
> F2FS_BLKSIZE);
> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> }
> trace_f2fs_read_folio(folio, DATA);
> if (rac) {
> + if (!folio_in_bio) {
> + if (!ret)
> + folio_mark_uptodate(folio);
> + folio_unlock(folio);
> + }
err_out:
/* Nothing was submitted. */
if (!bio) {
if (!ret)
folio_mark_uptodate(folio);
folio_unlock(folio);
^^^^^^^^^^^^
If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
Thanks,
return ret;
}
> folio = readahead_folio(rac);
> goto next_folio;
> }
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read
2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
@ 2026-01-06 9:35 ` Chao Yu
0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-06 9:35 UTC (permalink / raw)
To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel
On 1/5/2026 11:31 PM, Nanzhe Zhao wrote:
> In f2fs_read_data_large_folio(), the block zeroing path calls
> folio_zero_range() and then continues the loop. However, it fails to
> advance index and offset before continuing.
>
> This can cause the loop to repeatedly process the same subpage of the
> folio, leading to stalls/hangs and incorrect progress when reading large
> folios with holes/zeroed blocks.
>
> Fix it by incrementing index and offset in the zeroing path before
> continuing.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
> fs/f2fs/data.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index ac569a396914..07c222bcc5e0 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2446,7 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> if (!folio)
> goto out;
>
> - folio_in_bio = false
> + folio_in_bio = false;
Should be fixed in 4/5.
> index = folio->index;
> offset = 0;
> ffs = NULL;
> @@ -2483,6 +2483,8 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> ret = -EIO;
> goto err_out;
> }
> + index++;
> + offset++;
What about increasing index & offset in for () statement, in case we missed
to update them anywhere.
Thanks,
> continue;
> } else if((map.m_flags & F2FS_MAP_MAPPED)) {
> block_nr = map.m_pblk + index - map.m_lblk;
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:Re: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
2026-01-06 9:19 ` Chao Yu
@ 2026-01-06 11:25 ` Nanzhe Zhao
0 siblings, 0 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-06 11:25 UTC (permalink / raw)
To: Chao Yu; +Cc: Kim Jaegeuk, linux-f2fs-devel, linux-kernel
At 2026-01-06 17:19:14, "Chao Yu" <chao@kernel.org> wrote:
>On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
>>IIUC, f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT) won't map hole
>>space, or am I missing something?
>>
>>Thanks,
My fault, I missed the goto sync_out statement in default case. Thanks for pointing out.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:Re: [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
2026-01-06 9:31 ` Chao Yu
@ 2026-01-07 0:33 ` Nanzhe Zhao
2026-01-07 1:16 ` Chao Yu
0 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-07 0:33 UTC (permalink / raw)
To: Chao Yu; +Cc: Kim Jaegeuk, linux-f2fs-devel, linux-kernel
Hi Chao yu:
At 2026-01-06 17:31:20, "Chao Yu" <chao@kernel.org> wrote:
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 66ab7a43a56f..ac569a396914 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>> unsigned nrpages;
>>> struct f2fs_folio_state *ffs;
>>> int ret = 0;
>>> + bool folio_in_bio = false;
>>
>>No need to initialize folio_in_bio?
Agreed. It's redundant since we reset it to false for each new folio before processing.
>>> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>> }
>>> trace_f2fs_read_folio(folio, DATA);
>>> if (rac) {
>>> + if (!folio_in_bio) {
>>> + if (!ret)
>>> + folio_mark_uptodate(folio);
>>> + folio_unlock(folio);
>>> + }
>>
>>err_out:
>> /* Nothing was submitted. */
>> if (!bio) {
>> if (!ret)
>> folio_mark_uptodate(folio);
>> folio_unlock(folio);
>>
>> ^^^^^^^^^^^^
>>
>>If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
Are you worried the folio could be unlocked once in the if (rac) { ... } block and then
unlocked again at err_out:? If so, I think that won't happen.
In such a case, every non-NULL folio will be unlocked exactly once by:
if (!folio_in_bio) {
if (!ret)
folio_mark_uptodate(folio);
folio_unlock(folio);
}
Specifically, after the last folio runs through the block above, the next call:
folio = readahead_folio(rac);
will return NULL. Then we go to next_folio:, and will directly hit:
if (!folio)
goto out;
This jumps straight to the out: label, skipping err_out: entirely.
Therefore, when ret is not an error code, the err_out: label will never be reached.
If ret becomes an error code, then the current folio will immediately goto err_out;
and be unlocked there once.
If rac is NULL (meaning we only read the single large folio passed in as the function argument),
we won't enter the if (rac) { ... goto next_folio; } path at all, so we also won't go to next_folio
and then potentially goto out;. In that case, it will naturally be unlocked once at err_out:.
Or am I missing some edge case here?
Thanks,
Nanzhe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
2026-01-07 0:33 ` Nanzhe Zhao
@ 2026-01-07 1:16 ` Chao Yu
0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-07 1:16 UTC (permalink / raw)
To: Nanzhe Zhao; +Cc: chao, Kim Jaegeuk, linux-f2fs-devel, linux-kernel
On 1/7/2026 8:33 AM, Nanzhe Zhao wrote:
> Hi Chao yu:
> At 2026-01-06 17:31:20, "Chao Yu" <chao@kernel.org> wrote:
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index 66ab7a43a56f..ac569a396914 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>> unsigned nrpages;
>>>> struct f2fs_folio_state *ffs;
>>>> int ret = 0;
>>>> + bool folio_in_bio = false;
>>>
>>> No need to initialize folio_in_bio?
>
> Agreed. It's redundant since we reset it to false for each new folio before processing.
>
>>>> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>> }
>>>> trace_f2fs_read_folio(folio, DATA);
>>>> if (rac) {
>>>> + if (!folio_in_bio) {
>>>> + if (!ret)
>>>> + folio_mark_uptodate(folio);
>>>> + folio_unlock(folio);
>>>> + }
>>>
>>> err_out:
>>> /* Nothing was submitted. */
>>> if (!bio) {
>>> if (!ret)
>>> folio_mark_uptodate(folio);
>>> folio_unlock(folio);
>>>
>>> ^^^^^^^^^^^^
>>>
>>> If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
>
> Are you worried the folio could be unlocked once in the if (rac) { ... } block and then
> unlocked again at err_out:? If so, I think that won't happen.
>
> In such a case, every non-NULL folio will be unlocked exactly once by:
>
> if (!folio_in_bio) {
> if (!ret)
> folio_mark_uptodate(folio);
> folio_unlock(folio);
> }
> Specifically, after the last folio runs through the block above, the next call:
>
> folio = readahead_folio(rac);
> will return NULL. Then we go to next_folio:, and will directly hit:
>
> if (!folio)
> goto out;
> This jumps straight to the out: label, skipping err_out: entirely.
> Therefore, when ret is not an error code, the err_out: label will never be reached.
>
> If ret becomes an error code, then the current folio will immediately goto err_out;
> and be unlocked there once.
>
> If rac is NULL (meaning we only read the single large folio passed in as the function argument),
> we won't enter the if (rac) { ... goto next_folio; } path at all, so we also won't go to next_folio
> and then potentially goto out;. In that case, it will naturally be unlocked once at err_out:.
> Or am I missing some edge case here?
Nanzhe,
Oh, yes, I think so, thanks for the explanation.
Thanks,
>
> Thanks,
> Nanzhe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
` (4 preceding siblings ...)
2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
@ 2026-01-07 3:08 ` Jaegeuk Kim
2026-01-08 2:17 ` Nanzhe Zhao
5 siblings, 1 reply; 21+ messages in thread
From: Jaegeuk Kim @ 2026-01-07 3:08 UTC (permalink / raw)
To: Nanzhe Zhao; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel
Hi Nanzhe,
fyi - I applied the beginning two patches first.
Thanks,
On 01/05, Nanzhe Zhao wrote:
> When reading immutable, non-compressed files with large folios enabled,
> I was able to reproduce readahead hangs while reading sparse files with
> holes and heavily fragmented files. The problems were caused by a few
> corner cases in the large-folio read loop:
>
> - f2fs_folio_state could be observed with uninitialized field
> read_pages_pending
> - subpage accounting could become inconsistent with BIO completion,
> leading to folios being prematurely unlocked/marked uptodate.
> - NULL_ADDR/NEW_ADDR blocks can carry F2FS_MAP_MAPPED, causing the
> large-folio read path to treat hole blocks as mapped and to account
> them in read_pages_pending.
> - in readahead, a folio that never had any subpage queued to a BIO
> would not be seen by f2fs_finish_read_bio(), leaving it locked.
> - the zeroing path did not advance index/offset before continuing.
>
> This patch series fixes the above issues in f2fs_read_data_large_folio()
> introduced by commit 05e65c14ea59 ("f2fs: support large folio for
> immutable non-compressed case").
>
> Testing
> -------
>
> All patches pass scripts/checkpatch.pl.
>
> I tested the basic large-folio immutable read case described in the
> original thread (create a large file, set immutable, drop caches to
> reload the inode, then read it), and additionally verified:
>
> - sparse file
> - heavily fragmented file
>
> In all cases, reads completed without hangs and data was verified against
> the expected contents.
>
> Nanzhe Zhao (5):
> f2fs: Zero f2fs_folio_state on allocation
> f2fs: Accounting large folio subpages before bio submission
> f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
> f2fs: add 'folio_in_bio' to handle readahead folios with no BIO
> submission
> f2fs: advance index and offset after zeroing in large folio read
>
> fs/f2fs/data.c | 54 +++++++++++++++++++++++++++++++++-----------------
> 1 file changed, 36 insertions(+), 18 deletions(-)
>
>
> base-commit: 48b5439e04ddf4508ecaf588219012dc81d947c0
> --
> 2.34.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
2026-01-06 3:38 ` Barry Song
@ 2026-01-07 3:44 ` Nanzhe Zhao
2026-01-08 22:35 ` Barry Song
0 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-07 3:44 UTC (permalink / raw)
To: Barry Song; +Cc: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel
Hi Barry:
>At 2026-01-06 11:38:49, "Barry Song" <21cnbao@gmail.com> wrote:
>>On Tue, Jan 6, 2026 at 12:12 AM Nanzhe Zhao <nzzhao@126.com> wrote:
>>>
>>> f2fs_folio_state is attached to folio->private and is expected to start
>>> with read_pages_pending == 0. However, the structure was allocated from
>>> ffs_entry_slab without being fully initialized, which can leave
>>> read_pages_pending with stale values.
>>>
>>> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
>>> creation time.
>>>
>>> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
>>
>>
>>We already have GFP_F2FS_ZERO, but it includes GFP_IO. Should we
>>introduce another variant, such as GFP_F2FS_NOIO_ZERO (or similar)?
>>Overall, LGTM.
>>
I'm still not fully understand about the exact semantics of GFP_NOIO vs GFP_NOFS.
I did a bit of digging and, in the current buffered read / readahead context, it seems
like there may be no meaningful difference for the purpose of avoiding direct-reclaim
recursion deadlocks?
My current (possibly incomplete) understanding is that in may_enter_fs(), GFP_NOIO
only changes behavior for swapcache folios, rather than file-backed folios that are
currently in the read IO path,and the swap writeback path won't recurse back into f2fs's
own writeback function anyway. (On phones there usually isn't a swap partition; for zram
I guess swap writeback is effectively writing to RAM via the zram block device ?
Sorry for not being very familiar with the details there.)
I noticed iomap's ifs_alloc uses GFP_NOFS | __GFP_NOFAIL. So if GFP_NOFS is acceptable here,
we could simply use GFP_F2FS_ZERO and avoid introducing a new GFP_F2FS_NOIO_ZERO variant?
Just curious.I will vote for GFP_NOIO from semantic clarity perspective here.
Thanks,
Nanzhe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re:Re: [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
2026-01-07 3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
@ 2026-01-08 2:17 ` Nanzhe Zhao
2026-01-08 9:23 ` Chao Yu
0 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-08 2:17 UTC (permalink / raw)
To: Jaegeuk Kim; +Cc: Chao Yu, Barry Song, linux-f2fs-devel, linux-kernel
Hi Kim,
At 2026-01-07 11:08:50, "Jaegeuk Kim" <jaegeuk@kernel.org> wrote:
>>Hi Nanzhe,
>>
>>fyi - I applied the beginning two patches first.
>>
>>Thanks,
>>
Thanks for applying my small changes.
By the way, I’d like to discuss one more thing about testing for large folios.
It seems the current xfstests coverage may not be sufficient. Would it be
welcome for me to contribute some new test cases upstream?
Also, I think large-folio functionality might also need black-box testing such
as fault-injection, where we force certain paths to return errors and verify
behavior under failures. I’d appreciate your thoughts.
Thanks,
Nanzhe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
2026-01-08 2:17 ` Nanzhe Zhao
@ 2026-01-08 9:23 ` Chao Yu
0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-08 9:23 UTC (permalink / raw)
To: Nanzhe Zhao, Jaegeuk Kim; +Cc: chao, Barry Song, linux-f2fs-devel, linux-kernel
On 1/8/2026 10:17 AM, Nanzhe Zhao wrote:
> Hi Kim,
> At 2026-01-07 11:08:50, "Jaegeuk Kim" <jaegeuk@kernel.org> wrote:
>>> Hi Nanzhe,
>>>
>>> fyi - I applied the beginning two patches first.
>>>
>>> Thanks,
>>>
>
> Thanks for applying my small changes.
>
> By the way, I’d like to discuss one more thing about testing for large folios.
> It seems the current xfstests coverage may not be sufficient. Would it be
> welcome for me to contribute some new test cases upstream?
Great, please go ahead, new testcase can be added into tests/f2fs/ directory.
>
> Also, I think large-folio functionality might also need black-box testing such
> as fault-injection, where we force certain paths to return errors and verify
> behavior under failures. I’d appreciate your thoughts.
It's fine to introduce a new help f2fs_fsverity_verify_page() and inject error
there.
Thanks,
>
> Thanks,
> Nanzhe
>
>
>
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
2026-01-07 3:44 ` Nanzhe Zhao
@ 2026-01-08 22:35 ` Barry Song
0 siblings, 0 replies; 21+ messages in thread
From: Barry Song @ 2026-01-08 22:35 UTC (permalink / raw)
To: Nanzhe Zhao; +Cc: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel
On Wed, Jan 7, 2026 at 4:45 PM Nanzhe Zhao <nzzhao@126.com> wrote:
>
> Hi Barry:
>
> >At 2026-01-06 11:38:49, "Barry Song" <21cnbao@gmail.com> wrote:
> >>On Tue, Jan 6, 2026 at 12:12 AM Nanzhe Zhao <nzzhao@126.com> wrote:
> >>>
> >>> f2fs_folio_state is attached to folio->private and is expected to start
> >>> with read_pages_pending == 0. However, the structure was allocated from
> >>> ffs_entry_slab without being fully initialized, which can leave
> >>> read_pages_pending with stale values.
> >>>
> >>> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
> >>> creation time.
> >>>
> >>> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> >>
> >>
> >>We already have GFP_F2FS_ZERO, but it includes GFP_IO. Should we
> >>introduce another variant, such as GFP_F2FS_NOIO_ZERO (or similar)?
> >>Overall, LGTM.
> >>
>
> I'm still not fully understand about the exact semantics of GFP_NOIO vs GFP_NOFS.
> I did a bit of digging and, in the current buffered read / readahead context, it seems
> like there may be no meaningful difference for the purpose of avoiding direct-reclaim
> recursion deadlocks?
With GFP_NOIO, we will not swap out pages, including anonymous folios.
if (folio_test_anon(folio) && folio_test_swapbacked(folio)) {
if (!folio_test_swapcache(folio)) {
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
When using GFP_NOFS, reclaim can still swap out an anon folio,
provided its swap entry is not filesystem-backed
(see folio_swap_flags(folio)).
static bool may_enter_fs(struct folio *folio, gfp_t gfp_mask)
{
if (gfp_mask & __GFP_FS)
return true;
if (!folio_test_swapcache(folio) || !(gfp_mask & __GFP_IO))
return false;
/*
* We can "enter_fs" for swap-cache with only __GFP_IO
* providing this isn't SWP_FS_OPS.
* ->flags can be updated non-atomicially (scan_swap_map_slots),
* but that will never affect SWP_FS_OPS, so the data_race
* is safe.
*/
return !data_race(folio_swap_flags(folio) & SWP_FS_OPS);
}
Note that swap may be backed either by a filesystem swapfile or
directly by a block device.
In short, GFP_NOIO is stricter than GFP_NOFS: it disallows any I/O,
even if the I/O does not involve a filesystem, whereas GFP_NOFS
still permits I/O that is not filesystem-related.
>
> My current (possibly incomplete) understanding is that in may_enter_fs(), GFP_NOIO
> only changes behavior for swapcache folios, rather than file-backed folios that are
> currently in the read IO path,and the swap writeback path won't recurse back into f2fs's
> own writeback function anyway. (On phones there usually isn't a swap partition; for zram
> I guess swap writeback is effectively writing to RAM via the zram block device ?
> Sorry for not being very familiar with the details there.)
This can be the case for a swapfile on F2FS. Note that the check is
performed per folio. On a system with both zRAM and a filesystem-
backed swapfile, some folios may be swapped out while others may
not, depending on where their swap slots are allocated.
>
> I noticed iomap's ifs_alloc uses GFP_NOFS | __GFP_NOFAIL. So if GFP_NOFS is acceptable here,
> we could simply use GFP_F2FS_ZERO and avoid introducing a new GFP_F2FS_NOIO_ZERO variant?
>
> Just curious.I will vote for GFP_NOIO from semantic clarity perspective here.
In general, GFP_NOIO is used when handling bios or requests.
Thanks
Barry
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2026-01-08 22:35 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
2026-01-06 3:38 ` Barry Song
2026-01-07 3:44 ` Nanzhe Zhao
2026-01-08 22:35 ` Barry Song
2026-01-06 9:16 ` Chao Yu
2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
2026-01-06 9:16 ` Chao Yu
2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
2026-01-06 9:19 ` Chao Yu
2026-01-06 11:25 ` Nanzhe Zhao
2026-01-06 9:30 ` kernel test robot
2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
2026-01-06 9:31 ` Chao Yu
2026-01-07 0:33 ` Nanzhe Zhao
2026-01-07 1:16 ` Chao Yu
2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
2026-01-06 9:35 ` Chao Yu
2026-01-07 3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
2026-01-08 2:17 ` Nanzhe Zhao
2026-01-08 9:23 ` Chao Yu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox