[PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
@ 2026-01-05 15:30 Nanzhe Zhao
  2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
  To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao

When reading immutable, non-compressed files with large folios enabled,
I was able to reproduce readahead hangs while reading sparse files with
holes and heavily fragmented files. The problems were caused by a few
corner cases in the large-folio read loop:

  - f2fs_folio_state could be observed with uninitialized field
    read_pages_pending
  - subpage accounting could become inconsistent with BIO completion,
    leading to folios being prematurely unlocked/marked uptodate.
  - NULL_ADDR/NEW_ADDR blocks can carry F2FS_MAP_MAPPED, causing the
    large-folio read path to treat hole blocks as mapped and to account
    them in read_pages_pending.
  - in readahead, a folio that never had any subpage queued to a BIO
    would not be seen by f2fs_finish_read_bio(), leaving it locked.
  - the zeroing path did not advance index/offset before continuing.

This patch series fixes the above issues in f2fs_read_data_large_folio()
introduced by commit 05e65c14ea59 ("f2fs: support large folio for
immutable non-compressed case").

Testing
-------

All patches pass scripts/checkpatch.pl.

I tested the basic large-folio immutable read case described in the
original thread (create a large file, set immutable, drop caches to
reload the inode, then read it), and additionally verified:

  - sparse file
  - heavily fragmented file

In all cases, reads completed without hangs and data was verified against
the expected contents.

Nanzhe Zhao (5):
  f2fs: Zero f2fs_folio_state on allocation
  f2fs: Accounting large folio subpages before bio submission
  f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
  f2fs: add 'folio_in_bio' to handle readahead folios with no BIO
    submission
  f2fs: advance index and offset after zeroing in large folio read

 fs/f2fs/data.c | 54 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 18 deletions(-)


base-commit: 48b5439e04ddf4508ecaf588219012dc81d947c0
--
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
  2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
@ 2026-01-05 15:30 ` Nanzhe Zhao
  2026-01-06  3:38   ` Barry Song
  2026-01-06  9:16   ` Chao Yu
  2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
  To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao

f2fs_folio_state is attached to folio->private and is expected to start
with read_pages_pending == 0.  However, the structure was allocated from
ffs_entry_slab without being fully initialized, which can leave
read_pages_pending with stale values.

Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
creation time.

Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
 fs/f2fs/data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 471e52c6c1e0..ab091b294fa7 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2389,7 +2389,7 @@ static struct f2fs_folio_state *ffs_find_or_alloc(struct folio *folio)
 	if (ffs)
 		return ffs;

-	ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO, true, NULL);
+	ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO | __GFP_ZERO, true, NULL);

 	spin_lock_init(&ffs->state_lock);
 	folio_attach_private(folio, ffs);
--
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission
  2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
  2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
@ 2026-01-05 15:30 ` Nanzhe Zhao
  2026-01-06  9:16   ` Chao Yu
  2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
  To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao

In f2fs_read_data_large_folio(), read_pages_pending is incremented only
after the subpage has been added to the BIO.  With a heavily fragmented
file, each new subpage can force submission of the previous BIO.

If the BIO completes quickly, f2fs_finish_read_bio() may decrement
read_pages_pending to zero and call folio_end_read() while the read loop
is still processing other subpages of the same large folio.

Fix the ordering by incrementing read_pages_pending before any possible
BIO submission for the current subpage, matching the iomap ordering and
preventing premature folio_end_read().

Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
 fs/f2fs/data.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ab091b294fa7..4bef04560924 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2486,6 +2486,18 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 			continue;
 		}

+		/* We must increment read_pages_pending before possible BIOs submitting
+		 * to prevent from premature folio_end_read() call on folio
+		 */
+		if (folio_test_large(folio)) {
+			ffs = ffs_find_or_alloc(folio);
+
+			/* set the bitmap to wait */
+			spin_lock_irq(&ffs->state_lock);
+			ffs->read_pages_pending++;
+			spin_unlock_irq(&ffs->state_lock);
+		}
+
 		/*
 		 * This page will go to BIO.  Do we need to send this
 		 * BIO off first?
@@ -2513,15 +2525,6 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 					offset << PAGE_SHIFT))
 			goto submit_and_realloc;

-		if (folio_test_large(folio)) {
-			ffs = ffs_find_or_alloc(folio);
-
-			/* set the bitmap to wait */
-			spin_lock_irq(&ffs->state_lock);
-			ffs->read_pages_pending++;
-			spin_unlock_irq(&ffs->state_lock);
-		}
-
 		inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
 		f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
 				F2FS_BLKSIZE);
--
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
  2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
  2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
  2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
@ 2026-01-05 15:30 ` Nanzhe Zhao
  2026-01-06  9:19   ` Chao Yu
  2026-01-06  9:30   ` kernel test robot
  2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:30 UTC (permalink / raw)
  To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao

f2fs_read_data_large_folio() relies on f2fs_map_blocks() to decide whether
a subpage should be zero-filled or queued to a read bio.

However, f2fs_map_blocks() can set F2FS_MAP_MAPPED for NULL_ADDR and
NEW_ADDR in the non-DIO, no-create path. The large folio read code then
treats such hole blocks as mapped blocks, and may account them
in read_pages_pending and attempt to build bios for them, which can
leave tasks stuck in readahead for heavily fragmented files.

Add a helper, f2fs_block_needs_zeroing(), which detects NULL_ADDR and
NEW_ADDR from struct f2fs_map_blocks. Use it to prioritize the zeroing
path by checking f2fs_block_needs_zeroing() before
(map.m_flags & F2FS_MAP_MAPPED) under got_it: label.

Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
 fs/f2fs/data.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 4bef04560924..66ab7a43a56f 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2413,6 +2413,11 @@ static void ffs_detach_free(struct folio *folio)
 	kmem_cache_free(ffs_entry_slab, ffs);
 }

+static inline bool f2fs_block_needs_zeroing(const struct f2fs_map_blocks *map)
+{
+	return map->m_pblk == NULL_ADDR || map->m_pblk == NEW_ADDR;
+}
+
 static int f2fs_read_data_large_folio(struct inode *inode,
 		struct readahead_control *rac, struct folio *folio)
 {
@@ -2468,14 +2473,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 		if (ret)
 			goto err_out;
 got_it:
-		if ((map.m_flags & F2FS_MAP_MAPPED)) {
-			block_nr = map.m_pblk + index - map.m_lblk;
-			if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
-						DATA_GENERIC_ENHANCE_READ)) {
-				ret = -EFSCORRUPTED;
-				goto err_out;
-			}
-		} else {
+		if ((f2fs_block_needs_zeroing(&map))) {
 			folio_zero_range(folio, offset << PAGE_SHIFT, PAGE_SIZE);
 			if (f2fs_need_verity(inode, index) &&
 			    !fsverity_verify_page(folio_file_page(folio,
@@ -2484,6 +2482,13 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 				goto err_out;
 			}
 			continue;
+		} else if((map.m_flags & F2FS_MAP_MAPPED)) {
+			block_nr = map.m_pblk + index - map.m_lblk;
+			if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
+						DATA_GENERIC_ENHANCE_READ)) {
+				ret = -EFSCORRUPTED;
+				goto err_out;
+			}
 		}

 		/* We must increment read_pages_pending before possible BIOs submitting
--
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
  2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
                   ` (2 preceding siblings ...)
  2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
@ 2026-01-05 15:31 ` Nanzhe Zhao
  2026-01-06  9:31   ` Chao Yu
  2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
  2026-01-07  3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
  5 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:31 UTC (permalink / raw)
  To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao

f2fs_read_data_large_folio() can build a single read BIO across multiple
folios during readahead. If a folio ends up having none of its subpages
added to the BIO (e.g. all subpages are zeroed / treated as holes), it
will never be seen by f2fs_finish_read_bio(), so folio_end_read() is
never called. This leaves the folio locked and not marked uptodate.

Track whether the current folio has been added to a BIO via a local
'folio_in_bio' bool flag, and when iterating readahead folios, explicitly
mark the folio uptodate (on success) and unlock it when nothing was added.

Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
 fs/f2fs/data.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 66ab7a43a56f..ac569a396914 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 	unsigned nrpages;
 	struct f2fs_folio_state *ffs;
 	int ret = 0;
+	bool folio_in_bio = false;

 	if (!IS_IMMUTABLE(inode))
 		return -EOPNOTSUPP;
@@ -2445,6 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 	if (!folio)
 		goto out;

+	folio_in_bio = false
 	index = folio->index;
 	offset = 0;
 	ffs = NULL;
@@ -2530,6 +2532,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 					offset << PAGE_SHIFT))
 			goto submit_and_realloc;

+		folio_in_bio = true;
 		inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
 		f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
 				F2FS_BLKSIZE);
@@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 	}
 	trace_f2fs_read_folio(folio, DATA);
 	if (rac) {
+		if (!folio_in_bio) {
+			if (!ret)
+				folio_mark_uptodate(folio);
+			folio_unlock(folio);
+	}
 		folio = readahead_folio(rac);
 		goto next_folio;
 	}
--
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read
  2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
                   ` (3 preceding siblings ...)
  2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
@ 2026-01-05 15:31 ` Nanzhe Zhao
  2026-01-06  9:35   ` Chao Yu
  2026-01-07  3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
  5 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-05 15:31 UTC (permalink / raw)
  To: Kim Jaegeuk; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel, Nanzhe Zhao

In f2fs_read_data_large_folio(), the block zeroing path calls
folio_zero_range() and then continues the loop. However, it fails to
advance index and offset before continuing.

This can cause the loop to repeatedly process the same subpage of the
folio, leading to stalls/hangs and incorrect progress when reading large
folios with holes/zeroed blocks.

Fix it by incrementing index and offset in the zeroing path before
continuing.

Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
 fs/f2fs/data.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ac569a396914..07c222bcc5e0 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2446,7 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 	if (!folio)
 		goto out;

-	folio_in_bio = false
+	folio_in_bio = false;
 	index = folio->index;
 	offset = 0;
 	ffs = NULL;
@@ -2483,6 +2483,8 @@ static int f2fs_read_data_large_folio(struct inode *inode,
 				ret = -EIO;
 				goto err_out;
 			}
+			index++;
+			offset++;
 			continue;
 		} else if((map.m_flags & F2FS_MAP_MAPPED)) {
 			block_nr = map.m_pblk + index - map.m_lblk;
--
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
  2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
@ 2026-01-06  3:38   ` Barry Song
  2026-01-07  3:44     ` Nanzhe Zhao
  2026-01-06  9:16   ` Chao Yu
  1 sibling, 1 reply; 21+ messages in thread
From: Barry Song @ 2026-01-06  3:38 UTC (permalink / raw)
  To: Nanzhe Zhao; +Cc: Kim Jaegeuk, Chao Yu, linux-f2fs-devel, linux-kernel

On Tue, Jan 6, 2026 at 12:12 AM Nanzhe Zhao <nzzhao@126.com> wrote:
>
> f2fs_folio_state is attached to folio->private and is expected to start
> with read_pages_pending == 0.  However, the structure was allocated from
> ffs_entry_slab without being fully initialized, which can leave
> read_pages_pending with stale values.
>
> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
> creation time.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>


We already have GFP_F2FS_ZERO, but it includes GFP_IO. Should we
introduce another variant, such as GFP_F2FS_NOIO_ZERO (or similar)?
Overall, LGTM.

Reviewed-by: Barry Song <baohua@kernel.org>

> ---
>  fs/f2fs/data.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 471e52c6c1e0..ab091b294fa7 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2389,7 +2389,7 @@ static struct f2fs_folio_state *ffs_find_or_alloc(struct folio *folio)
>         if (ffs)
>                 return ffs;
>
> -       ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO, true, NULL);
> +       ffs = f2fs_kmem_cache_alloc(ffs_entry_slab, GFP_NOIO | __GFP_ZERO, true, NULL);
>
>         spin_lock_init(&ffs->state_lock);
>         folio_attach_private(folio, ffs);
> --
> 2.34.1

Thanks
Barry

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
  2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
  2026-01-06  3:38   ` Barry Song
@ 2026-01-06  9:16   ` Chao Yu
  1 sibling, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-06  9:16 UTC (permalink / raw)
  To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel

On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
> f2fs_folio_state is attached to folio->private and is expected to start
> with read_pages_pending == 0.  However, the structure was allocated from
> ffs_entry_slab without being fully initialized, which can leave
> read_pages_pending with stale values.
> 
> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
> creation time.
> 
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission
  2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
@ 2026-01-06  9:16   ` Chao Yu
  0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-06  9:16 UTC (permalink / raw)
  To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel

On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
> In f2fs_read_data_large_folio(), read_pages_pending is incremented only
> after the subpage has been added to the BIO.  With a heavily fragmented
> file, each new subpage can force submission of the previous BIO.
> 
> If the BIO completes quickly, f2fs_finish_read_bio() may decrement
> read_pages_pending to zero and call folio_end_read() while the read loop
> is still processing other subpages of the same large folio.
> 
> Fix the ordering by incrementing read_pages_pending before any possible
> BIO submission for the current subpage, matching the iomap ordering and
> preventing premature folio_end_read().
> 
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
  2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
@ 2026-01-06  9:19   ` Chao Yu
  2026-01-06 11:25     ` Nanzhe Zhao
  2026-01-06  9:30   ` kernel test robot
  1 sibling, 1 reply; 21+ messages in thread
From: Chao Yu @ 2026-01-06  9:19 UTC (permalink / raw)
  To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel

On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:
> f2fs_read_data_large_folio() relies on f2fs_map_blocks() to decide whether
> a subpage should be zero-filled or queued to a read bio.
> 
> However, f2fs_map_blocks() can set F2FS_MAP_MAPPED for NULL_ADDR and
> NEW_ADDR in the non-DIO, no-create path. The large folio read code then

Nanzhe,

IIUC, f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT) won't map hole
space, or am I missing something?

Thanks,

> treats such hole blocks as mapped blocks, and may account them
> in read_pages_pending and attempt to build bios for them, which can
> leave tasks stuck in readahead for heavily fragmented files.
> 
> Add a helper, f2fs_block_needs_zeroing(), which detects NULL_ADDR and
> NEW_ADDR from struct f2fs_map_blocks. Use it to prioritize the zeroing
> path by checking f2fs_block_needs_zeroing() before
> (map.m_flags & F2FS_MAP_MAPPED) under got_it: label.
> 
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
>   fs/f2fs/data.c | 21 +++++++++++++--------
>   1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 4bef04560924..66ab7a43a56f 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2413,6 +2413,11 @@ static void ffs_detach_free(struct folio *folio)
>   	kmem_cache_free(ffs_entry_slab, ffs);
>   }
> 
> +static inline bool f2fs_block_needs_zeroing(const struct f2fs_map_blocks *map)
> +{
> +	return map->m_pblk == NULL_ADDR || map->m_pblk == NEW_ADDR;
> +}
> +
>   static int f2fs_read_data_large_folio(struct inode *inode,
>   		struct readahead_control *rac, struct folio *folio)
>   {
> @@ -2468,14 +2473,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   		if (ret)
>   			goto err_out;
>   got_it:
> -		if ((map.m_flags & F2FS_MAP_MAPPED)) {
> -			block_nr = map.m_pblk + index - map.m_lblk;
> -			if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
> -						DATA_GENERIC_ENHANCE_READ)) {
> -				ret = -EFSCORRUPTED;
> -				goto err_out;
> -			}
> -		} else {
> +		if ((f2fs_block_needs_zeroing(&map))) {
>   			folio_zero_range(folio, offset << PAGE_SHIFT, PAGE_SIZE);
>   			if (f2fs_need_verity(inode, index) &&
>   			    !fsverity_verify_page(folio_file_page(folio,
> @@ -2484,6 +2482,13 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   				goto err_out;
>   			}
>   			continue;
> +		} else if((map.m_flags & F2FS_MAP_MAPPED)) {
> +			block_nr = map.m_pblk + index - map.m_lblk;
> +			if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
> +						DATA_GENERIC_ENHANCE_READ)) {
> +				ret = -EFSCORRUPTED;
> +				goto err_out;
> +			}
>   		}
> 
>   		/* We must increment read_pages_pending before possible BIOs submitting
> --
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
  2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
  2026-01-06  9:19   ` Chao Yu
@ 2026-01-06  9:30   ` kernel test robot
  1 sibling, 0 replies; 21+ messages in thread
From: kernel test robot @ 2026-01-06  9:30 UTC (permalink / raw)
  To: Nanzhe Zhao, Kim Jaegeuk
  Cc: llvm, oe-kbuild-all, Chao Yu, linux-f2fs-devel, linux-kernel,
	Nanzhe Zhao

Hi Nanzhe,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 48b5439e04ddf4508ecaf588219012dc81d947c0]

url:    https://github.com/intel-lab-lkp/linux/commits/Nanzhe-Zhao/f2fs-Zero-f2fs_folio_state-on-allocation/20260106-005006
base:   48b5439e04ddf4508ecaf588219012dc81d947c0
patch link:    https://lore.kernel.org/r/20260105153101.152892-4-nzzhao%40126.com
patch subject: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260106/202601061013.MBnRTOrG-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260106/202601061013.MBnRTOrG-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202601061013.MBnRTOrG-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> fs/f2fs/data.c:2485:13: warning: variable 'block_nr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
    2485 |                 } else if((map.m_flags & F2FS_MAP_MAPPED)) {
         |                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/f2fs/data.c:2527:39: note: uninitialized use occurs here
    2527 |                 f2fs_wait_on_block_writeback(inode, block_nr);
         |                                                     ^~~~~~~~
   fs/f2fs/data.c:2485:10: note: remove the 'if' if its condition is always true
    2485 |                 } else if((map.m_flags & F2FS_MAP_MAPPED)) {
         |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/f2fs/data.c:2454:20: note: initialize the variable 'block_nr' to silence this warning
    2454 |                 sector_t block_nr;
         |                                  ^
         |                                   = 0
   1 warning generated.


vim +2485 fs/f2fs/data.c

  2420	
  2421	static int f2fs_read_data_large_folio(struct inode *inode,
  2422			struct readahead_control *rac, struct folio *folio)
  2423	{
  2424		struct bio *bio = NULL;
  2425		sector_t last_block_in_bio = 0;
  2426		struct f2fs_map_blocks map = {0, };
  2427		pgoff_t index, offset;
  2428		unsigned max_nr_pages = rac ? readahead_count(rac) :
  2429					folio_nr_pages(folio);
  2430		unsigned nrpages;
  2431		struct f2fs_folio_state *ffs;
  2432		int ret = 0;
  2433	
  2434		if (!IS_IMMUTABLE(inode))
  2435			return -EOPNOTSUPP;
  2436	
  2437		if (f2fs_compressed_file(inode))
  2438			return -EOPNOTSUPP;
  2439	
  2440		map.m_seg_type = NO_CHECK_TYPE;
  2441	
  2442		if (rac)
  2443			folio = readahead_folio(rac);
  2444	next_folio:
  2445		if (!folio)
  2446			goto out;
  2447	
  2448		index = folio->index;
  2449		offset = 0;
  2450		ffs = NULL;
  2451		nrpages = folio_nr_pages(folio);
  2452	
  2453		for (; nrpages; nrpages--) {
  2454			sector_t block_nr;
  2455			/*
  2456			 * Map blocks using the previous result first.
  2457			 */
  2458			if ((map.m_flags & F2FS_MAP_MAPPED) &&
  2459					index > map.m_lblk &&
  2460					index < (map.m_lblk + map.m_len))
  2461				goto got_it;
  2462	
  2463			/*
  2464			 * Then do more f2fs_map_blocks() calls until we are
  2465			 * done with this page.
  2466			 */
  2467			memset(&map, 0, sizeof(map));
  2468			map.m_seg_type = NO_CHECK_TYPE;
  2469			map.m_lblk = index;
  2470			map.m_len = max_nr_pages;
  2471	
  2472			ret = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT);
  2473			if (ret)
  2474				goto err_out;
  2475	got_it:
  2476			if ((f2fs_block_needs_zeroing(&map))) {
  2477				folio_zero_range(folio, offset << PAGE_SHIFT, PAGE_SIZE);
  2478				if (f2fs_need_verity(inode, index) &&
  2479				    !fsverity_verify_page(folio_file_page(folio,
  2480									index))) {
  2481					ret = -EIO;
  2482					goto err_out;
  2483				}
  2484				continue;
> 2485			} else if((map.m_flags & F2FS_MAP_MAPPED)) {
  2486				block_nr = map.m_pblk + index - map.m_lblk;
  2487				if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
  2488							DATA_GENERIC_ENHANCE_READ)) {
  2489					ret = -EFSCORRUPTED;
  2490					goto err_out;
  2491				}
  2492			}
  2493	
  2494			/* We must increment read_pages_pending before possible BIOs submitting
  2495			 * to prevent from premature folio_end_read() call on folio
  2496			 */
  2497			if (folio_test_large(folio)) {
  2498				ffs = ffs_find_or_alloc(folio);
  2499	
  2500				/* set the bitmap to wait */
  2501				spin_lock_irq(&ffs->state_lock);
  2502				ffs->read_pages_pending++;
  2503				spin_unlock_irq(&ffs->state_lock);
  2504			}
  2505	
  2506			/*
  2507			 * This page will go to BIO.  Do we need to send this
  2508			 * BIO off first?
  2509			 */
  2510			if (bio && (!page_is_mergeable(F2FS_I_SB(inode), bio,
  2511							last_block_in_bio, block_nr) ||
  2512				!f2fs_crypt_mergeable_bio(bio, inode, index, NULL))) {
  2513	submit_and_realloc:
  2514				f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
  2515				bio = NULL;
  2516			}
  2517			if (bio == NULL)
  2518				bio = f2fs_grab_read_bio(inode, block_nr,
  2519						max_nr_pages,
  2520						f2fs_ra_op_flags(rac),
  2521						index, false);
  2522	
  2523			/*
  2524			 * If the page is under writeback, we need to wait for
  2525			 * its completion to see the correct decrypted data.
  2526			 */
  2527			f2fs_wait_on_block_writeback(inode, block_nr);
  2528	
  2529			if (!bio_add_folio(bio, folio, F2FS_BLKSIZE,
  2530						offset << PAGE_SHIFT))
  2531				goto submit_and_realloc;
  2532	
  2533			inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
  2534			f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
  2535					F2FS_BLKSIZE);
  2536			last_block_in_bio = block_nr;
  2537			index++;
  2538			offset++;
  2539		}
  2540		trace_f2fs_read_folio(folio, DATA);
  2541		if (rac) {
  2542			folio = readahead_folio(rac);
  2543			goto next_folio;
  2544		}
  2545	err_out:
  2546		/* Nothing was submitted. */
  2547		if (!bio) {
  2548			if (!ret)
  2549				folio_mark_uptodate(folio);
  2550			folio_unlock(folio);
  2551			return ret;
  2552		}
  2553	
  2554		if (ret) {
  2555			f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
  2556	
  2557			/* Wait bios and clear uptodate. */
  2558			folio_lock(folio);
  2559			folio_clear_uptodate(folio);
  2560			folio_unlock(folio);
  2561		}
  2562	out:
  2563		f2fs_submit_read_bio(F2FS_I_SB(inode), bio, DATA);
  2564		return ret;
  2565	}
  2566	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
  2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
@ 2026-01-06  9:31   ` Chao Yu
  2026-01-07  0:33     ` Nanzhe Zhao
  0 siblings, 1 reply; 21+ messages in thread
From: Chao Yu @ 2026-01-06  9:31 UTC (permalink / raw)
  To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel

On 1/5/2026 11:31 PM, Nanzhe Zhao wrote:
> f2fs_read_data_large_folio() can build a single read BIO across multiple
> folios during readahead. If a folio ends up having none of its subpages
> added to the BIO (e.g. all subpages are zeroed / treated as holes), it
> will never be seen by f2fs_finish_read_bio(), so folio_end_read() is
> never called. This leaves the folio locked and not marked uptodate.
> 
> Track whether the current folio has been added to a BIO via a local
> 'folio_in_bio' bool flag, and when iterating readahead folios, explicitly
> mark the folio uptodate (on success) and unlock it when nothing was added.
> 
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
>   fs/f2fs/data.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 66ab7a43a56f..ac569a396914 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   	unsigned nrpages;
>   	struct f2fs_folio_state *ffs;
>   	int ret = 0;
> +	bool folio_in_bio = false;

No need to initialize folio_in_bio?

> 
>   	if (!IS_IMMUTABLE(inode))
>   		return -EOPNOTSUPP;
> @@ -2445,6 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   	if (!folio)
>   		goto out;
> 
> +	folio_in_bio = false

folio_in_bio = false;

>   	index = folio->index;
>   	offset = 0;
>   	ffs = NULL;
> @@ -2530,6 +2532,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   					offset << PAGE_SHIFT))
>   			goto submit_and_realloc;
> 
> +		folio_in_bio = true;
>   		inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
>   		f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
>   				F2FS_BLKSIZE);
> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   	}
>   	trace_f2fs_read_folio(folio, DATA);
>   	if (rac) {
> +		if (!folio_in_bio) {
> +			if (!ret)
> +				folio_mark_uptodate(folio);
> +			folio_unlock(folio);
> +	}

err_out:
	/* Nothing was submitted. */
	if (!bio) {
		if (!ret)
			folio_mark_uptodate(folio);
		folio_unlock(folio);

                 ^^^^^^^^^^^^

If all folios in rac have not been mapped (hole case), will we unlock the folio twice?

Thanks,

		return ret;
	}

>   		folio = readahead_folio(rac);
>   		goto next_folio;
>   	}
> --
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read
  2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
@ 2026-01-06  9:35   ` Chao Yu
  0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-06  9:35 UTC (permalink / raw)
  To: Nanzhe Zhao, Kim Jaegeuk; +Cc: chao, linux-f2fs-devel, linux-kernel

On 1/5/2026 11:31 PM, Nanzhe Zhao wrote:
> In f2fs_read_data_large_folio(), the block zeroing path calls
> folio_zero_range() and then continues the loop. However, it fails to
> advance index and offset before continuing.
> 
> This can cause the loop to repeatedly process the same subpage of the
> folio, leading to stalls/hangs and incorrect progress when reading large
> folios with holes/zeroed blocks.
> 
> Fix it by incrementing index and offset in the zeroing path before
> continuing.
> 
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
>   fs/f2fs/data.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index ac569a396914..07c222bcc5e0 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2446,7 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   	if (!folio)
>   		goto out;
> 
> -	folio_in_bio = false
> +	folio_in_bio = false;

Should be fixed in 4/5.

>   	index = folio->index;
>   	offset = 0;
>   	ffs = NULL;
> @@ -2483,6 +2483,8 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>   				ret = -EIO;
>   				goto err_out;
>   			}
> +			index++;
> +			offset++;

What about increasing index & offset in for () statement, in case we missed
to update them anywhere.

Thanks,

>   			continue;
>   		} else if((map.m_flags & F2FS_MAP_MAPPED)) {
>   			block_nr = map.m_pblk + index - map.m_lblk;
> --
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re:Re: [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
  2026-01-06  9:19   ` Chao Yu
@ 2026-01-06 11:25     ` Nanzhe Zhao
  0 siblings, 0 replies; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-06 11:25 UTC (permalink / raw)
  To: Chao Yu; +Cc: Kim Jaegeuk, linux-f2fs-devel, linux-kernel


At 2026-01-06 17:19:14, "Chao Yu" <chao@kernel.org> wrote:
>On 1/5/2026 11:30 PM, Nanzhe Zhao wrote:

>>IIUC, f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT) won't map hole
>>space, or am I missing something?
>>
>>Thanks,

My fault, I missed the goto sync_out statement in default case. Thanks for pointing out.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re:Re: [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
  2026-01-06  9:31   ` Chao Yu
@ 2026-01-07  0:33     ` Nanzhe Zhao
  2026-01-07  1:16       ` Chao Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-07  0:33 UTC (permalink / raw)
  To: Chao Yu; +Cc: Kim Jaegeuk, linux-f2fs-devel, linux-kernel

Hi Chao yu:
At 2026-01-06 17:31:20, "Chao Yu" <chao@kernel.org> wrote:
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 66ab7a43a56f..ac569a396914 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>   	unsigned nrpages;
>>>   	struct f2fs_folio_state *ffs;
>>>   	int ret = 0;
>>> +	bool folio_in_bio = false;
>>
>>No need to initialize folio_in_bio?

Agreed. It's redundant since we reset it to false for each new folio before processing.

>>> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>   	}
>>>   	trace_f2fs_read_folio(folio, DATA);
>>>   	if (rac) {
>>> +		if (!folio_in_bio) {
>>> +			if (!ret)
>>> +				folio_mark_uptodate(folio);
>>> +			folio_unlock(folio);
>>> +	}
>>
>>err_out:
>>	/* Nothing was submitted. */
>>	if (!bio) {
>>		if (!ret)
>>			folio_mark_uptodate(folio);
>>		folio_unlock(folio);
>>
>>                 ^^^^^^^^^^^^
>>
>>If all folios in rac have not been mapped (hole case), will we unlock the folio twice?

Are you worried the folio could be unlocked once in the if (rac) { ... } block and then 
unlocked again at err_out:? If so, I think that won't happen.

In such a case, every non-NULL folio will be unlocked exactly once by:

if (!folio_in_bio) {
       if (!ret)
               folio_mark_uptodate(folio);
       folio_unlock(folio);
}
Specifically, after the last folio runs through the block above, the next call:

folio = readahead_folio(rac);
will return NULL. Then we go to next_folio:, and will directly hit:

if (!folio)
       goto out;
This jumps straight to the out: label, skipping err_out: entirely. 
Therefore, when ret is not an error code, the err_out: label will never be reached.

If ret becomes an error code, then the current folio will immediately goto err_out; 
and be unlocked there once.

If rac is NULL (meaning we only read the single large folio passed in as the function argument), 
we won't enter the if (rac) { ... goto next_folio; } path at all, so we also won't go to next_folio 
and then potentially goto out;. In that case, it will naturally be unlocked once at err_out:.
Or am I missing some edge case here?

Thanks,
Nanzhe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission
  2026-01-07  0:33     ` Nanzhe Zhao
@ 2026-01-07  1:16       ` Chao Yu
  0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-07  1:16 UTC (permalink / raw)
  To: Nanzhe Zhao; +Cc: chao, Kim Jaegeuk, linux-f2fs-devel, linux-kernel

On 1/7/2026 8:33 AM, Nanzhe Zhao wrote:
> Hi Chao yu:
> At 2026-01-06 17:31:20, "Chao Yu" <chao@kernel.org> wrote:
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index 66ab7a43a56f..ac569a396914 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>>    	unsigned nrpages;
>>>>    	struct f2fs_folio_state *ffs;
>>>>    	int ret = 0;
>>>> +	bool folio_in_bio = false;
>>>
>>> No need to initialize folio_in_bio?
> 
> Agreed. It's redundant since we reset it to false for each new folio before processing.
> 
>>>> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>>    	}
>>>>    	trace_f2fs_read_folio(folio, DATA);
>>>>    	if (rac) {
>>>> +		if (!folio_in_bio) {
>>>> +			if (!ret)
>>>> +				folio_mark_uptodate(folio);
>>>> +			folio_unlock(folio);
>>>> +	}
>>>
>>> err_out:
>>> 	/* Nothing was submitted. */
>>> 	if (!bio) {
>>> 		if (!ret)
>>> 			folio_mark_uptodate(folio);
>>> 		folio_unlock(folio);
>>>
>>>                  ^^^^^^^^^^^^
>>>
>>> If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
> 
> Are you worried the folio could be unlocked once in the if (rac) { ... } block and then
> unlocked again at err_out:? If so, I think that won't happen.
> 
> In such a case, every non-NULL folio will be unlocked exactly once by:
> 
> if (!folio_in_bio) {
>         if (!ret)
>                 folio_mark_uptodate(folio);
>         folio_unlock(folio);
> }
> Specifically, after the last folio runs through the block above, the next call:
> 
> folio = readahead_folio(rac);
> will return NULL. Then we go to next_folio:, and will directly hit:
> 
> if (!folio)
>         goto out;
> This jumps straight to the out: label, skipping err_out: entirely.
> Therefore, when ret is not an error code, the err_out: label will never be reached.
> 
> If ret becomes an error code, then the current folio will immediately goto err_out;
> and be unlocked there once.
> 
> If rac is NULL (meaning we only read the single large folio passed in as the function argument),
> we won't enter the if (rac) { ... goto next_folio; } path at all, so we also won't go to next_folio
> and then potentially goto out;. In that case, it will naturally be unlocked once at err_out:.
> Or am I missing some edge case here?

Nanzhe,

Oh, yes, I think so, thanks for the explanation.

Thanks,

> 
> Thanks,
> Nanzhe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
  2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
                   ` (4 preceding siblings ...)
  2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
@ 2026-01-07  3:08 ` Jaegeuk Kim
  2026-01-08  2:17   ` Nanzhe Zhao
  5 siblings, 1 reply; 21+ messages in thread
From: Jaegeuk Kim @ 2026-01-07  3:08 UTC (permalink / raw)
  To: Nanzhe Zhao; +Cc: Chao Yu, linux-f2fs-devel, linux-kernel

Hi Nanzhe,

fyi - I applied the beginning two patches first.

Thanks,

On 01/05, Nanzhe Zhao wrote:
> When reading immutable, non-compressed files with large folios enabled,
> I was able to reproduce readahead hangs while reading sparse files with
> holes and heavily fragmented files. The problems were caused by a few
> corner cases in the large-folio read loop:
> 
>   - f2fs_folio_state could be observed with uninitialized field
>     read_pages_pending
>   - subpage accounting could become inconsistent with BIO completion,
>     leading to folios being prematurely unlocked/marked uptodate.
>   - NULL_ADDR/NEW_ADDR blocks can carry F2FS_MAP_MAPPED, causing the
>     large-folio read path to treat hole blocks as mapped and to account
>     them in read_pages_pending.
>   - in readahead, a folio that never had any subpage queued to a BIO
>     would not be seen by f2fs_finish_read_bio(), leaving it locked.
>   - the zeroing path did not advance index/offset before continuing.
> 
> This patch series fixes the above issues in f2fs_read_data_large_folio()
> introduced by commit 05e65c14ea59 ("f2fs: support large folio for
> immutable non-compressed case").
> 
> Testing
> -------
> 
> All patches pass scripts/checkpatch.pl.
> 
> I tested the basic large-folio immutable read case described in the
> original thread (create a large file, set immutable, drop caches to
> reload the inode, then read it), and additionally verified:
> 
>   - sparse file
>   - heavily fragmented file
> 
> In all cases, reads completed without hangs and data was verified against
> the expected contents.
> 
> Nanzhe Zhao (5):
>   f2fs: Zero f2fs_folio_state on allocation
>   f2fs: Accounting large folio subpages before bio submission
>   f2fs: add f2fs_block_needs_zeroing() to handle hole blocks
>   f2fs: add 'folio_in_bio' to handle readahead folios with no BIO
>     submission
>   f2fs: advance index and offset after zeroing in large folio read
> 
>  fs/f2fs/data.c | 54 +++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 36 insertions(+), 18 deletions(-)
> 
> 
> base-commit: 48b5439e04ddf4508ecaf588219012dc81d947c0
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re:Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
  2026-01-06  3:38   ` Barry Song
@ 2026-01-07  3:44     ` Nanzhe Zhao
  2026-01-08 22:35       ` Barry Song
  0 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-07  3:44 UTC (permalink / raw)
  To: Barry Song; +Cc: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel

Hi Barry:

>At 2026-01-06 11:38:49, "Barry Song" <21cnbao@gmail.com> wrote:
>>On Tue, Jan 6, 2026 at 12:12 AM Nanzhe Zhao <nzzhao@126.com> wrote:
>>>
>>> f2fs_folio_state is attached to folio->private and is expected to start
>>> with read_pages_pending == 0.  However, the structure was allocated from
>>> ffs_entry_slab without being fully initialized, which can leave
>>> read_pages_pending with stale values.
>>>
>>> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
>>> creation time.
>>>
>>> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
>>
>>
>>We already have GFP_F2FS_ZERO, but it includes GFP_IO. Should we
>>introduce another variant, such as GFP_F2FS_NOIO_ZERO (or similar)?
>>Overall, LGTM.
>>

I'm still not fully understand about the exact semantics of GFP_NOIO vs GFP_NOFS. 
I did a bit of digging and, in the current buffered read / readahead context, it seems 
like there may be no meaningful difference for the purpose of avoiding direct-reclaim 
recursion deadlocks?

My current (possibly incomplete) understanding is that in may_enter_fs(), GFP_NOIO 
only changes behavior for swapcache folios, rather than file-backed folios that are
currently in the read IO path,and the swap writeback path won't recurse back into f2fs's 
own writeback function anyway. (On phones there usually isn't  a swap partition; for zram 
 I guess swap writeback is effectively writing to RAM via the zram block device ? 
Sorry for  not being very familiar with the details there.)

I noticed iomap's ifs_alloc uses GFP_NOFS | __GFP_NOFAIL. So if GFP_NOFS is acceptable here, 
we could simply use GFP_F2FS_ZERO and avoid introducing a new GFP_F2FS_NOIO_ZERO variant?

Just curious.I will vote  for GFP_NOIO  from semantic clarity perspective here.

Thanks,
Nanzhe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re:Re: [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
  2026-01-07  3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
@ 2026-01-08  2:17   ` Nanzhe Zhao
  2026-01-08  9:23     ` Chao Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Nanzhe Zhao @ 2026-01-08  2:17 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, Barry Song, linux-f2fs-devel, linux-kernel

Hi Kim,
At 2026-01-07 11:08:50, "Jaegeuk Kim" <jaegeuk@kernel.org> wrote:
>>Hi Nanzhe,
>>
>>fyi - I applied the beginning two patches first.
>>
>>Thanks,
>>

Thanks for applying my small changes.

By the way, I’d like to discuss one more thing about testing for large folios. 
It seems the current xfstests coverage may not be sufficient. Would it be 
welcome for me to contribute some new test cases upstream?

Also, I think large-folio functionality might also need black-box testing such
as fault-injection, where we force certain paths to return errors and verify 
behavior under failures. I’d appreciate your thoughts.

Thanks,
Nanzhe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files
  2026-01-08  2:17   ` Nanzhe Zhao
@ 2026-01-08  9:23     ` Chao Yu
  0 siblings, 0 replies; 21+ messages in thread
From: Chao Yu @ 2026-01-08  9:23 UTC (permalink / raw)
  To: Nanzhe Zhao, Jaegeuk Kim; +Cc: chao, Barry Song, linux-f2fs-devel, linux-kernel

On 1/8/2026 10:17 AM, Nanzhe Zhao wrote:
> Hi Kim,
> At 2026-01-07 11:08:50, "Jaegeuk Kim" <jaegeuk@kernel.org> wrote:
>>> Hi Nanzhe,
>>>
>>> fyi - I applied the beginning two patches first.
>>>
>>> Thanks,
>>>
> 
> Thanks for applying my small changes.
> 
> By the way, I’d like to discuss one more thing about testing for large folios.
> It seems the current xfstests coverage may not be sufficient. Would it be
> welcome for me to contribute some new test cases upstream?

Great, please go ahead, new testcase can be added into tests/f2fs/ directory.

> 
> Also, I think large-folio functionality might also need black-box testing such
> as fault-injection, where we force certain paths to return errors and verify
> behavior under failures. I’d appreciate your thoughts.

It's fine to introduce a new help f2fs_fsverity_verify_page() and inject error
there.

Thanks,

> 
> Thanks,
> Nanzhe
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation
  2026-01-07  3:44     ` Nanzhe Zhao
@ 2026-01-08 22:35       ` Barry Song
  0 siblings, 0 replies; 21+ messages in thread
From: Barry Song @ 2026-01-08 22:35 UTC (permalink / raw)
  To: Nanzhe Zhao; +Cc: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel

On Wed, Jan 7, 2026 at 4:45 PM Nanzhe Zhao <nzzhao@126.com> wrote:
>
> Hi Barry:
>
> >At 2026-01-06 11:38:49, "Barry Song" <21cnbao@gmail.com> wrote:
> >>On Tue, Jan 6, 2026 at 12:12 AM Nanzhe Zhao <nzzhao@126.com> wrote:
> >>>
> >>> f2fs_folio_state is attached to folio->private and is expected to start
> >>> with read_pages_pending == 0.  However, the structure was allocated from
> >>> ffs_entry_slab without being fully initialized, which can leave
> >>> read_pages_pending with stale values.
> >>>
> >>> Allocate the object with __GFP_ZERO so all fields are reliably zeroed at
> >>> creation time.
> >>>
> >>> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> >>
> >>
> >>We already have GFP_F2FS_ZERO, but it includes GFP_IO. Should we
> >>introduce another variant, such as GFP_F2FS_NOIO_ZERO (or similar)?
> >>Overall, LGTM.
> >>
>
> I'm still not fully understand about the exact semantics of GFP_NOIO vs GFP_NOFS.
> I did a bit of digging and, in the current buffered read / readahead context, it seems
> like there may be no meaningful difference for the purpose of avoiding direct-reclaim
> recursion deadlocks?

With GFP_NOIO, we will not swap out pages, including anonymous folios.

                if (folio_test_anon(folio) && folio_test_swapbacked(folio)) {
                        if (!folio_test_swapcache(folio)) {
                                if (!(sc->gfp_mask & __GFP_IO))
                                        goto keep_locked;

When using GFP_NOFS, reclaim can still swap out an anon folio,
provided its swap entry is not filesystem-backed
(see folio_swap_flags(folio)).

static bool may_enter_fs(struct folio *folio, gfp_t gfp_mask)
{
        if (gfp_mask & __GFP_FS)
                return true;
        if (!folio_test_swapcache(folio) || !(gfp_mask & __GFP_IO))
                return false;
        /*
         * We can "enter_fs" for swap-cache with only __GFP_IO
         * providing this isn't SWP_FS_OPS.
         * ->flags can be updated non-atomicially (scan_swap_map_slots),
         * but that will never affect SWP_FS_OPS, so the data_race
         * is safe.
         */
        return !data_race(folio_swap_flags(folio) & SWP_FS_OPS);
}

Note that swap may be backed either by a filesystem swapfile or
directly by a block device.

In short, GFP_NOIO is stricter than GFP_NOFS: it disallows any I/O,
even if the I/O does not involve a filesystem, whereas GFP_NOFS
still permits I/O that is not filesystem-related.

>
> My current (possibly incomplete) understanding is that in may_enter_fs(), GFP_NOIO
> only changes behavior for swapcache folios, rather than file-backed folios that are
> currently in the read IO path,and the swap writeback path won't recurse back into f2fs's
> own writeback function anyway. (On phones there usually isn't  a swap partition; for zram
>  I guess swap writeback is effectively writing to RAM via the zram block device ?
> Sorry for  not being very familiar with the details there.)

This can be the case for a swapfile on F2FS. Note that the check is
performed per folio. On a system with both zRAM and a filesystem-
backed swapfile, some folios may be swapped out while others may
not, depending on where their swap slots are allocated.

>
> I noticed iomap's ifs_alloc uses GFP_NOFS | __GFP_NOFAIL. So if GFP_NOFS is acceptable here,
> we could simply use GFP_F2FS_ZERO and avoid introducing a new GFP_F2FS_NOIO_ZERO variant?
>
> Just curious.I will vote  for GFP_NOIO  from semantic clarity perspective here.

In general, GFP_NOIO is used when handling bios or requests.

Thanks
Barry

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-01-08 22:35 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-05 15:30 [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Nanzhe Zhao
2026-01-05 15:30 ` [PATCH v1 1/5] f2fs: Zero f2fs_folio_state on allocation Nanzhe Zhao
2026-01-06  3:38   ` Barry Song
2026-01-07  3:44     ` Nanzhe Zhao
2026-01-08 22:35       ` Barry Song
2026-01-06  9:16   ` Chao Yu
2026-01-05 15:30 ` [PATCH v1 2/5] f2fs: Accounting large folio subpages before bio submission Nanzhe Zhao
2026-01-06  9:16   ` Chao Yu
2026-01-05 15:30 ` [PATCH v1 3/5] f2fs: add f2fs_block_needs_zeroing() to handle hole blocks Nanzhe Zhao
2026-01-06  9:19   ` Chao Yu
2026-01-06 11:25     ` Nanzhe Zhao
2026-01-06  9:30   ` kernel test robot
2026-01-05 15:31 ` [PATCH v1 4/5] f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission Nanzhe Zhao
2026-01-06  9:31   ` Chao Yu
2026-01-07  0:33     ` Nanzhe Zhao
2026-01-07  1:16       ` Chao Yu
2026-01-05 15:31 ` [PATCH v1 5/5] f2fs: advance index and offset after zeroing in large folio read Nanzhe Zhao
2026-01-06  9:35   ` Chao Yu
2026-01-07  3:08 ` [PATCH v1 0/5] f2fs: fix large folio read corner cases for immutable files Jaegeuk Kim
2026-01-08  2:17   ` Nanzhe Zhao
2026-01-08  9:23     ` Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox