public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ext4: avoid __GFP_NOFAIL in __ext4_get_inode_loc allocation
@ 2026-04-27 22:23 Chao Shi
  2026-04-28  1:28 ` Theodore Tso
  0 siblings, 1 reply; 2+ messages in thread
From: Chao Shi @ 2026-04-27 22:23 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, Chao Shi, Sungwoo Kim, Dave Tian,
	Weidong Zhu

When kswapd shrinks the dcache, the last iput() on an ext4 inode can
trigger ext4_orphan_del(), which calls ext4_reserve_inode_write() and
ultimately __ext4_get_inode_loc().  That function calls sb_getblk(),
which wraps __getblk() and carries implicit __GFP_NOFAIL.  Because
kswapd runs with PF_MEMALLOC set, combining NOFAIL with a non-reclaimable
context trips WARN_ON_ONCE(current->flags & PF_MEMALLOC) inside
__alloc_pages_slowpath(), producing a spurious splat even though the
allocation could simply fail and return -ENOMEM to the caller.

Switch both sb_getblk() call sites in __ext4_get_inode_loc() to
sb_getblk_gfp() with the same flags minus __GFP_NOFAIL
(mapping_gfp_constraint(~__GFP_FS) | __GFP_MOVABLE), computing the gfp
value once and reusing it for the optional bitmap_bh optimisation fetch.
All callers of __ext4_get_inode_loc() -- reached via ext4_get_inode_loc(),
__ext4_get_inode_loc_noinmem(), and ext4_get_fc_inode_loc() -- already
propagate a non-zero return as an error without aborting the filesystem.
Both sb_getblk() call sites in __ext4_get_inode_loc() are converted; the
bitmap_bh fetch already falls back to make_io on NULL, so allowing it to
fail is a no-op there.

Reproduced under syzkaller+FEMU based fuzz tool (FuzzNvme) on x86_64 QEMU,
based on mainline 894009e2ef10:

  WARNING: CPU: 0 PID: 55 at mm/page_alloc.c:4722
                                    __alloc_pages_slowpath
  Comm: kswapd0 Not tainted 6.19.0+ #14
  Call Trace:
   __alloc_pages_slowpath
   alloc_pages_mpol
   folio_alloc_noprof
   filemap_alloc_folio_noprof
   __filemap_get_folio
   grow_dev_folio
   grow_buffers
   __getblk_slow
   bdev_getblk
   __ext4_get_inode_loc
   ext4_get_inode_loc
   ext4_reserve_inode_write
   ext4_orphan_del
   ext4_evict_inode
   evict
   iput
   dentry_unlink_inode
   __dentry_kill
   shrink_dentry_list
   prune_dcache_sb
   super_cache_scan
   do_shrink_slab
   shrink_slab
   shrink_node
   balance_pgdat
   kswapd
   kthread
   ret_from_fork

Related: see d8b90e6387a ("ext4: add ext4_sb_bread_nofail() helper
function for ext4_free_branches()") for the same strategy applied to
the read path in ext4_free_branches().
Link: https://lore.kernel.org/all/?q=PF_MEMALLOC+nofail+ext4+iput

Acked-by: Sungwoo Kim <iam@sung-woo.kim>
Acked-by: Dave Tian <daveti@purdue.edu>
Acked-by: Weidong Zhu <weizhu@fiu.edu>
Signed-off-by: Chao Shi <coshi036@gmail.com>
---
 fs/ext4/inode.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c2c2d6ac7f3..1b2a7bd59b8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4859,6 +4859,7 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
 	ext4_fsblk_t		block;
 	struct blk_plug		plug;
 	int			inodes_per_block, inode_offset;
+	gfp_t			gfp;
 
 	iloc->bh = NULL;
 	if (ino < EXT4_ROOT_INO ||
@@ -4887,7 +4888,14 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
 	}
 	block += (inode_offset / inodes_per_block);
 
-	bh = sb_getblk(sb, block);
+	/*
+	 * No __GFP_NOFAIL: this can run from reclaim context (kswapd
+	 * shrinker -> iput -> ext4_orphan_del path) where NOFAIL trips
+	 * WARN_ON_ONCE in __alloc_pages_slowpath().
+	 */
+	gfp = mapping_gfp_constraint(sb->s_bdev->bd_mapping, ~__GFP_FS) |
+		__GFP_MOVABLE;
+	bh = sb_getblk_gfp(sb, block, gfp);
 	if (unlikely(!bh))
 		return -ENOMEM;
 	if (ext4_buffer_uptodate(bh))
@@ -4912,7 +4920,7 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
 		start = inode_offset & ~(inodes_per_block - 1);
 
 		/* Is the inode bitmap in cache? */
-		bitmap_bh = sb_getblk(sb, ext4_inode_bitmap(sb, gdp));
+		bitmap_bh = sb_getblk_gfp(sb, ext4_inode_bitmap(sb, gdp), gfp);
 		if (unlikely(!bitmap_bh))
 			goto make_io;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-28  1:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 22:23 [PATCH] ext4: avoid __GFP_NOFAIL in __ext4_get_inode_loc allocation Chao Shi
2026-04-28  1:28 ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox