linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fuse: fix readahead reclaim deadlock
@ 2025-09-25 22:44 Joanne Koong
  2025-09-26  6:51 ` Gao Xiang
  2025-09-26  9:01 ` Miklos Szeredi
  0 siblings, 2 replies; 12+ messages in thread
From: Joanne Koong @ 2025-09-25 22:44 UTC (permalink / raw)
  To: miklos; +Cc: linux-fsdevel, osandov, kernel-team

A deadlock can occur if the server triggers reclaim while servicing a
readahead request, and reclaim attempts to evict the inode of the file
being read ahead:

>>> stack_trace(1504735)
 folio_wait_bit_common (mm/filemap.c:1308:4)
 folio_lock (./include/linux/pagemap.h:1052:3)
 truncate_inode_pages_range (mm/truncate.c:336:10)
 fuse_evict_inode (fs/fuse/inode.c:161:2)
 evict (fs/inode.c:704:3)
 dentry_unlink_inode (fs/dcache.c:412:3)
 __dentry_kill (fs/dcache.c:615:3)
 shrink_kill (fs/dcache.c:1060:12)
 shrink_dentry_list (fs/dcache.c:1087:3)
 prune_dcache_sb (fs/dcache.c:1168:2)
 super_cache_scan (fs/super.c:221:10)
 do_shrink_slab (mm/shrinker.c:435:9)
 shrink_slab (mm/shrinker.c:626:10)
 shrink_node (mm/vmscan.c:5951:2)
 shrink_zones (mm/vmscan.c:6195:3)
 do_try_to_free_pages (mm/vmscan.c:6257:3)
 do_swap_page (mm/memory.c:4136:11)
 handle_pte_fault (mm/memory.c:5562:10)
 handle_mm_fault (mm/memory.c:5870:9)
 do_user_addr_fault (arch/x86/mm/fault.c:1338:10)
 handle_page_fault (arch/x86/mm/fault.c:1481:3)
 exc_page_fault (arch/x86/mm/fault.c:1539:2)
 asm_exc_page_fault+0x22/0x27

During readahead, the folio is locked. When fuse_evict_inode() is
called, it attempts to remove all folios associated with the inode from
the page cache (truncate_inode_pages_range()), which requires acquiring
the folio lock. If the server triggers reclaim while servicing a
readahead request, reclaim will block indefinitely waiting for the folio
lock, while readahead cannot relinquish the lock because it is itself
blocked in reclaim, resulting in a deadlock.

The inode is only evicted if it has no remaining references after its
dentry is unlinked. Since readahead is asynchronous, it is not
guaranteed that the inode will have any references at this point.

This fixes the deadlock by holding a reference on the inode while
readahead is in progress, which prevents the inode from being evicted
until readahead completes. Additionally, this also prevents a malicious
or buggy server from indefinitely blocking kswapd if it never fulfills a
readahead request.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reported-by: Omar Sandoval <osandov@fb.com>
---
 fs/fuse/file.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index f1ef77a0be05..8e759061b843 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -893,6 +893,7 @@ static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
 	if (ia->ff)
 		fuse_file_put(ia->ff, false);
 
+	iput(inode);
 	fuse_io_free(ia);
 }
 
@@ -973,6 +974,12 @@ static void fuse_readahead(struct readahead_control *rac)
 		ia = fuse_io_alloc(NULL, cur_pages);
 		if (!ia)
 			break;
+		/*
+		 *  Acquire the inode ref here to prevent reclaim from
+		 *  deadlocking. The ref gets dropped in fuse_readpages_end().
+		 */
+		igrab(inode);
+
 		ap = &ia->ap;
 
 		while (pages < cur_pages) {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-10-07  0:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-25 22:44 [PATCH] fuse: fix readahead reclaim deadlock Joanne Koong
2025-09-26  6:51 ` Gao Xiang
2025-09-26  7:19   ` Gao Xiang
2025-09-29 17:25     ` Joanne Koong
2025-09-30  2:21       ` Gao Xiang
2025-09-30  2:35         ` Gao Xiang
2025-09-30 10:08         ` Miklos Szeredi
2025-09-30 18:47           ` Joanne Koong
2025-09-30 18:55             ` Miklos Szeredi
2025-10-01  0:18               ` Joanne Koong
2025-10-07  0:37           ` Joanne Koong
2025-09-26  9:01 ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).