The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] btrfs: don't let shrinker touch extent_maps that are being logged
@ 2026-06-29 13:10 Jeff Layton
  2026-06-29 14:19 ` Filipe Manana
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2026-06-29 13:10 UTC (permalink / raw)
  To: Chris Mason, David Sterba, Filipe Manana, Josef Bacik
  Cc: linux-btrfs, linux-kernel, Jeff Layton

The extent map shrinker can free an extent map that is still owned by an
in-flight fsync and still linked on the inode's modified_extents list,
corrupting that list and eventually causing an RCU stall.

btrfs_scan_inode() currently skips EXTENT_FLAG_PINNED maps, then calls
btrfs_remove_extent_mapping() followed by btrfs_free_extent_map():

	if (em->flags & EXTENT_FLAG_PINNED)
		goto next;
	...
	btrfs_remove_extent_mapping(inode, em);
	btrfs_free_extent_map(em);

But btrfs_remove_extent_mapping() deliberately does NOT unlink a map that
has EXTENT_FLAG_LOGGING set:

	if (!(em->flags & EXTENT_FLAG_LOGGING))
		list_del_init(&em->list);
	remove_em(inode, em);

This sets up a UAF situation where a later fsync() can trip over the
now-freed extent_map still on the modified_extents() list.

Fix it by having the shrinker skip maps that are being logged, the same
way it skips pinned maps. Such a map is owned by the in-flight fsync and
will become reclaimable again once logging clears the flag.

Fixes: 956a17d9d050 ("btrfs: add a shrinker for extent maps")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
We've started hitting a number of these problems in our fleet. It
seems to mostly happen on ARM64 architecture, but there have been some
WARN_ONs that popped on x86_64 too.
---
 fs/btrfs/extent_map.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index fce9c5cc0122..128f7800e101 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -1166,7 +1166,13 @@ static long btrfs_scan_inode(struct btrfs_inode *inode, struct btrfs_em_shrink_c
 		em = rb_entry(node, struct extent_map, rb_node);
 		ctx->scanned++;
 
-		if (em->flags & EXTENT_FLAG_PINNED)
+		/*
+		 * Skip extent maps that are pinned or are being logged. The
+		 * i_mmap_lock should prevent this from seeing LOGGING on extent_maps
+		 * directly associated with inode, but em may be associated with
+		 * other, dependent inodes and their locks are not held.
+		 */
+		if (em->flags & (EXTENT_FLAG_PINNED | EXTENT_FLAG_LOGGING))
 			goto next;
 
 		/*

---
base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
change-id: 20260629-btrfs-skip-logging-3e31701d9647

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-30 14:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 13:10 [PATCH] btrfs: don't let shrinker touch extent_maps that are being logged Jeff Layton
2026-06-29 14:19 ` Filipe Manana
2026-06-29 14:56   ` Jeff Layton
2026-06-29 15:06     ` Filipe Manana
2026-06-29 15:47       ` Filipe Manana
2026-06-30 12:28         ` Jeff Layton
2026-06-30 14:06           ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox