linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] mm: enhance migration work around on noref buffer-heads
@ 2025-04-10  1:49 Luis Chamberlain
  2025-04-10  1:49 ` [PATCH v2 1/8] migrate: fix skipping metadata buffer heads on migration Luis Chamberlain
                   ` (7 more replies)
  0 siblings, 8 replies; 38+ messages in thread
From: Luis Chamberlain @ 2025-04-10  1:49 UTC (permalink / raw)
  To: brauner, jack, tytso, adilger.kernel, linux-ext4, riel
  Cc: dave, willy, hannes, oliver.sang, david, axboe, hare, david,
	djwong, ritesh.list, linux-fsdevel, linux-block, linux-mm,
	gost.dev, p.raghav, da.gomez, mcgrof

We have an eye-sore of a spin lock held during page migration which
was added for a ext4 jbd corruption fix for which we have no clear
public corruption data. We want to remove the spin lock on mm/migrate
so to help buffer-head filesystems embrace large folios, since we
can cond_resched() on large folios on folio_mc_copy(). We've managed
to reproduce a corruption by just removing the spinlock and stressing
ext4 with generic/750, a corruption happens after 3 hours many times.
However, while developing an alternative fix based on feedback [0], we've
come the conclusion ext4 on vanilla Linux is still affected. We still have
a lingering jbd2 corruption issue.

The underlying race is in jbd2’s use of buffer_migrate_folio_norefs() without
holding doing proper synchronization, making it unsafe during folio migration.
ext4 uses jbd2 as its journaling backend. The corruption surfaces in ext4's
metadata operations, like ext4_ext_insert_extent(), when journal metadata fails
to be marked dirty due to the migration race. This leads to ENOSPC, journal
aborts, read-only fallback, and long-term filesystem corruption seen in replay
logs and "error count since last fsck".

This simply skips folio migration on jbd2 metadata buffers to avoid races during
journal writes that can lead to filesystem corruption, but also paves the way
to enable jbd2 to eventually overcome this limitation and enable folio
migration, while also implementing some of the suggested enhancements on
__find_get_block_slow(). The suggested trylock idea is implemented, thereby
potentially reducing i_private_lock contention and leveraging folio_trylock()
when allowed.

The first patch is intended to go through Linus' tree, if agreeable, and then
the rest can be evaluated for fs-next. Although I did not intend to upstream
the debugfs interface, at this point I'm convinced the statistics are extremely
useful while enhacing this path, and should also prove useful in enhacing and
eventually enabling folio migration on jbd2 metadata buffers.

If you want this in tree form, see 20250409-bh-meta-migrate-optimal [1].

[0] https://lore.kernel.org/all/20250330064732.3781046-3-mcgrof@kernel.org/T/#mf2fb79c9ab0d20fab65c65142b7f53680e68d8fa
[1] https://web.git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20250409-bh-meta-migrate-optimal

Changes on v2:

 - replace heuristic with buffer_meta() check as we're convinced the issue
   with corruption stil exist and jbd2 metadata buffers still needs work
   to enable folio migration
 - implements community feedback and performance suggestions on code
   paving the way to eventually enable jbd2 metadata buffers to leverage
   folio migration
 - adds debugfs interface

Davidlohr Bueso (6):
  fs/buffer: try to use folio lock for pagecache lookups
  fs/buffer: introduce __find_get_block_nonatomic()
  fs/ocfs2: use sleeping version of __find_get_block()
  fs/jbd2: use sleeping version of __find_get_block()
  fs/ext4: use sleeping version of __find_get_block()
  mm/migrate: enable noref migration for jbd2

Luis Chamberlain (2):
  migrate: fix skipping metadata buffer heads on migration
  mm: add migration buffer-head debugfs interface

 fs/buffer.c                 |  76 ++++++++++----
 fs/ext4/ialloc.c            |   3 +-
 fs/ext4/inode.c             |   2 +
 fs/ext4/mballoc.c           |   3 +-
 fs/jbd2/revoke.c            |  15 +--
 fs/ocfs2/journal.c          |   2 +-
 include/linux/buffer_head.h |   9 ++
 mm/migrate.c                | 192 ++++++++++++++++++++++++++++++++++--
 8 files changed, 266 insertions(+), 36 deletions(-)

-- 
2.47.2


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2025-04-29  9:32 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10  1:49 [PATCH v2 0/8] mm: enhance migration work around on noref buffer-heads Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 1/8] migrate: fix skipping metadata buffer heads on migration Luis Chamberlain
2025-04-10  3:18   ` Matthew Wilcox
2025-04-10 12:05   ` Jan Kara
2025-04-14 21:09     ` Luis Chamberlain
2025-04-14 22:19       ` Luis Chamberlain
2025-04-15  9:05         ` Christian Brauner
2025-04-15 15:47           ` Luis Chamberlain
2025-04-15 16:23             ` Jan Kara
2025-04-15 21:06               ` Luis Chamberlain
2025-04-16  2:02                 ` Davidlohr Bueso
2025-04-15 11:17         ` Jan Kara
2025-04-15 11:23       ` Jan Kara
2025-04-15 16:18     ` Luis Chamberlain
2025-04-15 16:28       ` Jan Kara
2025-04-16 16:58         ` Luis Chamberlain
2025-04-23 17:09           ` Jan Kara
2025-04-23 20:30             ` Luis Chamberlain
2025-04-25 22:51               ` Luis Chamberlain
2025-04-28 23:08                 ` Luis Chamberlain
2025-04-29  9:32                   ` Jan Kara
2025-04-15  1:36   ` Davidlohr Bueso
2025-04-15 11:25     ` Jan Kara
2025-04-15 18:14       ` Davidlohr Bueso
2025-04-10  1:49 ` [PATCH v2 2/8] fs/buffer: try to use folio lock for pagecache lookups Luis Chamberlain
2025-04-10 14:38   ` Jan Kara
2025-04-10 17:38     ` Davidlohr Bueso
2025-04-10  1:49 ` [PATCH v2 3/8] fs/buffer: introduce __find_get_block_nonatomic() Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 4/8] fs/ocfs2: use sleeping version of __find_get_block() Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 5/8] fs/jbd2: " Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 6/8] fs/ext4: " Luis Chamberlain
2025-04-10 13:36   ` Jan Kara
2025-04-10 17:32     ` Davidlohr Bueso
2025-04-10  1:49 ` [PATCH v2 7/8] mm/migrate: enable noref migration for jbd2 Luis Chamberlain
2025-04-10 13:40   ` Jan Kara
2025-04-10 17:30     ` Davidlohr Bueso
2025-04-14 12:12       ` Jan Kara
2025-04-10  1:49 ` [PATCH v2 8/8] mm: add migration buffer-head debugfs interface Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).