From: Wang Yugui <wangyugui@e16-tech.com>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 4/4] btrfs: keep folios locked inside run_delalloc_nocow()
Date: Tue, 05 Aug 2025 06:36:59 +0800 [thread overview]
Message-ID: <20250805063658.ED67.409509F4@e16-tech.com> (raw)
In-Reply-To: <a5a3c2f37bbe0b32ffaa8a15a85515ab9f173a28.1753687685.git.wqu@suse.com>
Hi,
> [BUG]
> There is a very low chance that DEBUG_WARN() inside
> btrfs_writepage_cow_fixup() can be triggered when
> CONFIG_BTRFS_EXPERIMENTAL is enabled.
>
> This only happens after run_delalloc_nocow() failed.
>
> Unfortunately I haven't hit it for a while thus no real world dmesg for
> now.
>
> [CAUSE]
> There is a race window where after run_delalloc_nocow() failed, error
> handling can race with writeback thread.
>
> Before we hit run_delalloc_nocow(), there is an inode with the following
> dirty pages: (4K page size, 4K block size, no large folio)
>
> 0 4K 8K 12K 16K
> |/////////|///////////|///////////|////////////|
>
> The inode also have NODATACOW flag, and the above dirty range will go
> through different extents during run_delalloc_range():
>
> 0 4K 8K 12K 16K
> | NOCOW | COW | COW | NOCOW |
>
> The race happen like this:
>
> writeback thread A | writeback thread B
> ----------------------------------+--------------------------------------
> Writeback for folio 0 |
> run_delalloc_nocow() |
> |- nocow_one_range() |
> | For range [0, 4K), ret = 0 |
> | |
> |- fallback_to_cow() |
> | For range [4K, 8K), ret = 0 |
> | Folio 4K *UNLOCKED* |
> | | Writeback for folio 4K
> |- fallback_to_cow() | extent_writepage()
> | For range [8K, 12K), failure | |- writepage_delalloc()
> | | |
> |- btrfs_cleanup_ordered_extents()| |
> |- btrfs_folio_clear_ordered() | |
> | Folio 0 still locked, safe | |
> | | | Ordered extent already allocated.
> | | | Nothing to do.
> | | |- extent_writepage_io()
> | | |- btrfs_writepage_cow_fixup()
> |- btrfs_folio_clear_ordered() | |
> Folio 4K hold by thread B, | |
> UNSAFE! | |- btrfs_test_ordered()
> | | Cleared by thread A,
> | |
> | |- DEBUG_WARN();
>
> This is only possible after run_delalloc_nocow() failure, as
> cow_file_range() will keep all folios and io tree range locked, until
> everything is finished or after error handling.
>
> The root cause is we allow fallback_to_cow() and nocow_one_range() to
> unlock the folios after a successful run, so that during error handling
> we're no longer safe to use btrfs_cleanup_ordered_extents() as the
> folios are already unlocked.
>
> [FIX]
> - Make fallback_to_cow() and nocow_one_range() to keep folios locked
> after a successful run
>
> For fallback_to_cow() we can pass COW_FILE_RANGE_KEEP_LOCKED flag
> into cow_file_range().
>
> For nocow_one_range() we have to remove the PAGE_UNLOCK flag from
> extent_clear_unlock_delalloc().
>
> - Unlock folios if everything is fine in run_delalloc_nocow()
>
> - Use extent_clear_unlock_delalloc() to handle range [@start,
> @cur_offset) inside run_delalloc_nocow()
> Since folios are still locked, we do not need
> cleanup_dirty_folios() to do the cleanup.
>
> extent_clear_unlock_delalloc() with "PAGE_START_WRIBACK |
> PAGE_END_WRITEBACK" will clear the dirty flags.
>
> - Remove cleanup_dirty_folios()
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
We need a 'Fix:' tag for kernel stable branch.
It seems that this problem does not happen (or less frequency) on kernel
6.6.y/6.1.y.
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2025/08/05
next prev parent reply other threads:[~2025-08-04 22:37 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-28 8:27 [PATCH v2 0/4] btrfs: btrfs: fix possible race between error handling and writeback Qu Wenruo
2025-07-28 8:27 ` [PATCH v2 1/4] btrfs: rework the error handling of run_delalloc_nocow() Qu Wenruo
2025-07-28 8:27 ` [PATCH v2 2/4] btrfs: enhance error messages for delalloc range failure Qu Wenruo
2025-07-28 8:27 ` [PATCH v2 3/4] btrfs: make nocow_one_range() to do cleanup on error Qu Wenruo
2025-07-28 8:27 ` [PATCH v2 4/4] btrfs: keep folios locked inside run_delalloc_nocow() Qu Wenruo
2025-08-04 22:36 ` Wang Yugui [this message]
2025-08-04 22:53 ` Qu Wenruo
2025-08-18 15:48 ` [PATCH v2 0/4] btrfs: btrfs: fix possible race between error handling and writeback David Sterba
2025-08-18 21:47 ` Qu Wenruo
2025-08-21 13:52 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250805063658.ED67.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox