public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 4/4] btrfs: keep folios locked inside run_delalloc_nocow()
Date: Tue, 05 Aug 2025 06:36:59 +0800	[thread overview]
Message-ID: <20250805063658.ED67.409509F4@e16-tech.com> (raw)
In-Reply-To: <a5a3c2f37bbe0b32ffaa8a15a85515ab9f173a28.1753687685.git.wqu@suse.com>

Hi,

> [BUG]
> There is a very low chance that DEBUG_WARN() inside
> btrfs_writepage_cow_fixup() can be triggered when
> CONFIG_BTRFS_EXPERIMENTAL is enabled.
> 
> This only happens after run_delalloc_nocow() failed.
> 
> Unfortunately I haven't hit it for a while thus no real world dmesg for
> now.
> 
> [CAUSE]
> There is a race window where after run_delalloc_nocow() failed, error
> handling can race with writeback thread.
> 
> Before we hit run_delalloc_nocow(), there is an inode with the following
> dirty pages: (4K page size, 4K block size, no large folio)
> 
>   0         4K          8K          12K          16K
>   |/////////|///////////|///////////|////////////|
> 
> The inode also have NODATACOW flag, and the above dirty range will go
> through different extents during run_delalloc_range():
> 
>   0         4K          8K          12K          16K
>   |  NOCOW  |    COW    |    COW    |   NOCOW    |
> 
> The race happen like this:
> 
>     writeback thread A            |        writeback thread B
> ----------------------------------+--------------------------------------
> Writeback for folio 0             |
> run_delalloc_nocow()              |
> |- nocow_one_range()              |
> |  For range [0, 4K), ret = 0     |
> |                                 |
> |- fallback_to_cow()              |
> |  For range [4K, 8K), ret = 0    |
> |  Folio 4K *UNLOCKED*            |
> |                                 | Writeback for folio 4K
> |- fallback_to_cow()              | extent_writepage()
> |  For range [8K, 12K), failure   | |- writepage_delalloc()
> |				  | |
> |- btrfs_cleanup_ordered_extents()| |
>    |- btrfs_folio_clear_ordered() | |
>    |  Folio 0 still locked, safe  | |
>    |                              | |  Ordered extent already allocated.
>    |                              | |  Nothing to do.
>    |                              | |- extent_writepage_io()
>    |                              |    |- btrfs_writepage_cow_fixup()
>    |- btrfs_folio_clear_ordered() |    |
>       Folio 4K hold by thread B,  |    |
>       UNSAFE!                     |    |- btrfs_test_ordered()
>                                   |    |  Cleared by thread A,
> 				  |    |
>                                   |    |- DEBUG_WARN();
> 
> This is only possible after run_delalloc_nocow() failure, as
> cow_file_range() will keep all folios and io tree range locked, until
> everything is finished or after error handling.
> 
> The root cause is we allow fallback_to_cow() and nocow_one_range() to
> unlock the folios after a successful run, so that during error handling
> we're no longer safe to use btrfs_cleanup_ordered_extents() as the
> folios are already unlocked.
> 
> [FIX]
> - Make fallback_to_cow() and nocow_one_range() to keep folios locked
>   after a successful run
> 
>   For fallback_to_cow() we can pass COW_FILE_RANGE_KEEP_LOCKED flag
>   into cow_file_range().
> 
>   For nocow_one_range() we have to remove the PAGE_UNLOCK flag from
>   extent_clear_unlock_delalloc().
> 
> - Unlock folios if everything is fine in run_delalloc_nocow()
> 
> - Use extent_clear_unlock_delalloc() to handle range [@start,
>   @cur_offset) inside run_delalloc_nocow()
>   Since folios are still locked, we do not need
>   cleanup_dirty_folios() to do the cleanup.
> 
>   extent_clear_unlock_delalloc() with "PAGE_START_WRIBACK |
>   PAGE_END_WRITEBACK" will clear the dirty flags.
> 
> - Remove cleanup_dirty_folios()
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

We need a 'Fix:' tag for kernel stable branch.

It seems that this problem does not happen (or less frequency) on  kernel
6.6.y/6.1.y.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2025/08/05



  reply	other threads:[~2025-08-04 22:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-28  8:27 [PATCH v2 0/4] btrfs: btrfs: fix possible race between error handling and writeback Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 1/4] btrfs: rework the error handling of run_delalloc_nocow() Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 2/4] btrfs: enhance error messages for delalloc range failure Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 3/4] btrfs: make nocow_one_range() to do cleanup on error Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 4/4] btrfs: keep folios locked inside run_delalloc_nocow() Qu Wenruo
2025-08-04 22:36   ` Wang Yugui [this message]
2025-08-04 22:53     ` Qu Wenruo
2025-08-18 15:48 ` [PATCH v2 0/4] btrfs: btrfs: fix possible race between error handling and writeback David Sterba
2025-08-18 21:47   ` Qu Wenruo
2025-08-21 13:52     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250805063658.ED67.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox