All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 4/4] btrfs: keep folios locked inside run_delalloc_nocow()
Date: Tue, 05 Aug 2025 06:36:59 +0800	[thread overview]
Message-ID: <20250805063658.ED67.409509F4@e16-tech.com> (raw)
In-Reply-To: <a5a3c2f37bbe0b32ffaa8a15a85515ab9f173a28.1753687685.git.wqu@suse.com>

Hi,

> [BUG]
> There is a very low chance that DEBUG_WARN() inside
> btrfs_writepage_cow_fixup() can be triggered when
> CONFIG_BTRFS_EXPERIMENTAL is enabled.
> 
> This only happens after run_delalloc_nocow() failed.
> 
> Unfortunately I haven't hit it for a while thus no real world dmesg for
> now.
> 
> [CAUSE]
> There is a race window where after run_delalloc_nocow() failed, error
> handling can race with writeback thread.
> 
> Before we hit run_delalloc_nocow(), there is an inode with the following
> dirty pages: (4K page size, 4K block size, no large folio)
> 
>   0         4K          8K          12K          16K
>   |/////////|///////////|///////////|////////////|
> 
> The inode also have NODATACOW flag, and the above dirty range will go
> through different extents during run_delalloc_range():
> 
>   0         4K          8K          12K          16K
>   |  NOCOW  |    COW    |    COW    |   NOCOW    |
> 
> The race happen like this:
> 
>     writeback thread A            |        writeback thread B
> ----------------------------------+--------------------------------------
> Writeback for folio 0             |
> run_delalloc_nocow()              |
> |- nocow_one_range()              |
> |  For range [0, 4K), ret = 0     |
> |                                 |
> |- fallback_to_cow()              |
> |  For range [4K, 8K), ret = 0    |
> |  Folio 4K *UNLOCKED*            |
> |                                 | Writeback for folio 4K
> |- fallback_to_cow()              | extent_writepage()
> |  For range [8K, 12K), failure   | |- writepage_delalloc()
> |				  | |
> |- btrfs_cleanup_ordered_extents()| |
>    |- btrfs_folio_clear_ordered() | |
>    |  Folio 0 still locked, safe  | |
>    |                              | |  Ordered extent already allocated.
>    |                              | |  Nothing to do.
>    |                              | |- extent_writepage_io()
>    |                              |    |- btrfs_writepage_cow_fixup()
>    |- btrfs_folio_clear_ordered() |    |
>       Folio 4K hold by thread B,  |    |
>       UNSAFE!                     |    |- btrfs_test_ordered()
>                                   |    |  Cleared by thread A,
> 				  |    |
>                                   |    |- DEBUG_WARN();
> 
> This is only possible after run_delalloc_nocow() failure, as
> cow_file_range() will keep all folios and io tree range locked, until
> everything is finished or after error handling.
> 
> The root cause is we allow fallback_to_cow() and nocow_one_range() to
> unlock the folios after a successful run, so that during error handling
> we're no longer safe to use btrfs_cleanup_ordered_extents() as the
> folios are already unlocked.
> 
> [FIX]
> - Make fallback_to_cow() and nocow_one_range() to keep folios locked
>   after a successful run
> 
>   For fallback_to_cow() we can pass COW_FILE_RANGE_KEEP_LOCKED flag
>   into cow_file_range().
> 
>   For nocow_one_range() we have to remove the PAGE_UNLOCK flag from
>   extent_clear_unlock_delalloc().
> 
> - Unlock folios if everything is fine in run_delalloc_nocow()
> 
> - Use extent_clear_unlock_delalloc() to handle range [@start,
>   @cur_offset) inside run_delalloc_nocow()
>   Since folios are still locked, we do not need
>   cleanup_dirty_folios() to do the cleanup.
> 
>   extent_clear_unlock_delalloc() with "PAGE_START_WRIBACK |
>   PAGE_END_WRITEBACK" will clear the dirty flags.
> 
> - Remove cleanup_dirty_folios()
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

We need a 'Fix:' tag for kernel stable branch.

It seems that this problem does not happen (or less frequency) on  kernel
6.6.y/6.1.y.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2025/08/05



  reply	other threads:[~2025-08-04 22:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-28  8:27 [PATCH v2 0/4] btrfs: btrfs: fix possible race between error handling and writeback Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 1/4] btrfs: rework the error handling of run_delalloc_nocow() Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 2/4] btrfs: enhance error messages for delalloc range failure Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 3/4] btrfs: make nocow_one_range() to do cleanup on error Qu Wenruo
2025-07-28  8:27 ` [PATCH v2 4/4] btrfs: keep folios locked inside run_delalloc_nocow() Qu Wenruo
2025-08-04 22:36   ` Wang Yugui [this message]
2025-08-04 22:53     ` Qu Wenruo
2025-08-18 15:48 ` [PATCH v2 0/4] btrfs: btrfs: fix possible race between error handling and writeback David Sterba
2025-08-18 21:47   ` Qu Wenruo
2025-08-21 13:52     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250805063658.ED67.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.