[PATCH v2 0/2] btrfs: remove COW fixup and checked folio flag

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/2] btrfs: remove COW fixup and checked folio flag
@ 2026-04-14  1:16 Qu Wenruo
  2026-04-14  1:16 ` [PATCH v2 1/2] btrfs: remove the COW fixup mechanism Qu Wenruo
  2026-04-14  1:16 ` [PATCH v2 2/2] btrfs: remove folio checked subpage bitmap tracking Qu Wenruo
  0 siblings, 2 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-04-14  1:16 UTC (permalink / raw)
  To: linux-btrfs

Changelog:
v2:
- Also remove the fixup worker

- Slightly reword the cover letter

- Remove COW fixup related comments

- Update Kconfig to reflect this change.

For experimental builds we're already rejecting dirty folios which
don't have ordered flags since v6.15.

Unfortunately we're not yet removing that COW fixup machanism for
non-experimental builds even at v7.0, as we can still trigger the
warning when we detect dirty folios without ordered extents.

So far all those problems only happen when errors are injected into the
IO path, e.g. generic/475, and I haven't yet seen it triggered without
error handling.

Although I prefer to remove the COW fixup after all error handling
problems are fixed, I run out of ideas how those cases can happen, and
the current handling of treating such cases as write errors is not
going to make things any worse anyway.

Furthermore for the future of huge folios (order 9, 2M page on 4K page
size systems), we can not afford the extra bitmap for a huge folio.
In that case, a huge folio will need 64 bytes per bitmap, which is no
longer a small amount.

I believe it's time to remove the COW fixup mechanism even for
non-experimental builds, along with the checked folio flags.

Qu Wenruo (2):
  btrfs: remove the COW fixup mechanism
  btrfs: remove folio checked subpage bitmap tracking

 fs/btrfs/Kconfig            |   4 -
 fs/btrfs/defrag.c           |   1 -
 fs/btrfs/disk-io.c          |  16 +--
 fs/btrfs/extent_io.c        |  12 +--
 fs/btrfs/file.c             |  12 +--
 fs/btrfs/free-space-cache.c |   4 -
 fs/btrfs/fs.h               |   7 --
 fs/btrfs/inode.c            | 205 +++---------------------------------
 fs/btrfs/reflink.c          |   1 -
 fs/btrfs/subpage.c          |  39 +------
 fs/btrfs/subpage.h          |   5 +-
 11 files changed, 21 insertions(+), 285 deletions(-)

-- 
2.53.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/2] btrfs: remove the COW fixup mechanism
  2026-04-14  1:16 [PATCH v2 0/2] btrfs: remove COW fixup and checked folio flag Qu Wenruo
@ 2026-04-14  1:16 ` Qu Wenruo
  2026-04-14  2:14   ` David Sterba
  2026-04-14  1:16 ` [PATCH v2 2/2] btrfs: remove folio checked subpage bitmap tracking Qu Wenruo
  1 sibling, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2026-04-14  1:16 UTC (permalink / raw)
  To: linux-btrfs

[BACKGROUND]
Btrfs has a special mechanism called COW fixup, which detects dirty
pages without an ordered extent (folio ordered flag).

Normally a dirty folio must go through delayed allocation (delalloc)
before it can be submitted, and delalloc will create an ordered extent
for it and mark the range with ordered flag.

However in older kernels, there are bugs related to get_user_pages()
which can lead to some page marked dirty but without notifying the fs to
properly prepare them for writeback.

In that case without an ordered extent btrfs is unable to properly
submit such dirty folios, thus the COW fixup mechanism is introduced,
which do the extra space reservation so that they can be written back
properly.

[MODERN SOLUTIONS]
The MM layer has solved it properly now with the introduction of
pin_user_pages*(), so we're handling cases that are no longer valid.

So commit 7ca3e84980ef ("btrfs: reject out-of-band dirty folios during
writeback") is introduced to change the behavior from going through
COW fixup to rejecting them directly for experimental builds.

So far it works fine, but when errors are injected into the IO path, we
have random failures triggering the new warnings.

It looks like we have error path that cleared the ordered flag but
leaves the folio dirty flag, which later triggers the warning.

[REMOVAL OF COW FIXUP]
Although I hope to fix all those known warnings cases, I just can not
figure out the root cause yet.

But on the other hand, if we remove the ordered and checked flags in the
future, and purely rely on the dirty flags and ordered extent search, we
can get a much cleaner handling.

Considering it's no longer hitting the COW fixup for normal IO paths, I
think it's finally the time to remove the COW fixup completely.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/Kconfig     |   4 -
 fs/btrfs/disk-io.c   |  16 +---
 fs/btrfs/extent_io.c |  12 +--
 fs/btrfs/fs.h        |   7 --
 fs/btrfs/inode.c     | 202 ++++---------------------------------------
 5 files changed, 17 insertions(+), 224 deletions(-)

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 5e75438e0b73..5d785d010971 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -93,10 +93,6 @@ config BTRFS_EXPERIMENTAL
 
 	  Current list:
 
-	  - COW fixup worker warning - last warning before removing the
-				       functionality catching out-of-band page
-				       dirtying, not necessary since 5.8
-
 	  - RAID mirror read policy - additional read policies for balancing
 				      reading from redundant block group
 				      profiles (currently: pid, round-robin,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d450bfc6d89b..761362f4ab9f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1736,7 +1736,6 @@ static int read_backup_root(struct btrfs_fs_info *fs_info, u8 priority)
 /* helper to cleanup workers */
 static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info)
 {
-	btrfs_destroy_workqueue(fs_info->fixup_workers);
 	btrfs_destroy_workqueue(fs_info->delalloc_workers);
 	btrfs_destroy_workqueue(fs_info->workers);
 	if (fs_info->endio_workers)
@@ -1944,9 +1943,6 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 	fs_info->caching_workers =
 		btrfs_alloc_workqueue(fs_info, "cache", flags, max_active, 0);
 
-	fs_info->fixup_workers =
-		btrfs_alloc_ordered_workqueue(fs_info, "fixup", ordered_flags);
-
 	fs_info->endio_workers =
 		alloc_workqueue("btrfs-endio", flags, max_active);
 	fs_info->endio_meta_workers =
@@ -1972,7 +1968,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 	      fs_info->endio_workers && fs_info->endio_meta_workers &&
 	      fs_info->endio_write_workers &&
 	      fs_info->endio_freespace_worker && fs_info->rmw_workers &&
-	      fs_info->caching_workers && fs_info->fixup_workers &&
+	      fs_info->caching_workers &&
 	      fs_info->delayed_workers && fs_info->qgroup_rescan_workers &&
 	      fs_info->discard_ctl.discard_workers)) {
 		return -ENOMEM;
@@ -4327,16 +4323,6 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	if (unlikely(BTRFS_FS_ERROR(fs_info)))
 		btrfs_error_commit_super(fs_info);
 
-	/*
-	 * Wait for any fixup workers to complete.
-	 * If we don't wait for them here and they are still running by the time
-	 * we call kthread_stop() against the cleaner kthread further below, we
-	 * get an use-after-free on the cleaner because the fixup worker adds an
-	 * inode to the list of delayed iputs and then attempts to wakeup the
-	 * cleaner kthread, which was already stopped and destroyed. We parked
-	 * already the cleaner, but below we run all pending delayed iputs.
-	 */
-	btrfs_flush_workqueue(fs_info->fixup_workers);
 	/*
 	 * Similar case here, we have to wait for delalloc workers before we
 	 * proceed below and stop the cleaner kthread, otherwise we trigger a
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a8887285deda..862ef3c8886e 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1739,12 +1739,6 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
 	       start, len, folio_start, folio_size(folio));
 
 	ret = btrfs_writepage_cow_fixup(folio);
-	if (ret == -EAGAIN) {
-		/* Fixup worker will requeue */
-		folio_redirty_for_writepage(bio_ctrl->wbc, folio);
-		folio_unlock(folio);
-		return 1;
-	}
 	if (ret < 0) {
 		btrfs_folio_clear_dirty(fs_info, folio, start, len);
 		btrfs_folio_set_writeback(fs_info, folio, start, len);
@@ -1867,12 +1861,8 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
 	 *
 	 * So here we check if the page has private set to rule out such
 	 * case.
-	 * But we also have a long history of relying on the COW fixup,
-	 * so here we only enable this check for experimental builds until
-	 * we're sure it's safe.
 	 */
-	if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) &&
-	    unlikely(!folio_test_private(folio))) {
+	if (unlikely(!folio_test_private(folio))) {
 		WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
 		btrfs_err_rl(fs_info,
 	"root %lld ino %llu folio %llu is marked dirty without notifying the fs",
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index a4758d94b32e..5ccc907327a8 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -697,13 +697,6 @@ struct btrfs_fs_info {
 	struct btrfs_workqueue *endio_write_workers;
 	struct btrfs_workqueue *endio_freespace_worker;
 	struct btrfs_workqueue *caching_workers;
-
-	/*
-	 * Fixup workers take dirty pages that didn't properly go through the
-	 * cow mechanism and make them safe to write.  It happens for the
-	 * sys_munmap function call path.
-	 */
-	struct btrfs_workqueue *fixup_workers;
 	struct btrfs_workqueue *delayed_workers;
 
 	struct task_struct *transaction_kthread;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 808e52aa6ef2..29a624a2c611 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2833,206 +2833,34 @@ int btrfs_set_extent_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 				    EXTENT_DELALLOC | extra_bits, cached_state);
 }
 
-/* see btrfs_writepage_start_hook for details on why this is required */
-struct btrfs_writepage_fixup {
-	struct folio *folio;
-	struct btrfs_inode *inode;
-	struct btrfs_work work;
-};
-
-static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
-{
-	struct btrfs_writepage_fixup *fixup =
-		container_of(work, struct btrfs_writepage_fixup, work);
-	struct btrfs_ordered_extent *ordered;
-	struct extent_state *cached_state = NULL;
-	struct extent_changeset *data_reserved = NULL;
-	struct folio *folio = fixup->folio;
-	struct btrfs_inode *inode = fixup->inode;
-	struct btrfs_fs_info *fs_info = inode->root->fs_info;
-	u64 page_start = folio_pos(folio);
-	u64 page_end = folio_next_pos(folio) - 1;
-	int ret = 0;
-	bool free_delalloc_space = true;
-
-	/*
-	 * This is similar to page_mkwrite, we need to reserve the space before
-	 * we take the folio lock.
-	 */
-	ret = btrfs_delalloc_reserve_space(inode, &data_reserved, page_start,
-					   folio_size(folio));
-again:
-	folio_lock(folio);
-
-	/*
-	 * Before we queued this fixup, we took a reference on the folio.
-	 * folio->mapping may go NULL, but it shouldn't be moved to a different
-	 * address space.
-	 */
-	if (!folio->mapping || !folio_test_dirty(folio) ||
-	    !folio_test_checked(folio)) {
-		/*
-		 * Unfortunately this is a little tricky, either
-		 *
-		 * 1) We got here and our folio had already been dealt with and
-		 *    we reserved our space, thus ret == 0, so we need to just
-		 *    drop our space reservation and bail.  This can happen the
-		 *    first time we come into the fixup worker, or could happen
-		 *    while waiting for the ordered extent.
-		 * 2) Our folio was already dealt with, but we happened to get an
-		 *    ENOSPC above from the btrfs_delalloc_reserve_space.  In
-		 *    this case we obviously don't have anything to release, but
-		 *    because the folio was already dealt with we don't want to
-		 *    mark the folio with an error, so make sure we're resetting
-		 *    ret to 0.  This is why we have this check _before_ the ret
-		 *    check, because we do not want to have a surprise ENOSPC
-		 *    when the folio was already properly dealt with.
-		 */
-		if (!ret) {
-			btrfs_delalloc_release_extents(inode, folio_size(folio));
-			btrfs_delalloc_release_space(inode, data_reserved,
-						     page_start, folio_size(folio),
-						     true);
-		}
-		ret = 0;
-		goto out_page;
-	}
-
-	/*
-	 * We can't mess with the folio state unless it is locked, so now that
-	 * it is locked bail if we failed to make our space reservation.
-	 */
-	if (ret)
-		goto out_page;
-
-	btrfs_lock_extent(&inode->io_tree, page_start, page_end, &cached_state);
-
-	/* already ordered? We're done */
-	if (folio_test_ordered(folio))
-		goto out_reserved;
-
-	ordered = btrfs_lookup_ordered_range(inode, page_start, PAGE_SIZE);
-	if (ordered) {
-		btrfs_unlock_extent(&inode->io_tree, page_start, page_end,
-				    &cached_state);
-		folio_unlock(folio);
-		btrfs_start_ordered_extent(ordered);
-		btrfs_put_ordered_extent(ordered);
-		goto again;
-	}
-
-	ret = btrfs_set_extent_delalloc(inode, page_start, page_end, 0,
-					&cached_state);
-	if (ret)
-		goto out_reserved;
-
-	/*
-	 * Everything went as planned, we're now the owner of a dirty page with
-	 * delayed allocation bits set and space reserved for our COW
-	 * destination.
-	 *
-	 * The page was dirty when we started, nothing should have cleaned it.
-	 */
-	BUG_ON(!folio_test_dirty(folio));
-	free_delalloc_space = false;
-out_reserved:
-	btrfs_delalloc_release_extents(inode, PAGE_SIZE);
-	if (free_delalloc_space)
-		btrfs_delalloc_release_space(inode, data_reserved, page_start,
-					     PAGE_SIZE, true);
-	btrfs_unlock_extent(&inode->io_tree, page_start, page_end, &cached_state);
-out_page:
-	if (ret) {
-		/*
-		 * We hit ENOSPC or other errors.  Update the mapping and page
-		 * to reflect the errors and clean the page.
-		 */
-		mapping_set_error(folio->mapping, ret);
-		btrfs_folio_clear_ordered(fs_info, folio, page_start,
-					  folio_size(folio));
-		btrfs_mark_ordered_io_finished(inode, page_start,
-					       folio_size(folio), !ret);
-		folio_clear_dirty_for_io(folio);
-	}
-	btrfs_folio_clear_checked(fs_info, folio, page_start, PAGE_SIZE);
-	folio_unlock(folio);
-	folio_put(folio);
-	kfree(fixup);
-	extent_changeset_free(data_reserved);
-	/*
-	 * As a precaution, do a delayed iput in case it would be the last iput
-	 * that could need flushing space. Recursing back to fixup worker would
-	 * deadlock.
-	 */
-	btrfs_add_delayed_iput(inode);
-}
-
 /*
- * There are a few paths in the higher layers of the kernel that directly
- * set the folio dirty bit without asking the filesystem if it is a
- * good idea.  This causes problems because we want to make sure COW
- * properly happens and the data=ordered rules are followed.
+ * There used to be a bug related to get_user_page*() where a page can be
+ * dirtied without notifying the filesystem.
  *
- * In our case any range that doesn't have the ORDERED bit set
- * hasn't been properly setup for IO.  We kick off an async process
- * to fix it up.  The async helper will wait for ordered extents, set
- * the delalloc bit and make it safe to write the folio.
+ * Btrfs used to handle such cases by manually re-setup the needed flags/states
+ * so we can later submit them for writeback.
+ *
+ * But nowadays MM layer has address that problem, and we can only hit dirty
+ * folios without the ordered flag when some error handling is done incorrectly,
+ * e.g. cleared the ordered flag but didn't clear the dirty flag.
+ * In that case we just error out and treat it as an write error.
  */
 int btrfs_writepage_cow_fixup(struct folio *folio)
 {
 	struct inode *inode = folio->mapping->host;
 	struct btrfs_fs_info *fs_info = inode_to_fs_info(inode);
-	struct btrfs_writepage_fixup *fixup;
 
 	/* This folio has ordered extent covering it already */
 	if (folio_test_ordered(folio))
 		return 0;
 
-	/*
-	 * For experimental build, we error out instead of EAGAIN.
-	 *
-	 * We should not hit such out-of-band dirty folios anymore.
-	 */
-	if (IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL)) {
-		DEBUG_WARN();
-		btrfs_err_rl(fs_info,
+	DEBUG_WARN();
+	btrfs_err_rl(fs_info,
 	"root %lld ino %llu folio %llu is marked dirty without notifying the fs",
-			     btrfs_root_id(BTRFS_I(inode)->root),
-			     btrfs_ino(BTRFS_I(inode)),
-			     folio_pos(folio));
-		return -EUCLEAN;
-	}
-
-	/*
-	 * folio_checked is set below when we create a fixup worker for this
-	 * folio, don't try to create another one if we're already
-	 * folio_test_checked.
-	 *
-	 * The extent_io writepage code will redirty the foio if we send back
-	 * EAGAIN.
-	 */
-	if (folio_test_checked(folio))
-		return -EAGAIN;
-
-	fixup = kzalloc_obj(*fixup, GFP_NOFS);
-	if (!fixup)
-		return -EAGAIN;
-
-	/*
-	 * We are already holding a reference to this inode from
-	 * write_cache_pages.  We need to hold it because the space reservation
-	 * takes place outside of the folio lock, and we can't trust
-	 * folio->mapping outside of the folio lock.
-	 */
-	ihold(inode);
-	btrfs_folio_set_checked(fs_info, folio, folio_pos(folio), folio_size(folio));
-	folio_get(folio);
-	btrfs_init_work(&fixup->work, btrfs_writepage_fixup_worker, NULL);
-	fixup->folio = folio;
-	fixup->inode = BTRFS_I(inode);
-	btrfs_queue_work(fs_info->fixup_workers, &fixup->work);
-
-	return -EAGAIN;
+		     btrfs_root_id(BTRFS_I(inode)->root),
+		     btrfs_ino(BTRFS_I(inode)),
+		     folio_pos(folio));
+	return -EUCLEAN;
 }
 
 static int insert_reserved_file_extent(struct btrfs_trans_handle *trans,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/2] btrfs: remove folio checked subpage bitmap tracking
  2026-04-14  1:16 [PATCH v2 0/2] btrfs: remove COW fixup and checked folio flag Qu Wenruo
  2026-04-14  1:16 ` [PATCH v2 1/2] btrfs: remove the COW fixup mechanism Qu Wenruo
@ 2026-04-14  1:16 ` Qu Wenruo
  1 sibling, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-04-14  1:16 UTC (permalink / raw)
  To: linux-btrfs

The folio checked flag is only utilized by the COW fixup mechanism
inside btrfs.

Since the COW fixup is already removed from non-experimental builds,
there is no need to keep the checked subpage bitmap.

This will saves us some space for large folios, for example for a single
256K sized large folio on 4K page sized systems:

 Old bitmap size = 6 * (256K / 4K / 8) = 48 bytes
 New bitmap size = 5 * (256K / 4K / 8) = 40 bytes

This will be more obvious when we're going to support huge folios (order
= 9).

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/defrag.c           |  1 -
 fs/btrfs/file.c             | 12 +-----------
 fs/btrfs/free-space-cache.c |  4 ----
 fs/btrfs/inode.c            |  3 ---
 fs/btrfs/reflink.c          |  1 -
 fs/btrfs/subpage.c          | 39 ++-----------------------------------
 fs/btrfs/subpage.h          |  5 +----
 7 files changed, 4 insertions(+), 61 deletions(-)

diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
index 7e2db5d3a4d4..af40ad62009a 100644
--- a/fs/btrfs/defrag.c
+++ b/fs/btrfs/defrag.c
@@ -1179,7 +1179,6 @@ static int defrag_one_locked_target(struct btrfs_inode *inode,
 		if (start >= folio_next_pos(folio) ||
 		    start + len <= folio_pos(folio))
 			continue;
-		btrfs_folio_clamp_clear_checked(fs_info, folio, start, len);
 		btrfs_folio_clamp_set_dirty(fs_info, folio, start, len);
 	}
 	btrfs_delalloc_release_extents(inode, len);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index cf1cb5c4db75..a6f641a41d99 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -49,14 +49,6 @@ static void btrfs_drop_folio(struct btrfs_fs_info *fs_info, struct folio *folio,
 	u64 block_len = round_up(pos + copied, fs_info->sectorsize) - block_start;
 
 	ASSERT(block_len <= U32_MAX);
-	/*
-	 * Folio checked is some magic around finding folios that have been
-	 * modified without going through btrfs_dirty_folio().  Clear it here.
-	 * There should be no need to mark the pages accessed as
-	 * prepare_one_folio() should have marked them accessed in
-	 * prepare_one_folio() via find_or_create_page()
-	 */
-	btrfs_folio_clamp_clear_checked(fs_info, folio, block_start, block_len);
 	folio_unlock(folio);
 	folio_put(folio);
 }
@@ -65,7 +57,7 @@ static void btrfs_drop_folio(struct btrfs_fs_info *fs_info, struct folio *folio,
  * After copy_folio_from_iter_atomic(), update the following things for delalloc:
  * - Mark newly dirtied folio as DELALLOC in the io tree.
  *   Used to advise which range is to be written back.
- * - Mark modified folio as Uptodate/Dirty and not needing COW fixup
+ * - Mark modified folio as Uptodate/Dirty
  * - Update inode size for past EOF write
  */
 int btrfs_dirty_folio(struct btrfs_inode *inode, struct folio *folio, loff_t pos,
@@ -107,7 +99,6 @@ int btrfs_dirty_folio(struct btrfs_inode *inode, struct folio *folio, loff_t pos
 		return ret;
 
 	btrfs_folio_clamp_set_uptodate(fs_info, folio, start_pos, num_bytes);
-	btrfs_folio_clamp_clear_checked(fs_info, folio, start_pos, num_bytes);
 	btrfs_folio_clamp_set_dirty(fs_info, folio, start_pos, num_bytes);
 
 	/*
@@ -1987,7 +1978,6 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
 	if (zero_start != fsize)
 		folio_zero_range(folio, zero_start, folio_size(folio) - zero_start);
 
-	btrfs_folio_clear_checked(fs_info, folio, page_start, fsize);
 	btrfs_folio_set_dirty(fs_info, folio, page_start, end + 1 - page_start);
 	btrfs_folio_set_uptodate(fs_info, folio, page_start, end + 1 - page_start);
 
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index ab22e4f9ffdd..07567fd45634 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -433,10 +433,6 @@ static void io_ctl_drop_pages(struct btrfs_io_ctl *io_ctl)
 
 	for (i = 0; i < io_ctl->num_pages; i++) {
 		if (io_ctl->pages[i]) {
-			btrfs_folio_clear_checked(io_ctl->fs_info,
-					page_folio(io_ctl->pages[i]),
-					page_offset(io_ctl->pages[i]),
-					PAGE_SIZE);
 			unlock_page(io_ctl->pages[i]);
 			put_page(io_ctl->pages[i]);
 		}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 29a624a2c611..b13a0a322e23 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5033,8 +5033,6 @@ int btrfs_truncate_block(struct btrfs_inode *inode, u64 offset, u64 start, u64 e
 	folio_zero_range(folio, zero_start - folio_pos(folio),
 			 zero_end - zero_start + 1);
 
-	btrfs_folio_clear_checked(fs_info, folio, block_start,
-				  block_end + 1 - block_start);
 	btrfs_folio_set_dirty(fs_info, folio, block_start,
 			      block_end + 1 - block_start);
 
@@ -7651,7 +7649,6 @@ static void btrfs_invalidate_folio(struct folio *folio, size_t offset,
 	 * did something wrong.
 	 */
 	ASSERT(!folio_test_ordered(folio));
-	btrfs_folio_clear_checked(fs_info, folio, folio_pos(folio), folio_size(folio));
 	if (!inode_evicting)
 		__btrfs_release_folio(folio, GFP_NOFS);
 	clear_folio_extent_mapped(folio);
diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c
index 49865a463780..14742abe0f92 100644
--- a/fs/btrfs/reflink.c
+++ b/fs/btrfs/reflink.c
@@ -141,7 +141,6 @@ static int copy_inline_to_page(struct btrfs_inode *inode,
 		folio_zero_range(folio, datal, block_size - datal);
 
 	btrfs_folio_set_uptodate(fs_info, folio, file_offset, block_size);
-	btrfs_folio_clear_checked(fs_info, folio, file_offset, block_size);
 	btrfs_folio_set_dirty(fs_info, folio, file_offset, block_size);
 out_unlock:
 	if (!IS_ERR(folio)) {
diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
index f82e71f5d88b..8a09f34ea31e 100644
--- a/fs/btrfs/subpage.c
+++ b/fs/btrfs/subpage.c
@@ -508,35 +508,6 @@ void btrfs_subpage_clear_ordered(const struct btrfs_fs_info *fs_info,
 	spin_unlock_irqrestore(&bfs->lock, flags);
 }
 
-void btrfs_subpage_set_checked(const struct btrfs_fs_info *fs_info,
-			       struct folio *folio, u64 start, u32 len)
-{
-	struct btrfs_folio_state *bfs = folio_get_private(folio);
-	unsigned int start_bit = subpage_calc_start_bit(fs_info, folio,
-							checked, start, len);
-	unsigned long flags;
-
-	spin_lock_irqsave(&bfs->lock, flags);
-	bitmap_set(bfs->bitmaps, start_bit, len >> fs_info->sectorsize_bits);
-	if (subpage_test_bitmap_all_set(fs_info, folio, checked))
-		folio_set_checked(folio);
-	spin_unlock_irqrestore(&bfs->lock, flags);
-}
-
-void btrfs_subpage_clear_checked(const struct btrfs_fs_info *fs_info,
-				 struct folio *folio, u64 start, u32 len)
-{
-	struct btrfs_folio_state *bfs = folio_get_private(folio);
-	unsigned int start_bit = subpage_calc_start_bit(fs_info, folio,
-							checked, start, len);
-	unsigned long flags;
-
-	spin_lock_irqsave(&bfs->lock, flags);
-	bitmap_clear(bfs->bitmaps, start_bit, len >> fs_info->sectorsize_bits);
-	folio_clear_checked(folio);
-	spin_unlock_irqrestore(&bfs->lock, flags);
-}
-
 /*
  * Unlike set/clear which is dependent on each page status, for test all bits
  * are tested in the same way.
@@ -561,7 +532,6 @@ IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(uptodate);
 IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(dirty);
 IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(writeback);
 IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(ordered);
-IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(checked);
 
 /*
  * Note that, in selftests (extent-io-tests), we can have empty fs_info passed
@@ -659,8 +629,6 @@ IMPLEMENT_BTRFS_PAGE_OPS(writeback, folio_start_writeback, folio_end_writeback,
 			 folio_test_writeback);
 IMPLEMENT_BTRFS_PAGE_OPS(ordered, folio_set_ordered, folio_clear_ordered,
 			 folio_test_ordered);
-IMPLEMENT_BTRFS_PAGE_OPS(checked, folio_set_checked, folio_clear_checked,
-			 folio_test_checked);
 
 #define GET_SUBPAGE_BITMAP(fs_info, folio, name, dst)			\
 {									\
@@ -782,7 +750,6 @@ void __cold btrfs_subpage_dump_bitmap(const struct btrfs_fs_info *fs_info,
 	unsigned long dirty_bitmap;
 	unsigned long writeback_bitmap;
 	unsigned long ordered_bitmap;
-	unsigned long checked_bitmap;
 	unsigned long locked_bitmap;
 	unsigned long flags;
 
@@ -795,20 +762,18 @@ void __cold btrfs_subpage_dump_bitmap(const struct btrfs_fs_info *fs_info,
 	GET_SUBPAGE_BITMAP(fs_info, folio, dirty, &dirty_bitmap);
 	GET_SUBPAGE_BITMAP(fs_info, folio, writeback, &writeback_bitmap);
 	GET_SUBPAGE_BITMAP(fs_info, folio, ordered, &ordered_bitmap);
-	GET_SUBPAGE_BITMAP(fs_info, folio, checked, &checked_bitmap);
 	GET_SUBPAGE_BITMAP(fs_info, folio, locked, &locked_bitmap);
 	spin_unlock_irqrestore(&bfs->lock, flags);
 
 	dump_page(folio_page(folio, 0), "btrfs folio state dump");
 	btrfs_warn(fs_info,
-"start=%llu len=%u page=%llu, bitmaps uptodate=%*pbl dirty=%*pbl locked=%*pbl writeback=%*pbl ordered=%*pbl checked=%*pbl",
+"start=%llu len=%u page=%llu, bitmaps uptodate=%*pbl dirty=%*pbl locked=%*pbl writeback=%*pbl ordered=%*pbl",
 		    start, len, folio_pos(folio),
 		    blocks_per_folio, &uptodate_bitmap,
 		    blocks_per_folio, &dirty_bitmap,
 		    blocks_per_folio, &locked_bitmap,
 		    blocks_per_folio, &writeback_bitmap,
-		    blocks_per_folio, &ordered_bitmap,
-		    blocks_per_folio, &checked_bitmap);
+		    blocks_per_folio, &ordered_bitmap);
 }
 
 void btrfs_get_subpage_dirty_bitmap(struct btrfs_fs_info *fs_info,
diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h
index d81a0ade559f..fdea0b605bfc 100644
--- a/fs/btrfs/subpage.h
+++ b/fs/btrfs/subpage.h
@@ -41,11 +41,9 @@ enum {
 	btrfs_bitmap_nr_writeback,
 
 	/*
-	 * The ordered and checked flags are for COW fixup, already marked
-	 * deprecated, and will be removed eventually.
+	 * The ordered flags shows if the range has an ordered extent.
 	 */
 	btrfs_bitmap_nr_ordered,
-	btrfs_bitmap_nr_checked,
 
 	/*
 	 * The locked bit is for async delalloc range (compression), currently
@@ -182,7 +180,6 @@ DECLARE_BTRFS_SUBPAGE_OPS(uptodate);
 DECLARE_BTRFS_SUBPAGE_OPS(dirty);
 DECLARE_BTRFS_SUBPAGE_OPS(writeback);
 DECLARE_BTRFS_SUBPAGE_OPS(ordered);
-DECLARE_BTRFS_SUBPAGE_OPS(checked);
 
 /*
  * Helper for error cleanup, where a folio will have its dirty flag cleared,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] btrfs: remove the COW fixup mechanism
  2026-04-14  1:16 ` [PATCH v2 1/2] btrfs: remove the COW fixup mechanism Qu Wenruo
@ 2026-04-14  2:14   ` David Sterba
  0 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2026-04-14  2:14 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Apr 14, 2026 at 10:46:41AM +0930, Qu Wenruo wrote:
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -1739,12 +1739,6 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
>  	       start, len, folio_start, folio_size(folio));
>  
>  	ret = btrfs_writepage_cow_fixup(folio);

The btrfs_writepage_cow_fixup() is left to do the check but now the name
does not match what it's doing, it basically verifies the ordered bit is
set or return EUCLEAN.

Either here or in a separate patch please rename it, to something like
"verify ordered" or something describing what we expect here so we can
forget about what 'fixup' meant.

> -	if (ret == -EAGAIN) {
> -		/* Fixup worker will requeue */
> -		folio_redirty_for_writepage(bio_ctrl->wbc, folio);
> -		folio_unlock(folio);
> -		return 1;
> -	}
>  	if (ret < 0) {
>  		btrfs_folio_clear_dirty(fs_info, folio, start, len);
>  		btrfs_folio_set_writeback(fs_info, folio, start, len);

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-14  2:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-14  1:16 [PATCH v2 0/2] btrfs: remove COW fixup and checked folio flag Qu Wenruo
2026-04-14  1:16 ` [PATCH v2 1/2] btrfs: remove the COW fixup mechanism Qu Wenruo
2026-04-14  2:14   ` David Sterba
2026-04-14  1:16 ` [PATCH v2 2/2] btrfs: remove folio checked subpage bitmap tracking Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox