public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix race in subpage sync writeback handling
@ 2025-04-14 19:32 Josef Bacik
  2025-04-14 19:52 ` Boris Burkov
  2025-04-14 21:45 ` Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Josef Bacik @ 2025-04-14 19:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While debugging a fs corruption with subpage we noticed a potential race
in how we do tagging for writeback.  Boris came up with the following
diagram to describe the race potential

EB1                                                       EB2
btree_write_cache_pages
  tag_pages_for_writeback
  filemap_get_folios_tag(TOWRITE)
  submit_eb_page
    submit_eb_subpage
      for eb in folio:
        write_one_eb
                                                          set_extent_buffer_dirty
                                                          btrfs_meta_folio_set_dirty
                                                          ...
                                                          filemap_fdatawrite_range
                                                            btree_write_cache_pages
                                                            tag_pages_for_writeback
          folio_lock
          btrfs_meta_folio_clear_dirty
          btrfs_meta_folio_set_writeback // clear TOWRITE
          folio_unlock
                                                            filemap_get_folios_tag(TOWRITE) //missed

The problem here is that we'll call folio_start_writeback() the first
time we initiate writeback on one extent buffer, if we happened to dirty
the extent buffer behind the one we were currently writing in the first
task, and race in as described above, we would miss the TOWRITE tag as
it would have been cleared, and we will never initiate writeback on that
EB.

This is kind of complicated for us, the best thing to do here is to
simply leave the TOWRITE tag in place, and only clear it if we aren't
dirty after we finish processing all the eb's in the folio.

Fixes: c4aec299fa8f ("btrfs: introduce submit_eb_subpage() to submit a subpage metadata page")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent_io.c | 23 +++++++++++++++++++++++
 fs/btrfs/subpage.c   |  2 +-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 6cfd286b8bbc..5d09a47c1c28 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2063,6 +2063,29 @@ static int submit_eb_subpage(struct folio *folio, struct writeback_control *wbc)
 		}
 		free_extent_buffer(eb);
 	}
+	/*
+	 * Normally folio_start_writeback() will clear TAG_TOWRITE, but for
+	 * subpage we use __folio_start_writeback(folio, true), which keeps it
+	 * from clearing TOWRITE.  This is because we walk the bitmap and
+	 * process each eb one at a time, and then locking the folio when we
+	 * process the eb.  We could have somebody dirty behind us, and then
+	 * subsequently mark this range as TOWRITE.  In that case we must not
+	 * clear TOWRITE or we will skip writing back the dirty folio.
+	 *
+	 * So here lock the folio, if it is clean we know we are done with it,
+	 * and we can clear TOWRITE.
+	 */
+	folio_lock(folio);
+	if (!folio_test_dirty(folio)) {
+		XA_STATE(xas, &folio->mapping->i_pages, folio_index(folio));
+		unsigned long flags;
+
+		xas_lock_irqsave(&xas, flags);
+		xas_load(&xas);
+		xas_clear_mark(&xas, PAGECACHE_TAG_TOWRITE);
+		xas_unlock_irqrestore(&xas, flags);
+	}
+	folio_unlock(folio);
 	return submitted;
 }
 
diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
index d4f019233493..53140a9dad9d 100644
--- a/fs/btrfs/subpage.c
+++ b/fs/btrfs/subpage.c
@@ -454,7 +454,7 @@ void btrfs_subpage_set_writeback(const struct btrfs_fs_info *fs_info,
 	spin_lock_irqsave(&subpage->lock, flags);
 	bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits);
 	if (!folio_test_writeback(folio))
-		folio_start_writeback(folio);
+		__folio_start_writeback(folio, true);
 	spin_unlock_irqrestore(&subpage->lock, flags);
 }
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-15 17:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14 19:32 [PATCH] btrfs: fix race in subpage sync writeback handling Josef Bacik
2025-04-14 19:52 ` Boris Burkov
2025-04-14 21:45 ` Qu Wenruo
2025-04-15  0:40   ` Boris Burkov
2025-04-15 17:22   ` Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox