public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] btrfs: remove an ASSERT() which can cause false alert
@ 2022-02-17  9:49 Qu Wenruo
  2022-02-17 11:14 ` Filipe Manana
  2022-04-19 20:34 ` David Sterba
  0 siblings, 2 replies; 9+ messages in thread
From: Qu Wenruo @ 2022-02-17  9:49 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Su Yue

[BUG]
Su reported that with his VM running on an M1 mac, it has a high chance
to trigger the following ASSERT() with scrub and balance test cases:

		ASSERT(cache->start == chunk_offset);
		ret = scrub_chunk(sctx, cache, scrub_dev, found_key.offset,
				  dev_extent_len);

[CAUSE]
The ASSERT() is making sure that the block group cache (@cache) and the
chunk offset from the device extent match.

But the truth is, block group caches and dev extent items are deleted at
different timing.

For block group caches and items, after btrfs_remove_block_group() they
will be marked deleted, but the device extent and the chunk item is
still in dev tree and chunk tree respectively.

The device extents and chunk items are only deleted in
btrfs_remove_chunk(), which happens either a btrfs_delete_unused_bgs()
or btrfs_relocate_chunk().

This timing difference leaves a window where we can have a mismatch
between block group cache and device extent tree, and triggering the
ASSERT().

[FIX]

- Remove the ASSERT()

- Add a quick exit if our found bg doesn't match @chunk_offset

- Add a comment on the existing checks in scrub_chunk()
  This is the ultimate safenet, as it will iterate through all the stripes
  of the found chunk.
  And only scrub the stripe if it matches both device and physical
  offset.

  So even by some how we got a dev extent which is no longer owned
  by the target chunk, we will just skip this chunk completely, without
  any consequence.

Reported-by: Su Yue <l@damenly.su>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Add a new quick exit with extra comments

- Add a new comment in the existing safenet in scrub_chunk()
---
 fs/btrfs/scrub.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 11089568b287..1c3ed4720ddd 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3578,6 +3578,14 @@ static noinline_for_stack int scrub_chunk(struct scrub_ctx *sctx,
 		goto out;
 
 	map = em->map_lookup;
+
+	/*
+	 * Due to the delayed modification of device tree, this chunk
+	 * may not own this dev extent.
+	 *
+	 * So we need to iterate through all stripes, and only scrub
+	 * the stripe which matches both device and physical offset.
+	 */
 	for (i = 0; i < map->num_stripes; ++i) {
 		if (map->stripes[i].dev->bdev == scrub_dev->bdev &&
 		    map->stripes[i].physical == dev_offset) {
@@ -3699,6 +3707,18 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
 		if (!cache)
 			goto skip;
 
+		/*
+		 * Since dev extents modification is delayed, it's possible
+		 * we got a bg which doesn't really own this dev extent.
+		 *
+		 * Here just do a quick skip, scrub_chunk() has a more
+		 * comprehensive check in it.
+		 */
+		if (cache->start != chunk_offset) {
+			btrfs_put_block_group(cache);
+			goto skip;
+		}
+
 		if (sctx->is_dev_replace && btrfs_is_zoned(fs_info)) {
 			spin_lock(&cache->lock);
 			if (!cache->to_copy) {
@@ -3822,7 +3842,6 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
 		dev_replace->item_needs_writeback = 1;
 		up_write(&dev_replace->rwsem);
 
-		ASSERT(cache->start == chunk_offset);
 		ret = scrub_chunk(sctx, cache, scrub_dev, found_key.offset,
 				  dev_extent_len);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-04-19 20:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-17  9:49 [PATCH v2] btrfs: remove an ASSERT() which can cause false alert Qu Wenruo
2022-02-17 11:14 ` Filipe Manana
2022-02-17 11:24   ` Qu Wenruo
2022-02-17 12:32     ` Filipe Manana
2022-02-17 13:00       ` Qu Wenruo
2022-02-17 13:02         ` Filipe Manana
2022-02-17 13:25           ` Qu Wenruo
2022-02-18  2:29           ` Su Yue
2022-04-19 20:34 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox