public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Cc: Su Yue <l@damenly.su>
Subject: Re: [PATCH] btrfs: remove an ASSERT() which can cause false alert
Date: Thu, 17 Feb 2022 13:42:18 +0800	[thread overview]
Message-ID: <dbe7eeec-e3ed-0e2a-fb15-6758c2fc56fd@suse.com> (raw)
In-Reply-To: <8ff11d24e9650cbaed4f52c4ead6184899bcc7b8.1644900321.git.wqu@suse.com>



On 2022/2/15 12:46, Qu Wenruo wrote:
> [BUG]
> Su reported that with his VM running on an M1 mac, it has a high chance
> to trigger the following ASSERT() with scrub and balance test cases:
> 
> 		ASSERT(cache->start == chunk_offset);
> 		ret = scrub_chunk(sctx, cache, scrub_dev, found_key.offset,
> 				  dev_extent_len);
> 
> [CAUSE]
> The ASSERT() is making sure that the block group cache (@cache) and the
> chunk offset from the device extent match.
> 
> But the truth is, block group caches and dev extent items are deleted at
> different timing.
> 
> For block group caches and items, after btrfs_remove_block_group() they
> will be marked deleted, but the device extent and the chunk item is
> still in dev tree and chunk tree respectively.
> 
> The device extents and chunk items are only deleted in
> btrfs_remove_chunk(), which happens either a btrfs_delete_unused_bgs()
> or btrfs_relocate_chunk().
> 
> This timing difference leaves a window where we can have a mismatch
> between block group cache and device extent tree, and triggering the
> ASSERT().
> 
> [FIX]
> Just remove the ASSERT().
> 
> In fact there are several checks in scrub_chunk() to skip such device
> extents.
> In scrub_chunk() we do an chunk mapping search, and handling (!em) or
> (em->start != bg->start) case by simply exit with ret = 0.
> 
> So the ASSERT() is really not necessary.
> 
> Please fold this fix into the patch "btrfs: scrub: cleanup the argument
> list of scrub_chunk()".
> 
> Reported-by: Su Yue <l@damenly.su>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/scrub.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 11089568b287..9c6cf1149f08 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -3822,7 +3822,6 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
>   		dev_replace->item_needs_writeback = 1;
>   		up_write(&dev_replace->rwsem);
>   
> -		ASSERT(cache->start == chunk_offset);

This check still has some sense, as we expect the bg cache to match with 
the dev extent.

>   		ret = scrub_chunk(sctx, cache, scrub_dev, found_key.offset,
>   				  dev_extent_len);

And since we have removed chunk_offset passed to scrub_chunk(), we have 
an extra check before we call scrub_stripe() against found_key.offset.

So removing the ASSERT() is still OK.

But it would be better to replace the ASSERT() with a full check between 
cache->start and chunk_offset.

I'd better find a way to reproduce the case, which I can not reproduce 
at all under x86_64 vms.

Thanks,
Qu

>   


      reply	other threads:[~2022-02-17  5:42 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15  4:46 [PATCH] btrfs: remove an ASSERT() which can cause false alert Qu Wenruo
2022-02-17  5:42 ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dbe7eeec-e3ed-0e2a-fb15-6758c2fc56fd@suse.com \
    --to=wqu@suse.com \
    --cc=l@damenly.su \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox