Re: 5.6-5.10 balance regression?

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: David Arendt <admin@prnet.org>
To: "Qu Wenruo" <quwenruo.btrfs@gmx.com>,
	"Stéphane Lesimple" <stephane_btrfs2@lesimple.fr>,
	"Qu Wenruo" <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: 5.6-5.10 balance regression?
Date: Tue, 29 Dec 2020 01:59:48 +0100	[thread overview]
Message-ID: <6dc40a44-bc3b-ea9b-327b-e2700b2efd62@prnet.org> (raw)
In-Reply-To: <5f819b6c-d737-bb73-5382-370875b599c1@gmx.com>

Hi,

Just for information: On my system the error appeared on a filesystem 
using space cache v1. I think my problem might then be unrelated to this 
one. If it will happen again, I will try to collect more information. 
Maybe a should try a clear_cache to ensure that the space cache is not 
wrong.

Bye,
David Arendt

On 12/29/20 1:44 AM, Qu Wenruo wrote:
>
>
> On 2020/12/29 上午7:39, Qu Wenruo wrote:
>>
>>
>> On 2020/12/29 上午3:58, Stéphane Lesimple wrote:
>>>> I know it fails in relocate_block_group(), which returns -2, I'm
>>>> currently
>>>> adding a couple printk's here and there to try to pinpoint that 
>>>> better.
>>>
>>> Okay, so btrfs_relocate_block_group() starts with stage
>>> MOVE_DATA_EXTENTS, which
>>> completes successfully, as relocate_block_group() returns 0:
>>>
>>> BTRFS info (device <unknown>): relocate_block_group:
>>> prepare_to_realocate = 0
>>> BTRFS info (device <unknown>): relocate_block_group loop: progress =
>>> 1, btrfs_start_transaction = ok
>>> [...]
>>> BTRFS info (device <unknown>): relocate_block_group loop: progress =
>>> 168, btrfs_start_transaction = ok
>>> BTRFS info (device <unknown>): relocate_block_group: returning err = 0
>>> BTRFS info (device dm-10): stage = move data extents,
>>> relocate_block_group = 0
>>> BTRFS info (device dm-10): found 167 extents, stage: move data extents
>>>
>>> Then it proceeds to the UPDATE_DATA_PTRS stage and calls
>>> relocate_block_group()
>>> again. This time it'll fail at the 92th iteration of the loop:
>>>
>>> BTRFS info (device <unknown>): relocate_block_group loop: progress =
>>> 92, btrfs_start_transaction = ok
>>> BTRFS info (device <unknown>): relocate_block_group loop:
>>> extents_found = 92, item_size(53) >= sizeof(*ei)(24), flags = 1, ret 
>>> = 0
>>> BTRFS info (device <unknown>): add_data_references:
>>> btrfs_find_all_leafs = 0
>>> BTRFS info (device <unknown>): add_data_references loop:
>>> read_tree_block ok
>>> BTRFS info (device <unknown>): add_data_references loop:
>>> delete_v1_space_cache = -2
>>
>> Damn it, if we find no v1 space cache for the block group, it means
>> we're fine to continue...
>>
>>> BTRFS info (device <unknown>): relocate_block_group loop:
>>> add_data_references = -2
>>>
>>> Then the -ENOENT goes all the way up the call stack and aborts the
>>> balance.
>>>
>>> So it fails in delete_v1_space_cache(), though it is worth noting that
>>> the
>>> FS we're talking about is actually using space_cache v2.
>>
>> Space cache v2, no wonder no v1 space cache.
>>
>>>
>>> Does it help? Shall I dig deeper?
>>
>> You're already at the point!
>>
>> Mind me to craft a fix with your signed-off-by?
>
> The problem is more complex than I thought, but still we at least have
> some workaround.
>
> Firstly, this happens when an old fs get v2 space cache enabled, but
> still has v1 space cache left.
>
> Newer v2 mount should cleanup v1 properly, but older kernel doesn't do
> the proper cleaning, thus left some v1 cache.
>
> Then we call btrfs balance on such old fs, leading to the -ENOENT error.
> We can't ignore the error, as we have no way to relocate such left over
> v1 cache (normally we delete it completely, but with v2 cache, we can't).
>
> So what I can do is only to add a warning message to the problem.
>
> To solve your problem, I also submitted a patch to btrfs-progs, to force
> v1 space cache cleaning even if the fs has v2 space cache enabled.
>
> Or, you can disable v2 space cache first, using "btrfs check
> --clear-space-cache v2" first, then "btrfs check --clear-space_cache
> v1", and finally mount the fs with "space_cache=v2" again.
>
> To verify there is no space cache v1 left, you can run the following
> command to verify:
>
> # btrfs ins dump-tree -t root <device> | grep EXTENT_DATA
>
> It should output nothing.
>
> Then please try if you can balance all your data.
>
> Thanks,
> Qu
>
>>
>> Thanks,
>> Qu
>>
>>>
>>> Regards,
>>>
>>> Stéphane.
>>>

next prev parent reply	other threads:[~2020-12-29  1:00 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-27 12:11 5.6-5.10 balance regression? Stéphane Lesimple
2020-12-27 13:11 ` David Arendt
2020-12-28  0:06   ` Qu Wenruo
2020-12-28  7:38     ` David Arendt
2020-12-28  7:48       ` Qu Wenruo
2020-12-28 17:43         ` Stéphane Lesimple
2020-12-28 19:58           ` Stéphane Lesimple
2020-12-28 23:39             ` Qu Wenruo
2020-12-29  0:44               ` Qu Wenruo
2020-12-29  0:59                 ` David Arendt [this message]
2020-12-29  4:36                   ` Qu Wenruo
2020-12-29  9:31                 ` Stéphane Lesimple
2020-12-29  9:42                 ` Martin Steigerwald

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6dc40a44-bc3b-ea9b-327b-e2700b2efd62@prnet.org \
    --to=admin@prnet.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=stephane_btrfs2@lesimple.fr \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox