From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Illia Bobyr <illia.bobyr@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: "parent transid verify failed" and mount usebackuproot does not seem to work
Date: Wed, 1 Jul 2020 18:48:41 +0800 [thread overview]
Message-ID: <39558ad7-dfb3-05f7-1583-181f76f2a93d@gmx.com> (raw)
In-Reply-To: <2f22bd0a-aa48-d0f1-04d0-cb130897249d@gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 4937 bytes --]
On 2020/7/1 下午6:16, Illia Bobyr wrote:
> On 6/30/2020 6:36 PM, Qu Wenruo wrote:
>> On 2020/7/1 上午3:41, Illia Bobyr wrote:
>>> Hi,
>>>
>>> I have a btrfs with bcache setup that failed during a boot yesterday.
>>> There is one SSD with bcache that is used as a cache for 3 btrfs HDDs.
>>>
>>> Reading through a number of discussions, I've decided to ask for advice here.
>>> Should I be running "btrfs check --recover"?
>>>
>>> The last message in the dmesg log is this one:
>>>
>>> Btrfs loaded, crc32c=crc32c-intel
>>> BTRFS: device label root devid 3 transid 138434 /dev/bcache2 scanned
>>> by btrfs (341)
>>> BTRFS: device label root devid 2 transid 138434 /dev/bcache1 scanned
>>> by btrfs (341)
>>> BTRFS: device label root devid 1 transid 138434 /dev/bcache0 scanned
>>> by btrfs (341)
>>> BTRFS info (device bcache0): disk space caching is enabled
>>> BTRFS info (device bcache0): has skinny extents
>>> BTRFS error (device bcache0): parent transid verify failed on
>>> 16984159518720 wanted 138414 found 138207
>>> BTRFS error (device bcache0): parent transid verify failed on
>>> 16984159518720 wanted 138414 found 138207
>>> BTRFS error (device bcache0): open_ctree failed
>> Looks like some tree blocks not written back correctly.
>>
>> Considering we don't have known write back related bugs with 5.6, I
>> guess bcache may be involved again?
>
> A bit more details: the system started to misbehave.
> Interactive session was saying that the main file system became read/only.
Any dmesg of that RO event?
That would be the most valuable info to help us to locate the bug and
fix it.
I guess there is something wrong before that, and by somehow it
corrupted the extent tree, breaking the life keeping COW of metadata and
screwed up everything.
> And then the SSH disconnected and did not reconnect any more.
> It did not seem to reboot correctly after I've pressed the reboot
> button, so I did a hard rebooted.
> And now it could not mount the root partition any more.
>>> Trying to mount it in the recovery mode does not seem to work:
>>>
>>> [...]
>>>
>>> I have tried booting using a live ISO with 5.8.0 kernel and btrfs v5.6.1
>>> from http://defender.exton.net/.
>>> After booting tried mounting the bcache using the same command as above.
>>> The only message in the console was "Killed".
>>> /dev/kmsg on the other hand lists messages very similar to the ones I've
>>> seen in the initramfs environment: https://pastebin.com/Vhy072Mx
>> It looks like there is a chance to recover, as there is a rootbackup
>> with newer generation.
>>
>> While tree-checker is rejecting the newer generation one.
>>
>> The kernel panic is caused by some corner error handling with root
>> backups cleanups.
>> We need to fix it anyway.
>>
>> In this case, I guess "btrfs ins dump-super -fFa" output would help to
>> show if it's possible to recover.
>
> Here is the output: https://pastebin.com/raw/DtJd813y
OK, the backup root is fine.
So this means, metadata COW is corrupted, which caused the transid mismatch.
>
>> Anyway, something looks strange.
>>
>> The backup roots have a newer generation while the super block is still
>> old doesn't look correct at all.
>
> Just in case, here is the output of "btrfs check", as suggested by "A L
> <mail@lechevalier.se>". It does not seem to contain any new information.
>
> parent transid verify failed on 16984014372864 wanted 138350 found 131117
> parent transid verify failed on 16984014405632 wanted 138350 found 131127
> parent transid verify failed on 16984013406208 wanted 138350 found 131112
> parent transid verify failed on 16984075436032 wanted 138384 found 131136
> parent transid verify failed on 16984075436032 wanted 138384 found 131136
> parent transid verify failed on 16984075436032 wanted 138384 found 131136
> Ignoring transid failure
> ERROR: child eb corrupted: parent bytenr=16984175853568 item=8 parent
> level=2 child level=0
> ERROR: failed to read block groups: Input/output error
Extent tree is completely screwed up, no wonder the transid error happens.
I don't believe it's reasonable possible to restore the fs to RW status.
The only remaining method left is btrfs-restore then.
> ERROR: cannot open file system
> Opening filesystem to check...
>
> As I was running the commands I have accidentally run the following command:
>
> btrfs inspect-internal dump-super -fFa >/dev/bcache0 2>&1
>
> Effectively overwriting the first 10kb of the partition :(
That's not a problem at all.
Btrfs reserves the first 0~1M space, so as long as you don't screw up
the super block at [64K, 68K) you're completely fine.
Thanks,
Qu
>
> Seems like the superblock starts at 64kb. So, I hope, this would not
> cause any more damage.
>
> P.S. Thanks a lot for your reply Qu Wenruo!
>
> Thank you,
> Illia
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2020-07-01 10:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-30 19:41 "parent transid verify failed" and mount usebackuproot does not seem to work Illia Bobyr
2020-07-01 1:36 ` Qu Wenruo
2020-07-01 10:16 ` Illia Bobyr
2020-07-01 10:48 ` Qu Wenruo [this message]
2020-07-01 21:36 ` Illia Bobyr
2020-07-01 23:50 ` Qu Wenruo
-- strict thread matches above, loose matches on Subject: below --
2020-06-30 19:26 Illia Bobyr
2020-06-30 19:55 ` Lukas Straub
2020-06-30 4:24 Illia Bobyr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=39558ad7-dfb3-05f7-1583-181f76f2a93d@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=illia.bobyr@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox