From: "Tomáš Metelka" <tomas.metelka@metaliza.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Broken chunk tree - Was: Mount issue, mount /dev/sdc2: can't read superblock
Date: Sun, 30 Dec 2018 01:48:23 +0100 [thread overview]
Message-ID: <07b88bad-e1fa-7485-d410-ee261ace321c@metaliza.cz> (raw)
In-Reply-To: <8f59acfd-4d97-86d4-2063-25213e2770d0@gmx.com>
Ok, I've got it:-(
But just a few questions: I've tried (with btrfs-progs v4.19.1) to
recover files through btrfs restore -s -m -S -v -i ... and following
events occurred:
1) Just 1 "hard" error:
ERROR: cannot map block logical 117058830336 length 1073741824: -2
Error copying data for /mnt/...
(file which absence really doesn't pain me:-))
2) For 24 files a I got "too much loops" warning (U mean this: "if
(loops >= 0 && loops++ >= 1024) { ..."). I've always answered yes but
I'm afraid these files are corrupted (at least 2 of them seems corrupted).
How much bad is this? Does the error mentioned in #1 mean that it's the
only file which is totally lost? I can live without those 24 + 1 files
so if #1 and #2 would be the only errors then I could say the recovery
was successful ... but I'm afraid things aren't such easy:-)
Thanks
M.
Tomáš Metelka
Business & IT Analyst
Tel: +420 728 627 252
Email: tomas.metelka@metaliza.cz
On 24. 12. 18 15:19, Qu Wenruo wrote:
>
>
> On 2018/12/24 下午9:52, Tomáš Metelka wrote:
>> On 24. 12. 18 14:02, Qu Wenruo wrote:
>>> btrfs check --readonly output please.
>>>
>>> btrfs check --readonly is always the most reliable and detailed output
>>> for any possible recovery.
>>
>> This is very weird because it prints only:
>> ERROR: cannot open file system
>
> A new place to enhance ;)
>
>>
>> I've tried also "btrfs check -r 75152310272" but it only says:
>> parent transid verify failed on 75152310272 wanted 2488742 found 2488741
>> parent transid verify failed on 75152310272 wanted 2488742 found 2488741
>> Ignoring transid failure
>> ERROR: cannot open file system
>>
>> I've tried that because:
>> backup 3:
>> backup_tree_root: 75152310272 gen: 2488741 level: 1
>>
>>> Also kernel message for the mount failure could help.
>>
>> Sorry, my fault, I should start from this point:
>>
>> Dec 23 21:59:07 tisc5 kernel: [10319.442615] BTRFS: device fsid
>> be557007-42c9-4079-be16-568997e94cd9 devid 1 transid 2488742 /dev/loop0
>> Dec 23 22:00:49 tisc5 kernel: [10421.167028] BTRFS info (device loop0):
>> disk space caching is enabled
>> Dec 23 22:00:49 tisc5 kernel: [10421.167034] BTRFS info (device loop0):
>> has skinny extents
>> Dec 23 22:00:50 tisc5 kernel: [10421.807564] BTRFS critical (device
>> loop0): corrupt node: root=1 block=75150311424 slot=245, invalid NULL
>> node pointer
> This explains the problem.
>
> Your root tree has one node pointer which is not correct.
> For pointer it should never points to 0.
>
> This is pretty weird, at least some corruption pattern I have never seen.
>
> Since your tree root get corrupted, there isn't much thing we can do,
> but try to use older tree roots.
>
> You could go try all backup roots, starting from the newest backup (with
> highest generation), and check the backup root bytenr using:
> # btrfs check -r <backup root bytenr> <device>
>
> To see which one get least error, but normally the chance is near 0.
>
>> Dec 23 22:00:50 tisc5 kernel: [10421.807653] BTRFS error (device loop0):
>> failed to read block groups: -5
>> Dec 23 22:00:50 tisc5 kernel: [10421.877001] BTRFS error (device loop0):
>> open_ctree failed
>>
>>
>> So i tried to do:
>> 1) btrfs inspect-internal dump-super (with the snippet posted above)
>> 2) btrfs inspect-internal dump-tree -b 75150311424
>>
>> And it showed (header + snippet for items 243-248):
>> node 75150311424 level 1 items 249 free 244 generation 2488741 owner 2
>> fs uuid be557007-42c9-4079-be16-568997e94cd9
>> chunk uuid dbe69c7e-2d50-4001-af31-148c5475b48b
>> ...
>> key (14799519744 EXTENT_ITEM 4096) block 233423224832 (14247023) gen
>> 2484894
>> key (14811271168 EXTENT_ITEM 135168) block 656310272 (40058) gen 2488049
>
>
>> key (1505328190277054464 UNKNOWN.4 366981796979539968) block 0 (0) gen 0
>> key (0 UNKNOWN.0 1419267647995904) block 6468220747776 (394788864) gen
>> 7786775707648
>
> Pretty obviously, these two nodes are garbage.
> Something corrupted the memory at runtime, and we don't have runtime
> check against corruption yet.
>
> So IMHO, I think the problem is, some kernel code, either btrfs or other
> parts, corrupted the memory.
> And then btrfs fails to detect it, write it back to disk, and finally
> kernel get its chance to read the tree block from disk and finally
> caught the problem.
>
> I could add such check for node, but normally it needs
> CONFIG_BTRFS_FS_CHECK_INTEGRITY, so makes no sense for normal user.
>
>> key (12884901888 EXTENT_ITEM 24576) block 816693248 (49847) gen 2484931
>> key (14902849536 EXTENT_ITEM 131072) block 75135844352 (4585928) gen
>> 2488739
>>
>>
>> I looked at that numbers quite a while (also in hex) trying to figure
>> out what has happened (bit flips (it was on SSD), byte shifts (I
>> suspected bad CPU also ... because it has died after 2 months from
>> that)) and tried to guess "correct" values for that items ... but no
>> idea:-(
>
> I'm not that sure, unless you're super lucky (or unlucky in this case),
> or it will normally get caught by csum first.
>
>>
>> So this why I have asked about that log_root and whether there is a
>> chance to "log-replay things":-)
>
> For your case, definitely not related to log replay.
>
> Thanks,
> Qu
>
>>
>>
>> Thanks
>> M.
>
next prev parent reply other threads:[~2018-12-30 0:52 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-20 21:21 Mount issue, mount /dev/sdc2: can't read superblock Peter Chant
2018-12-21 22:25 ` Chris Murphy
2018-12-22 12:34 ` Peter Chant
2018-12-24 0:58 ` Chris Murphy
2018-12-24 2:00 ` Qu Wenruo
2018-12-24 11:36 ` Peter Chant
2018-12-24 11:31 ` Peter Chant
2018-12-24 12:02 ` Qu Wenruo
2018-12-24 12:48 ` Tomáš Metelka
2018-12-24 13:02 ` Qu Wenruo
2018-12-24 13:52 ` Tomáš Metelka
2018-12-24 14:19 ` Qu Wenruo
2018-12-30 0:48 ` Tomáš Metelka [this message]
2018-12-30 3:59 ` Broken chunk tree - Was: " Duncan
2018-12-30 4:38 ` Qu Wenruo
2018-12-24 23:20 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07b88bad-e1fa-7485-d410-ee261ace321c@metaliza.cz \
--to=tomas.metelka@metaliza.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox