Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: "Tomáš Metelka" <tomas.metelka@metaliza.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Broken chunk tree - Was: Mount issue, mount /dev/sdc2: can't read superblock
Date: Sun, 30 Dec 2018 01:48:23 +0100	[thread overview]
Message-ID: <07b88bad-e1fa-7485-d410-ee261ace321c@metaliza.cz> (raw)
In-Reply-To: <8f59acfd-4d97-86d4-2063-25213e2770d0@gmx.com>

Ok, I've got it:-(

But just a few questions: I've tried (with btrfs-progs v4.19.1) to 
recover files through btrfs restore -s -m -S -v -i ... and following 
events occurred:

1) Just 1 "hard" error:
ERROR: cannot map block logical 117058830336 length 1073741824: -2
Error copying data for /mnt/...
(file which absence really doesn't pain me:-))

2) For 24 files a I got "too much loops" warning (U mean this: "if 
(loops >= 0 && loops++ >= 1024) { ..."). I've always answered yes but 
I'm afraid these files are corrupted (at least 2 of them seems corrupted).

How much bad is this? Does the error mentioned in #1 mean that it's the 
only file which is totally lost? I can live without those 24 + 1 files 
so if #1 and #2 would be the only errors then I could say the recovery 
was successful ... but I'm afraid things aren't such easy:-)

Thanks
M.


   Tomáš Metelka
   Business & IT Analyst

   Tel: +420 728 627 252
   Email: tomas.metelka@metaliza.cz



On 24. 12. 18 15:19, Qu Wenruo wrote:
> 
> 
> On 2018/12/24 下午9:52, Tomáš Metelka wrote:
>> On 24. 12. 18 14:02, Qu Wenruo wrote:
>>> btrfs check --readonly output please.
>>>
>>> btrfs check --readonly is always the most reliable and detailed output
>>> for any possible recovery.
>>
>> This is very weird because it prints only:
>> ERROR: cannot open file system
> 
> A new place to enhance ;)
> 
>>
>> I've tried also "btrfs check -r 75152310272" but it only says:
>> parent transid verify failed on 75152310272 wanted 2488742 found 2488741
>> parent transid verify failed on 75152310272 wanted 2488742 found 2488741
>> Ignoring transid failure
>> ERROR: cannot open file system
>>
>> I've tried that because:
>>      backup 3:
>>   backup_tree_root:    75152310272    gen: 2488741 level: 1
>>
>>> Also kernel message for the mount failure could help.
>>
>> Sorry, my fault, I should start from this point:
>>
>> Dec 23 21:59:07 tisc5 kernel: [10319.442615] BTRFS: device fsid
>> be557007-42c9-4079-be16-568997e94cd9 devid 1 transid 2488742 /dev/loop0
>> Dec 23 22:00:49 tisc5 kernel: [10421.167028] BTRFS info (device loop0):
>> disk space caching is enabled
>> Dec 23 22:00:49 tisc5 kernel: [10421.167034] BTRFS info (device loop0):
>> has skinny extents
>> Dec 23 22:00:50 tisc5 kernel: [10421.807564] BTRFS critical (device
>> loop0): corrupt node: root=1 block=75150311424 slot=245, invalid NULL
>> node pointer
> This explains the problem.
> 
> Your root tree has one node pointer which is not correct.
> For pointer it should never points to 0.
> 
> This is pretty weird, at least some corruption pattern I have never seen.
> 
> Since your tree root get corrupted, there isn't much thing we can do,
> but try to use older tree roots.
> 
> You could go try all backup roots, starting from the newest backup (with
> highest generation), and check the backup root bytenr using:
> # btrfs check -r <backup root bytenr> <device>
> 
> To see which one get least error, but normally the chance is near 0.
> 
>> Dec 23 22:00:50 tisc5 kernel: [10421.807653] BTRFS error (device loop0):
>> failed to read block groups: -5
>> Dec 23 22:00:50 tisc5 kernel: [10421.877001] BTRFS error (device loop0):
>> open_ctree failed
>>
>>
>> So i tried to do:
>> 1) btrfs inspect-internal dump-super (with the snippet posted above)
>> 2) btrfs inspect-internal dump-tree -b 75150311424
>>
>> And it showed (header + snippet for items 243-248):
>> node 75150311424 level 1 items 249 free 244 generation 2488741 owner 2
>> fs uuid be557007-42c9-4079-be16-568997e94cd9
>> chunk uuid dbe69c7e-2d50-4001-af31-148c5475b48b
>> ...
>>    key (14799519744 EXTENT_ITEM 4096) block 233423224832 (14247023) gen
>> 2484894
>>    key (14811271168 EXTENT_ITEM 135168) block 656310272 (40058) gen 2488049
> 
> 
>>    key (1505328190277054464 UNKNOWN.4 366981796979539968) block 0 (0) gen 0
>>    key (0 UNKNOWN.0 1419267647995904) block 6468220747776 (394788864) gen
>> 7786775707648
> 
> Pretty obviously, these two nodes are garbage.
> Something corrupted the memory at runtime, and we don't have runtime
> check against corruption yet.
> 
> So IMHO, I think the problem is, some kernel code, either btrfs or other
> parts, corrupted the memory.
> And then btrfs fails to detect it, write it back to disk, and finally
> kernel get its chance to read the tree block from disk and finally
> caught the problem.
> 
> I could add such check for node, but normally it needs
> CONFIG_BTRFS_FS_CHECK_INTEGRITY, so makes no sense for normal user.
> 
>>    key (12884901888 EXTENT_ITEM 24576) block 816693248 (49847) gen 2484931
>>    key (14902849536 EXTENT_ITEM 131072) block 75135844352 (4585928) gen
>> 2488739
>>
>>
>> I looked at that numbers quite a while (also in hex) trying to figure
>> out what has happened (bit flips (it was on SSD), byte shifts (I
>> suspected bad CPU also ... because it has died after 2 months from
>> that)) and tried to guess "correct" values for that items ... but no
>> idea:-(
> 
> I'm not that sure, unless you're super lucky (or unlucky in this case),
> or it will normally get caught by csum first.
> 
>>
>> So this why I have asked about that log_root and whether there is a
>> chance to "log-replay things":-)
> 
> For your case, definitely not related to log replay.
> 
> Thanks,
> Qu
> 
>>
>>
>> Thanks
>> M.
> 

  reply	other threads:[~2018-12-30  0:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-20 21:21 Mount issue, mount /dev/sdc2: can't read superblock Peter Chant
2018-12-21 22:25 ` Chris Murphy
2018-12-22 12:34   ` Peter Chant
2018-12-24  0:58     ` Chris Murphy
2018-12-24  2:00       ` Qu Wenruo
2018-12-24 11:36         ` Peter Chant
2018-12-24 11:31       ` Peter Chant
2018-12-24 12:02         ` Qu Wenruo
2018-12-24 12:48           ` Tomáš Metelka
2018-12-24 13:02             ` Qu Wenruo
2018-12-24 13:52               ` Tomáš Metelka
2018-12-24 14:19                 ` Qu Wenruo
2018-12-30  0:48                   ` Tomáš Metelka [this message]
2018-12-30  3:59                     ` Broken chunk tree - Was: " Duncan
2018-12-30  4:38                     ` Qu Wenruo
2018-12-24 23:20         ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07b88bad-e1fa-7485-d410-ee261ace321c@metaliza.cz \
    --to=tomas.metelka@metaliza.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox