* Unable to mount degraded RAID5
@ 2016-07-04 18:09 Tomáš Hrdina
2016-07-04 18:41 ` Chris Murphy
0 siblings, 1 reply; 25+ messages in thread
From: Tomáš Hrdina @ 2016-07-04 18:09 UTC (permalink / raw)
To: linux-btrfs
Hello,
one of my 3 disks failed in RAID5. After that, fs is unable to mount.
Any help on what to try next would be appreciated.
sudo btrfs version
btrfs-progs v4.6.1
-- I installed 4.6.1 just now. I ran rescue on 4.4
uname -a
Linux uncik-srv 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux
sudo mount -t btrfs -o ro,recovery /dev/sdc /shares
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
sudo btrfs rescue chunk-recover /dev/sdc
Scanning: 2517981163520 in dev0, 3166618750976 in dev1Segmentation fault
(core dumped)
sudo btrfs filesystem show /dev/sda
warning, device 3 is missing
checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379
checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379
bytenr mismatch, want=12678831570944, have=10160133442474442752
Couldn't read chunk tree
Label: none uuid: 2dab74bb-fc73-4c47-a413-a55840f6f71e
Total devices 3 FS bytes used 3.80TiB
devid 1 size 3.64TiB used 1.92TiB path /dev/sdb
devid 2 size 3.64TiB used 1.92TiB path /dev/sda
*** Some devices missing
sudo btrfs restore /dev/sda /mnt
warning, device 3 is missing
checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379
checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379
bytenr mismatch, want=12678831570944, have=10160133442474442752
Couldn't read chunk tree
Could not open root, trying backup super
warning, device 3 is missing
warning, device 1 is missing
bytenr mismatch, want=12678831570944, have=0
Couldn't read chunk tree
Could not open root, trying backup super
warning, device 3 is missing
warning, device 1 is missing
bytenr mismatch, want=12678831570944, have=0
Couldn't read chunk tree
Could not open root, trying backup super
sudo btrfs check /dev/sda
warning, device 3 is missing
checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379
checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379
bytenr mismatch, want=12678831570944, have=10160133442474442752
Couldn't read chunk tree
Couldn't open file system
Log
http://sebsauvage.net/paste/?236c3f6f238dbf26#4kM3tx+CjlA8ke9yMH+gD/QDsnjNnBK2i5Do4CXwD04=
Thank you
Tomas
---
Tato zpráva byla zkontrolována na viry programem Avast Antivirus.
https://www.avast.com/antivirus
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: Unable to mount degraded RAID5 2016-07-04 18:09 Unable to mount degraded RAID5 Tomáš Hrdina @ 2016-07-04 18:41 ` Chris Murphy [not found] ` <95f58623-95a4-b5d2-fa3a-bfb957840a31@gmail.com> 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-04 18:41 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Btrfs BTRFS On Mon, Jul 4, 2016 at 12:09 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > sudo mount -t btrfs -o ro,recovery /dev/sdc /shares > mount: wrong fs type, bad option, bad superblock on /dev/sdc, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. It needs -o degraded, not recovery. Do not use 'btrfs replace' on raid5 right now, it seems to be unreliable. If you do not have a backup of this raid5 I highly recommend that you mount -o ro,degraded and make a backup now before you do anything else to the file system. Degraded raid56 is really fragile on Btrfs, and still broadly considered experimental (or at least has enough caveats and gotchas that it's really just for expert usage). > sudo btrfs rescue chunk-recover /dev/sdc > Scanning: 2517981163520 in dev0, 3166618750976 in dev1Segmentation fault > (core dumped) This is not a good idea. Avoid randomly trying things, especially things that have absolutely nothing to do with your problem. > > sudo btrfs filesystem show /dev/sda > warning, device 3 is missing > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > bytenr mismatch, want=12678831570944, have=10160133442474442752 > Couldn't read chunk tree > Label: none uuid: 2dab74bb-fc73-4c47-a413-a55840f6f71e > Total devices 3 FS bytes used 3.80TiB > devid 1 size 3.64TiB used 1.92TiB path /dev/sdb > devid 2 size 3.64TiB used 1.92TiB path /dev/sda > *** Some devices missing > > sudo btrfs restore /dev/sda /mnt > warning, device 3 is missing > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > bytenr mismatch, want=12678831570944, have=10160133442474442752 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 3 is missing > warning, device 1 is missing It's concerning that this says device 1 is missing when 'btrfs fi show' clearly shows it as not missing. There's a bug here somewhere, either show is wrong, or restore is wrong. That restore sees two missing devices means it probably can't reconstruct from parity, and there will be csum errors. Hopefully that's all that's going on right now. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <95f58623-95a4-b5d2-fa3a-bfb957840a31@gmail.com>]
* Re: Unable to mount degraded RAID5 [not found] ` <95f58623-95a4-b5d2-fa3a-bfb957840a31@gmail.com> @ 2016-07-04 19:01 ` Chris Murphy 2016-07-04 19:11 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-04 19:01 UTC (permalink / raw) To: Tomáš Hrdina, Btrfs BTRFS On Mon, Jul 4, 2016 at 12:54 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > Degraded gives same result: > > sudo mount -t btrfs -o ro,degraded /dev/sda /shares > mount: wrong fs type, bad option, bad superblock on /dev/sda, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. > That's bad. What are the kernel messages for the mount attempt? Do the following and report back all the results. # btrfs dev scan # btrfs fi show # blkid # btrfs check /dev/sda Also, make sure you reply to the btrfs list also and not just to me personally. And also don't top post or it'll annoy some people on the list. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 19:01 ` Chris Murphy @ 2016-07-04 19:11 ` Tomáš Hrdina 2016-07-04 20:43 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-04 19:11 UTC (permalink / raw) To: Chris Murphy, Btrfs BTRFS Result from dmesg: http://sebsauvage.net/paste/?4e8e95b5eafbf675#ybToBzZ/WAoRjjugeH6N2YXZKEBlswaNI/J41GBmFYU= sudo btrfs dev scan Scanning for Btrfs filesystems sudo btrfs fi show warning, device 3 is missing checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 bytenr mismatch, want=12678831570944, have=10160133442474442752 Couldn't read chunk tree Label: none uuid: 2dab74bb-fc73-4c47-a413-a55840f6f71e Total devices 3 FS bytes used 3.80TiB devid 1 size 3.64TiB used 1.92TiB path /dev/sdb devid 2 size 3.64TiB used 1.92TiB path /dev/sda *** Some devices missing sudo blkid /dev/sda: UUID="2dab74bb-fc73-4c47-a413-a55840f6f71e" UUID_SUB="7262d027-202a-4b29-aaf8-0bb8cf107d4d" TYPE="btrfs" /dev/sdb: UUID="2dab74bb-fc73-4c47-a413-a55840f6f71e" UUID_SUB="afa4cff8-d037-472e-9eb1-070d331b6a20" TYPE="btrfs" /dev/sdc1: UUID="6731963f-5bf9-458c-a44d-6f0bbea38357" TYPE="ext2" PARTUUID="ccb00f7e-01" /dev/sdc5: UUID="522XeT-jWNr-O0oG-HeHV-akQu-AR3P-fzeUAy" TYPE="LVM2_member" PARTUUID="ccb00f7e-05" /dev/sdd1: LABEL="Verbatim HDD" UUID="70947DE1947DAA6C" TYPE="ntfs" PARTUUID="58334162-01" /dev/mapper/uncik--srv--vg-root: UUID="15b50e38-d4cc-454c-ba0f-80adbb4cd4e1" TYPE="ext4" /dev/mapper/uncik--srv--vg-swap_1: UUID="c9db5981-acc3-43bb-a209-a213b80cc9cb" TYPE="swap" sudo btrfs check /dev/sda warning, device 3 is missing checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 bytenr mismatch, want=12678831570944, have=10160133442474442752 Couldn't read chunk tree Couldn't open file system Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Monday, July 04, 2016 9:01PM *To:* Tomáš Hrdina, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 On Mon, Jul 4, 2016 at 12:54 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > Degraded gives same result: > > sudo mount -t btrfs -o ro,degraded /dev/sda /shares > mount: wrong fs type, bad option, bad superblock on /dev/sda, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. > That's bad. What are the kernel messages for the mount attempt? Do the following and report back all the results. # btrfs dev scan # btrfs fi show # blkid # btrfs check /dev/sda Also, make sure you reply to the btrfs list also and not just to me personally. And also don't top post or it'll annoy some people on the list. --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 19:11 ` Tomáš Hrdina @ 2016-07-04 20:43 ` Chris Murphy 2016-07-04 21:10 ` Tomáš Hrdina 2016-07-05 3:48 ` Andrei Borzenkov 0 siblings, 2 replies; 25+ messages in thread From: Chris Murphy @ 2016-07-04 20:43 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Mon, Jul 4, 2016 at 1:11 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > Result from dmesg: > http://sebsauvage.net/paste/?4e8e95b5eafbf675#ybToBzZ/WAoRjjugeH6N2YXZKEBlswaNI/J41GBmFYU= [10849.041749] BTRFS info (device sda): allowing degraded mounts [10849.041754] BTRFS info (device sda): disk space caching is enabled [10849.041756] BTRFS: has skinny extents [10849.090553] BTRFS error (device sda): bad tree block start 10160120763642806272 12678831570944 [10849.090676] BTRFS error (device sda): bad tree block start 10160120763642806272 12678831570944 [10849.090700] BTRFS: failed to read chunk tree on sda [10849.100153] BTRFS: open_ctree failed Try 'mount -o ro,degraded,recovery > > sudo btrfs check /dev/sda > warning, device 3 is missing > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > bytenr mismatch, want=12678831570944, have=10160133442474442752 > Couldn't read chunk tree > Couldn't open file system Want and have are way far apart. If the mount command above still fails then I'd like to see: # btrfs-show-super -fa /dev/sda # btrfs-show-super -fa /dev/sdb Pretty much look for any discrepancies in generation, root and chunk_root addresses, both in the main part of the super as well as in the backups. # btrfs-find-root /dev/sda Maybe it's possible to use a different tree to get it mounted. I don't know what happened but merely a failing device should not either break checksums or lose the ability to mount the proper tree; but for sure one of the backups should work. Have you done a scrub on this file system and do you know if anything was fixed or if it always found no problem? -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 20:43 ` Chris Murphy @ 2016-07-04 21:10 ` Tomáš Hrdina 2016-07-04 22:42 ` Chris Murphy 2016-07-05 3:48 ` Andrei Borzenkov 1 sibling, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-04 21:10 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS One disk got reallocated sectors in SMART, so i did extended smart test and it passed. Then I ran scrub and it found nothing. Everything was ok. After this, it was started another extended smart test, weekly scheduled, and I thing that sometime during this, disk went offline. Maybe problem can be, that another disk have smart stat: Reported Uncorrect on 1. sudo mount -o ro,degraded,recovery /dev/sda /shares mount: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. sudo btrfs-show-super -fa /dev/sda and sdb http://sebsauvage.net/paste/?39c73a3440b2e903#WZnUJXNFPNz/fFuOK3QquVeOWQUopcCl0JabtuYMWew= sudo btrfs-find-root /dev/sda warning, device 3 is missing Couldn't read chunk tree ERROR: open ctree failed Thank you Tomáš ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Monday, July 04, 2016 10:43PM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 On Mon, Jul 4, 2016 at 1:11 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > Result from dmesg: > http://sebsauvage.net/paste/?4e8e95b5eafbf675#ybToBzZ/WAoRjjugeH6N2YXZKEBlswaNI/J41GBmFYU= [10849.041749] BTRFS info (device sda): allowing degraded mounts [10849.041754] BTRFS info (device sda): disk space caching is enabled [10849.041756] BTRFS: has skinny extents [10849.090553] BTRFS error (device sda): bad tree block start 10160120763642806272 12678831570944 [10849.090676] BTRFS error (device sda): bad tree block start 10160120763642806272 12678831570944 [10849.090700] BTRFS: failed to read chunk tree on sda [10849.100153] BTRFS: open_ctree failed Try 'mount -o ro,degraded,recovery > > sudo btrfs check /dev/sda > warning, device 3 is missing > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 > bytenr mismatch, want=12678831570944, have=10160133442474442752 > Couldn't read chunk tree > Couldn't open file system Want and have are way far apart. If the mount command above still fails then I'd like to see: # btrfs-show-super -fa /dev/sda # btrfs-show-super -fa /dev/sdb Pretty much look for any discrepancies in generation, root and chunk_root addresses, both in the main part of the super as well as in the backups. # btrfs-find-root /dev/sda Maybe it's possible to use a different tree to get it mounted. I don't know what happened but merely a failing device should not either break checksums or lose the ability to mount the proper tree; but for sure one of the backups should work. Have you done a scrub on this file system and do you know if anything was fixed or if it always found no problem? --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 21:10 ` Tomáš Hrdina @ 2016-07-04 22:42 ` Chris Murphy 2016-07-04 22:59 ` Chris Murphy 2016-07-05 7:12 ` Tomáš Hrdina 0 siblings, 2 replies; 25+ messages in thread From: Chris Murphy @ 2016-07-04 22:42 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Mon, Jul 4, 2016 at 3:10 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > http://sebsauvage.net/paste/?39c73a3440b2e903#WZnUJXNFPNz/fFuOK3QquVeOWQUopcCl0JabtuYMWew= Both backup 0 and 1 have bad information for backup_fs_root. backup_fs_root: 0 gen: 0 level: 0 Presumably it automatically tries backup 2 or 3 even though they have older generations but I'm not sure. > sudo btrfs-find-root /dev/sda > warning, device 3 is missing > Couldn't read chunk tree > ERROR: open ctree failed I'm gonna guess the system chunk is bad or damaged somehow and therefore there's no way to get to the chunk tree. What do you get for: # btrfs-debug-tree -d /dev/sda -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 22:42 ` Chris Murphy @ 2016-07-04 22:59 ` Chris Murphy 2016-07-05 7:12 ` Tomáš Hrdina 1 sibling, 0 replies; 25+ messages in thread From: Chris Murphy @ 2016-07-04 22:59 UTC (permalink / raw) To: Chris Murphy; +Cc: Tomáš Hrdina, Btrfs BTRFS I just tried btrfs rescue chunk-recover (btrfs-progs 4.6) on new Btrfs, 3x raid5 with 1 dev missing. I get: [root@f24s ~]# btrfs rescue chunk-recover /dev/VG/2 Scanning: DONE in dev0, DONE in dev1 open with broken chunk error Chunk tree recovery failed So I don't think rescue chunk-recover can work degraded. At least, it's not working now, and if it isn't meant to work it probably should fail before it does the scanning, which takes a long time. I filed a bug: https://bugzilla.kernel.org/show_bug.cgi?id=121471 But in my case things still continue to work with btrfs-find-tree and degraded mount works OK, so off hand I don't think the rescue chunk-recover made things worse. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 22:42 ` Chris Murphy 2016-07-04 22:59 ` Chris Murphy @ 2016-07-05 7:12 ` Tomáš Hrdina 1 sibling, 0 replies; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-05 7:12 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS sudo btrfs-debug-tree -d /dev/sdc btrfs-progs v4.6.1 warning, device 3 is missing checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 checksum verify failed on 12678831570944 found 3DC57E3E wanted 771D2379 bytenr mismatch, want=12678831570944, have=10160133442474442752 Couldn't read chunk tree ERROR: unable to open /dev/sdc Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Tuesday, July 05, 2016 12:42AM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 On Mon, Jul 4, 2016 at 3:10 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > http://sebsauvage.net/paste/?39c73a3440b2e903#WZnUJXNFPNz/fFuOK3QquVeOWQUopcCl0JabtuYMWew= Both backup 0 and 1 have bad information for backup_fs_root. backup_fs_root: 0 gen: 0 level: 0 Presumably it automatically tries backup 2 or 3 even though they have older generations but I'm not sure. > sudo btrfs-find-root /dev/sda > warning, device 3 is missing > Couldn't read chunk tree > ERROR: open ctree failed I'm gonna guess the system chunk is bad or damaged somehow and therefore there's no way to get to the chunk tree. What do you get for: # btrfs-debug-tree -d /dev/sda --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-04 20:43 ` Chris Murphy 2016-07-04 21:10 ` Tomáš Hrdina @ 2016-07-05 3:48 ` Andrei Borzenkov 2016-07-05 15:13 ` Chris Murphy 1 sibling, 1 reply; 25+ messages in thread From: Andrei Borzenkov @ 2016-07-05 3:48 UTC (permalink / raw) To: Chris Murphy, Tomáš Hrdina; +Cc: Btrfs BTRFS 04.07.2016 23:43, Chris Murphy пишет: > > Have you done a scrub on this file system and do you know if anything > was fixed or if it always found no problem? > scrub on degraded RAID5 cannot fix anything by definition, because even if scrub finds discrepancies, it does not have enough data to reconstruct them. I would actually avoid it - the worst that can happen if it attempts to replace remaining data with something faked. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-05 3:48 ` Andrei Borzenkov @ 2016-07-05 15:13 ` Chris Murphy 2016-07-05 18:40 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-05 15:13 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Chris Murphy, Tomáš Hrdina, Btrfs BTRFS On Mon, Jul 4, 2016 at 9:48 PM, Andrei Borzenkov <arvidjaar@gmail.com> wrote: > 04.07.2016 23:43, Chris Murphy пишет: >> >> Have you done a scrub on this file system and do you know if anything >> was fixed or if it always found no problem? >> > > scrub on degraded RAID5 cannot fix anything by definition, Right. In this case, he can't mount, so he can't do a scrub. My concise question could be confusing in another situation as suggesting he should do a scrub now, but I was asking if he had ever done a scrub. I was wondering if maybe he's run into this scrub problem where a data strip is wrong but gets fixed from good parity and is then promptly overwritten with wrongly computed parity. That leads to this same kind of checksum errors when degraded because the wrong parity results in wrong reconstruction of data. But that's not the case here it seems. So, how is it this healthy, functioning raid5 totally implodes like this with checksum errors just because of a single device degraded? There are no device read errors or link resets in the kernel messages. It seems to be a weakness of the chunk tree again, which at least Qu has mentioned before. >because even > if scrub finds discrepancies, it does not have enough data to > reconstruct them. I would actually avoid it - the worst that can happen > if it attempts to replace remaining data with something faked. At the moment I would like all of the debugging tools to have a flag to force ignoring checksum checks. Right now they fail on checksum mismatch. Instead I'd rather see the output ignoring checksum mismatches, but somehow indicate suspicious information because of a checksum mismatch. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-05 15:13 ` Chris Murphy @ 2016-07-05 18:40 ` Tomáš Hrdina 2016-07-05 23:19 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-05 18:40 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS I don't know, if it would be good idea, but my disk, which disconnected is connected again. Maybe it could help in getting data to the right state, so other two disk could be mounted alone. But don't know, if it would stay connected for some work. Or if it would make things even worst. Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Tuesday, July 05, 2016 5:13PM *To:* Andrei Borzenkov *Cc:* Chris Murphy, Tomáš Hrdina, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 On Mon, Jul 4, 2016 at 9:48 PM, Andrei Borzenkov <arvidjaar@gmail.com> wrote: > 04.07.2016 23:43, Chris Murphy пишет: >> >> Have you done a scrub on this file system and do you know if anything >> was fixed or if it always found no problem? >> >> > scrub on degraded RAID5 cannot fix anything by definition, Right. In this case, he can't mount, so he can't do a scrub. My concise question could be confusing in another situation as suggesting he should do a scrub now, but I was asking if he had ever done a scrub. I was wondering if maybe he's run into this scrub problem where a data strip is wrong but gets fixed from good parity and is then promptly overwritten with wrongly computed parity. That leads to this same kind of checksum errors when degraded because the wrong parity results in wrong reconstruction of data. But that's not the case here it seems. So, how is it this healthy, functioning raid5 totally implodes like this with checksum errors just because of a single device degraded? There are no device read errors or link resets in the kernel messages. It seems to be a weakness of the chunk tree again, which at least Qu has mentioned before. > because even > if scrub finds discrepancies, it does not have enough data to > reconstruct them. I would actually avoid it - the worst that can happen > if it attempts to replace remaining data with something faked. At the moment I would like all of the debugging tools to have a flag to force ignoring checksum checks. Right now they fail on checksum mismatch. Instead I'd rather see the output ignoring checksum mismatches, but somehow indicate suspicious information because of a checksum mismatch. --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-05 18:40 ` Tomáš Hrdina @ 2016-07-05 23:19 ` Chris Murphy 2016-07-06 8:07 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-05 23:19 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Tue, Jul 5, 2016 at 12:40 PM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > I don't know, if it would be good idea, but my disk, which disconnected > is connected again. Maybe it could help in getting data to the right > state, so other two disk could be mounted alone. But don't know, if it > would stay connected for some work. Or if it would make things even worst. I'd stick to the read only commands: btrfs check btrfs-debug-tree -d btrfs-find-root Also, I'm surprised I didn't ask (seeing as I'm on a rampage about this these days) smartctl -l scterc /dev/sdX ## for each drive smartcl -a /dev/sdX ## for each drive cat /sys/block/sdX/device/timeout ## for each drive We should find out if there are bad sectors, and if they can even be properly corrected by Btrfs self healing mechanism. The normal default on Linux prevents this with consumer drives. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-05 23:19 ` Chris Murphy @ 2016-07-06 8:07 ` Tomáš Hrdina 2016-07-06 16:08 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-06 8:07 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Now with 3 disks: sudo btrfs check /dev/sda parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 Checking filesystem on /dev/sda UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e checking extents parent transid verify failed on 7009468874752 wanted 70180 found 70133 parent transid verify failed on 7009468874752 wanted 70180 found 70133 checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC bytenr mismatch, want=7009468874752, have=65536 parent transid verify failed on 7008859045888 wanted 70175 found 70133 parent transid verify failed on 7008859045888 wanted 70175 found 70133 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 bytenr mismatch, want=7008859045888, have=65536 parent transid verify failed on 7008899547136 wanted 70175 found 70133 parent transid verify failed on 7008899547136 wanted 70175 found 70133 checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 parent transid verify failed on 7008899547136 wanted 70175 found 70133 Ignoring transid failure leaf parent key incorrect 7008899547136 bad block 7008899547136 Errors found in extent allocation tree or chunk allocation parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 sudo btrfs-debug-tree -d /dev/sdc http://sebsauvage.net/paste/?d690b2c9d130008d#cni3fnKUZ7Y/oaXm+nsOw0afoWDFXNl26eC+vbJmcRA= sudo btrfs-find-root /dev/sdc parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 Superblock thinks the generation is 70182 Superblock thinks the level is 1 Found tree root at 6062830010368 gen 70182 level 1 Well block 6062434418688(gen: 70181 level: 1) seems good, but generation/level doesn't match, want gen: 70182 level: 1 Well block 6062497202176(gen: 69186 level: 0) seems good, but generation/level doesn't match, want gen: 70182 level: 1 Well block 6062470332416(gen: 69186 level: 0) seems good, but generation/level doesn't match, want gen: 70182 level: 1 sudo smartctl -l scterc /dev/sda smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control: Read: Disabled Write: Disabled sudo smartctl -l scterc /dev/sdb smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) sudo smartctl -l scterc /dev/sdc smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control: Read: Disabled Write: Disabled sudo smartcl -a /dev/sdx http://sebsauvage.net/paste/?aab1d282ceb1e1cf#auxFRkK5GCW8j1gR7mwgzR1z92Qn9oqtc6EEC2C6sEE= cat /sys/block/sda/device/timeout 30 cat /sys/block/sdb/device/timeout 30 cat /sys/block/sdc/device/timeout 30 Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Wednesday, July 06, 2016 1:19AM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 btrfs check --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-06 8:07 ` Tomáš Hrdina @ 2016-07-06 16:08 ` Chris Murphy 2016-07-06 17:50 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-06 16:08 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Wed, Jul 6, 2016 at 2:07 AM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > Now with 3 disks: > > sudo btrfs check /dev/sda > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > bytenr mismatch, want=7008807157760, have=65536 > Checking filesystem on /dev/sda > UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e > checking extents > parent transid verify failed on 7009468874752 wanted 70180 found 70133 > parent transid verify failed on 7009468874752 wanted 70180 found 70133 > checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC > checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC > bytenr mismatch, want=7009468874752, have=65536 > parent transid verify failed on 7008859045888 wanted 70175 found 70133 > parent transid verify failed on 7008859045888 wanted 70175 found 70133 > checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 > checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 > bytenr mismatch, want=7008859045888, have=65536 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > Ignoring transid failure > leaf parent key incorrect 7008899547136 > bad block 7008899547136 > Errors found in extent allocation tree or chunk allocation > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > bytenr mismatch, want=7009074167808, have=65536 Ok much better than before, these all seem sane with a limited number of problems. Maybe --repair can fix it, but don't do that yet. > sudo btrfs-debug-tree -d /dev/sdc > http://sebsauvage.net/paste/?d690b2c9d130008d#cni3fnKUZ7Y/oaXm+nsOw0afoWDFXNl26eC+vbJmcRA= OK good, so now it finds the chunk tree OK. This is good news. I would try to mount it ro first, if you need to make or refresh a backup. So in order: mount -o ro mount -o ro,recovery If those don't work lets see what the user and kernel errors are. > > > sudo btrfs-find-root /dev/sdc > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > Superblock thinks the generation is 70182 > Superblock thinks the level is 1 > Found tree root at 6062830010368 gen 70182 level 1 > Well block 6062434418688(gen: 70181 level: 1) seems good, but > generation/level doesn't match, want gen: 70182 level: 1 > Well block 6062497202176(gen: 69186 level: 0) seems good, but > generation/level doesn't match, want gen: 70182 level: 1 > Well block 6062470332416(gen: 69186 level: 0) seems good, but > generation/level doesn't match, want gen: 70182 level: 1 This is also a good sign that you can probably get btrfs rescue to work and point it to one of these older tree roots, if mount won't work. > > > sudo smartctl -l scterc /dev/sda > smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: > Read: Disabled > Write: Disabled > > > sudo smartctl -l scterc /dev/sdb > smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: > Read: 70 (7.0 seconds) > Write: 70 (7.0 seconds) > > > sudo smartctl -l scterc /dev/sdc > smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: > Read: Disabled > Write: Disabled There's good news and bad news. The good news is all the drives support SCT ERC. The bad news is two of the drives have the wrong setting for raid1+, including raid5. Issue: smartctl -l scterc,70,70 /dev/sdX #for each drive This is not a persistent setting. The drive being powered off (maybe even reset) will revert the setting to drive default. Some people use a udev rule to set this during startup. I think it can also be done with a systemd unit. You'd want to specify the drives by id, wwn if available, so that it's always consistent across boots. The point of this setting is to force the drive to give up on errors quickly, allowing Btrfs in this case to be informed of the exact problem (media error and what sector) so that Btrfs can reconstruct the data from parity and then fix the bad sector(s). In your current configuration the fixup can't happen, so problems start to accumulate. > sudo smartcl -a /dev/sdx > http://sebsauvage.net/paste/?aab1d282ceb1e1cf#auxFRkK5GCW8j1gR7mwgzR1z92Qn9oqtc6EEC2C6sEE= sudo smartctl -a /dev/sda === START OF INFORMATION SECTION === Model Family: Seagate NAS HDD Device Model: ST4000VN000-1H4168 Serial Number: Z302YVSZ 5 Reallocated_Sector_Ct 0x0033 089 089 010 Pre-fail Always - 14648 That's too many reallocated sectors. The good news is none are pending. But for a NAS drive I think this is too high, get it replaced under warranty. It certainly means that the unrecoverable read spec for this particular drive is being busted so they should replace it without question. It's possible this value is high by a factor of 8 if they're counting 512 byte logical sectors, where the actual physical sector is 4096 bytes. So it might not be as big of a problem as it seems, but it's still busted the spec. sudo smartctl -a /dev/sdb === START OF INFORMATION SECTION === Model Family: Seagate NAS HDD Device Model: ST4000VN000-2AH166 Serial Number: WDH00SM8 LU WWN Device Id: 5 000c50 09bbd3af2 Error 1 occurred at disk power-on lifetime: 453 hours (18 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 This drive has recently experienced an explicit read error. That probably was fixed by Btrfs 18 days ago, if you have logs going back that long you'd likely see a fixup for this same sector LBA value. /dev/sdc looks OK. What's interesting looking at all smartctl outputs is that all three are NAS models of Seagate but *two* of them do not have SCT ERC enabled by default. That is very eyebrow raising as it relates to the potential spread of misconfigurations of RAID. Device Model: ST4000VN000-1H4168 Device Model: ST4000VN000-2AH166 ## this one has SCT ERC set to 70 deciseconds Device Model: ST4000VN000-1H4168 Seems like a bad idea for a NAS drive to default to SCT ERC disabled, I would expect the overwhelming use case for NAS drives will be raid1, 5, or 6, all of which need SCT ERC enabled. Very weird choice by Seagate in my opinion. Anyway, you should enable this on the other two drives. That way there are fast error recoveries. If it turns out Btrfs can't reconstruct something upon error, we can deal with that later. The main thing is you want to get this raid5 as healthy as possible before the previously failed device fails again, or gets replaced. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-06 16:08 ` Chris Murphy @ 2016-07-06 17:50 ` Tomáš Hrdina 2016-07-06 18:12 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-06 17:50 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS sudo mount -o ro /dev/sdc /shares mount: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. sudo mount -o ro,recovery /dev/sdc /shares mount: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. dmesg http://sebsauvage.net/paste/?04d1162dc44d7e55#uY0kIaX66o7Kh+TZAGK2T+CKdRk2jorIWM3w5gfXp8I= Do you want any other log to see? For all 3 disks: sudo smartctl -l scterc,70,70 /dev/sdx smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control set to: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Wednesday, July 06, 2016 6:08PM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 On Wed, Jul 6, 2016 at 2:07 AM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > Now with 3 disks: > > sudo btrfs check /dev/sda > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > bytenr mismatch, want=7008807157760, have=65536 > Checking filesystem on /dev/sda > UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e > checking extents > parent transid verify failed on 7009468874752 wanted 70180 found 70133 > parent transid verify failed on 7009468874752 wanted 70180 found 70133 > checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC > checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC > bytenr mismatch, want=7009468874752, have=65536 > parent transid verify failed on 7008859045888 wanted 70175 found 70133 > parent transid verify failed on 7008859045888 wanted 70175 found 70133 > checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 > checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 > bytenr mismatch, want=7008859045888, have=65536 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > Ignoring transid failure > leaf parent key incorrect 7008899547136 > bad block 7008899547136 > Errors found in extent allocation tree or chunk allocation > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > bytenr mismatch, want=7009074167808, have=65536 Ok much better than before, these all seem sane with a limited number of problems. Maybe --repair can fix it, but don't do that yet. > sudo btrfs-debug-tree -d /dev/sdc > http://sebsauvage.net/paste/?d690b2c9d130008d#cni3fnKUZ7Y/oaXm+nsOw0afoWDFXNl26eC+vbJmcRA= OK good, so now it finds the chunk tree OK. This is good news. I would try to mount it ro first, if you need to make or refresh a backup. So in order: mount -o ro mount -o ro,recovery If those don't work lets see what the user and kernel errors are. > >> > sudo btrfs-find-root /dev/sdc > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > Superblock thinks the generation is 70182 > Superblock thinks the level is 1 > Found tree root at 6062830010368 gen 70182 level 1 > Well block 6062434418688(gen: 70181 level: 1) seems good, but > generation/level doesn't match, want gen: 70182 level: 1 > Well block 6062497202176(gen: 69186 level: 0) seems good, but > generation/level doesn't match, want gen: 70182 level: 1 > Well block 6062470332416(gen: 69186 level: 0) seems good, but > generation/level doesn't match, want gen: 70182 level: 1 This is also a good sign that you can probably get btrfs rescue to work and point it to one of these older tree roots, if mount won't work. > >> > sudo smartctl -l scterc /dev/sda > smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: > Read: Disabled > Write: Disabled > >> > sudo smartctl -l scterc /dev/sdb > smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: > Read: 70 (7.0 seconds) > Write: 70 (7.0 seconds) > >> > sudo smartctl -l scterc /dev/sdc > smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-24-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: > Read: Disabled > Write: Disabled There's good news and bad news. The good news is all the drives support SCT ERC. The bad news is two of the drives have the wrong setting for raid1+, including raid5. Issue: smartctl -l scterc,70,70 /dev/sdX #for each drive This is not a persistent setting. The drive being powered off (maybe even reset) will revert the setting to drive default. Some people use a udev rule to set this during startup. I think it can also be done with a systemd unit. You'd want to specify the drives by id, wwn if available, so that it's always consistent across boots. The point of this setting is to force the drive to give up on errors quickly, allowing Btrfs in this case to be informed of the exact problem (media error and what sector) so that Btrfs can reconstruct the data from parity and then fix the bad sector(s). In your current configuration the fixup can't happen, so problems start to accumulate. > sudo smartcl -a /dev/sdx > http://sebsauvage.net/paste/?aab1d282ceb1e1cf#auxFRkK5GCW8j1gR7mwgzR1z92Qn9oqtc6EEC2C6sEE= sudo smartctl -a /dev/sda === START OF INFORMATION SECTION === Model Family: Seagate NAS HDD Device Model: ST4000VN000-1H4168 Serial Number: Z302YVSZ 5 Reallocated_Sector_Ct 0x0033 089 089 010 Pre-fail Always - 14648 That's too many reallocated sectors. The good news is none are pending. But for a NAS drive I think this is too high, get it replaced under warranty. It certainly means that the unrecoverable read spec for this particular drive is being busted so they should replace it without question. It's possible this value is high by a factor of 8 if they're counting 512 byte logical sectors, where the actual physical sector is 4096 bytes. So it might not be as big of a problem as it seems, but it's still busted the spec. sudo smartctl -a /dev/sdb === START OF INFORMATION SECTION === Model Family: Seagate NAS HDD Device Model: ST4000VN000-2AH166 Serial Number: WDH00SM8 LU WWN Device Id: 5 000c50 09bbd3af2 Error 1 occurred at disk power-on lifetime: 453 hours (18 days + 21 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 This drive has recently experienced an explicit read error. That probably was fixed by Btrfs 18 days ago, if you have logs going back that long you'd likely see a fixup for this same sector LBA value. /dev/sdc looks OK. What's interesting looking at all smartctl outputs is that all three are NAS models of Seagate but *two* of them do not have SCT ERC enabled by default. That is very eyebrow raising as it relates to the potential spread of misconfigurations of RAID. Device Model: ST4000VN000-1H4168 Device Model: ST4000VN000-2AH166 ## this one has SCT ERC set to 70 deciseconds Device Model: ST4000VN000-1H4168 Seems like a bad idea for a NAS drive to default to SCT ERC disabled, I would expect the overwhelming use case for NAS drives will be raid1, 5, or 6, all of which need SCT ERC enabled. Very weird choice by Seagate in my opinion. Anyway, you should enable this on the other two drives. That way there are fast error recoveries. If it turns out Btrfs can't reconstruct something upon error, we can deal with that later. The main thing is you want to get this raid5 as healthy as possible before the previously failed device fails again, or gets replaced. --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-06 17:50 ` Tomáš Hrdina @ 2016-07-06 18:12 ` Chris Murphy 2016-07-09 17:30 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-06 18:12 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Wed, Jul 6, 2016 at 11:50 AM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > sudo mount -o ro /dev/sdc /shares > mount: wrong fs type, bad option, bad superblock on /dev/sdc, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. > > > sudo mount -o ro,recovery /dev/sdc /shares > mount: wrong fs type, bad option, bad superblock on /dev/sdc, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. [ 275.688919] BTRFS error (device sda): parent transid verify failed on 7008533413888 wanted 70175 found 70132 Looks like the generation is too far back for backup roots. Just for grins, now that all drives are present, what do you get for # btrfs rescue super-recover -v /dev/sda Next I suggest btrfs-image -c9 -t4 and optionally -s to sanitize file names. And also btrfs-debug-tree (this time no -d) redirected to a file. These two files can be big, about the size of the used amount of metadata chunks. These go in the cloud at some point, reference them in a bugzilla.kernel.org bug report by URL. Expect it to be months before a dev looks at it. So now what you want to try to do is use restore. https://btrfs.wiki.kernel.org/index.php/Restore You can use the information from btrfs-find-root to give restore a -t value to try. For example: >Found tree root at 6062830010368 gen 70182 level 1 >Well block 6062434418688(gen: 70181 level: 1) seems good, but >generation/level doesn't match, want gen: 70182 level: 1 >Well block 6062497202176(gen: 69186 level: 0) seems good, but >generation/level doesn't match, want gen: 70182 level: 1 >Well block 6062470332416(gen: 69186 level: 0) seems good, but >generation/level doesn't match, want gen: 70182 level: 1 btrfs restore -t 6062830010368 -v -i /dev/sda <pathtowhereyouwantdatatogo> If that fails totally you can try the next bytenr, for the -t value, 6062434418688. And then the next. Each value down is going backward in time, so it implies some data loss. This is not the end. It's just that it's the safest since no changes to the fs have happened. If you set up some kind of overlay you can be more aggressive like going right for btrfs check --repair and seeing if it can fix things, but without the overlay it's possible to totally break the fs such that even restore won't work. Once you pretty much have everything important off the volume, you can get more aggressive with trying to fix it. OR just blow it away and start over. But I think it's valid to gather as much information about the file system and try to fix it because the autopsy is the main way to make Btrfs better. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-06 18:12 ` Chris Murphy @ 2016-07-09 17:30 ` Tomáš Hrdina 2016-07-09 18:33 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-09 17:30 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS sudo btrfs rescue super-recover -v /dev/sda All Devices: Device: id = 1, name = /dev/sdc Device: id = 2, name = /dev/sdb Device: id = 3, name = /dev/sda Before Recovering: [All good supers]: device name = /dev/sdc superblock bytenr = 65536 device name = /dev/sdc superblock bytenr = 67108864 device name = /dev/sdc superblock bytenr = 274877906944 device name = /dev/sdb superblock bytenr = 65536 device name = /dev/sdb superblock bytenr = 67108864 device name = /dev/sdb superblock bytenr = 274877906944 device name = /dev/sda superblock bytenr = 65536 device name = /dev/sda superblock bytenr = 67108864 device name = /dev/sda superblock bytenr = 274877906944 [All bad supers]: All supers are valid, no need to recover I hope, a made it right: sudo btrfs-image -c9 -t4 /dev/sda /mnt/btrfs-image parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 Error going to next leaf -5 create failed (Success) sudo btrfs-debug-tree /dev/sda > /mnt/btrfs-debug-tree 2> /mnt/btrfs-debug-tree-err btrfs-debug-tree file have 10MB btrfs-debug-tree-err http://sebsauvage.net/paste/?12cf2fb771b93bdd#Ajv5gPoxDKjaWExcJnMZLVhcU5wVw77abeZ4tIGTazU= I used btrfs restore and everything except the newest files was restored. I can get those files again from the internet, so now it is save to do changes to filesystem and try to repair it. In the end, I will create new fs, but I can try to repair it and hopefully gather some helpful information. Thank you for help... Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Wednesday, July 06, 2016 8:12PM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 btrfs rescue super-recover -v /dev/sda --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-09 17:30 ` Tomáš Hrdina @ 2016-07-09 18:33 ` Chris Murphy 2016-07-10 7:01 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-09 18:33 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Sat, Jul 9, 2016 at 11:30 AM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > sudo btrfs rescue super-recover -v /dev/sda > All Devices: > Device: id = 1, name = /dev/sdc > Device: id = 2, name = /dev/sdb > Device: id = 3, name = /dev/sda > > Before Recovering: > [All good supers]: > device name = /dev/sdc > superblock bytenr = 65536 > > device name = /dev/sdc > superblock bytenr = 67108864 > > device name = /dev/sdc > superblock bytenr = 274877906944 > > device name = /dev/sdb > superblock bytenr = 65536 > > device name = /dev/sdb > superblock bytenr = 67108864 > > device name = /dev/sdb > superblock bytenr = 274877906944 > > device name = /dev/sda > superblock bytenr = 65536 > > device name = /dev/sda > superblock bytenr = 67108864 > > device name = /dev/sda > superblock bytenr = 274877906944 > > [All bad supers]: > > All supers are valid, no need to recover > > > I hope, a made it right: > > > sudo btrfs-image -c9 -t4 /dev/sda /mnt/btrfs-image > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > bytenr mismatch, want=7008807157760, have=65536 > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > bytenr mismatch, want=7009074167808, have=65536 > Error going to next leaf -5 > create failed (Success) I've always found that to be a confusing message at the end. If you've got something in the realm of the total used amount of metadata block groups (what you'd see with 'fi df' or 'fi usage' then it's probably OK. > > > sudo btrfs-debug-tree /dev/sda > /mnt/btrfs-debug-tree 2> > /mnt/btrfs-debug-tree-err > > btrfs-debug-tree file have 10MB > btrfs-debug-tree-err > http://sebsauvage.net/paste/?12cf2fb771b93bdd#Ajv5gPoxDKjaWExcJnMZLVhcU5wVw77abeZ4tIGTazU= Huh so not very useful compared to with -d, it must be tripping up on something. I guess you could try btrfs-progs 4.6.1 and see if you get any different results, at the least, btrfs-debug-tree shouldn't crash. >> I used btrfs restore and everything except the newest files was restored. > I can get those files again from the internet, so now it is save to do > changes to filesystem and try to repair it. > In the end, I will create new fs, but I can try to repair it and > hopefully gather some helpful information. > > Thank you for help... So now it's an open question what to try and in what order, and I'm afraid I'm only making estimated guesses. Ideally you'd set up some kind of overlay so that you can try different sequences in a way that's non-destructive to the original, but just make sure whatever overlay method you use obscures the original file system from the kernel or you run into the problem of having the same volume UUID exposed more than once, which presently can corrupt both copies. I would try them in the following order where you try to mount after each and only try the next one if mount fails. btrfs check --repair btrfs check -r <tree root bytenr> --repair Use the tree root bytenr you used for mostly successful recovery for 'btrfs restore -t' but if that doesn't work then I'd try all the tree roots that btrfs-find-root reports. btrfs check --repair --init-csum-tree btrfs check --repair --init-extent-tree Those aren't really related to your problem at all, it's just spaghetti at a wall. Likewise the following two: btrfs restore chunk-recover btrfs restore zero-log In particular there's nothing about your situation that suggests zero log ought to fix anything. But stranger things have happened. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-09 18:33 ` Chris Murphy @ 2016-07-10 7:01 ` Tomáš Hrdina 2016-07-10 20:08 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-10 7:01 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS After every step, I tried mount fs with ro, ro,recovery and ro,degraded,recovery. If failed, I moved to next step. sudo btrfs check --repair /dev/sdc enabling repair mode parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 Checking filesystem on /dev/sdc UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e checking extents parent transid verify failed on 7009468874752 wanted 70180 found 70133 parent transid verify failed on 7009468874752 wanted 70180 found 70133 checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC bytenr mismatch, want=7009468874752, have=65536 parent transid verify failed on 7008859045888 wanted 70175 found 70133 parent transid verify failed on 7008859045888 wanted 70175 found 70133 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 bytenr mismatch, want=7008859045888, have=65536 parent transid verify failed on 7008899547136 wanted 70175 found 70133 parent transid verify failed on 7008899547136 wanted 70175 found 70133 checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 parent transid verify failed on 7008899547136 wanted 70175 found 70133 Ignoring transid failure leaf parent key incorrect 7008899547136 bad block 7008899547136 Errors found in extent allocation tree or chunk allocation parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 btrfs check -r <tree root bytenr> --repair I didn't use any bytenr for recovery. Recovery worked without -t. sudo btrfs-find-root /dev/sdc parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 Superblock thinks the generation is 70182 Superblock thinks the level is 1 Found tree root at 6062830010368 gen 70182 level 1 Well block 6062434418688(gen: 70181 level: 1) seems good, but generation/level doesn't match, want gen: 70182 level: 1 Well block 6062497202176(gen: 69186 level: 0) seems good, but generation/level doesn't match, want gen: 70182 level: 1 Well block 6062470332416(gen: 69186 level: 0) seems good, but generation/level doesn't match, want gen: 70182 level: 1 sudo btrfs check -r 6062830010368 --repair /dev/sdc enabling repair mode parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 Checking filesystem on /dev/sdc UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e checking extents parent transid verify failed on 7009468874752 wanted 70180 found 70133 parent transid verify failed on 7009468874752 wanted 70180 found 70133 checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC bytenr mismatch, want=7009468874752, have=65536 parent transid verify failed on 7008859045888 wanted 70175 found 70133 parent transid verify failed on 7008859045888 wanted 70175 found 70133 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 bytenr mismatch, want=7008859045888, have=65536 parent transid verify failed on 7008899547136 wanted 70175 found 70133 parent transid verify failed on 7008899547136 wanted 70175 found 70133 checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 parent transid verify failed on 7008899547136 wanted 70175 found 70133 Ignoring transid failure leaf parent key incorrect 7008899547136 bad block 7008899547136 Errors found in extent allocation tree or chunk allocation parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 sudo btrfs check -r 6062434418688 --repair /dev/sdc enabling repair mode parent transid verify failed on 6062434418688 wanted 70182 found 70181 parent transid verify failed on 6062434418688 wanted 70182 found 70181 checksum verify failed on 6062434418688 found F868085E wanted 1C8BB5E8 parent transid verify failed on 6062434418688 wanted 70182 found 70181 Ignoring transid failure parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 Checking filesystem on /dev/sdc UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e checking extents parent transid verify failed on 6063240511488 wanted 70175 found 70132 parent transid verify failed on 6063240511488 wanted 70175 found 70132 checksum verify failed on 6063240511488 found E9831D76 wanted 0D60A0C0 checksum verify failed on 6063240511488 found E9831D76 wanted 0D60A0C0 bytenr mismatch, want=6063240511488, have=65536 parent transid verify failed on 7398301417472 wanted 70180 found 70134 parent transid verify failed on 7398301417472 wanted 70180 found 70134 checksum verify failed on 7398301417472 found B2FA98AB wanted 5619251D checksum verify failed on 7398301417472 found B2FA98AB wanted 5619251D bytenr mismatch, want=7398301417472, have=65536 parent transid verify failed on 7398262865920 wanted 70180 found 70133 parent transid verify failed on 7398262865920 wanted 70180 found 70133 checksum verify failed on 7398262865920 found 37B272E5 wanted D351CF53 parent transid verify failed on 7398262865920 wanted 70180 found 70133 Ignoring transid failure leaf parent key incorrect 7398262865920 parent transid verify failed on 7398398099456 wanted 70180 found 70134 parent transid verify failed on 7398398099456 wanted 70180 found 70134 checksum verify failed on 7398398099456 found 1923B74F wanted FDC00AF9 checksum verify failed on 7398398099456 found 1923B74F wanted FDC00AF9 bytenr mismatch, want=7398398099456, have=65536 parent transid verify failed on 7398398099456 wanted 70180 found 70134 parent transid verify failed on 7398398099456 wanted 70180 found 70134 checksum verify failed on 7398398099456 found 1923B74F wanted FDC00AF9 checksum verify failed on 7398398099456 found 1923B74F wanted FDC00AF9 bytenr mismatch, want=7398398099456, have=65536 parent transid verify failed on 7009449263104 wanted 70180 found 70133 parent transid verify failed on 7009449263104 wanted 70180 found 70133 checksum verify failed on 7009449263104 found AD1A4120 wanted 49F9FC96 checksum verify failed on 7009449263104 found AD1A4120 wanted 49F9FC96 bytenr mismatch, want=7009449263104, have=65536 parent transid verify failed on 7398308003840 wanted 70180 found 70134 parent transid verify failed on 7398308003840 wanted 70180 found 70134 checksum verify failed on 7398308003840 found 9162951D wanted 758128AB checksum verify failed on 7398308003840 found 9162951D wanted 758128AB bytenr mismatch, want=7398308003840, have=65536 parent transid verify failed on 7009456766976 wanted 70180 found 70133 parent transid verify failed on 7009456766976 wanted 70180 found 70133 checksum verify failed on 7009456766976 found 0A20BD0C wanted EEC300BA checksum verify failed on 7009456766976 found 0A20BD0C wanted EEC300BA bytenr mismatch, want=7009456766976, have=65536 parent transid verify failed on 7398971736064 wanted 70180 found 70134 parent transid verify failed on 7398971736064 wanted 70180 found 70134 checksum verify failed on 7398971736064 found 39868CDB wanted DD65316D checksum verify failed on 7398971736064 found 39868CDB wanted DD65316D bytenr mismatch, want=7398971736064, have=65536 parent transid verify failed on 7398171967488 wanted 70180 found 70133 parent transid verify failed on 7398171967488 wanted 70180 found 70133 checksum verify failed on 7398171967488 found 372EF754 wanted D3CD4AE2 checksum verify failed on 7398171967488 found 372EF754 wanted D3CD4AE2 bytenr mismatch, want=7398171967488, have=65536 parent transid verify failed on 7009468596224 wanted 70180 found 70133 parent transid verify failed on 7009468596224 wanted 70180 found 70133 checksum verify failed on 7009468596224 found CE38C9D6 wanted 2ADB7460 parent transid verify failed on 7009468596224 wanted 70180 found 70133 Ignoring transid failure leaf parent key incorrect 7009468596224 parent transid verify failed on 7398199115776 wanted 70180 found 70133 parent transid verify failed on 7398199115776 wanted 70180 found 70133 checksum verify failed on 7398199115776 found 90F857D8 wanted 741BEA6E checksum verify failed on 7398199115776 found 90F857D8 wanted 741BEA6E bytenr mismatch, want=7398199115776, have=65536 parent transid verify failed on 7398207799296 wanted 70180 found 70133 parent transid verify failed on 7398207799296 wanted 70180 found 70133 checksum verify failed on 7398207799296 found 99BAD070 wanted 7D596DC6 checksum verify failed on 7398207799296 found 99BAD070 wanted 7D596DC6 bytenr mismatch, want=7398207799296, have=65536 parent transid verify failed on 7009468874752 wanted 70180 found 70133 parent transid verify failed on 7009468874752 wanted 70180 found 70133 checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC bytenr mismatch, want=7009468874752, have=65536 parent transid verify failed on 7008859045888 wanted 70175 found 70133 parent transid verify failed on 7008859045888 wanted 70175 found 70133 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 bytenr mismatch, want=7008859045888, have=65536 parent transid verify failed on 7008899547136 wanted 70175 found 70133 parent transid verify failed on 7008899547136 wanted 70175 found 70133 checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 parent transid verify failed on 7008899547136 wanted 70175 found 70133 Ignoring transid failure leaf parent key incorrect 7008899547136 bad block 7008899547136 Errors found in extent allocation tree or chunk allocation parent transid verify failed on 7398682050560 wanted 70180 found 70134 parent transid verify failed on 7398682050560 wanted 70180 found 70134 checksum verify failed on 7398682050560 found 3B3B2ADC wanted DFD8976A checksum verify failed on 7398682050560 found 3B3B2ADC wanted DFD8976A bytenr mismatch, want=7398682050560, have=65536 sudo btrfs check -r 6062497202176 --repair /dev/sdc enabling repair mode parent transid verify failed on 6062497202176 wanted 70182 found 69186 parent transid verify failed on 6062497202176 wanted 70182 found 69186 checksum verify failed on 6062497202176 found 41994FE1 wanted A57AF257 checksum verify failed on 6062497202176 found 41994FE1 wanted A57AF257 bytenr mismatch, want=6062497202176, have=65536 Couldn't read tree root Couldn't open file system sudo btrfs check -r 6062470332416 --repair /dev/sdc enabling repair mode parent transid verify failed on 6062470332416 wanted 70182 found 69186 parent transid verify failed on 6062470332416 wanted 70182 found 69186 checksum verify failed on 6062470332416 found 46F2EC59 wanted A21151EF checksum verify failed on 6062470332416 found 46F2EC59 wanted A21151EF bytenr mismatch, want=6062470332416, have=65536 Couldn't read tree root Couldn't open file system sudo btrfs check --repair --init-csum-tree /dev/sdc enabling repair mode Creating a new CRC tree parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 Checking filesystem on /dev/sdc UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e Reinit crc root parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 Unable to find block group for 0 extent-tree.c:289: find_search_start: Assertion `1` failed. btrfs(btrfs_reserve_extent+0x8f9)[0x45140a] btrfs(btrfs_alloc_free_block+0x60)[0x451794] btrfs[0x41d2d5] btrfs(cmd_check+0xfe8)[0x42d0f5] btrfs(main+0x155)[0x40a433] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ffa55325830] btrfs(_start+0x29)[0x40a029] sudo btrfs check --repair --init-extent-tree /dev/sdc enabling repair mode Checking filesystem on /dev/sdc UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e Creating a new extent tree Failed to find [6062434598912, 168, 16384] btrfs unable to find ref byte nr 6062830010368 parent 0 root 1 owner 1 offset 0 Failed to find [6062434615296, 168, 16384] btrfs unable to find ref byte nr 6062830174208 parent 0 root 1 owner 0 offset 1 parent transid verify failed on 7398212452352 wanted 70180 found 70133 parent transid verify failed on 7398212452352 wanted 70180 found 70133 checksum verify failed on 7398212452352 found B2C4F638 wanted 56274B8E checksum verify failed on 7398212452352 found B2C4F638 wanted 56274B8E bytenr mismatch, want=7398212452352, have=65536 Error reading data reloc tree error resetting the pending balance transaction.h:42: btrfs_start_transaction: Assertion `fs_info->running_transaction` failed. btrfs[0x4468e6] btrfs(close_ctree_fs_info+0x184)[0x448d59] btrfs(cmd_check+0x3010)[0x42f11d] btrfs(main+0x155)[0x40a433] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f34be581830] btrfs(_start+0x29)[0x40a029] I believe you meant rescue command: sudo btrfs rescue chunk-recover -v /dev/sdc All Devices: Device: id = 1, name = /dev/sdd Device: id = 3, name = /dev/sdb Device: id = 2, name = /dev/sdc Scanning: 2808088981504 in dev0, 2759780548608 in dev1, 2978000535552 in dev2scan chunk headers error Chunk tree recovery aborted sudo btrfs rescue zero-log /dev/sdc parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 Clearing log on /dev/sdc, previous log_root 0, level 0 parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 parent transid verify failed on 7009074167808 wanted 70175 found 70133 parent transid verify failed on 7009074167808 wanted 70175 found 70133 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 bytenr mismatch, want=7009074167808, have=65536 Unable to find block group for 0 extent-tree.c:289: find_search_start: Assertion `1` failed. btrfs(btrfs_reserve_extent+0x8f9)[0x45140a] btrfs(btrfs_alloc_free_block+0x60)[0x451794] btrfs(__btrfs_cow_block+0x1a7)[0x4406dc] btrfs(btrfs_cow_block+0x102)[0x441161] btrfs[0x446ce3] btrfs(btrfs_commit_transaction+0xec)[0x448ac7] btrfs[0x432d3d] btrfs(handle_command_group+0x5d)[0x40a2d9] btrfs(cmd_rescue+0x15)[0x432d71] btrfs(main+0x155)[0x40a433] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fb21969b830] btrfs(_start+0x29)[0x40a029] So far, no luck. I can't still mount. Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Saturday, July 09, 2016 8:33PM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 btrfs check --repair --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-10 7:01 ` Tomáš Hrdina @ 2016-07-10 20:08 ` Chris Murphy 2016-07-11 17:17 ` Tomáš Hrdina 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-10 20:08 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Sun, Jul 10, 2016 at 1:01 AM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > sudo btrfs check --repair /dev/sdc > enabling repair mode > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > bytenr mismatch, want=7008807157760, have=65536 > Checking filesystem on /dev/sdc > UUID: 2dab74bb-fc73-4c47-a413-a55840f6f71e > checking extents > parent transid verify failed on 7009468874752 wanted 70180 found 70133 > parent transid verify failed on 7009468874752 wanted 70180 found 70133 > checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC > checksum verify failed on 7009468874752 found 2B10421A wanted CFF3FFAC > bytenr mismatch, want=7009468874752, have=65536 > parent transid verify failed on 7008859045888 wanted 70175 found 70133 > parent transid verify failed on 7008859045888 wanted 70175 found 70133 > checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 > checksum verify failed on 7008859045888 found 7313A127 wanted 97F01C91 > bytenr mismatch, want=7008859045888, have=65536 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > checksum verify failed on 7008899547136 found 2B6F9045 wanted CF8C2DF3 > parent transid verify failed on 7008899547136 wanted 70175 found 70133 > Ignoring transid failure > leaf parent key incorrect 7008899547136 > bad block 7008899547136 > Errors found in extent allocation tree or chunk allocation > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > parent transid verify failed on 7009074167808 wanted 70175 found 70133 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > checksum verify failed on 7009074167808 found FDA6D1F0 wanted 19456C46 > bytenr mismatch, want=7009074167808, have=65536 OK well it was all a goose chase then. These are all the same messages from 4 days ago also. The central problem appears to be checksum verifications on multiple blocks, which really doesn't make sense to me because it should be able to reconstruct from parity. How is it possible to have four root trees, all of which point to different leaf/nodes, all of which have some kind of checksum failure? None of them are good? And none of them can be reconstructed? Sounds fishy. You try to plug each of those bytenr's into btrfs-debug-tree -b <bytenr> and see if it'll show you what leaf information is there that it doesn't like. But if there's a csum mismatch, it may refuse to show anything, rather than show it and say it's unreliable due to csum mismatch. If it refuses to show it you could plug each of those failed bytenrs into btrfs-map-logical -l <bytenr> and get a device and physical sector, then you can get the entire leaf, compute a new csum and overwrite the current one. That way it now passes csum and see if that's the only problem, or if there's another brick wall later. Of course, if the csum was correct, and it's the metadata that's bad, honoring bad metadata as valid might cause a bad fix and then the whole thing implodes. But you're pretty much there already I'd say. If I were to pick an address to start with, it'd be this one. > leaf parent key incorrect 7008899547136 > bad block 7008899547136 But other than that, I'm out of ideas. It's completely reasonable to just give up at this point. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-10 20:08 ` Chris Murphy @ 2016-07-11 17:17 ` Tomáš Hrdina 2016-07-11 19:25 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Tomáš Hrdina @ 2016-07-11 17:17 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS sudo btrfs-debug-tree /dev/sdc It has 200000 lines. Don't know, what you use for bigger files. sudo btrfs-debug-tree -b 6062434418688 /dev/sdc http://sebsauvage.net/paste/?fc156dee1d1deb3b#YpG/TA0H3I313jMuC4pgsdj++TcuDaFwWIBeuuOXfCA= sudo btrfs-debug-tree -b 6062497202176 /dev/sdc http://sebsauvage.net/paste/?86621abec9c239bd#kwTpZ7BZLcLw71yCfr3jHKZT08zsaXK3RgdFo7MFoFc= sudo btrfs-debug-tree -b 6062470332416 /dev/sdc http://sebsauvage.net/paste/?4ff40fa0b6b201c9#nFk7pT9MLj2w9egUJlgXdkmCkWyp1vSG0kADfq3J7eA= It got some results, but I don't know, what to look for. sudo btrfs-map-logical -l 7008899547136 /dev/sdc parent transid verify failed on 7008807157760 wanted 70175 found 70133 parent transid verify failed on 7008807157760 wanted 70175 found 70133 checksum verify failed on 7008807157760 found F192848C wanted 1571393A checksum verify failed on 7008807157760 found F192848C wanted 1571393A bytenr mismatch, want=7008807157760, have=65536 mirror 1 logical 7008899547136 physical 735226609664 device /dev/sdb mirror 2 logical 7008899547136 physical 3166748524544 device /dev/sdc Also I don't know, what to do with this. How to compute new csum. For me, it would be ok to give up and just start fresh. Thank you Tomas ------------------------------------------------------------------------ *From:* Chris Murphy *Sent:* Sunday, July 10, 2016 10:08PM *To:* Tomáš Hrdina *Cc:* Chris Murphy, Btrfs Btrfs *Subject:* Re: Unable to mount degraded RAID5 btrfs-debug-tree -b --- Tato zpráva byla zkontrolována na viry programem Avast Antivirus. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-11 17:17 ` Tomáš Hrdina @ 2016-07-11 19:25 ` Chris Murphy 0 siblings, 0 replies; 25+ messages in thread From: Chris Murphy @ 2016-07-11 19:25 UTC (permalink / raw) To: Tomáš Hrdina; +Cc: Chris Murphy, Btrfs BTRFS On Mon, Jul 11, 2016 at 11:17 AM, Tomáš Hrdina <thomas.rkh@gmail.com> wrote: > sudo btrfs-debug-tree /dev/sdc > It has 200000 lines. Don't know, what you use for bigger files. > > > sudo btrfs-debug-tree -b 6062434418688 /dev/sdc > http://sebsauvage.net/paste/?fc156dee1d1deb3b#YpG/TA0H3I313jMuC4pgsdj++TcuDaFwWIBeuuOXfCA= > > sudo btrfs-debug-tree -b 6062497202176 /dev/sdc > http://sebsauvage.net/paste/?86621abec9c239bd#kwTpZ7BZLcLw71yCfr3jHKZT08zsaXK3RgdFo7MFoFc= > > > sudo btrfs-debug-tree -b 6062470332416 /dev/sdc > http://sebsauvage.net/paste/?4ff40fa0b6b201c9#nFk7pT9MLj2w9egUJlgXdkmCkWyp1vSG0kADfq3J7eA= None of these have anything useful in them, there's no tree root there. > > It got some results, but I don't know, what to look for. > > > sudo btrfs-map-logical -l 7008899547136 /dev/sdc > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > parent transid verify failed on 7008807157760 wanted 70175 found 70133 > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > checksum verify failed on 7008807157760 found F192848C wanted 1571393A > bytenr mismatch, want=7008807157760, have=65536 > mirror 1 logical 7008899547136 physical 735226609664 device /dev/sdb > mirror 2 logical 7008899547136 physical 3166748524544 device /dev/sdc > > > Also I don't know, what to do with this. How to compute new csum. Right, that's pretty tricky to do manually. > > For me, it would be ok to give up and just start fresh. OK in that case do that. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <CAFDLS-CtnVDtD8d=Wtp0tVokKJ6pjptpX7MR862dThBJvSPC5g@mail.gmail.com>]
* Fwd: Unable to mount degraded RAID5 [not found] <CAFDLS-CtnVDtD8d=Wtp0tVokKJ6pjptpX7MR862dThBJvSPC5g@mail.gmail.com> @ 2016-07-06 17:12 ` Gonzalo Gomez-Arrue Azpiazu 2016-07-06 18:19 ` Chris Murphy 0 siblings, 1 reply; 25+ messages in thread From: Gonzalo Gomez-Arrue Azpiazu @ 2016-07-06 17:12 UTC (permalink / raw) To: linux-btrfs Hello, I had a RAID5 with 3 disks and one failed; now the filesystem cannot be mounted. None of the recommendations that I found seem to work. The situation seems to be similar to this one: http://www.spinics.net/lists/linux-btrfs/msg56825.html Any suggestion on what to try next? Thanks a lot beforehand! sudo btrfs version btrfs-progs v4.4 uname -a Linux ubuntu 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux sudo btrfs fi show warning, device 2 is missing checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 bytenr mismatch, want=2339175972864, have=65536 Couldn't read chunk root Label: none uuid: 495efbc6-2f62-4cd7-962b-7ae3d0e929f1 Total devices 3 FS bytes used 1.29TiB devid 1 size 2.73TiB used 674.03GiB path /dev/sdc1 devid 3 size 2.73TiB used 674.03GiB path /dev/sdd1 *** Some devices missing sudo mount -t btrfs -o ro,degraded,recovery /dev/sdc1 /btrfs mount: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. dmesg | tail [ 2440.036368] BTRFS info (device sdd1): allowing degraded mounts [ 2440.036383] BTRFS info (device sdd1): enabling auto recovery [ 2440.036390] BTRFS info (device sdd1): disk space caching is enabled [ 2440.037928] BTRFS warning (device sdd1): devid 2 uuid 0c7d7db2-6a27-4b19-937b-b6266ba81257 is missing [ 2440.652085] BTRFS info (device sdd1): bdev (null) errs: wr 1413, rd 362, flush 471, corrupt 0, gen 0 [ 2441.359066] BTRFS error (device sdd1): bad tree block start 0 833766391808 [ 2441.359306] BTRFS error (device sdd1): bad tree block start 0 833766391808 [ 2441.359330] BTRFS: Failed to read block groups: -5 [ 2441.383793] BTRFS: open_ctree failed sudo btrfs restore /dev/sdc1 /bkp warning, device 2 is missing checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 bytenr mismatch, want=2339175972864, have=65536 Couldn't read chunk root Could not open root, trying backup super warning, device 2 is missing warning, device 3 is missing checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 bytenr mismatch, want=2339175972864, have=65536 Couldn't read chunk root Could not open root, trying backup super warning, device 2 is missing warning, device 3 is missing checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 checksum verify failed on 2339175972864 found A781ADC2 wanted 43621074 bytenr mismatch, want=2339175972864, have=65536 Couldn't read chunk root Could not open root, trying backup super sudo btrfs-show-super -fa /dev/sdc1 http://sebsauvage.net/paste/?d79e9e9c385cf1a5#fNwoEj5o2aQ6T7nDl4vjrFqEJG0SHeVpmGknbbCVnd0= sudo btrfs-find-root /dev/sdc1 warning, device 2 is missing Couldn't read chunk root Open ctree failed ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-06 17:12 ` Fwd: " Gonzalo Gomez-Arrue Azpiazu @ 2016-07-06 18:19 ` Chris Murphy 2016-07-07 12:24 ` Gonzalo Gomez-Arrue Azpiazu 0 siblings, 1 reply; 25+ messages in thread From: Chris Murphy @ 2016-07-06 18:19 UTC (permalink / raw) To: Gonzalo Gomez-Arrue Azpiazu; +Cc: Btrfs BTRFS On Wed, Jul 6, 2016 at 11:12 AM, Gonzalo Gomez-Arrue Azpiazu <ggomarr@gmail.com> wrote: > Hello, > > I had a RAID5 with 3 disks and one failed; now the filesystem cannot be mounted. > > None of the recommendations that I found seem to work. The situation > seems to be similar to this one: > http://www.spinics.net/lists/linux-btrfs/msg56825.html > > Any suggestion on what to try next? Basically if you are degraded *and* it runs into additional errors, then it's broken because raid5 only protects against one device error. The main problem is if it can't read the chunk root it's hard for any tool to recover data because the chunk tree mapping is vital to finding data. What do you get for: btrfs rescue super-recover -v /dev/sdc1 It's a problem with the chunk tree because all of your super blocks point to the same chunk tree root so there isn't another one to try. >sudo btrfs-find-root /dev/sdc1 >warning, device 2 is missing >Couldn't read chunk root >Open ctree failed It's bad news. I'm not even sure 'btrfs restore' can help this case. -- Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Unable to mount degraded RAID5 2016-07-06 18:19 ` Chris Murphy @ 2016-07-07 12:24 ` Gonzalo Gomez-Arrue Azpiazu 0 siblings, 0 replies; 25+ messages in thread From: Gonzalo Gomez-Arrue Azpiazu @ 2016-07-07 12:24 UTC (permalink / raw) To: linux-btrfs Thanks a lot, your will to help out someone you do not know (and who is obviously way over his depth) is inspiring. This is what it says: btrfs rescue super-recover -v /dev/sdc1 All Devices: Device: id = 3, name = /dev/sdd1 Device: id = 1, name = /dev/sdc1 Before Recovering: [All good supers]: device name = /dev/sdd1 superblock bytenr = 65536 device name = /dev/sdd1 superblock bytenr = 67108864 device name = /dev/sdd1 superblock bytenr = 274877906944 device name = /dev/sdc1 superblock bytenr = 65536 device name = /dev/sdc1 superblock bytenr = 67108864 device name = /dev/sdc1 superblock bytenr = 274877906944 [All bad supers]: All supers are valid, no need to recover Any suggestion on what to do next? (again, really appreciated - I hope to be able to give back the support I am receiving at some point!) On Wed, Jul 6, 2016 at 9:19 PM, Chris Murphy <lists@colorremedies.com> wrote: > On Wed, Jul 6, 2016 at 11:12 AM, Gonzalo Gomez-Arrue Azpiazu > <ggomarr@gmail.com> wrote: >> Hello, >> >> I had a RAID5 with 3 disks and one failed; now the filesystem cannot be mounted. >> >> None of the recommendations that I found seem to work. The situation >> seems to be similar to this one: >> http://www.spinics.net/lists/linux-btrfs/msg56825.html >> >> Any suggestion on what to try next? > > Basically if you are degraded *and* it runs into additional errors, > then it's broken because raid5 only protects against one device error. > The main problem is if it can't read the chunk root it's hard for any > tool to recover data because the chunk tree mapping is vital to > finding data. > > What do you get for: > btrfs rescue super-recover -v /dev/sdc1 > > It's a problem with the chunk tree because all of your super blocks > point to the same chunk tree root so there isn't another one to try. > >>sudo btrfs-find-root /dev/sdc1 >>warning, device 2 is missing >>Couldn't read chunk root >>Open ctree failed > > It's bad news. I'm not even sure 'btrfs restore' can help this case. > > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-07-11 19:25 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-04 18:09 Unable to mount degraded RAID5 Tomáš Hrdina
2016-07-04 18:41 ` Chris Murphy
[not found] ` <95f58623-95a4-b5d2-fa3a-bfb957840a31@gmail.com>
2016-07-04 19:01 ` Chris Murphy
2016-07-04 19:11 ` Tomáš Hrdina
2016-07-04 20:43 ` Chris Murphy
2016-07-04 21:10 ` Tomáš Hrdina
2016-07-04 22:42 ` Chris Murphy
2016-07-04 22:59 ` Chris Murphy
2016-07-05 7:12 ` Tomáš Hrdina
2016-07-05 3:48 ` Andrei Borzenkov
2016-07-05 15:13 ` Chris Murphy
2016-07-05 18:40 ` Tomáš Hrdina
2016-07-05 23:19 ` Chris Murphy
2016-07-06 8:07 ` Tomáš Hrdina
2016-07-06 16:08 ` Chris Murphy
2016-07-06 17:50 ` Tomáš Hrdina
2016-07-06 18:12 ` Chris Murphy
2016-07-09 17:30 ` Tomáš Hrdina
2016-07-09 18:33 ` Chris Murphy
2016-07-10 7:01 ` Tomáš Hrdina
2016-07-10 20:08 ` Chris Murphy
2016-07-11 17:17 ` Tomáš Hrdina
2016-07-11 19:25 ` Chris Murphy
[not found] <CAFDLS-CtnVDtD8d=Wtp0tVokKJ6pjptpX7MR862dThBJvSPC5g@mail.gmail.com>
2016-07-06 17:12 ` Fwd: " Gonzalo Gomez-Arrue Azpiazu
2016-07-06 18:19 ` Chris Murphy
2016-07-07 12:24 ` Gonzalo Gomez-Arrue Azpiazu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).