* Help Recovering BTRFS array
@ 2017-09-18 17:14 grondinm
2017-09-19 3:45 ` Duncan
0 siblings, 1 reply; 3+ messages in thread
From: grondinm @ 2017-09-18 17:14 UTC (permalink / raw)
To: linux-btrfs
Hello,
I will try to provide all information pertinent to the situation i find myself in.
Yesterday while trying to write some data to a BTRFS filesystem on top of a mdadm raid5 array encrypted with dmcrypt comprising of 4 1tb HDD my system became unresponsive and i had no choice but to hard reset. System came back up no problem and the array in question mounted without a complaint. Once i tried to write data to it again however the system became unresponsive again and required another hard reset. Again system came back up and everything mounted with no complaints.
This time i decided to run some checks. Ran a raid check by issuing 'echo check > /sys/block/md0/md/sync_action'. This completed without a single error. So i performed a proper restart just because and once the system came back up i initiated a scrub on the btrfs filesystem. This greeted me with my first indication that something is wrong:
btrfs sc stat /media/Storage2
scrub status for e5bd5cf3-c736-48ff-b1c6-c9f678567788
scrub started at Mon Sep 18 06:05:21 2017, running for 07:40:47
total bytes scrubbed: 1.03TiB with 1 errors
error details: super=1
corrected errors: 0, uncorrectable errors: 0, unverified errors: 0
I was concerned but since it was still scrubbing i left it. Now things look really bleak...
Every few minutes the scrub process goes into a D status as shown by htop it eventually keeps going and as far as i can see is still scrubbing(slowly). I decided to check a something else(based on the error above) I ran btrfs inspect-internal dump-super -a -f /dev/md0 which gave me this:
superblock: bytenr=65536, device=/dev/md0
---------------------------------------------------------
ERROR: bad magic on superblock on /dev/md0 at 65536
superblock: bytenr=67108864, device=/dev/md0
---------------------------------------------------------
ERROR: bad magic on superblock on /dev/md0 at 67108864
superblock: bytenr=274877906944, device=/dev/md0
---------------------------------------------------------
ERROR: bad magic on superblock on /dev/md0 at 274877906944
Now i'm really panicked. Is the FS toast? Can any recovery be attempted?
Here is the output of dump-super with the -F option:
superblock: bytenr=65536, device=/dev/md0
---------------------------------------------------------
csum_type 43668 (INVALID)
csum_size 32
csum 0x76c647b04abf1057f04e40d1dc52522397258064b98a1b8f6aa6934c74c0dd55 [DON'T MATCH]
bytenr 6376050623103086821
flags 0x7edcc412b742c79f
( WRITTEN |
RELOC |
METADUMP |
unknown flag: 0x7edcc410b742c79c )
magic ..l~...q [DON'T MATCH]
fsid 2cf827fa-7ab8-e290-b152-1735c2735a37
label .a.9.@.=....4.#.|.D...]..dh=d....,..k..n..~.5.....i.8...(.._.tl.a......1sX@..2..Qi....dJ.>Hy.U......{X5.....kG0.)....t..;..../.2...@.T.|.u.<.`!........J*9./....8...&.g....\.V...*.,/95.uEs..W.i..z..h...n(...VGn^F.......H.......5.DT..3.A..mK...~..}.1......n.
generation 1769598730239175261
root 14863846352370317867
sys_array_size 1744503544
chunk_root_generation 18100024505086712407
root_level 79
chunk_root 10848092274453435018
chunk_root_level 156
log_root 7514172289378668244
log_root_transid 6227239369566282426
log_root_level 18
total_bytes 5481087866519986730
bytes_used 13216280034370888020
sectorsize 4102056786
nodesize 1038279258
leafsize 276348297
stripesize 2473897044
root_dir 12090183195204234845
num_devices 12836127619712721941
compat_flags 0xf98ff436fc954bd4
compat_ro_flags 0x3fe8246616164da7
( FREE_SPACE_TREE |
FREE_SPACE_TREE_VALID |
unknown flag: 0x3fe8246616164da4 )
incompat_flags 0x3989a5037330bfd8
( COMPRESS_LZO |
COMPRESS_LZOv2 |
EXTENDED_IREF |
RAID56 |
SKINNY_METADATA |
NO_HOLES |
unknown flag: 0x3989a5037330bc10 )
cache_generation 10789185961859482334
uuid_tree_generation 14921288820846890813
dev_item.uuid e6e382b3-de66-4c25-7cc9-3cc43cde9c24
dev_item.fsid f8430e37-12ca-adaf-b038-f0ee10ce6327 [DON'T MATCH]
dev_item.type 7909001383421391155
dev_item.total_bytes 4839925749276763097
dev_item.bytes_used 14330418354255459170
dev_item.io_align 4136652250
dev_item.io_width 1113335506
dev_item.sector_size 1197062542
dev_item.devid 16559830033162408461
dev_item.dev_group 3271056113
dev_item.seek_speed 113
dev_item.bandwidth 35
dev_item.generation 15723849675231264550
sys_chunk_array[2048]:
item 0 key (12549421470303619499 FREE_SPACE_EXTENT 12715844338991310897)
ERROR: unexpected item type 199 in sys_array at offset 17
backup_roots[4]:
backup 0:
backup_tree_root: 5918676053091157562 gen: 4599194864203214963 level: 144
backup_chunk_root: 8975008186939396462 gen: 9664304934392981307 level: 235
backup_extent_root: 7756240529876556730 gen: 11265888412945796540 level: 76
backup_fs_root: 14967994453532855475 gen: 14136483082973799135 level: 83
backup_dev_root: 4954192454133918329 gen: 10448900784675204431 level: 88
backup_csum_root: 17904990185861936348 gen: 16603807809187172614 level: 7
backup_total_bytes: 9741488400785312878
backup_bytes_used: 1720631489440696202
backup_num_devices: 7931588452052946894
backup 1:
backup_tree_root: 13684698678564474567 gen: 33043910689084607 level: 140
backup_chunk_root: 6974714140403509989 gen: 16650332720823634889 level: 9
backup_extent_root: 3516452618597086684 gen: 6721909999031129955 level: 223
backup_fs_root: 5571396873850442777 gen: 9961414730092117271 level: 191
backup_dev_root: 12512234014059269698 gen: 15283952794535835341 level: 219
backup_csum_root: 1602565535990769763 gen: 2136912881039835078 level: 131
backup_total_bytes: 14489406064823832967
backup_bytes_used: 7530374675662623713
backup_num_devices: 11808512257789822277
backup 2:
backup_tree_root: 1996397391337300003 gen: 18162243921156386044 level: 44
backup_chunk_root: 4216057549420117484 gen: 17057376761029942685 level: 206
backup_extent_root: 7825952091071540206 gen: 16405179489532307152 level: 196
backup_fs_root: 3619121322246444309 gen: 2206909528630697779 level: 138
backup_dev_root: 13438213769779314384 gen: 16262689976944350697 level: 71
backup_csum_root: 11158912803848319140 gen: 4080194652350711178 level: 102
backup_total_bytes: 8272238769505738558
backup_bytes_used: 2576815956740779944
backup_num_devices: 3304053143692865210
backup 3:
backup_tree_root: 6075859188169321540 gen: 3028573589797862085 level: 29
backup_chunk_root: 15239419444928921132 gen: 2443952642303102760 level: 12
backup_extent_root: 6347616010285651669 gen: 2556158545601457306 level: 0
backup_fs_root: 12762145105300866492 gen: 16088334343865581774 level: 111
backup_dev_root: 17265158067002101903 gen: 3661539768757457899 level: 128
backup_csum_root: 5238561502854589650 gen: 13986831142192892137 level: 71
backup_total_bytes: 1680040557947059672
backup_bytes_used: 9379121512784539833
backup_num_devices: 12952468782229959036
superblock: bytenr=67108864, device=/dev/md0
---------------------------------------------------------
csum_type 65211 (INVALID)
csum_size 32
csum 0x4e1b51fe2fcccfeaff3977e08e271e9479da4ce9d3f67819483cca9caab206bb [DON'T MATCH]
bytenr 6151545373820009586
flags 0xa350e057fb4826b7
( WRITTEN |
RELOC |
SEEDING |
METADUMP |
METADUMP_V2 |
unknown flag: 0xa350e050fb4826b4 )
magic .x..V... [DON'T MATCH]
fsid ee64660a-2116-e3d1-bc81-57b2815ec175
label .....Z.(...B.u.f.../_....?x........!....4.)P+4......s.....6....>#+.U.xD...N.],7.;..z...i...5.g]l....`..."..k.Po.....
generation 9204480862653147048
root 15374020348068925257
sys_array_size 140002398
chunk_root_generation 5853066660345508532
root_level 163
chunk_root 3974206076299456518
chunk_root_level 221
log_root 6202016390021464484
log_root_transid 14498449761713243809
log_root_level 201
total_bytes 14339628566885106025
bytes_used 17973936159968091942
sectorsize 4203000179
nodesize 2861897139
leafsize 804415224
stripesize 1724696015
root_dir 6813288129527782924
num_devices 13544860601977045288
compat_flags 0x17ba4d6371928997
compat_ro_flags 0x80f879377aba010a
( FREE_SPACE_TREE_VALID |
unknown flag: 0x80f879377aba0108 )
incompat_flags 0xf35623bf46fd1c61
( MIXED_BACKREF |
BIG_METADATA |
EXTENDED_IREF |
unknown flag: 0xf35623bf46fd1c00 )
cache_generation 11325987263756088267
uuid_tree_generation 4050409366984057912
dev_item.uuid bff94652-989f-2b33-72ec-29da20f3b479
dev_item.fsid 9354d3c0-6e85-ef63-4890-6788401fc26d [DON'T MATCH]
dev_item.type 14799036415562724559
dev_item.total_bytes 8381745741598706536
dev_item.bytes_used 15795328293767567346
dev_item.io_align 2308776314
dev_item.io_width 941981659
dev_item.sector_size 1063805735
dev_item.devid 12559357670314928468
dev_item.dev_group 942585008
dev_item.seek_speed 160
dev_item.bandwidth 39
dev_item.generation 18008986788126981566
sys_chunk_array[2048]:
item 0 key (13965545642286451913 UNKNOWN.255 7678635361941794030)
ERROR: unexpected item type 255 in sys_array at offset 17
backup_roots[4]:
backup 0:
backup_tree_root: 1087501316855813684 gen: 12596045152199850401 level: 181
backup_chunk_root: 6557971774951271289 gen: 10208907930953549416 level: 246
backup_extent_root: 17481591668916579532 gen: 17453085530363153960 level: 243
backup_fs_root: 5601146632676646164 gen: 1465306770263226833 level: 12
backup_dev_root: 10902104218396152245 gen: 6092605485111932734 level: 75
backup_csum_root: 4442090594103425789 gen: 16911190536176118325 level: 97
backup_total_bytes: 801355142582085876
backup_bytes_used: 15028783946215022130
backup_num_devices: 16206375141624920634
backup 1:
backup_tree_root: 6351022532463796747 gen: 12652270837466734092 level: 253
backup_chunk_root: 2129606227456051738 gen: 7578364203725752851 level: 56
backup_extent_root: 11797685505978116035 gen: 10821301879062427211 level: 123
backup_fs_root: 288698854728497504 gen: 3081381278827790906 level: 13
backup_dev_root: 12061358643958707086 gen: 3512250687651845182 level: 123
backup_csum_root: 7522723451362117623 gen: 8328549283143475261 level: 218
backup_total_bytes: 3787127650231681413
backup_bytes_used: 12262463780944988829
backup_num_devices: 6357846707684110191
backup 2:
backup_tree_root: 12526877211419749609 gen: 1680728446797631925 level: 161
backup_chunk_root: 16243421936188306600 gen: 14674402491217930546 level: 170
backup_extent_root: 11709157221414435615 gen: 10762394217518893925 level: 7
backup_fs_root: 10470162224093528971 gen: 3157933380710680810 level: 141
backup_dev_root: 9370932758463595836 gen: 8195536874915306599 level: 87
backup_csum_root: 12862340132242466490 gen: 9499669518915523221 level: 70
backup_total_bytes: 4872773697414434976
backup_bytes_used: 1090914674363382351
backup_num_devices: 7250132742877406898
backup 3:
backup_tree_root: 4211606369908709000 gen: 9090072647248956366 level: 35
backup_chunk_root: 2842930413763548032 gen: 12574535419719702846 level: 227
backup_extent_root: 14215872060939819175 gen: 14386746624218700558 level: 247
backup_fs_root: 15269817613881593429 gen: 13012357796004765397 level: 82
backup_dev_root: 3593902900654541720 gen: 13357519206322201065 level: 75
backup_csum_root: 5284950982560001279 gen: 7162449733826645409 level: 134
backup_total_bytes: 2093390614757276942
backup_bytes_used: 871113832302380517
backup_num_devices: 820958227809071147
superblock: bytenr=274877906944, device=/dev/md0
---------------------------------------------------------
csum_type 26389 (INVALID)
csum_size 32
csum 0xe03c6b131b4ec6cb2114a4419277489fbf40b002906b4ed953638893397f6602 [DON'T MATCH]
bytenr 13664584384672311294
flags 0x59bdb26c99b54734
( CHANGING_FSID |
METADUMP_V2 |
unknown flag: 0x59bdb26099b54734 )
magic .F.&8... [DON'T MATCH]
fsid 847b0be0-2312-b08f-a471-273a39e385d4
label .C..l...xIg}B>~....&w.T>.<..Ay9 .)p...btH1....;...l..gk.J.....(.....f.|}....n^.M.|.3E.....I....:...)......H...}1.......L......B.l).s........]\[..\.\.X.~.A.x.......h......,.\.&h.y.6..m=.?..t.S.........[>a...(vt@...X......<o...v|.....U.gj...Lq...O.[u.....)0X
generation 12015793396287018420
root 8265308008657923830
sys_array_size 2413164638
chunk_root_generation 17965737574707260379
root_level 207
chunk_root 14573544619304334351
chunk_root_level 163
log_root 4378696952282061090
log_root_transid 15003646425698952082
log_root_level 164
total_bytes 11252436429982035164
bytes_used 17981997017222930567
sectorsize 2924702014
nodesize 3928239295
leafsize 3697142194
stripesize 3756194375
root_dir 702748544839699505
num_devices 4380849543563758050
compat_flags 0xd9bcafa18166c944
compat_ro_flags 0xb7789875e93811c2
( FREE_SPACE_TREE_VALID |
unknown flag: 0xb7789875e93811c0 )
incompat_flags 0x41b06fab5f783f39
( MIXED_BACKREF |
COMPRESS_LZO |
COMPRESS_LZOv2 |
BIG_METADATA |
SKINNY_METADATA |
NO_HOLES |
unknown flag: 0x41b06fab5f783c10 )
cache_generation 2790664258309518082
uuid_tree_generation 14560986793057547729
dev_item.uuid 9e68c747-6dd8-f3fc-1051-88123e6a0adf
dev_item.fsid 75e5ac67-228c-114d-a0bf-10a13b374c0c [DON'T MATCH]
dev_item.type 15303811306433940600
dev_item.total_bytes 15956271228860727248
dev_item.bytes_used 12663614323963693149
dev_item.io_align 742657370
dev_item.io_width 1198324740
dev_item.sector_size 3703211052
dev_item.devid 12808946261635193203
dev_item.dev_group 3986818674
dev_item.seek_speed 91
dev_item.bandwidth 25
dev_item.generation 16511076442895313212
sys_chunk_array[2048]:
item 0 key (0x01b68e621940ad8f UUID_KEY_SUBVOL 0x07e8f4b15269df74)
ERROR: unexpected item type 251 in sys_array at offset 17
backup_roots[4]:
backup 0:
backup_tree_root: 10929332989678721259 gen: 2410752186462201473 level: 168
backup_chunk_root: 12005036188293733529 gen: 6017658543726827580 level: 4
backup_extent_root: 5772439121786530440 gen: 9771362028433917927 level: 213
backup_fs_root: 15873974038452009066 gen: 5162464511863081610 level: 102
backup_dev_root: 10880196960241275662 gen: 5444842538697534609 level: 76
backup_csum_root: 732437746216976091 gen: 12889163759242890293 level: 17
backup_total_bytes: 4726471422280643128
backup_bytes_used: 7638379647440950994
backup_num_devices: 16086614005965457184
backup 1:
backup_tree_root: 365908921308135486 gen: 7955498484582942906 level: 222
backup_chunk_root: 2320600280382439909 gen: 16307549195721352448 level: 255
backup_extent_root: 63915797200689228 gen: 7223963135006628664 level: 122
backup_fs_root: 12093496594110329491 gen: 8673088152230422327 level: 176
backup_dev_root: 7071018315773203076 gen: 2123646002765713136 level: 68
backup_csum_root: 2256649538333048592 gen: 1242771109749269014 level: 209
backup_total_bytes: 13370841764183390953
backup_bytes_used: 1581695008980451655
backup_num_devices: 15998246508876073400
backup 2:
backup_tree_root: 2377137025235210394 gen: 3690996016117903730 level: 29
backup_chunk_root: 15708758458351285828 gen: 17427968661543407470 level: 251
backup_extent_root: 13419652187983998816 gen: 7021427764963897079 level: 140
backup_fs_root: 5475447812522438330 gen: 7816979085263009751 level: 33
backup_dev_root: 16577324916178280834 gen: 6576710502850587987 level: 204
backup_csum_root: 12227591344944562665 gen: 17430186024557184819 level: 199
backup_total_bytes: 18360656019908872927
backup_bytes_used: 3948283139050748388
backup_num_devices: 9142284180596408193
backup 3:
backup_tree_root: 18431410624458069403 gen: 16232386146948169557 level: 215
backup_chunk_root: 6863200809691904167 gen: 12426138819402850753 level: 233
backup_extent_root: 12822236089199474581 gen: 13741181971180163621 level: 146
backup_fs_root: 14102329897235867390 gen: 7763083095823881448 level: 186
backup_dev_root: 17346249016785915998 gen: 582294204541486187 level: 40
backup_csum_root: 5772528336744043720 gen: 1284328790601276594 level: 18
backup_total_bytes: 5440822434798926871
backup_bytes_used: 12538610060624890670
backup_num_devices: 7489880256855801674
Thank you
Marc Grondin
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Help Recovering BTRFS array
2017-09-18 17:14 Help Recovering BTRFS array grondinm
@ 2017-09-19 3:45 ` Duncan
0 siblings, 0 replies; 3+ messages in thread
From: Duncan @ 2017-09-19 3:45 UTC (permalink / raw)
To: linux-btrfs
grondinm posted on Mon, 18 Sep 2017 14:14:08 -0300 as excerpted:
> superblock: bytenr=65536, device=/dev/md0
> ---------------------------------------------------------
> ERROR: bad magic on superblock on /dev/md0 at 65536
>
> superblock: bytenr=67108864, device=/dev/md0
> ---------------------------------------------------------
> ERROR: bad magic on superblock on /dev/md0 at 67108864
>
> superblock: bytenr=274877906944, device=/dev/md0
> ---------------------------------------------------------
> ERROR: bad magic on superblock on /dev/md0 at 274877906944
>
> Now i'm really panicked. Is the FS toast? Can any recovery be attempted?
First I'm a user and list regular, not a dev. With luck they can help
beyond the below suggestions...
However, there's no need to panic in any case, due to the sysadmin's
first rule of backups: The true value of any data is defined by the
number of backups of that data you consider(ed) it worth having.
As a result, there are precisely two possibilities, neither one of which
calls for panic.
1) No need to panic because you have a backup, and recovery is as simple
as restoring from that backup.
2) You don't have a backup, in which case the lack of that backup means
you have defined the value of the data as only trivial, worth less than
the time/trouble/resources you saved by not making that backup. Because
the data is only of trivial value anyway, and you saved the more valuable
assets of the time/trouble/resources you would have put into that backup
were the data of more than trivial value, you've still saved the stuff
you considered most valuable, so again, no need to panic.
It's a binary state. There's no third possibility available, and no
possibility you lost what your actions, or lack of them in the case of no
backup, defined as of most value to you.
(As for the freshness of that backup, the same rule applies, but to the
data delta between the state as of the backup and the current state. If
the value of the changed data is worth it to you to have it backed up,
you'll have freshened your backup. If not, you defined it to be as of
such trivial value as to not be worth the time/trouble/resources to do
so.)
That said, at the time you're calculating the value of the data against
the value of the time/trouble/resources required to back it up, the loss
potential remains theoretical. Once something actually happens to the
data, it's no longer theoretical, and the data, while of trivial enough
value to be worth the risk when it was theoretical, may still be valuable
enough to you to spend at least some time/trouble on trying to recover it.
In that case, since you can still mount, I'd suggest mounting read-only
to prevent any further damage, and then do a copy off of the data you
can, to a different, unaffected, filesystem.
Then if there's still data you want that you couldn't simply copy off,
you can try btrfs restore. While I do have backups here, a couple times
when things went bad, btrfs restore was able to get back pretty much
everything to current, while were I to have had to restore from backups,
I'd have lost enough changed data to hurt, even if I had defined it as of
trivial enough value when the risk remained theoretical that I hadn't yet
freshened the backup. (Since then I upgraded the rest of my storage to
ssd, thus lowering the time and hassle cost of backups, encouraging me to
do them more frequently. Talking about which, I need to freshen them in
the near future. It's now on my list for my next day off...)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Help Recovering BTRFS array
@ 2017-09-21 13:40 grondinm
0 siblings, 0 replies; 3+ messages in thread
From: grondinm @ 2017-09-21 13:40 UTC (permalink / raw)
To: linux-btrfs
Hi Duncan,
I'm not sure if this will attache to my original message...
Thank you for your reply. For some reason i'm not getting list messages even tho i know i am subscribed.
I know all to well about the golden rule of data. It has bitten me a few times. The data on this array is mostly data that i don't really care about. I was able to copy off what i wanted. The main reason i sent it to the list was just to see if i could somehow return the FS to a working state without having to recreate. I'm just surprised that all 3 copies of the super block got corrupted. Probably my lack of understanding but i always assumed that if one copy got corrupted it would be replaced by a good copy therefore leaving all copies in a good state. Is that not the case. If it is then what back luck that all 3 got messed up at same time.
Some information i forgot to include in my original message
uname -a
Linux thebeach 4.12.13-gentoo-GMAN #1 SMP Sat Sep 16 15:28:26 ADT 2017 x86_64 Intel(R) Core(TM) i5-2320 CPU @ 3.00GHz GenuineIntel GNU/Linux
btrfs --version
btrfs-progs v4.10.2
Anyways thank you again for your reply. I will leave the FS intact for a few days in case anymore details could help the development of BTRFS and maybe avoid this happening or having a recovery option.
Marc
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-09-21 13:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-18 17:14 Help Recovering BTRFS array grondinm
2017-09-19 3:45 ` Duncan
-- strict thread matches above, loose matches on Subject: below --
2017-09-21 13:40 grondinm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).