* Any hope of pool recovery? @ 2015-07-01 15:39 Donald Pearson 2015-07-01 15:50 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-01 15:39 UTC (permalink / raw) To: linux-btrfs Hello, "darkling" was helping me on IRC for a while before he had to drop off, thanks for the help darkling. To pick up where we left off... In summary, I have a 10 disk raid6 pool that I cannot mount. btrfs fi show output is here -> http://pastebin.com/aidGV20e 'tank' is the pool in question. mounting fails with errors in dmesg with or without recovery,degraded,ro options. [ 142.588443] BTRFS: device label tank devid 1 transid 14796 /dev/sdc [ 142.589646] BTRFS info (device sdc): enabling auto recovery [ 142.589658] BTRFS info (device sdc): allowing degraded mounts [ 142.589665] BTRFS info (device sdc): disk space caching is enabled [ 142.589669] BTRFS: has skinny extents [ 142.592199] BTRFS: failed to read chunk root on sdc [ 142.612988] BTRFS: open_ctree failed What precipitated all this was horrible performance from the pool, seeing that service times for /dev/sdg were ~ 3 seconds and smartctl reported many sector issues with /dev/sdg. I issued the commant btrfs device delete /dev/sdg and then monitored btrfs fi show but saw no change in allocated data to /dev/sdg for several hours. I then attempted wipefs -a /dev/sdg but it was still listed in the btrfs fi show. I then rebooted, and am at the point where I'm at now. I figured it's best to stop breaking things now and ask for help, if this can be recovered. Thank you, Donald (seijirou) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 15:39 Any hope of pool recovery? Donald Pearson @ 2015-07-01 15:50 ` Chris Murphy 2015-07-01 16:09 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-01 15:50 UTC (permalink / raw) To: Btrfs BTRFS btrfs-progs version is 4.0, what is the kernel versions you've tried to mount with? I suggest running btrfs check (without --repair) and including the full output. There are a lot of changes in btrfs-progs 4.1, but off hand I don't know that they'd affect btrfs check results. Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 15:50 ` Chris Murphy @ 2015-07-01 16:09 ` Donald Pearson 2015-07-01 18:58 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-01 16:09 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Thanks Chris, To my shame it turns out darkling didn't drop off IRC after all; I'm new to all this and learning quickly that I need to sit on my hands. I admit despite darkling's suggestion that my usertools are probably fine I pulled down a newer kernel from elrepo so currently I'm running 4.1.1-1.el7.elrepo.x86_64 I started with 4.0.2-1.el7.elrepo.x86_64 I also do have btrfs-progs 4.1 that I got from git. Here is the 4.0 output [root@san01 btrfs-progs]# btrfs check /dev/sdc checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 checksum verify failed on 21364736 found EC809498 wanted 0863292E checksum verify failed on 21364736 found 925303CE wanted 09150E74 checksum verify failed on 21364736 found 925303CE wanted 09150E74 bytenr mismatch, want=21364736, have=1065943040 Couldn't read chunk tree Couldn't open file system Here is the 4.1 output [root@san01 btrfs-progs]# ./btrfs check /dev/sdc checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 checksum verify failed on 21364736 found EC809498 wanted 0863292E checksum verify failed on 21364736 found 925303CE wanted 09150E74 checksum verify failed on 21364736 found 925303CE wanted 09150E74 bytenr mismatch, want=21364736, have=1065943040 Couldn't read chunk tree Couldn't open file system Finally, before I learned of this mailing list I started a run of btrfs rescue chunk-recover [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc I can see now through iostat that all 10 drives are reading as fast as they can and my understanding is this will take a long time, but I've since learned (not only that darkling was still alive on IRC) that this probably won't solve my problem. Regards, Donald (seijirou) On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote: > btrfs-progs version is 4.0, what is the kernel versions you've tried > to mount with? > > I suggest running btrfs check (without --repair) and including the > full output. There are a lot of changes in btrfs-progs 4.1, but off > hand I don't know that they'd affect btrfs check results. > > > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 16:09 ` Donald Pearson @ 2015-07-01 18:58 ` Donald Pearson 2015-07-01 19:05 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-01 18:58 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Small update on this, with no idea if this is useful information or not. At some point within the last hour iostat shows that /dev/sdg is no longer under heavy reads. The other 9 drives however are still reading as fast as they are able. There is no new output on the `btrfs rescue chunk-recover` screen so I expect it's still running. There are 4 other drives with the same total capacity as sdg so I would have expected then to normally all complete at about the same time. Regards, Donald On Wed, Jul 1, 2015 at 11:09 AM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > Thanks Chris, > > To my shame it turns out darkling didn't drop off IRC after all; I'm > new to all this and learning quickly that I need to sit on my hands. > I admit despite darkling's suggestion that my usertools are probably > fine I pulled down a newer kernel from elrepo so currently I'm running > 4.1.1-1.el7.elrepo.x86_64 > > I started with 4.0.2-1.el7.elrepo.x86_64 > > I also do have btrfs-progs 4.1 that I got from git. > > Here is the 4.0 output > [root@san01 btrfs-progs]# btrfs check /dev/sdc > checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 > checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 > checksum verify failed on 21364736 found EC809498 wanted 0863292E > checksum verify failed on 21364736 found 925303CE wanted 09150E74 > checksum verify failed on 21364736 found 925303CE wanted 09150E74 > bytenr mismatch, want=21364736, have=1065943040 > Couldn't read chunk tree > Couldn't open file system > > Here is the 4.1 output > [root@san01 btrfs-progs]# ./btrfs check /dev/sdc > checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 > checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 > checksum verify failed on 21364736 found EC809498 wanted 0863292E > checksum verify failed on 21364736 found 925303CE wanted 09150E74 > checksum verify failed on 21364736 found 925303CE wanted 09150E74 > bytenr mismatch, want=21364736, have=1065943040 > Couldn't read chunk tree > Couldn't open file system > > Finally, before I learned of this mailing list I started a run of > btrfs rescue chunk-recover > [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc > > I can see now through iostat that all 10 drives are reading as fast as > they can and my understanding is this will take a long time, but I've > since learned (not only that darkling was still alive on IRC) that > this probably won't solve my problem. > > Regards, > Donald (seijirou) > > On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote: >> btrfs-progs version is 4.0, what is the kernel versions you've tried >> to mount with? >> >> I suggest running btrfs check (without --repair) and including the >> full output. There are a lot of changes in btrfs-progs 4.1, but off >> hand I don't know that they'd affect btrfs check results. >> >> >> Chris Murphy >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 18:58 ` Donald Pearson @ 2015-07-01 19:05 ` Donald Pearson 2015-07-01 21:35 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-01 19:05 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS I should have thought to check this to add earlier. I'm seeing errors for /dev/sdg in dmesg (not surprised, I wanted this drive out of the pool to begin with because it's sick). [ 142.612988] BTRFS: open_ctree failed [11836.105577] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [11836.105585] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] [11836.105589] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error [11836.105592] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f1 b8 00 01 00 00 [11836.105596] blk_update_request: critical medium error, dev sdg, sector 1515975096 [11839.044815] mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) [11839.044843] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [11839.044848] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] [11839.044857] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error [11839.044862] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 b8 00 01 00 00 [11839.044865] blk_update_request: critical medium error, dev sdg, sector 1515975352 [11842.009545] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [11842.009554] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] [11842.009558] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error [11842.009562] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 80 00 00 08 00 [11842.009565] blk_update_request: critical medium error, dev sdg, sector 1515975296 [11842.009934] Buffer I/O error on dev sdg, logical block 189496912, async page read On Wed, Jul 1, 2015 at 1:58 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > Small update on this, with no idea if this is useful information or not. > > At some point within the last hour iostat shows that /dev/sdg is no > longer under heavy reads. > > The other 9 drives however are still reading as fast as they are able. > There is no new output on the `btrfs rescue chunk-recover` screen so I > expect it's still running. > > There are 4 other drives with the same total capacity as sdg so I > would have expected then to normally all complete at about the same > time. > > Regards, > Donald > > On Wed, Jul 1, 2015 at 11:09 AM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: >> Thanks Chris, >> >> To my shame it turns out darkling didn't drop off IRC after all; I'm >> new to all this and learning quickly that I need to sit on my hands. >> I admit despite darkling's suggestion that my usertools are probably >> fine I pulled down a newer kernel from elrepo so currently I'm running >> 4.1.1-1.el7.elrepo.x86_64 >> >> I started with 4.0.2-1.el7.elrepo.x86_64 >> >> I also do have btrfs-progs 4.1 that I got from git. >> >> Here is the 4.0 output >> [root@san01 btrfs-progs]# btrfs check /dev/sdc >> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >> checksum verify failed on 21364736 found EC809498 wanted 0863292E >> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >> bytenr mismatch, want=21364736, have=1065943040 >> Couldn't read chunk tree >> Couldn't open file system >> >> Here is the 4.1 output >> [root@san01 btrfs-progs]# ./btrfs check /dev/sdc >> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >> checksum verify failed on 21364736 found EC809498 wanted 0863292E >> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >> bytenr mismatch, want=21364736, have=1065943040 >> Couldn't read chunk tree >> Couldn't open file system >> >> Finally, before I learned of this mailing list I started a run of >> btrfs rescue chunk-recover >> [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc >> >> I can see now through iostat that all 10 drives are reading as fast as >> they can and my understanding is this will take a long time, but I've >> since learned (not only that darkling was still alive on IRC) that >> this probably won't solve my problem. >> >> Regards, >> Donald (seijirou) >> >> On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote: >>> btrfs-progs version is 4.0, what is the kernel versions you've tried >>> to mount with? >>> >>> I suggest running btrfs check (without --repair) and including the >>> full output. There are a lot of changes in btrfs-progs 4.1, but off >>> hand I don't know that they'd affect btrfs check results. >>> >>> >>> Chris Murphy >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 19:05 ` Donald Pearson @ 2015-07-01 21:35 ` Donald Pearson 2015-07-01 23:29 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-01 21:35 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Here is the result of the attempted rescue chunk-recover [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc All Devices: Device: id = 7, name = /dev/sdl Device: id = 8, name = /dev/sdm Device: id = 9, name = /dev/sdn Device: id = 3, name = /dev/sdf Device: id = 6, name = /dev/sdi Device: id = 4, name = /dev/sdg Device: id = 5, name = /dev/sdh Device: id = 2, name = /dev/sdd Device: id = 10, name = /dev/sdq Device: id = 1, name = /dev/sdc *** Error in `./btrfs': free(): invalid next size (fast): 0x0000000001332100 *** Segmentation fault On Wed, Jul 1, 2015 at 2:05 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > I should have thought to check this to add earlier. I'm seeing errors > for /dev/sdg in dmesg (not surprised, I wanted this drive out of the > pool to begin with because it's sick). > > [ 142.612988] BTRFS: open_ctree failed > [11836.105577] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE > [11836.105585] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] > [11836.105589] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error > [11836.105592] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f1 b8 00 01 00 00 > [11836.105596] blk_update_request: critical medium error, dev sdg, > sector 1515975096 > [11839.044815] mpt2sas0: log_info(0x31080000): originator(PL), > code(0x08), sub_code(0x0000) > [11839.044843] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE > [11839.044848] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] > [11839.044857] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error > [11839.044862] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 b8 00 01 00 00 > [11839.044865] blk_update_request: critical medium error, dev sdg, > sector 1515975352 > [11842.009545] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE > [11842.009554] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] > [11842.009558] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error > [11842.009562] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 80 00 00 08 00 > [11842.009565] blk_update_request: critical medium error, dev sdg, > sector 1515975296 > [11842.009934] Buffer I/O error on dev sdg, logical block 189496912, > async page read > > On Wed, Jul 1, 2015 at 1:58 PM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: >> Small update on this, with no idea if this is useful information or not. >> >> At some point within the last hour iostat shows that /dev/sdg is no >> longer under heavy reads. >> >> The other 9 drives however are still reading as fast as they are able. >> There is no new output on the `btrfs rescue chunk-recover` screen so I >> expect it's still running. >> >> There are 4 other drives with the same total capacity as sdg so I >> would have expected then to normally all complete at about the same >> time. >> >> Regards, >> Donald >> >> On Wed, Jul 1, 2015 at 11:09 AM, Donald Pearson >> <donaldwhpearson@gmail.com> wrote: >>> Thanks Chris, >>> >>> To my shame it turns out darkling didn't drop off IRC after all; I'm >>> new to all this and learning quickly that I need to sit on my hands. >>> I admit despite darkling's suggestion that my usertools are probably >>> fine I pulled down a newer kernel from elrepo so currently I'm running >>> 4.1.1-1.el7.elrepo.x86_64 >>> >>> I started with 4.0.2-1.el7.elrepo.x86_64 >>> >>> I also do have btrfs-progs 4.1 that I got from git. >>> >>> Here is the 4.0 output >>> [root@san01 btrfs-progs]# btrfs check /dev/sdc >>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >>> checksum verify failed on 21364736 found EC809498 wanted 0863292E >>> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >>> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >>> bytenr mismatch, want=21364736, have=1065943040 >>> Couldn't read chunk tree >>> Couldn't open file system >>> >>> Here is the 4.1 output >>> [root@san01 btrfs-progs]# ./btrfs check /dev/sdc >>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000 >>> checksum verify failed on 21364736 found EC809498 wanted 0863292E >>> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >>> checksum verify failed on 21364736 found 925303CE wanted 09150E74 >>> bytenr mismatch, want=21364736, have=1065943040 >>> Couldn't read chunk tree >>> Couldn't open file system >>> >>> Finally, before I learned of this mailing list I started a run of >>> btrfs rescue chunk-recover >>> [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc >>> >>> I can see now through iostat that all 10 drives are reading as fast as >>> they can and my understanding is this will take a long time, but I've >>> since learned (not only that darkling was still alive on IRC) that >>> this probably won't solve my problem. >>> >>> Regards, >>> Donald (seijirou) >>> >>> On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote: >>>> btrfs-progs version is 4.0, what is the kernel versions you've tried >>>> to mount with? >>>> >>>> I suggest running btrfs check (without --repair) and including the >>>> full output. There are a lot of changes in btrfs-progs 4.1, but off >>>> hand I don't know that they'd affect btrfs check results. >>>> >>>> >>>> Chris Murphy >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 21:35 ` Donald Pearson @ 2015-07-01 23:29 ` Chris Murphy 2015-07-02 1:38 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-01 23:29 UTC (permalink / raw) To: Btrfs BTRFS On Wed, Jul 1, 2015 at 3:35 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > *** Error in `./btrfs': free(): invalid next size (fast): 0x0000000001332100 *** > Segmentation fault Blek. Well that's a bug then too. If you have space somewhere to put a btrfs-image -c9 -t4, I'd do that now before making anymore changes. Write up a bugzilla.kernel.org bug, include the URL for the image file (which will be large). Include the URL for the bug in this thread. And then it's wait time basically. I'm not a dev but this sounds rather serious. The pisser is that this is exactly the use case for raid6. You have a failed drive, want an extra margin to cover possible additional errors, you get a "BTRFS: failed to read chunk root on sdc" which could be construed as a problem with sdc, so a 2nd failure, and yet no reconstruction of the necessary metadata. Is metadata also raid6? Or just data? I don't see a 'btrfs fi df' probably because you can't mount the volume. Do you know if it was created with -d raid6 -m raid6 at mkfs time? (Include this info in the bug report.) Failing device handling with Btrfs is still weak. In many cases it will keep trying to use a device that produces spurious or even failed read and write errors. It's possible this caused some confusion. I propose trying the following. You could wait to see if someone else has better suggestions, but this seems reasonably safe. - Physically remove sdg from the system, reboot, and see if you can mount the volume with the most conservative mount option: -o ro,recovery,degraded,skip_balance If that doesn't work, and you still get the message about chunk root on devid 1/sdc (thing is, when you remove sdg it's possible drive letters will change, so be sure to correlate any errors to devid by using a current 'btrfs fi show' listing), then yuck. I would try chunk recover again, now that known bad drive sdg is physically removed. Do you get a different result, or still a seg fault? If those two things still fail, what's next is a toss up between two options: - Find or build a "4.2" kernel (there is no rc1 yet); Fedora has several "4.2"/linux-next binaries already built in the koji build system, so your distro might have extremely new kernels available somewhere for bleeding edgers. Try this with the above mount options again. In the recent git pull for this kernel there were nearly 2000 lines added, and nearly that many deleted. A lot of changes. So it's worth a shot. It could produce a good result or a worse result, or the same result. *shrug* What I probably wouldn't try while running the 4.2 kernel is another chunk recover. Seems doubtful it will make much difference. and the other option: - Physically remove the device that still produces the "BTRFS: failed to read chunk root on sdX" error, which in the current state as you posted it, was /dev/sdc (devid 1). Physically remove it. Reboot. And then retry the same mount options from above and see what that results in. If there were no problems with your file system, removing two devices and mounting degraded should work without errors (I've done it), so it seems like a valid thing to try seeing as two devices are giving you a hard time. Will a 3rd? Dunno. Anyway, not good news. But you're helping make Btrfs better! -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-01 23:29 ` Chris Murphy @ 2015-07-02 1:38 ` Donald Pearson 2015-07-02 2:31 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-02 1:38 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Thanks Chris. Everything is/was raid6. Oddly when I created the filesystem there was a mix of raid1 and raid6 but a balance dconvert mconvert after creation set everything to raid6. I did previously try a btrfs-image as I found that as a "first thing to do" through some google searching but that command won't run with essentially the same errors (additional "device is missing errors now" but this is otherwise identical to what I saw before). I'm happy to help post a bug report but can I still provide actionable information without btrfs-image working? [root@san01 btrfs-progs]# ./btrfs-image -c9 -t4 /dev/sdc /mnt2/backup/sdc.img warning, device 4 is missing warning devid 4 not found already checksum verify failed on 21364736 found EC809498 wanted 0863292E checksum verify failed on 21364736 found 925303CE wanted 09150E74 checksum verify failed on 21364736 found 925303CE wanted 09150E74 bytenr mismatch, want=21364736, have=1065943040 Couldn't read chunk tree Open ctree failed create failed (Bad file descriptor) So after the chunk-recover failed I postulated that there may be some correlation with the read of /dev/sdg stopping early. I say early because the other 4 drives of the same capacity continued reading for quite some time. So I tested a dd of sdg to a file, and after it ran for about 2 hours it stopped prematurely after 700 some-odd gigs and left some errors in the logs (I'll just tack them on the end of the email for the curious). At this point I decided sdg was done and couldn't be doing any help while installed so I yanked it out. Still unable to mount, I rebooted. Unfortunately I am still unable to mount after the reboot (and I tried again just now with all the options you posted, no dice), so I am running the chunk-recover command again. That would be neat if I can somehow contribute! Thanks again, Donald Here's the drive vomiting in my logs after it got halfway through the dd image attempt. Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f1 e0 00 01 00 00 Jul 1 17:05:51 san01 kernel: blk_update_request: critical medium error, dev sdg, sector 1515975136 Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium Error [current] Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 e0 00 01 00 00 On Wed, Jul 1, 2015 at 6:29 PM, Chris Murphy <lists@colorremedies.com> wrote: > On Wed, Jul 1, 2015 at 3:35 PM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: > >> *** Error in `./btrfs': free(): invalid next size (fast): 0x0000000001332100 *** >> Segmentation fault > > Blek. Well that's a bug then too. If you have space somewhere to put a > btrfs-image -c9 -t4, I'd do that now before making anymore changes. > Write up a bugzilla.kernel.org bug, include the URL for the image file > (which will be large). Include the URL for the bug in this thread. And > then it's wait time basically. I'm not a dev but this sounds rather > serious. > > The pisser is that this is exactly the use case for raid6. You have a > failed drive, want an extra margin to cover possible additional > errors, you get a "BTRFS: failed to read chunk root on sdc" which > could be construed as a problem with sdc, so a 2nd failure, and yet no > reconstruction of the necessary metadata. > > Is metadata also raid6? Or just data? I don't see a 'btrfs fi df' > probably because you can't mount the volume. Do you know if it was > created with -d raid6 -m raid6 at mkfs time? (Include this info in the > bug report.) > > Failing device handling with Btrfs is still weak. In many cases it > will keep trying to use a device that produces spurious or even failed > read and write errors. It's possible this caused some confusion. > > I propose trying the following. You could wait to see if someone else > has better suggestions, but this seems reasonably safe. > > - Physically remove sdg from the system, reboot, and see if you can > mount the volume with the most conservative mount option: -o > ro,recovery,degraded,skip_balance > > If that doesn't work, and you still get the message about chunk root > on devid 1/sdc (thing is, when you remove sdg it's possible drive > letters will change, so be sure to correlate any errors to devid by > using a current 'btrfs fi show' listing), then yuck. > > I would try chunk recover again, now that known bad drive sdg is > physically removed. Do you get a different result, or still a seg > fault? > > If those two things still fail, what's next is a toss up between two options: > > - Find or build a "4.2" kernel (there is no rc1 yet); Fedora has > several "4.2"/linux-next binaries already built in the koji build > system, so your distro might have extremely new kernels available > somewhere for bleeding edgers. Try this with the above mount options > again. In the recent git pull for this kernel there were nearly 2000 > lines added, and nearly that many deleted. A lot of changes. So it's > worth a shot. It could produce a good result or a worse result, or the > same result. *shrug* What I probably wouldn't try while running the > 4.2 kernel is another chunk recover. Seems doubtful it will make much > difference. > > and the other option: > > - Physically remove the device that still produces the "BTRFS: failed > to read chunk root on sdX" error, which in the current state as you > posted it, was /dev/sdc (devid 1). Physically remove it. Reboot. And > then retry the same mount options from above and see what that results > in. If there were no problems with your file system, removing two > devices and mounting degraded should work without errors (I've done > it), so it seems like a valid thing to try seeing as two devices are > giving you a hard time. Will a 3rd? Dunno. > > Anyway, not good news. But you're helping make Btrfs better! > > > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 1:38 ` Donald Pearson @ 2015-07-02 2:31 ` Chris Murphy 2015-07-02 14:49 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-02 2:31 UTC (permalink / raw) To: Btrfs BTRFS On Wed, Jul 1, 2015 at 7:38 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > Here's the drive vomiting in my logs after it got halfway through the > dd image attempt. > > Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium > Error [current] > Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense: > Unrecovered read error > Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a > 5b f1 e0 00 01 00 00 > Jul 1 17:05:51 san01 kernel: blk_update_request: critical medium > error, dev sdg, sector 1515975136 > Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium > Error [current] > Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense: > Unrecovered read error > Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a > 5b f2 e0 00 01 00 00 This looks like a typical URE. There are a number of reasons why a sector can be bad, but basically the drive ECC has given up being able to correct the problem, and it reports the command, the error, and the sector involved. What *should* happen is Btrfs reconstructs the data (or metadata) on that sector, and then writes it (since kernel 3.19) back to the bad sector LBA. The drive tries to write to that bad sector, and verifies. If there is a persistent failure then that LBA is mapped to a different physical sector and the bad one is removed (has no LBA) - there will be no kernel messages for this it's all handled in the drive itself. But this sounds like a dd read of the raw device, where Btrfs is not involved (because you can't mount the volume) so none of this correction happens. What I wonder though it in the much earlier logs, if this same problem happened when the volume was mounted, did Btrfs try to fix the problem and were there problems fixing it? So it might be useful if there's something in /var/log/messages or journalctl -bX at the time the original problem was first developing. Bad sectors are completely ordinary. They're not really common, out of maybe 50 drives I've had two exhibit this. But the drive's are designed to take this into account, and so are hardware, and linux kernel md raid, and LVM raid, and Btrfs, and ZFS. So... it's kinda important to know more about this edge case to find out where the problem is. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 2:31 ` Chris Murphy @ 2015-07-02 14:49 ` Donald Pearson 2015-07-02 16:58 ` Chris Murphy 2015-07-02 17:00 ` Chris Murphy 0 siblings, 2 replies; 23+ messages in thread From: Donald Pearson @ 2015-07-02 14:49 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Hello, At the bottom of this email are the results of the latest chunk-recover. I only included one example of the output that was printed prior to the summary information but it went up to the end of my screen buffer and beyond. So it looks like the command executed properly when none of the drives give up on a read. That said my issue with mounting still exists unfortunately. The errors in dmesg now complain about /dev/sdd. [56496.014539] BTRFS (device sdd): bad tree block start 0 21364736 Which is curious because this is device id 2, where previously the complaint was about device id 1. So can I believe dmesg about which drive is actually the issue or is the drive that's printed in dmesg just whichever drive happens to be the last in some loop of code? Theoretically I should be able to kick another drive out of the pool safely, but I'm not sure which one to actually kick out or if that is the appropriate next step. I do see plenty of complaints about the sdg drive (previously sde) in /var/log/messages from the 28th which is when I started noticing issues. Nothing is jumping out at me claiming the btrfs is taking action but I may not know what to look for. journalctl I'm not familiar with. journalctl -bX returns with "failed to parse relative boot ID number 'X'" but perhaps you meant X to be a variable of some value? journalctl -b does run, but I'm not sure what to look for. So, what does the audience suggest? Shall I compile a newer kernel, kick out another drive (which?), or take what's behind door #3 (which is...?) Thanks again everybody, Donald Chunk: start = 6643489177600, len = 1073741824, type = 104, num_stripes = 10 Stripes list: [ 0] Stripe: devid = 8, offset = 817549672448 [ 1] Stripe: devid = 7, offset = 817549672448 [ 2] Stripe: devid = 10, offset = 817549672448 [ 3] Stripe: devid = 9, offset = 817549672448 [ 4] Stripe: devid = 3, offset = 817549672448 [ 5] Stripe: devid = 0, offset = 0 [ 6] Stripe: devid = 0, offset = 0 [ 7] Stripe: devid = 0, offset = 0 [ 8] Stripe: devid = 0, offset = 0 [ 9] Stripe: devid = 0, offset = 0 Block Group: start = 6643489177600, len = 1073741824, flag = 104 Device extent list: [ 0]Device extent: devid = 3, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 1]Device extent: devid = 9, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 2]Device extent: devid = 10, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 3]Device extent: devid = 7, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 4]Device extent: devid = 8, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 5]Device extent: devid = 4, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 6]Device extent: devid = 2, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 7]Device extent: devid = 1, start = 817569595392, len = 134217728, chunk offset = 6643489177600 [ 8]Device extent: devid = 6, start = 817549672448, len = 134217728, chunk offset = 6643489177600 [ 9]Device extent: devid = 5, start = 817549672448, len = 134217728, chunk offset = 6643489177600 Chunk: start = 6886154829824, len = 8589934592, type = 101, num_stripes = 0 Stripes list: Block Group: start = 6886154829824, len = 8589934592, flag = 101 No device extent. Chunk: start = 6894744764416, len = 8589934592, type = 101, num_stripes = 0 Stripes list: Block Group: start = 6894744764416, len = 8589934592, flag = 101 No device extent. Chunk: start = 6903334699008, len = 8589934592, type = 101, num_stripes = 0 Stripes list: Block Group: start = 6903334699008, len = 8589934592, flag = 101 No device extent. Total Chunks: 805 Recoverable: 567 Unrecoverable: 238 Orphan Block Groups: Orphan Device Extents: Device extent: devid = 4, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 2, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 1, start = 819851296768, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 9, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 10, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 8, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 7, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 3, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 6, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 Device extent: devid = 5, start = 819831373824, len = 1073741824, chunk offset = 6661742788608 open with broken chunk error Fail to recover the chunk tree. On Wed, Jul 1, 2015 at 9:31 PM, Chris Murphy <lists@colorremedies.com> wrote: > On Wed, Jul 1, 2015 at 7:38 PM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: > >> Here's the drive vomiting in my logs after it got halfway through the >> dd image attempt. >> >> Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result: >> hostbyte=DID_OK driverbyte=DRIVER_SENSE >> Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium >> Error [current] >> Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense: >> Unrecovered read error >> Jul 1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a >> 5b f1 e0 00 01 00 00 >> Jul 1 17:05:51 san01 kernel: blk_update_request: critical medium >> error, dev sdg, sector 1515975136 >> Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result: >> hostbyte=DID_OK driverbyte=DRIVER_SENSE >> Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium >> Error [current] >> Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense: >> Unrecovered read error >> Jul 1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a >> 5b f2 e0 00 01 00 00 > > This looks like a typical URE. There are a number of reasons why a > sector can be bad, but basically the drive ECC has given up being able > to correct the problem, and it reports the command, the error, and the > sector involved. What *should* happen is Btrfs reconstructs the data > (or metadata) on that sector, and then writes it (since kernel 3.19) > back to the bad sector LBA. The drive tries to write to that bad > sector, and verifies. If there is a persistent failure then that LBA > is mapped to a different physical sector and the bad one is removed > (has no LBA) - there will be no kernel messages for this it's all > handled in the drive itself. > > But this sounds like a dd read of the raw device, where Btrfs is not > involved (because you can't mount the volume) so none of this > correction happens. What I wonder though it in the much earlier logs, > if this same problem happened when the volume was mounted, did Btrfs > try to fix the problem and were there problems fixing it? > > So it might be useful if there's something in /var/log/messages or > journalctl -bX at the time the original problem was first developing. > > Bad sectors are completely ordinary. They're not really common, out of > maybe 50 drives I've had two exhibit this. But the drive's are > designed to take this into account, and so are hardware, and linux > kernel md raid, and LVM raid, and Btrfs, and ZFS. So... it's kinda > important to know more about this edge case to find out where the > problem is. > > > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 14:49 ` Donald Pearson @ 2015-07-02 16:58 ` Chris Murphy 2015-07-02 17:00 ` Chris Murphy 1 sibling, 0 replies; 23+ messages in thread From: Chris Murphy @ 2015-07-02 16:58 UTC (permalink / raw) To: Donald Pearson; +Cc: Chris Murphy, Btrfs BTRFS On Thu, Jul 2, 2015 at 8:49 AM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > Which is curious because this is device id 2, where previously the > complaint was about device id 1. So can I believe dmesg about which > drive is actually the issue or is the drive that's printed in dmesg > just whichever drive happens to be the last in some loop of code? devid is static/reliable /dev/sdX is dynamic/unreliable and related to logic board's firmware Some systems are more stable in this regard than others, I've worked with systems that have different drive order every boot, even when hardware configuration is unchanged. When the config changes, good bet the drive letters will change. > Theoretically I should be able to kick another drive out of the pool > safely, but I'm not sure which one to actually kick out or if that is > the appropriate next step. My limited understanding at this point is that once you get "open with broken chunk error Fail to recover the chunk tree." from chunk recover, you've reached the limits of the current state of recovery tools. But that it completed suggests it might be possible to get a complete btrfs image, and get that to a developer who can then use it to improve the recovery tools. > > I do see plenty of complaints about the sdg drive (previously sde) in > /var/log/messages from the 28th which is when I started noticing > issues. Nothing is jumping out at me claiming the btrfs is taking > action but I may not know what to look for. > > journalctl I'm not familiar with. journalctl -bX returns with "failed > to parse relative boot ID number 'X'" but perhaps you meant X to be a > variable of some value? journalctl -b does run, but I'm not sure > what to look for. I don't have a raid56 example handy for what this looks like before this message appears: [48466.853589] BTRFS: fixed up error at logical 20971520 on dev /dev/sdb But that's what I get for corrupt metadata where metadata profile is DUP. The messages for missing metadata that needs reconstruction would be different but I'd expect to still see the fixed up message. But I'd also look at https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/raid56.c?id=refs/tags/v4.1 and read comments and possible raid56 related error messages. It's similar for data. [ 1540.865534] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 [ 1540.866944] BTRFS: unable to fixup (regular) error at logical 12845056 on dev /dev/sdb Again this is a corruption example, not a read failure example. It can't be fixed because the data profile is single in this case. > > So, what does the audience suggest? Shall I compile a newer kernel, > kick out another drive (which?), or take what's behind door #3 (which > is...?) If there's data on this volume you need, put all the drives back in and look at btrfs-rescue to try and extract what you can. And then try a btrfs-image again, maybe it'll work too if there aren't read errors. Once you've gotten what you need out of it, you can decide if it's worth continuing to try to fix it (seems doubtful to me but I am not a developer). I'd probably just start over. The one change to make going forward is more frequent scrubs to hopefully find and fixup any bad sectors before it starts to cause this problem again. Maybe someone with more knowledge will say if any of the btrfs kernel debug features are worth enabling? I suspect those debug features are only useful to gather more information as the file system is being used and encounters the first problem, the URE, and any subsequent events that caused confusion and then the self-corruption of the fs beyond repair. If so, that implies a whole new fs, and then trying to reproduce the conditions that caused the problem. Which brings me to... hdparm has a dangerous --make-bad-sector option for testing RAID. I wonder if qemu has such an option? I'd rather test this in a VM than use a "do not ever use" option in hdparm. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 14:49 ` Donald Pearson 2015-07-02 16:58 ` Chris Murphy @ 2015-07-02 17:00 ` Chris Murphy 2015-07-02 18:19 ` Donald Pearson 1 sibling, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-02 17:00 UTC (permalink / raw) To: Donald Pearson; +Cc: Chris Murphy, Btrfs BTRFS On Thu, Jul 2, 2015 at 8:49 AM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > I do see plenty of complaints about the sdg drive (previously sde) in > /var/log/messages from the 28th which is when I started noticing > issues. Nothing is jumping out at me claiming the btrfs is taking > action but I may not know what to look for. I'd include that entire log with the bug report. I'd like to skim it at least. Even logs from earlier might be useful. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 17:00 ` Chris Murphy @ 2015-07-02 18:19 ` Donald Pearson 2015-07-02 18:26 ` Chris Murphy 2015-07-03 9:31 ` Duncan 0 siblings, 2 replies; 23+ messages in thread From: Donald Pearson @ 2015-07-02 18:19 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Unfortunately btrfs image fails with "couldn't read chunk tree". btrfs restore complains that every device is missing except the one that you specify on executing the command. Multiple devices as a parameter isn't an option. Specifcy /dev/disk/by-uuid/<uuid> claims that all devices are missing. I went ahead and dropped the drive that dmesg is still complaining about. Mounting still fails, so I'm going to try to rescue chunk-tree again (for science!). If anybody has any other ideas to try or data to gather/methods to gather them as a case study for any devs please let me know. I'll assemble all the data that I know how to and follow that link Chris suggested for filing a bug. On Thu, Jul 2, 2015 at 12:00 PM, Chris Murphy <lists@colorremedies.com> wrote: > On Thu, Jul 2, 2015 at 8:49 AM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: > >> I do see plenty of complaints about the sdg drive (previously sde) in >> /var/log/messages from the 28th which is when I started noticing >> issues. Nothing is jumping out at me claiming the btrfs is taking >> action but I may not know what to look for. > > I'd include that entire log with the bug report. I'd like to skim it > at least. Even logs from earlier might be useful. > > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 18:19 ` Donald Pearson @ 2015-07-02 18:26 ` Chris Murphy 2015-07-02 18:32 ` Donald Pearson 2015-07-03 9:31 ` Duncan 1 sibling, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-02 18:26 UTC (permalink / raw) To: Btrfs BTRFS On Thu, Jul 2, 2015 at 12:19 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > Unfortunately btrfs image fails with "couldn't read chunk tree". > > btrfs restore complains that every device is missing except the one > that you specify on executing the command. Multiple devices as a > parameter isn't an option. Specifcy /dev/disk/by-uuid/<uuid> claims > that all devices are missing. Sounds like restore isn't raid56 aware yet? -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 18:26 ` Chris Murphy @ 2015-07-02 18:32 ` Donald Pearson 2015-07-02 18:37 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-02 18:32 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS I think it is. I have another raid5 pool that I've created to test the restore function on, and it worked. On Thu, Jul 2, 2015 at 1:26 PM, Chris Murphy <lists@colorremedies.com> wrote: > On Thu, Jul 2, 2015 at 12:19 PM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: >> Unfortunately btrfs image fails with "couldn't read chunk tree". >> >> btrfs restore complains that every device is missing except the one >> that you specify on executing the command. Multiple devices as a >> parameter isn't an option. Specifcy /dev/disk/by-uuid/<uuid> claims >> that all devices are missing. > > Sounds like restore isn't raid56 aware yet? > > > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 18:32 ` Donald Pearson @ 2015-07-02 18:37 ` Chris Murphy 2015-07-02 18:45 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-02 18:37 UTC (permalink / raw) To: Btrfs BTRFS On Thu, Jul 2, 2015 at 12:32 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > I think it is. I have another raid5 pool that I've created to test > the restore function on, and it worked. So you have all devices for this raid6 available, and yet when you use restore, you get missing device message for all devices except the one specified? But that doesn't happen with the raid5 volume? -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 18:37 ` Chris Murphy @ 2015-07-02 18:45 ` Donald Pearson 2015-07-02 18:54 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-02 18:45 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS That is correct. I'm going to rebalance my raid5 pool as raid6 and re-test just because. On Thu, Jul 2, 2015 at 1:37 PM, Chris Murphy <lists@colorremedies.com> wrote: > On Thu, Jul 2, 2015 at 12:32 PM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: >> I think it is. I have another raid5 pool that I've created to test >> the restore function on, and it worked. > > So you have all devices for this raid6 available, and yet when you use > restore, you get missing device message for all devices except the one > specified? But that doesn't happen with the raid5 volume? > > > -- > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 18:45 ` Donald Pearson @ 2015-07-02 18:54 ` Donald Pearson 0 siblings, 0 replies; 23+ messages in thread From: Donald Pearson @ 2015-07-02 18:54 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS Yes it works with raid6 as well. [root@san01 btrfs-progs]# ./btrfs fi show Label: 'rockstor_rockstor' uuid: 08d14b6f-18df-4b1b-a91e-4b33e7c90c29 Total devices 1 FS bytes used 19.25GiB devid 1 size 457.40GiB used 457.40GiB path /dev/sdt3 warning, device 4 is missing warning, device 2 is missing warning devid 2 not found already warning devid 4 not found already checksum verify failed on 21364736 found 925303CE wanted 09150E74 checksum verify failed on 21364736 found 925303CE wanted 09150E74 bytenr mismatch, want=21364736, have=1065943040 Couldn't read chunk tree Label: 'backup' uuid: 68be4632-93ba-4478-9098-2ecb23ee6c94 Total devices 5 FS bytes used 978.72MiB devid 1 size 2.73TiB used 1.62GiB path /dev/sdb devid 3 size 2.73TiB used 1.62GiB path /dev/sdi devid 4 size 2.73TiB used 1.62GiB path /dev/sdj devid 5 size 2.73TiB used 1.62GiB path /dev/sdn devid 6 size 2.73TiB used 1.62GiB path /dev/sdq Label: 'tank' uuid: 8a03f8e8-8b84-4d1b-b27d-e23ef8ebe21d Total devices 10 FS bytes used 5.67TiB devid 1 size 1.36TiB used 792.67GiB path /dev/sdc devid 3 size 1.82TiB used 792.65GiB path /dev/sdf devid 5 size 1.36TiB used 792.65GiB path /dev/sdg devid 6 size 1.36TiB used 792.65GiB path /dev/sdh devid 7 size 1.82TiB used 792.65GiB path /dev/sdk devid 8 size 1.82TiB used 792.65GiB path /dev/sdl devid 9 size 1.82TiB used 792.65GiB path /dev/sdm devid 10 size 1.82TiB used 792.65GiB path /dev/sdp *** Some devices missing btrfs-progs v4.1 [root@san01 btrfs-progs]# mount /dev/sdb /mnt2/backup [root@san01 btrfs-progs]# ./btrfs fi df /mnt2/backup Data, RAID6: total=3.00GiB, used=977.56MiB System, RAID6: total=96.00MiB, used=16.00KiB Metadata, RAID6: total=1.03GiB, used=1.14MiB GlobalReserve, single: total=16.00MiB, used=0.00B [root@san01 btrfs-progs]# ll /mnt2/backup total 1000000 -rw-r--r-- 1 root root 1024000000 Jul 2 13:48 test_file_1gb [root@san01 btrfs-progs]# umount /mnt2/backup [root@san01 btrfs-progs]# ./btrfs restore -xmv /dev/sdb ~ Restoring /root/test_file_1gb Done searching [root@san01 btrfs-progs]# ll ~ total 1000004 -rw-------. 1 root root 1101 Jun 20 23:18 anaconda-ks.cfg drwxr-xr-x 1 root root 22 Jul 1 09:48 git -rw-r--r-- 1 root root 1024000000 Jul 2 13:48 test_file_1gb [root@san01 btrfs-progs]# On Thu, Jul 2, 2015 at 1:45 PM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > That is correct. I'm going to rebalance my raid5 pool as raid6 and > re-test just because. > > On Thu, Jul 2, 2015 at 1:37 PM, Chris Murphy <lists@colorremedies.com> wrote: >> On Thu, Jul 2, 2015 at 12:32 PM, Donald Pearson >> <donaldwhpearson@gmail.com> wrote: >>> I think it is. I have another raid5 pool that I've created to test >>> the restore function on, and it worked. >> >> So you have all devices for this raid6 available, and yet when you use >> restore, you get missing device message for all devices except the one >> specified? But that doesn't happen with the raid5 volume? >> >> >> -- >> Chris Murphy >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-02 18:19 ` Donald Pearson 2015-07-02 18:26 ` Chris Murphy @ 2015-07-03 9:31 ` Duncan 2015-07-03 13:29 ` Martin Steigerwald 1 sibling, 1 reply; 23+ messages in thread From: Duncan @ 2015-07-03 9:31 UTC (permalink / raw) To: linux-btrfs Donald Pearson posted on Thu, 02 Jul 2015 13:19:41 -0500 as excerpted: > btrfs restore complains that every device is missing except the one that > you specify on executing the command. Multiple devices as a parameter > isn't an option. Specifcy /dev/disk/by-uuid/<uuid> claims that all > devices are missing. That sounds like the kernel lost track of what devices correspond with the filesystem. Try issuing the btrfs device scan command before restore, and see if that changes things. (If it doesn't, try btrfs device scan --all-devices, just in case.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-03 9:31 ` Duncan @ 2015-07-03 13:29 ` Martin Steigerwald 2015-07-03 15:05 ` Donald Pearson 0 siblings, 1 reply; 23+ messages in thread From: Martin Steigerwald @ 2015-07-03 13:29 UTC (permalink / raw) To: linux-btrfs On Friday 03 July 2015 09:31:03 Duncan wrote: > Donald Pearson posted on Thu, 02 Jul 2015 13:19:41 -0500 as excerpted: > > btrfs restore complains that every device is missing except the one that > > you specify on executing the command. Multiple devices as a parameter > > isn't an option. Specifcy /dev/disk/by-uuid/<uuid> claims that all > > devices are missing. > > That sounds like the kernel lost track of what devices correspond with > the filesystem. Try issuing the btrfs device scan command before > restore, and see if that changes things. (If it doesn't, try btrfs > device scan --all-devices, just in case.) Also does blkid and/or file-sk onto each device show that the BTRFS signatures are still there? -- Martin ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-03 13:29 ` Martin Steigerwald @ 2015-07-03 15:05 ` Donald Pearson 2015-07-03 17:51 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Donald Pearson @ 2015-07-03 15:05 UTC (permalink / raw) To: Martin Steigerwald; +Cc: Btrfs BTRFS Thanks for the inputs guys. Yes I did learn to perform a device scan --all-devices. It seems that the chunk tree is vital to a lot of functionality and the recovery tools are no exception. I suspect that I ran in to the raid56 caveat "btrfs does not deal well with a drive that is present but not working" described here :http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html There is also mention here of some caveats with raid56 and drives that are no good and/or out of sync are still picked right up by the pool leading to data loss. I should have studied more cloesly earlier, because issuing a btrfs device delete is exactly what I tried to do and things went down hill from there. I did some more digging and found that I had a lot of errors basically every drive. I think the incident that caused my backup pool to fail did more harm that I initially thought. Last night I disassembled the box and inspected/re-seated everything and now I'm scanning each device for read and write errors. So one particularly bad drive + multiple raid56 caveats + all the other drives behaving a little funny + a guy who didn't read as well as he should, and I really can't be surprised at the result. So I've decided to cash in the pool for some valueable lessons and move on. Thanks to everybody for your thoughts and help here and on IRC (a responsive IRC, that's a refereshing difference) and I'll see what adventures btrfs round 2 brings. Regards, Donald (seijirou) On Fri, Jul 3, 2015 at 8:29 AM, Martin Steigerwald <martin@lichtvoll.de> wrote: > On Friday 03 July 2015 09:31:03 Duncan wrote: >> Donald Pearson posted on Thu, 02 Jul 2015 13:19:41 -0500 as excerpted: >> > btrfs restore complains that every device is missing except the one that >> > you specify on executing the command. Multiple devices as a parameter >> > isn't an option. Specifcy /dev/disk/by-uuid/<uuid> claims that all >> > devices are missing. >> >> That sounds like the kernel lost track of what devices correspond with >> the filesystem. Try issuing the btrfs device scan command before >> restore, and see if that changes things. (If it doesn't, try btrfs >> device scan --all-devices, just in case.) > > Also does blkid and/or file-sk onto each device show that the BTRFS > signatures are still there? > > -- > Martin > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-03 15:05 ` Donald Pearson @ 2015-07-03 17:51 ` Chris Murphy 2015-07-06 12:08 ` Austin S Hemmelgarn 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2015-07-03 17:51 UTC (permalink / raw) To: Btrfs BTRFS On Fri, Jul 3, 2015 at 9:05 AM, Donald Pearson <donaldwhpearson@gmail.com> wrote: > I did some more digging and found that I had a lot of errors basically > every drive. Ick. Sucks for you but then makes this less of a Btrfs problem because it can really only do so much if more than the number of spares have problems. It does suggest a more aggressive need for the volume to go read only in such cases though, before it gets this corrupt. Multiple disk problems like this though suggest a shared hardware problem like a controller or expander. > So one particularly bad drive + multiple raid56 caveats + all the > other drives behaving a little funny + a guy who didn't read as well > as he should, and I really can't be surprised at the result. > > So I've decided to cash in the pool for some valueable lessons and > move on. Thanks to everybody for your thoughts and help here and on > IRC (a responsive IRC, that's a refereshing difference) and I'll see > what adventures btrfs round 2 brings. Yep! -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Any hope of pool recovery? 2015-07-03 17:51 ` Chris Murphy @ 2015-07-06 12:08 ` Austin S Hemmelgarn 0 siblings, 0 replies; 23+ messages in thread From: Austin S Hemmelgarn @ 2015-07-06 12:08 UTC (permalink / raw) To: Chris Murphy, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 1170 bytes --] On 2015-07-03 13:51, Chris Murphy wrote: > On Fri, Jul 3, 2015 at 9:05 AM, Donald Pearson > <donaldwhpearson@gmail.com> wrote: > >> I did some more digging and found that I had a lot of errors basically >> every drive. > > Ick. Sucks for you but then makes this less of a Btrfs problem because > it can really only do so much if more than the number of spares have > problems. It does suggest a more aggressive need for the volume to go > read only in such cases though, before it gets this corrupt. I'd almost say this is something that should be configurable. The default should probably be if there have been errors on at least as many drives as there are spares, the fs should go read-only; but still provide the option to choose between that, going read-only immediately on the first error or only going read-only on write errors. > Multiple disk problems like this though suggest a shared hardware > problem like a controller or expander. > I have to agree with this statement, I've seen stuff like this before (altho0ugh thankfully not on BTRFS), and 100% of the time the root cause was either the storage controller of system RAM. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 2967 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2015-07-06 12:08 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-01 15:39 Any hope of pool recovery? Donald Pearson 2015-07-01 15:50 ` Chris Murphy 2015-07-01 16:09 ` Donald Pearson 2015-07-01 18:58 ` Donald Pearson 2015-07-01 19:05 ` Donald Pearson 2015-07-01 21:35 ` Donald Pearson 2015-07-01 23:29 ` Chris Murphy 2015-07-02 1:38 ` Donald Pearson 2015-07-02 2:31 ` Chris Murphy 2015-07-02 14:49 ` Donald Pearson 2015-07-02 16:58 ` Chris Murphy 2015-07-02 17:00 ` Chris Murphy 2015-07-02 18:19 ` Donald Pearson 2015-07-02 18:26 ` Chris Murphy 2015-07-02 18:32 ` Donald Pearson 2015-07-02 18:37 ` Chris Murphy 2015-07-02 18:45 ` Donald Pearson 2015-07-02 18:54 ` Donald Pearson 2015-07-03 9:31 ` Duncan 2015-07-03 13:29 ` Martin Steigerwald 2015-07-03 15:05 ` Donald Pearson 2015-07-03 17:51 ` Chris Murphy 2015-07-06 12:08 ` Austin S Hemmelgarn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.