Any hope of pool recovery?

All of lore.kernel.org
 help / color / mirror / Atom feed

* Any hope of pool recovery?
@ 2015-07-01 15:39 Donald Pearson
  2015-07-01 15:50 ` Chris Murphy
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-01 15:39 UTC (permalink / raw)
  To: linux-btrfs

Hello,

"darkling" was helping me on IRC for a while before he had to drop
off, thanks for the help darkling.

To pick up where we left off...
In summary, I have a 10 disk raid6 pool that I cannot mount.

btrfs fi show output is here ->  http://pastebin.com/aidGV20e
'tank' is the pool in question.

mounting fails with errors in dmesg with or without
recovery,degraded,ro options.

[  142.588443] BTRFS: device label tank devid 1 transid 14796 /dev/sdc
[  142.589646] BTRFS info (device sdc): enabling auto recovery
[  142.589658] BTRFS info (device sdc): allowing degraded mounts
[  142.589665] BTRFS info (device sdc): disk space caching is enabled
[  142.589669] BTRFS: has skinny extents
[  142.592199] BTRFS: failed to read chunk root on sdc
[  142.612988] BTRFS: open_ctree failed

What precipitated all this was horrible performance from the pool,
seeing that service times for /dev/sdg were ~ 3 seconds and smartctl
reported many sector issues with /dev/sdg.
I issued the commant btrfs device delete /dev/sdg  and then monitored
btrfs fi show but saw no change in allocated data to /dev/sdg for
several hours.
I then attempted wipefs -a /dev/sdg  but it was still listed in the
btrfs fi show.
I then rebooted, and am at the point where I'm at now.   I figured
it's best to stop breaking things now and ask for help, if this can be
recovered.

Thank you,
Donald (seijirou)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 15:39 Any hope of pool recovery? Donald Pearson
@ 2015-07-01 15:50 ` Chris Murphy
  2015-07-01 16:09   ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-01 15:50 UTC (permalink / raw)
  To: Btrfs BTRFS

btrfs-progs version is 4.0, what is the kernel versions you've tried
to mount with?

I suggest running btrfs check (without --repair) and including the
full output. There are a lot of changes in btrfs-progs 4.1, but off
hand I don't know that they'd affect btrfs check results.

Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 15:50 ` Chris Murphy
@ 2015-07-01 16:09   ` Donald Pearson
  2015-07-01 18:58     ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-01 16:09 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Thanks Chris,

To my shame it turns out darkling didn't drop off IRC after all; I'm
new to all this and learning quickly that I need to sit on my hands.
I admit despite darkling's suggestion that my usertools are probably
fine I pulled down a newer kernel from elrepo so currently I'm running
4.1.1-1.el7.elrepo.x86_64

I started with 4.0.2-1.el7.elrepo.x86_64

I also do have btrfs-progs 4.1 that I got from git.

Here is the 4.0 output
[root@san01 btrfs-progs]# btrfs check /dev/sdc
checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
checksum verify failed on 21364736 found EC809498 wanted 0863292E
checksum verify failed on 21364736 found 925303CE wanted 09150E74
checksum verify failed on 21364736 found 925303CE wanted 09150E74
bytenr mismatch, want=21364736, have=1065943040
Couldn't read chunk tree
Couldn't open file system

Here is the 4.1 output
[root@san01 btrfs-progs]# ./btrfs check /dev/sdc
checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
checksum verify failed on 21364736 found EC809498 wanted 0863292E
checksum verify failed on 21364736 found 925303CE wanted 09150E74
checksum verify failed on 21364736 found 925303CE wanted 09150E74
bytenr mismatch, want=21364736, have=1065943040
Couldn't read chunk tree
Couldn't open file system

Finally, before I learned of this mailing list I started a run of
btrfs rescue chunk-recover
[root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc

I can see now through iostat that all 10 drives are reading as fast as
they can and my understanding is this will take a long time, but I've
since learned (not only that darkling was still alive on IRC) that
this probably won't solve my problem.

Regards,
Donald (seijirou)

On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote:
> btrfs-progs version is 4.0, what is the kernel versions you've tried
> to mount with?
>
> I suggest running btrfs check (without --repair) and including the
> full output. There are a lot of changes in btrfs-progs 4.1, but off
> hand I don't know that they'd affect btrfs check results.
>
>
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 16:09   ` Donald Pearson
@ 2015-07-01 18:58     ` Donald Pearson
  2015-07-01 19:05       ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-01 18:58 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Small update on this, with no idea if this is useful information or not.

At some point within the last hour iostat shows that /dev/sdg is no
longer under heavy reads.

The other 9 drives however are still reading as fast as they are able.
There is no new output on the `btrfs rescue chunk-recover` screen so I
expect it's still running.

There are 4 other drives with the same total capacity as sdg so I
would have expected then to normally all complete at about the same
time.

Regards,
Donald

On Wed, Jul 1, 2015 at 11:09 AM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
> Thanks Chris,
>
> To my shame it turns out darkling didn't drop off IRC after all; I'm
> new to all this and learning quickly that I need to sit on my hands.
> I admit despite darkling's suggestion that my usertools are probably
> fine I pulled down a newer kernel from elrepo so currently I'm running
> 4.1.1-1.el7.elrepo.x86_64
>
> I started with 4.0.2-1.el7.elrepo.x86_64
>
> I also do have btrfs-progs 4.1 that I got from git.
>
> Here is the 4.0 output
> [root@san01 btrfs-progs]# btrfs check /dev/sdc
> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
> checksum verify failed on 21364736 found EC809498 wanted 0863292E
> checksum verify failed on 21364736 found 925303CE wanted 09150E74
> checksum verify failed on 21364736 found 925303CE wanted 09150E74
> bytenr mismatch, want=21364736, have=1065943040
> Couldn't read chunk tree
> Couldn't open file system
>
> Here is the 4.1 output
> [root@san01 btrfs-progs]# ./btrfs check /dev/sdc
> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
> checksum verify failed on 21364736 found EC809498 wanted 0863292E
> checksum verify failed on 21364736 found 925303CE wanted 09150E74
> checksum verify failed on 21364736 found 925303CE wanted 09150E74
> bytenr mismatch, want=21364736, have=1065943040
> Couldn't read chunk tree
> Couldn't open file system
>
> Finally, before I learned of this mailing list I started a run of
> btrfs rescue chunk-recover
> [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc
>
> I can see now through iostat that all 10 drives are reading as fast as
> they can and my understanding is this will take a long time, but I've
> since learned (not only that darkling was still alive on IRC) that
> this probably won't solve my problem.
>
> Regards,
> Donald (seijirou)
>
> On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote:
>> btrfs-progs version is 4.0, what is the kernel versions you've tried
>> to mount with?
>>
>> I suggest running btrfs check (without --repair) and including the
>> full output. There are a lot of changes in btrfs-progs 4.1, but off
>> hand I don't know that they'd affect btrfs check results.
>>
>>
>> Chris Murphy
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 18:58     ` Donald Pearson
@ 2015-07-01 19:05       ` Donald Pearson
  2015-07-01 21:35         ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-01 19:05 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

I should have thought to check this to add earlier.  I'm seeing errors
for /dev/sdg in dmesg (not surprised, I wanted this drive out of the
pool to begin with because it's sick).

[  142.612988] BTRFS: open_ctree failed
[11836.105577] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[11836.105585] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current]
[11836.105589] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error
[11836.105592] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f1 b8 00 01 00 00
[11836.105596] blk_update_request: critical medium error, dev sdg,
sector 1515975096
[11839.044815] mpt2sas0: log_info(0x31080000): originator(PL),
code(0x08), sub_code(0x0000)
[11839.044843] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[11839.044848] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current]
[11839.044857] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error
[11839.044862] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 b8 00 01 00 00
[11839.044865] blk_update_request: critical medium error, dev sdg,
sector 1515975352
[11842.009545] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[11842.009554] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current]
[11842.009558] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error
[11842.009562] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 80 00 00 08 00
[11842.009565] blk_update_request: critical medium error, dev sdg,
sector 1515975296
[11842.009934] Buffer I/O error on dev sdg, logical block 189496912,
async page read

On Wed, Jul 1, 2015 at 1:58 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
> Small update on this, with no idea if this is useful information or not.
>
> At some point within the last hour iostat shows that /dev/sdg is no
> longer under heavy reads.
>
> The other 9 drives however are still reading as fast as they are able.
> There is no new output on the `btrfs rescue chunk-recover` screen so I
> expect it's still running.
>
> There are 4 other drives with the same total capacity as sdg so I
> would have expected then to normally all complete at about the same
> time.
>
> Regards,
> Donald
>
> On Wed, Jul 1, 2015 at 11:09 AM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>> Thanks Chris,
>>
>> To my shame it turns out darkling didn't drop off IRC after all; I'm
>> new to all this and learning quickly that I need to sit on my hands.
>> I admit despite darkling's suggestion that my usertools are probably
>> fine I pulled down a newer kernel from elrepo so currently I'm running
>> 4.1.1-1.el7.elrepo.x86_64
>>
>> I started with 4.0.2-1.el7.elrepo.x86_64
>>
>> I also do have btrfs-progs 4.1 that I got from git.
>>
>> Here is the 4.0 output
>> [root@san01 btrfs-progs]# btrfs check /dev/sdc
>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>> checksum verify failed on 21364736 found EC809498 wanted 0863292E
>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>> bytenr mismatch, want=21364736, have=1065943040
>> Couldn't read chunk tree
>> Couldn't open file system
>>
>> Here is the 4.1 output
>> [root@san01 btrfs-progs]# ./btrfs check /dev/sdc
>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>> checksum verify failed on 21364736 found EC809498 wanted 0863292E
>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>> bytenr mismatch, want=21364736, have=1065943040
>> Couldn't read chunk tree
>> Couldn't open file system
>>
>> Finally, before I learned of this mailing list I started a run of
>> btrfs rescue chunk-recover
>> [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc
>>
>> I can see now through iostat that all 10 drives are reading as fast as
>> they can and my understanding is this will take a long time, but I've
>> since learned (not only that darkling was still alive on IRC) that
>> this probably won't solve my problem.
>>
>> Regards,
>> Donald (seijirou)
>>
>> On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote:
>>> btrfs-progs version is 4.0, what is the kernel versions you've tried
>>> to mount with?
>>>
>>> I suggest running btrfs check (without --repair) and including the
>>> full output. There are a lot of changes in btrfs-progs 4.1, but off
>>> hand I don't know that they'd affect btrfs check results.
>>>
>>>
>>> Chris Murphy
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 19:05       ` Donald Pearson
@ 2015-07-01 21:35         ` Donald Pearson
  2015-07-01 23:29           ` Chris Murphy
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-01 21:35 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Here is the result of the attempted rescue chunk-recover

[root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc
All Devices:
        Device: id = 7, name = /dev/sdl
        Device: id = 8, name = /dev/sdm
        Device: id = 9, name = /dev/sdn
        Device: id = 3, name = /dev/sdf
        Device: id = 6, name = /dev/sdi
        Device: id = 4, name = /dev/sdg
        Device: id = 5, name = /dev/sdh
        Device: id = 2, name = /dev/sdd
        Device: id = 10, name = /dev/sdq
        Device: id = 1, name = /dev/sdc

*** Error in `./btrfs': free(): invalid next size (fast): 0x0000000001332100 ***
Segmentation fault

On Wed, Jul 1, 2015 at 2:05 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
> I should have thought to check this to add earlier.  I'm seeing errors
> for /dev/sdg in dmesg (not surprised, I wanted this drive out of the
> pool to begin with because it's sick).
>
> [  142.612988] BTRFS: open_ctree failed
> [11836.105577] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [11836.105585] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current]
> [11836.105589] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error
> [11836.105592] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f1 b8 00 01 00 00
> [11836.105596] blk_update_request: critical medium error, dev sdg,
> sector 1515975096
> [11839.044815] mpt2sas0: log_info(0x31080000): originator(PL),
> code(0x08), sub_code(0x0000)
> [11839.044843] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [11839.044848] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current]
> [11839.044857] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error
> [11839.044862] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 b8 00 01 00 00
> [11839.044865] blk_update_request: critical medium error, dev sdg,
> sector 1515975352
> [11842.009545] sd 0:0:6:0: [sdg] FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [11842.009554] sd 0:0:6:0: [sdg] Sense Key : Medium Error [current]
> [11842.009558] sd 0:0:6:0: [sdg] Add. Sense: Unrecovered read error
> [11842.009562] sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a 5b f2 80 00 00 08 00
> [11842.009565] blk_update_request: critical medium error, dev sdg,
> sector 1515975296
> [11842.009934] Buffer I/O error on dev sdg, logical block 189496912,
> async page read
>
> On Wed, Jul 1, 2015 at 1:58 PM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>> Small update on this, with no idea if this is useful information or not.
>>
>> At some point within the last hour iostat shows that /dev/sdg is no
>> longer under heavy reads.
>>
>> The other 9 drives however are still reading as fast as they are able.
>> There is no new output on the `btrfs rescue chunk-recover` screen so I
>> expect it's still running.
>>
>> There are 4 other drives with the same total capacity as sdg so I
>> would have expected then to normally all complete at about the same
>> time.
>>
>> Regards,
>> Donald
>>
>> On Wed, Jul 1, 2015 at 11:09 AM, Donald Pearson
>> <donaldwhpearson@gmail.com> wrote:
>>> Thanks Chris,
>>>
>>> To my shame it turns out darkling didn't drop off IRC after all; I'm
>>> new to all this and learning quickly that I need to sit on my hands.
>>> I admit despite darkling's suggestion that my usertools are probably
>>> fine I pulled down a newer kernel from elrepo so currently I'm running
>>> 4.1.1-1.el7.elrepo.x86_64
>>>
>>> I started with 4.0.2-1.el7.elrepo.x86_64
>>>
>>> I also do have btrfs-progs 4.1 that I got from git.
>>>
>>> Here is the 4.0 output
>>> [root@san01 btrfs-progs]# btrfs check /dev/sdc
>>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>>> checksum verify failed on 21364736 found EC809498 wanted 0863292E
>>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>>> bytenr mismatch, want=21364736, have=1065943040
>>> Couldn't read chunk tree
>>> Couldn't open file system
>>>
>>> Here is the 4.1 output
>>> [root@san01 btrfs-progs]# ./btrfs check /dev/sdc
>>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>>> checksum verify failed on 21364736 found E4E3BDB6 wanted 00000000
>>> checksum verify failed on 21364736 found EC809498 wanted 0863292E
>>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>>> checksum verify failed on 21364736 found 925303CE wanted 09150E74
>>> bytenr mismatch, want=21364736, have=1065943040
>>> Couldn't read chunk tree
>>> Couldn't open file system
>>>
>>> Finally, before I learned of this mailing list I started a run of
>>> btrfs rescue chunk-recover
>>> [root@san01 btrfs-progs]# ./btrfs rescue chunk-recover -v /dev/sdc
>>>
>>> I can see now through iostat that all 10 drives are reading as fast as
>>> they can and my understanding is this will take a long time, but I've
>>> since learned (not only that darkling was still alive on IRC) that
>>> this probably won't solve my problem.
>>>
>>> Regards,
>>> Donald (seijirou)
>>>
>>> On Wed, Jul 1, 2015 at 10:50 AM, Chris Murphy <lists@colorremedies.com> wrote:
>>>> btrfs-progs version is 4.0, what is the kernel versions you've tried
>>>> to mount with?
>>>>
>>>> I suggest running btrfs check (without --repair) and including the
>>>> full output. There are a lot of changes in btrfs-progs 4.1, but off
>>>> hand I don't know that they'd affect btrfs check results.
>>>>
>>>>
>>>> Chris Murphy
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 21:35         ` Donald Pearson
@ 2015-07-01 23:29           ` Chris Murphy
  2015-07-02  1:38             ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-01 23:29 UTC (permalink / raw)
  To: Btrfs BTRFS

On Wed, Jul 1, 2015 at 3:35 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:

> *** Error in `./btrfs': free(): invalid next size (fast): 0x0000000001332100 ***
> Segmentation fault

Blek. Well that's a bug then too. If you have space somewhere to put a
btrfs-image -c9 -t4, I'd do that now before making anymore changes.
Write up a bugzilla.kernel.org bug, include the URL for the image file
(which will be large). Include the URL for the bug in this thread. And
then it's wait time basically. I'm not a dev but this sounds rather
serious.

The pisser is that this is exactly the use case for raid6. You have a
failed drive, want an extra margin to cover possible additional
errors, you get a "BTRFS: failed to read chunk root on sdc" which
could be construed as a problem with sdc, so a 2nd failure, and yet no
reconstruction of the necessary metadata.

Is metadata also raid6? Or just data? I don't see a 'btrfs fi df'
probably because you can't mount the volume. Do you know if it was
created with -d raid6 -m raid6 at mkfs time? (Include this info in the
bug report.)

Failing device handling with Btrfs is still weak. In many cases it
will keep trying to use a device that produces spurious or even failed
read and write errors. It's possible this caused some confusion.

I propose trying the following. You could wait to see if someone else
has better suggestions, but this seems reasonably safe.

- Physically remove sdg from the system, reboot, and see if you can
mount the volume with the most conservative mount option: -o
ro,recovery,degraded,skip_balance

If that doesn't work, and you still get the message about chunk root
on devid 1/sdc (thing is, when you remove sdg it's possible drive
letters will change, so be sure to correlate any errors to devid by
using a current 'btrfs fi show' listing), then yuck.

I would try chunk recover again, now that known bad drive sdg is
physically removed. Do you get a different result, or still a seg
fault?

If those two things still fail, what's next is a toss up between two options:

- Find or build a "4.2" kernel (there is no rc1 yet); Fedora has
several "4.2"/linux-next binaries already built in the koji build
system, so your distro might have extremely new kernels available
somewhere for bleeding edgers. Try this with the above mount options
again. In the recent git pull for this kernel there were nearly 2000
lines added, and nearly that many deleted. A lot of changes. So it's
worth a shot. It could produce a good result or a worse result, or the
same result. *shrug* What I probably wouldn't try while running the
4.2 kernel is another chunk recover. Seems doubtful it will make much
difference.

and the other option:

- Physically remove the device that still produces the "BTRFS: failed
to read chunk root on sdX" error, which in the current state as you
posted it, was /dev/sdc (devid 1). Physically remove it. Reboot. And
then retry the same mount options from above and see what that results
in. If there were no problems with your file system, removing two
devices and mounting degraded should work without errors (I've done
it), so it seems like a valid thing to try seeing as two devices are
giving you a hard time. Will a 3rd? Dunno.

Anyway, not good news. But you're helping make Btrfs better!

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-01 23:29           ` Chris Murphy
@ 2015-07-02  1:38             ` Donald Pearson
  2015-07-02  2:31               ` Chris Murphy
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-02  1:38 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Thanks Chris.

Everything is/was raid6.  Oddly when I created the filesystem there
was a mix of raid1 and raid6 but a balance dconvert mconvert after
creation set everything to raid6.

I did previously try a btrfs-image as I found that as a "first thing
to do" through some google searching but that command won't run with
essentially the same errors (additional "device is missing errors now"
but this is otherwise identical to what I saw before).

I'm happy to help post a bug report but can I still provide actionable
information without btrfs-image working?

[root@san01 btrfs-progs]# ./btrfs-image -c9 -t4 /dev/sdc /mnt2/backup/sdc.img
warning, device 4 is missing
warning devid 4 not found already
checksum verify failed on 21364736 found EC809498 wanted 0863292E
checksum verify failed on 21364736 found 925303CE wanted 09150E74
checksum verify failed on 21364736 found 925303CE wanted 09150E74
bytenr mismatch, want=21364736, have=1065943040
Couldn't read chunk tree
Open ctree failed
create failed (Bad file descriptor)

So after the chunk-recover failed I postulated that there may be some
correlation with the read of /dev/sdg stopping early.  I say early
because the other 4 drives of the same capacity continued reading for
quite some time.

So I tested a dd of sdg to a file, and after it ran for about 2 hours
it stopped prematurely after 700 some-odd gigs and left some errors in
the logs (I'll just tack them on the end of the email for the
curious).

At this point I decided sdg was done and couldn't be doing any help
while installed so I yanked it out.  Still unable to mount, I
rebooted.  Unfortunately I am still unable to mount after the reboot
(and I tried again just now with all the options you posted, no dice),
so I am running the chunk-recover command again.

That would be neat if I can somehow contribute!

Thanks again,
Donald

Here's the drive vomiting in my logs after it got halfway through the
dd image attempt.

Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium
Error [current]
Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense:
Unrecovered read error
Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a
5b f1 e0 00 01 00 00
Jul  1 17:05:51 san01 kernel: blk_update_request: critical medium
error, dev sdg, sector 1515975136
Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium
Error [current]
Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense:
Unrecovered read error
Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a
5b f2 e0 00 01 00 00

On Wed, Jul 1, 2015 at 6:29 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Wed, Jul 1, 2015 at 3:35 PM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>
>> *** Error in `./btrfs': free(): invalid next size (fast): 0x0000000001332100 ***
>> Segmentation fault
>
> Blek. Well that's a bug then too. If you have space somewhere to put a
> btrfs-image -c9 -t4, I'd do that now before making anymore changes.
> Write up a bugzilla.kernel.org bug, include the URL for the image file
> (which will be large). Include the URL for the bug in this thread. And
> then it's wait time basically. I'm not a dev but this sounds rather
> serious.
>
> The pisser is that this is exactly the use case for raid6. You have a
> failed drive, want an extra margin to cover possible additional
> errors, you get a "BTRFS: failed to read chunk root on sdc" which
> could be construed as a problem with sdc, so a 2nd failure, and yet no
> reconstruction of the necessary metadata.
>
> Is metadata also raid6? Or just data? I don't see a 'btrfs fi df'
> probably because you can't mount the volume. Do you know if it was
> created with -d raid6 -m raid6 at mkfs time? (Include this info in the
> bug report.)
>
> Failing device handling with Btrfs is still weak. In many cases it
> will keep trying to use a device that produces spurious or even failed
> read and write errors. It's possible this caused some confusion.
>
> I propose trying the following. You could wait to see if someone else
> has better suggestions, but this seems reasonably safe.
>
> - Physically remove sdg from the system, reboot, and see if you can
> mount the volume with the most conservative mount option: -o
> ro,recovery,degraded,skip_balance
>
> If that doesn't work, and you still get the message about chunk root
> on devid 1/sdc (thing is, when you remove sdg it's possible drive
> letters will change, so be sure to correlate any errors to devid by
> using a current 'btrfs fi show' listing), then yuck.
>
> I would try chunk recover again, now that known bad drive sdg is
> physically removed. Do you get a different result, or still a seg
> fault?
>
> If those two things still fail, what's next is a toss up between two options:
>
> - Find or build a "4.2" kernel (there is no rc1 yet); Fedora has
> several "4.2"/linux-next binaries already built in the koji build
> system, so your distro might have extremely new kernels available
> somewhere for bleeding edgers. Try this with the above mount options
> again. In the recent git pull for this kernel there were nearly 2000
> lines added, and nearly that many deleted. A lot of changes. So it's
> worth a shot. It could produce a good result or a worse result, or the
> same result. *shrug* What I probably wouldn't try while running the
> 4.2 kernel is another chunk recover. Seems doubtful it will make much
> difference.
>
> and the other option:
>
> - Physically remove the device that still produces the "BTRFS: failed
> to read chunk root on sdX" error, which in the current state as you
> posted it, was /dev/sdc (devid 1). Physically remove it. Reboot. And
> then retry the same mount options from above and see what that results
> in. If there were no problems with your file system, removing two
> devices and mounting degraded should work without errors (I've done
> it), so it seems like a valid thing to try seeing as two devices are
> giving you a hard time. Will a 3rd? Dunno.
>
> Anyway, not good news. But you're helping make Btrfs better!
>
>
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02  1:38             ` Donald Pearson
@ 2015-07-02  2:31               ` Chris Murphy
  2015-07-02 14:49                 ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-02  2:31 UTC (permalink / raw)
  To: Btrfs BTRFS

On Wed, Jul 1, 2015 at 7:38 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:

> Here's the drive vomiting in my logs after it got halfway through the
> dd image attempt.
>
> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium
> Error [current]
> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense:
> Unrecovered read error
> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a
> 5b f1 e0 00 01 00 00
> Jul  1 17:05:51 san01 kernel: blk_update_request: critical medium
> error, dev sdg, sector 1515975136
> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium
> Error [current]
> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense:
> Unrecovered read error
> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a
> 5b f2 e0 00 01 00 00

This looks like a typical URE. There are a number of reasons why a
sector can be bad, but basically the drive ECC has given up being able
to correct the problem, and it reports the command, the error, and the
sector involved. What *should* happen is Btrfs reconstructs the data
(or metadata) on that sector, and then writes it (since kernel 3.19)
back to the bad sector LBA. The drive tries to write to that bad
sector, and verifies. If there is a persistent failure then that LBA
is mapped to a different physical sector and the bad one is removed
(has no LBA) - there will be no kernel messages for this it's all
handled in the drive itself.

But this sounds like a dd read of the raw device, where Btrfs is not
involved (because you can't mount the volume) so none of this
correction happens. What I wonder though it in the much earlier logs,
if this same problem happened when the volume was mounted, did Btrfs
try to fix the problem and were there problems fixing it?

So it might be useful if there's something in /var/log/messages or
journalctl -bX at the time the original problem was first developing.

Bad sectors are completely ordinary. They're not really common, out of
maybe 50 drives I've had two exhibit this. But the drive's are
designed to take this into account, and so are hardware, and linux
kernel md raid, and LVM raid, and Btrfs, and ZFS. So... it's kinda
important to know more about this edge case to find out where the
problem is.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02  2:31               ` Chris Murphy
@ 2015-07-02 14:49                 ` Donald Pearson
  2015-07-02 16:58                   ` Chris Murphy
  2015-07-02 17:00                   ` Chris Murphy
  0 siblings, 2 replies; 23+ messages in thread
From: Donald Pearson @ 2015-07-02 14:49 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Hello,

At the bottom of this email are the results of the latest
chunk-recover.  I only included one example of the output that was
printed prior to the summary information but it went up to the end of
my screen buffer and beyond.

So it looks like the command executed properly when none of the drives
give up on a read.  That said my issue with mounting still exists
unfortunately.  The errors in dmesg now complain about /dev/sdd.

[56496.014539] BTRFS (device sdd): bad tree block start 0 21364736

Which is curious because this is device id 2, where previously the
complaint was about device id 1.  So can I believe dmesg about which
drive is actually the issue or is the drive that's printed in dmesg
just whichever drive happens to be the last in some loop of code?
Theoretically I should be able to kick another drive out of the pool
safely, but I'm not sure which one to actually kick out or if that is
the appropriate next step.

I do see plenty of complaints about the sdg drive (previously sde) in
/var/log/messages from the 28th which is when I started noticing
issues.  Nothing is jumping out at me claiming the btrfs is taking
action but I may not know what to look for.

journalctl I'm not familiar with.  journalctl -bX returns with "failed
to parse relative boot ID number 'X'" but perhaps you meant X to be a
variable of some value?    journalctl -b does run, but I'm not sure
what to look for.

So, what does the audience suggest?  Shall I compile a newer kernel,
kick out another drive (which?), or take what's behind door #3 (which
is...?)

Thanks again everybody,
Donald

  Chunk: start = 6643489177600, len = 1073741824, type = 104, num_stripes = 10
      Stripes list:
      [ 0] Stripe: devid = 8, offset = 817549672448
      [ 1] Stripe: devid = 7, offset = 817549672448
      [ 2] Stripe: devid = 10, offset = 817549672448
      [ 3] Stripe: devid = 9, offset = 817549672448
      [ 4] Stripe: devid = 3, offset = 817549672448
      [ 5] Stripe: devid = 0, offset = 0
      [ 6] Stripe: devid = 0, offset = 0
      [ 7] Stripe: devid = 0, offset = 0
      [ 8] Stripe: devid = 0, offset = 0
      [ 9] Stripe: devid = 0, offset = 0
      Block Group: start = 6643489177600, len = 1073741824, flag = 104
      Device extent list:
          [ 0]Device extent: devid = 3, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 1]Device extent: devid = 9, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 2]Device extent: devid = 10, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 3]Device extent: devid = 7, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 4]Device extent: devid = 8, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 5]Device extent: devid = 4, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 6]Device extent: devid = 2, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 7]Device extent: devid = 1, start = 817569595392, len =
134217728, chunk offset = 6643489177600
          [ 8]Device extent: devid = 6, start = 817549672448, len =
134217728, chunk offset = 6643489177600
          [ 9]Device extent: devid = 5, start = 817549672448, len =
134217728, chunk offset = 6643489177600
  Chunk: start = 6886154829824, len = 8589934592, type = 101, num_stripes = 0
      Stripes list:
      Block Group: start = 6886154829824, len = 8589934592, flag = 101
      No device extent.
  Chunk: start = 6894744764416, len = 8589934592, type = 101, num_stripes = 0
      Stripes list:
      Block Group: start = 6894744764416, len = 8589934592, flag = 101
      No device extent.
  Chunk: start = 6903334699008, len = 8589934592, type = 101, num_stripes = 0
      Stripes list:
      Block Group: start = 6903334699008, len = 8589934592, flag = 101
      No device extent.

Total Chunks:           805
  Recoverable:          567
  Unrecoverable:        238

Orphan Block Groups:

Orphan Device Extents:
  Device extent: devid = 4, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 2, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 1, start = 819851296768, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 9, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 10, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 8, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 7, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 3, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 6, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608
  Device extent: devid = 5, start = 819831373824, len = 1073741824,
chunk offset = 6661742788608

open with broken chunk error
Fail to recover the chunk tree.

On Wed, Jul 1, 2015 at 9:31 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Wed, Jul 1, 2015 at 7:38 PM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>
>> Here's the drive vomiting in my logs after it got halfway through the
>> dd image attempt.
>>
>> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result:
>> hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium
>> Error [current]
>> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense:
>> Unrecovered read error
>> Jul  1 17:05:51 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a
>> 5b f1 e0 00 01 00 00
>> Jul  1 17:05:51 san01 kernel: blk_update_request: critical medium
>> error, dev sdg, sector 1515975136
>> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] FAILED Result:
>> hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Sense Key : Medium
>> Error [current]
>> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] Add. Sense:
>> Unrecovered read error
>> Jul  1 17:05:57 san01 kernel: sd 0:0:6:0: [sdg] CDB: Read(10) 28 00 5a
>> 5b f2 e0 00 01 00 00
>
> This looks like a typical URE. There are a number of reasons why a
> sector can be bad, but basically the drive ECC has given up being able
> to correct the problem, and it reports the command, the error, and the
> sector involved. What *should* happen is Btrfs reconstructs the data
> (or metadata) on that sector, and then writes it (since kernel 3.19)
> back to the bad sector LBA. The drive tries to write to that bad
> sector, and verifies. If there is a persistent failure then that LBA
> is mapped to a different physical sector and the bad one is removed
> (has no LBA) - there will be no kernel messages for this it's all
> handled in the drive itself.
>
> But this sounds like a dd read of the raw device, where Btrfs is not
> involved (because you can't mount the volume) so none of this
> correction happens. What I wonder though it in the much earlier logs,
> if this same problem happened when the volume was mounted, did Btrfs
> try to fix the problem and were there problems fixing it?
>
> So it might be useful if there's something in /var/log/messages or
> journalctl -bX at the time the original problem was first developing.
>
> Bad sectors are completely ordinary. They're not really common, out of
> maybe 50 drives I've had two exhibit this. But the drive's are
> designed to take this into account, and so are hardware, and linux
> kernel md raid, and LVM raid, and Btrfs, and ZFS. So... it's kinda
> important to know more about this edge case to find out where the
> problem is.
>
>
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 14:49                 ` Donald Pearson
@ 2015-07-02 16:58                   ` Chris Murphy
  2015-07-02 17:00                   ` Chris Murphy
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Murphy @ 2015-07-02 16:58 UTC (permalink / raw)
  To: Donald Pearson; +Cc: Chris Murphy, Btrfs BTRFS

On Thu, Jul 2, 2015 at 8:49 AM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:

> Which is curious because this is device id 2, where previously the
> complaint was about device id 1.  So can I believe dmesg about which
> drive is actually the issue or is the drive that's printed in dmesg
> just whichever drive happens to be the last in some loop of code?

devid is static/reliable
/dev/sdX is dynamic/unreliable and related to logic board's firmware

Some systems are more stable in this regard than others, I've worked
with systems that have different drive order every boot, even when
hardware configuration is unchanged. When the config changes, good bet
the drive letters will change.

> Theoretically I should be able to kick another drive out of the pool
> safely, but I'm not sure which one to actually kick out or if that is
> the appropriate next step.

My limited understanding at this point is that once you get "open with
broken chunk error
 Fail to recover the chunk tree." from chunk recover, you've reached
the limits of the current state of recovery tools.

But that it completed suggests it might be possible to get a complete
btrfs image, and get that to a developer who can then use it to
improve the recovery tools.

>
> I do see plenty of complaints about the sdg drive (previously sde) in
> /var/log/messages from the 28th which is when I started noticing
> issues.  Nothing is jumping out at me claiming the btrfs is taking
> action but I may not know what to look for.
>
> journalctl I'm not familiar with.  journalctl -bX returns with "failed
> to parse relative boot ID number 'X'" but perhaps you meant X to be a
> variable of some value?    journalctl -b does run, but I'm not sure
> what to look for.

I don't have a raid56 example handy for what this looks like before
this message appears:

[48466.853589] BTRFS: fixed up error at logical 20971520 on dev /dev/sdb

But that's what I get for corrupt metadata where metadata profile is
DUP. The messages for missing metadata that needs reconstruction would
be different but I'd expect to still see the fixed up message. But I'd
also look at
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/raid56.c?id=refs/tags/v4.1
and read comments and possible raid56 related error messages.

It's similar for data.

[ 1540.865534] BTRFS: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[ 1540.866944] BTRFS: unable to fixup (regular) error at logical
12845056 on dev /dev/sdb

Again this is a corruption example, not a read failure example. It
can't be fixed because the data profile is single in this case.

>
> So, what does the audience suggest?  Shall I compile a newer kernel,
> kick out another drive (which?), or take what's behind door #3 (which
> is...?)

If there's data on this volume you need, put all the drives back in
and look at btrfs-rescue to try and extract what you can. And then try
a btrfs-image again, maybe it'll work too if there aren't read errors.

Once you've gotten what you need out of it, you can decide if it's
worth continuing to try to fix it (seems doubtful to me but I am not a
developer). I'd probably just start over. The one change to make going
forward is more frequent scrubs to hopefully find and fixup any bad
sectors before it starts to cause this problem again.

Maybe someone with more knowledge will say if any of the btrfs kernel
debug features are worth enabling? I suspect those debug features are
only useful to gather more information as the file system is being
used and encounters the first problem, the URE, and any subsequent
events that caused confusion and then the self-corruption of the fs
beyond repair. If so, that implies a whole new fs, and then trying to
reproduce the conditions that caused the problem. Which brings me
to...

hdparm has a dangerous --make-bad-sector option for testing RAID. I
wonder if qemu has such an option? I'd rather test this in a VM than
use a "do not ever use" option in hdparm.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 14:49                 ` Donald Pearson
  2015-07-02 16:58                   ` Chris Murphy
@ 2015-07-02 17:00                   ` Chris Murphy
  2015-07-02 18:19                     ` Donald Pearson
  1 sibling, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-02 17:00 UTC (permalink / raw)
  To: Donald Pearson; +Cc: Chris Murphy, Btrfs BTRFS

On Thu, Jul 2, 2015 at 8:49 AM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:

> I do see plenty of complaints about the sdg drive (previously sde) in
> /var/log/messages from the 28th which is when I started noticing
> issues.  Nothing is jumping out at me claiming the btrfs is taking
> action but I may not know what to look for.

I'd include that entire log with the bug report. I'd like to skim it
at least. Even logs from earlier might be useful.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 17:00                   ` Chris Murphy
@ 2015-07-02 18:19                     ` Donald Pearson
  2015-07-02 18:26                       ` Chris Murphy
  2015-07-03  9:31                       ` Duncan
  0 siblings, 2 replies; 23+ messages in thread
From: Donald Pearson @ 2015-07-02 18:19 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Unfortunately btrfs image fails with "couldn't read chunk tree".

btrfs restore complains that every device is missing except the one
that you specify on executing the command.  Multiple devices as a
parameter isn't an option.  Specifcy /dev/disk/by-uuid/<uuid> claims
that all devices are missing.

I went ahead and dropped the drive that dmesg is still complaining
about.  Mounting still fails, so I'm going to try to rescue chunk-tree
again (for science!).

If anybody has any other ideas to try or data to gather/methods to
gather them as a case study for any devs please let me know.  I'll
assemble all the data that I know how to and follow that link Chris
suggested for filing a bug.

On Thu, Jul 2, 2015 at 12:00 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Thu, Jul 2, 2015 at 8:49 AM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>
>> I do see plenty of complaints about the sdg drive (previously sde) in
>> /var/log/messages from the 28th which is when I started noticing
>> issues.  Nothing is jumping out at me claiming the btrfs is taking
>> action but I may not know what to look for.
>
> I'd include that entire log with the bug report. I'd like to skim it
> at least. Even logs from earlier might be useful.
>
>
> --
> Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 18:19                     ` Donald Pearson
@ 2015-07-02 18:26                       ` Chris Murphy
  2015-07-02 18:32                         ` Donald Pearson
  2015-07-03  9:31                       ` Duncan
  1 sibling, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-02 18:26 UTC (permalink / raw)
  To: Btrfs BTRFS

On Thu, Jul 2, 2015 at 12:19 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
> Unfortunately btrfs image fails with "couldn't read chunk tree".
>
> btrfs restore complains that every device is missing except the one
> that you specify on executing the command.  Multiple devices as a
> parameter isn't an option.  Specifcy /dev/disk/by-uuid/<uuid> claims
> that all devices are missing.

Sounds like restore isn't raid56 aware yet?



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 18:26                       ` Chris Murphy
@ 2015-07-02 18:32                         ` Donald Pearson
  2015-07-02 18:37                           ` Chris Murphy
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-02 18:32 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

I think it is.  I have another raid5 pool that I've created to test
the restore function on, and it worked.

On Thu, Jul 2, 2015 at 1:26 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Thu, Jul 2, 2015 at 12:19 PM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>> Unfortunately btrfs image fails with "couldn't read chunk tree".
>>
>> btrfs restore complains that every device is missing except the one
>> that you specify on executing the command.  Multiple devices as a
>> parameter isn't an option.  Specifcy /dev/disk/by-uuid/<uuid> claims
>> that all devices are missing.
>
> Sounds like restore isn't raid56 aware yet?
>
>
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 18:32                         ` Donald Pearson
@ 2015-07-02 18:37                           ` Chris Murphy
  2015-07-02 18:45                             ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-02 18:37 UTC (permalink / raw)
  To: Btrfs BTRFS

On Thu, Jul 2, 2015 at 12:32 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
> I think it is.  I have another raid5 pool that I've created to test
> the restore function on, and it worked.

So you have all devices for this raid6 available, and yet when you use
restore, you get missing device message for all devices except the one
specified? But that doesn't happen with the raid5 volume?


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 18:37                           ` Chris Murphy
@ 2015-07-02 18:45                             ` Donald Pearson
  2015-07-02 18:54                               ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-02 18:45 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

That is correct.  I'm going to rebalance my raid5 pool as raid6 and
re-test just because.

On Thu, Jul 2, 2015 at 1:37 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Thu, Jul 2, 2015 at 12:32 PM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>> I think it is.  I have another raid5 pool that I've created to test
>> the restore function on, and it worked.
>
> So you have all devices for this raid6 available, and yet when you use
> restore, you get missing device message for all devices except the one
> specified? But that doesn't happen with the raid5 volume?
>
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 18:45                             ` Donald Pearson
@ 2015-07-02 18:54                               ` Donald Pearson
  0 siblings, 0 replies; 23+ messages in thread
From: Donald Pearson @ 2015-07-02 18:54 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Yes it works with raid6 as well.

[root@san01 btrfs-progs]# ./btrfs fi show
Label: 'rockstor_rockstor'  uuid: 08d14b6f-18df-4b1b-a91e-4b33e7c90c29
        Total devices 1 FS bytes used 19.25GiB
        devid    1 size 457.40GiB used 457.40GiB path /dev/sdt3

warning, device 4 is missing
warning, device 2 is missing
warning devid 2 not found already
warning devid 4 not found already
checksum verify failed on 21364736 found 925303CE wanted 09150E74
checksum verify failed on 21364736 found 925303CE wanted 09150E74
bytenr mismatch, want=21364736, have=1065943040
Couldn't read chunk tree
Label: 'backup'  uuid: 68be4632-93ba-4478-9098-2ecb23ee6c94
        Total devices 5 FS bytes used 978.72MiB
        devid    1 size 2.73TiB used 1.62GiB path /dev/sdb
        devid    3 size 2.73TiB used 1.62GiB path /dev/sdi
        devid    4 size 2.73TiB used 1.62GiB path /dev/sdj
        devid    5 size 2.73TiB used 1.62GiB path /dev/sdn
        devid    6 size 2.73TiB used 1.62GiB path /dev/sdq

Label: 'tank'  uuid: 8a03f8e8-8b84-4d1b-b27d-e23ef8ebe21d
        Total devices 10 FS bytes used 5.67TiB
        devid    1 size 1.36TiB used 792.67GiB path /dev/sdc
        devid    3 size 1.82TiB used 792.65GiB path /dev/sdf
        devid    5 size 1.36TiB used 792.65GiB path /dev/sdg
        devid    6 size 1.36TiB used 792.65GiB path /dev/sdh
        devid    7 size 1.82TiB used 792.65GiB path /dev/sdk
        devid    8 size 1.82TiB used 792.65GiB path /dev/sdl
        devid    9 size 1.82TiB used 792.65GiB path /dev/sdm
        devid   10 size 1.82TiB used 792.65GiB path /dev/sdp
        *** Some devices missing

btrfs-progs v4.1
[root@san01 btrfs-progs]# mount /dev/sdb /mnt2/backup
[root@san01 btrfs-progs]# ./btrfs fi df /mnt2/backup
Data, RAID6: total=3.00GiB, used=977.56MiB
System, RAID6: total=96.00MiB, used=16.00KiB
Metadata, RAID6: total=1.03GiB, used=1.14MiB
GlobalReserve, single: total=16.00MiB, used=0.00B
[root@san01 btrfs-progs]# ll /mnt2/backup
total 1000000
-rw-r--r-- 1 root root 1024000000 Jul  2 13:48 test_file_1gb
[root@san01 btrfs-progs]# umount /mnt2/backup
[root@san01 btrfs-progs]# ./btrfs restore -xmv /dev/sdb ~
Restoring /root/test_file_1gb
Done searching
[root@san01 btrfs-progs]# ll ~
total 1000004
-rw-------. 1 root root       1101 Jun 20 23:18 anaconda-ks.cfg
drwxr-xr-x  1 root root         22 Jul  1 09:48 git
-rw-r--r--  1 root root 1024000000 Jul  2 13:48 test_file_1gb
[root@san01 btrfs-progs]#

On Thu, Jul 2, 2015 at 1:45 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
> That is correct.  I'm going to rebalance my raid5 pool as raid6 and
> re-test just because.
>
> On Thu, Jul 2, 2015 at 1:37 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> On Thu, Jul 2, 2015 at 12:32 PM, Donald Pearson
>> <donaldwhpearson@gmail.com> wrote:
>>> I think it is.  I have another raid5 pool that I've created to test
>>> the restore function on, and it worked.
>>
>> So you have all devices for this raid6 available, and yet when you use
>> restore, you get missing device message for all devices except the one
>> specified? But that doesn't happen with the raid5 volume?
>>
>>
>> --
>> Chris Murphy
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-02 18:19                     ` Donald Pearson
  2015-07-02 18:26                       ` Chris Murphy
@ 2015-07-03  9:31                       ` Duncan
  2015-07-03 13:29                         ` Martin Steigerwald
  1 sibling, 1 reply; 23+ messages in thread
From: Duncan @ 2015-07-03  9:31 UTC (permalink / raw)
  To: linux-btrfs

Donald Pearson posted on Thu, 02 Jul 2015 13:19:41 -0500 as excerpted:

> btrfs restore complains that every device is missing except the one that
> you specify on executing the command.  Multiple devices as a parameter
> isn't an option.  Specifcy /dev/disk/by-uuid/<uuid> claims that all
> devices are missing.

That sounds like the kernel lost track of what devices correspond with 
the filesystem.  Try issuing the btrfs device scan command before 
restore, and see if that changes things.  (If it doesn't, try btrfs 
device scan --all-devices, just in case.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-03  9:31                       ` Duncan
@ 2015-07-03 13:29                         ` Martin Steigerwald
  2015-07-03 15:05                           ` Donald Pearson
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Steigerwald @ 2015-07-03 13:29 UTC (permalink / raw)
  To: linux-btrfs

On Friday 03 July 2015 09:31:03 Duncan wrote:
> Donald Pearson posted on Thu, 02 Jul 2015 13:19:41 -0500 as excerpted:
> > btrfs restore complains that every device is missing except the one that
> > you specify on executing the command.  Multiple devices as a parameter
> > isn't an option.  Specifcy /dev/disk/by-uuid/<uuid> claims that all
> > devices are missing.
> 
> That sounds like the kernel lost track of what devices correspond with
> the filesystem.  Try issuing the btrfs device scan command before
> restore, and see if that changes things.  (If it doesn't, try btrfs
> device scan --all-devices, just in case.)

Also does blkid and/or file-sk onto each device show that the BTRFS 
signatures are still there?

-- 
Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-03 13:29                         ` Martin Steigerwald
@ 2015-07-03 15:05                           ` Donald Pearson
  2015-07-03 17:51                             ` Chris Murphy
  0 siblings, 1 reply; 23+ messages in thread
From: Donald Pearson @ 2015-07-03 15:05 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: Btrfs BTRFS

Thanks for the inputs guys.

Yes I did learn to perform a device scan --all-devices.  It seems that
the chunk tree is vital to a lot of functionality and the recovery
tools are no exception.

I suspect that I ran in to the raid56 caveat "btrfs does not deal well
with a drive that is present but not working" described here
:http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html
There is also mention here of some caveats with raid56 and drives that
are no good and/or out of sync are still picked right up by the pool
leading to data loss.
I should have studied more cloesly earlier, because issuing a btrfs
device delete is exactly what I tried to do and things went down hill
from there.

I did some more digging and found that I had a lot of errors basically
every drive.  I think the incident that caused my backup pool to fail
did more harm that I initially thought.  Last night I disassembled the
box and inspected/re-seated everything and now I'm scanning each
device for read and write errors.

So one particularly bad drive + multiple raid56 caveats + all the
other drives behaving a little funny + a guy who didn't read as well
as he should, and I really can't be surprised at the result.

So I've decided to cash in the pool for some valueable lessons and
move on.  Thanks to everybody for your thoughts and help here and on
IRC (a responsive IRC, that's a refereshing difference) and I'll see
what adventures btrfs round 2 brings.

Regards,
Donald (seijirou)

On Fri, Jul 3, 2015 at 8:29 AM, Martin Steigerwald <martin@lichtvoll.de> wrote:
> On Friday 03 July 2015 09:31:03 Duncan wrote:
>> Donald Pearson posted on Thu, 02 Jul 2015 13:19:41 -0500 as excerpted:
>> > btrfs restore complains that every device is missing except the one that
>> > you specify on executing the command.  Multiple devices as a parameter
>> > isn't an option.  Specifcy /dev/disk/by-uuid/<uuid> claims that all
>> > devices are missing.
>>
>> That sounds like the kernel lost track of what devices correspond with
>> the filesystem.  Try issuing the btrfs device scan command before
>> restore, and see if that changes things.  (If it doesn't, try btrfs
>> device scan --all-devices, just in case.)
>
> Also does blkid and/or file-sk onto each device show that the BTRFS
> signatures are still there?
>
> --
> Martin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-03 15:05                           ` Donald Pearson
@ 2015-07-03 17:51                             ` Chris Murphy
  2015-07-06 12:08                               ` Austin S Hemmelgarn
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Murphy @ 2015-07-03 17:51 UTC (permalink / raw)
  To: Btrfs BTRFS

On Fri, Jul 3, 2015 at 9:05 AM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:

> I did some more digging and found that I had a lot of errors basically
> every drive.

Ick. Sucks for you but then makes this less of a Btrfs problem because
it can really only do so much if more than the number of spares have
problems. It does suggest a more aggressive need for the volume to go
read only in such cases though, before it gets this corrupt.

Multiple disk problems like this though suggest a shared hardware
problem like a controller or expander.

> So one particularly bad drive + multiple raid56 caveats + all the
> other drives behaving a little funny + a guy who didn't read as well
> as he should, and I really can't be surprised at the result.
>
> So I've decided to cash in the pool for some valueable lessons and
> move on.  Thanks to everybody for your thoughts and help here and on
> IRC (a responsive IRC, that's a refereshing difference) and I'll see
> what adventures btrfs round 2 brings.

Yep!

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Any hope of pool recovery?
  2015-07-03 17:51                             ` Chris Murphy
@ 2015-07-06 12:08                               ` Austin S Hemmelgarn
  0 siblings, 0 replies; 23+ messages in thread
From: Austin S Hemmelgarn @ 2015-07-06 12:08 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1170 bytes --]

On 2015-07-03 13:51, Chris Murphy wrote:
> On Fri, Jul 3, 2015 at 9:05 AM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>
>> I did some more digging and found that I had a lot of errors basically
>> every drive.
>
> Ick. Sucks for you but then makes this less of a Btrfs problem because
> it can really only do so much if more than the number of spares have
> problems. It does suggest a more aggressive need for the volume to go
> read only in such cases though, before it gets this corrupt.
I'd almost say this is something that should be configurable.  The 
default should probably be if there have been errors on at least as many 
drives as there are spares, the fs should go read-only; but still 
provide the option to choose between that, going read-only immediately 
on the first error or only going read-only on write errors.
> Multiple disk problems like this though suggest a shared hardware
> problem like a controller or expander.
>
I have to agree with this statement, I've seen stuff like this before 
(altho0ugh thankfully not on BTRFS), and 100% of the time the root cause 
was either the storage controller of system RAM.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-07-06 12:08 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-01 15:39 Any hope of pool recovery? Donald Pearson
2015-07-01 15:50 ` Chris Murphy
2015-07-01 16:09   ` Donald Pearson
2015-07-01 18:58     ` Donald Pearson
2015-07-01 19:05       ` Donald Pearson
2015-07-01 21:35         ` Donald Pearson
2015-07-01 23:29           ` Chris Murphy
2015-07-02  1:38             ` Donald Pearson
2015-07-02  2:31               ` Chris Murphy
2015-07-02 14:49                 ` Donald Pearson
2015-07-02 16:58                   ` Chris Murphy
2015-07-02 17:00                   ` Chris Murphy
2015-07-02 18:19                     ` Donald Pearson
2015-07-02 18:26                       ` Chris Murphy
2015-07-02 18:32                         ` Donald Pearson
2015-07-02 18:37                           ` Chris Murphy
2015-07-02 18:45                             ` Donald Pearson
2015-07-02 18:54                               ` Donald Pearson
2015-07-03  9:31                       ` Duncan
2015-07-03 13:29                         ` Martin Steigerwald
2015-07-03 15:05                           ` Donald Pearson
2015-07-03 17:51                             ` Chris Murphy
2015-07-06 12:08                               ` Austin S Hemmelgarn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.