* Damaged Root Tree(s)
@ 2018-01-21 19:16 Liwei
2018-01-21 21:45 ` Chris Murphy
2018-01-22 1:11 ` Qu Wenruo
0 siblings, 2 replies; 7+ messages in thread
From: Liwei @ 2018-01-21 19:16 UTC (permalink / raw)
To: linux-btrfs
Hi list,
====TLDR====
1. Can I mount a filesystem using one of the roots found with btrfs-find-root?
2. Can btrfs check just fix the damaged root without attempting any
other repairs?
3. If the above is not possible, how should I proceed given that I
seem to have lost both the main and backup roots?
====Background Information====
I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
theory is) experienced a headcrash while updating the root tree, or
maybe while it was carrying out background defragmentation.
This occurred while I was setting up redundancy by using LVM
mirroring, so in the logs you'll see some dm errors. Unfortunately the
lost data has not been mirrored yet (what are the chances, given that
the mirror was 97% complete when this happened).
Running a scrub on the raid shows that I have 1000+ unreadable
sectors, amounting to about 800kB of data. So I've got spare drives
and imaged the offending drive. Currently ddrescue is still trying to
read those sectors, but it seems unlikely that they'll ever succeed.
====Problem====
So with an imaged copy of the array, I tried remounting the
filesystem, but it refuses to mount even using 'usebackuproot':
With usebackuproot:
[ 1610.788527] device-mapper: raid1: Mirror read failed.
[ 1610.788799] device-mapper: raid1: Mirror read failed.
[ 1610.788939] Buffer I/O error on dev dm-15, logical block
5371800560, async page read
[ 1610.823141] BTRFS: device label edata devid 1 transid 318593
/dev/mapper/datavol-edata
[ 1616.778563] BTRFS info (device dm-15): trying to use backup root at
mount time
[ 1616.778758] BTRFS info (device dm-15): disk space caching is enabled
[ 1617.961152] device-mapper: raid1: Mirror read failed.
[ 1618.238198] device-mapper: raid1: Mirror read failed.
[ 1618.238498] BTRFS warning (device dm-15): failed to read tree root
[ 1618.238700] device-mapper: raid1: Mirror read failed.
[ 1618.238878] device-mapper: raid1: Mirror read failed.
[ 1618.239050] BTRFS warning (device dm-15): failed to read tree root
[ 1618.239207] device-mapper: raid1: Mirror read failed.
[ 1618.239372] device-mapper: raid1: Mirror read failed.
[ 1618.239590] BTRFS warning (device dm-15): failed to read tree root
[ 1618.239775] device-mapper: raid1: Mirror read failed.
[ 1618.240055] device-mapper: raid1: Mirror read failed.
[ 1618.240298] BTRFS warning (device dm-15): failed to read tree root
[ 1618.240492] device-mapper: raid1: Mirror read failed.
[ 1618.240744] device-mapper: raid1: Mirror read failed.
[ 1618.240989] BTRFS warning (device dm-15): failed to read tree root
[ 1618.363234] BTRFS error (device dm-15): open_ctree failed
Without usebackuproot:
[ 2149.015427] device-mapper: raid1: Mirror read failed.
[ 2149.015700] device-mapper: raid1: Mirror read failed.
[ 2149.015840] Buffer I/O error on dev dm-15, logical block
5371800560, async page read
[ 2154.172102] BTRFS info (device dm-15): disk space caching is enabled
[ 2155.325134] device-mapper: raid1: Mirror read failed.
[ 2155.715439] device-mapper: raid1: Mirror read failed.
[ 2155.715795] BTRFS warning (device dm-15): failed to read tree root
[ 2155.851599] BTRFS error (device dm-15): open_ctree failed
It appears that the damaged data has affected both the main and
backup roots.
Next I ran btrfs-find-root, which gave me the following:
Superblock thinks the generation is 318593
Superblock thinks the level is 1
Well block 25826479144960(gen: 318346 level: 1) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826450505728(gen: 318345 level: 1) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826461237248(gen: 318344 level: 1) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826479669248(gen: 318342 level: 0) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826479603712(gen: 318342 level: 0) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826468495360(gen: 318342 level: 0) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826465923072(gen: 318342 level: 0) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
Well block 25826477654016(gen: 318341 level: 0) seems good, but
generation/level doesn't match, want gen: 318593 level: 1
...[truncated]
I tried running btrfs check with the top 5 roots, but only the
first 3 seems to be usable. However, even with the first 3, btrfs
check gives me a lot of:
bytenr mismatch, want=26008292753408, have=0
bytenr mismatch, want=26353175658496, have=0
bytenr mismatch, want=26353188618240, have=0
bytenr mismatch, want=26353513299968, have=0
and thousands of extent errors, etc. I do see references to
directories within the filesystem though, so I'd think the tree root
is at least pretty good.
Just to see if btrfs check can reach a usable state, I made a COW
snapshot of the imaged drive, and ran btrfs check --repair. However,
it eventually gives up, and seemed to have wrecked the FS.
Is there a way to mount/repair the filesystem with the found root
instead? I'd like to copy the files off the image, but prefer not to
use btrfs restore. Can btrfs check just copy the alternative root and
not try to repair anything else?
====Misc info====
# uname -a
Linux tvm 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64 GNU/Linux
# btrfs --version
btrfs-progs v4.13.3
Thanks for the help!
Liwei
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Damaged Root Tree(s)
2018-01-21 19:16 Damaged Root Tree(s) Liwei
@ 2018-01-21 21:45 ` Chris Murphy
2018-01-22 1:11 ` Qu Wenruo
1 sibling, 0 replies; 7+ messages in thread
From: Chris Murphy @ 2018-01-21 21:45 UTC (permalink / raw)
To: Liwei; +Cc: Btrfs BTRFS
On Sun, Jan 21, 2018 at 12:16 PM, Liwei <xieliwei@gmail.com> wrote:
> Hi list,
>
> ====TLDR====
> 1. Can I mount a filesystem using one of the roots found with btrfs-find-root?
Not necessarily because more than just the tree root needs to be
readable to do a mount.
But decent chance it's possible to do an offline scrape using one of
those root trees with btrfs restore.
>
> ====Background Information====
> I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
> theory is) experienced a headcrash while updating the root tree, or
> maybe while it was carrying out background defragmentation.
>
> This occurred while I was setting up redundancy by using LVM
> mirroring, so in the logs you'll see some dm errors. Unfortunately the
> lost data has not been mirrored yet (what are the chances, given that
> the mirror was 97% complete when this happened).
>
> Running a scrub on the raid shows that I have 1000+ unreadable
> sectors, amounting to about 800kB of data. So I've got spare drives
> and imaged the offending drive. Currently ddrescue is still trying to
> read those sectors, but it seems unlikely that they'll ever succeed.
Bad luck. What's the metadata profile? Single or DUP?
>
> Next I ran btrfs-find-root, which gave me the following:
> Superblock thinks the generation is 318593
> Superblock thinks the level is 1
> Well block 25826479144960(gen: 318346 level: 1) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
That there's a big gap in generation between what's wanted and what's
found, a bunch of those more recent trees must be colocated and are
probably missing.
Anyway I think it's best to look at restore, and my limited experience
it tends to be more successful when restoring from snapshots
--
Chris Murphy
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Damaged Root Tree(s)
2018-01-21 19:16 Damaged Root Tree(s) Liwei
2018-01-21 21:45 ` Chris Murphy
@ 2018-01-22 1:11 ` Qu Wenruo
2018-01-22 1:14 ` Qu Wenruo
1 sibling, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2018-01-22 1:11 UTC (permalink / raw)
To: Liwei, linux-btrfs
[-- Attachment #1.1: Type: text/plain, Size: 6973 bytes --]
On 2018年01月22日 03:16, Liwei wrote:
> Hi list,
>
> ====TLDR====
> 1. Can I mount a filesystem using one of the roots found with btrfs-find-root?
Depends on the tree.
If it's root tree, it's possible.
Otherwise those found trees don't help much.
> 2. Can btrfs check just fix the damaged root without attempting any
> other repairs?
No.
But under most case, it's not a single corrupted tree but normally multiple.
> 3. If the above is not possible, how should I proceed given that I
> seem to have lost both the main and backup roots?
In theory, it's possible to use specified fs tree root to salvage a
filesystem.
But under most case, metadata is protected by safer profile.
So it's not implemented in btrfs-progs.
Your current best try would be manually scanning through all tree backups.
Which need extra info.
Please provide the following info:
# btrfs inspect dump-super -FfA <device> | grep backup_tree_root | sort
| uniq
And try them one by one:
# btrfs check --tree-root <number from above output> <device>
If any one can proceed, then use it to repair:
# btrfs check --tree-root <number> <device>
And good luck.
Thanks,
Qu
>
> ====Background Information====
> I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
> theory is) experienced a headcrash while updating the root tree, or
> maybe while it was carrying out background defragmentation.>
> This occurred while I was setting up redundancy by using LVM
> mirroring, so in the logs you'll see some dm errors. Unfortunately the
> lost data has not been mirrored yet (what are the chances, given that
> the mirror was 97% complete when this happened).
>
> Running a scrub on the raid shows that I have 1000+ unreadable
> sectors, amounting to about 800kB of data. So I've got spare drives
> and imaged the offending drive. Currently ddrescue is still trying to
> read those sectors, but it seems unlikely that they'll ever succeed.
>
> ====Problem====
> So with an imaged copy of the array, I tried remounting the
> filesystem, but it refuses to mount even using 'usebackuproot':
>
> With usebackuproot:
> [ 1610.788527] device-mapper: raid1: Mirror read failed.
> [ 1610.788799] device-mapper: raid1: Mirror read failed.
> [ 1610.788939] Buffer I/O error on dev dm-15, logical block
> 5371800560, async page read
> [ 1610.823141] BTRFS: device label edata devid 1 transid 318593
> /dev/mapper/datavol-edata
> [ 1616.778563] BTRFS info (device dm-15): trying to use backup root at
> mount time
> [ 1616.778758] BTRFS info (device dm-15): disk space caching is enabled
> [ 1617.961152] device-mapper: raid1: Mirror read failed.
> [ 1618.238198] device-mapper: raid1: Mirror read failed.
> [ 1618.238498] BTRFS warning (device dm-15): failed to read tree root
> [ 1618.238700] device-mapper: raid1: Mirror read failed.
> [ 1618.238878] device-mapper: raid1: Mirror read failed.
> [ 1618.239050] BTRFS warning (device dm-15): failed to read tree root
> [ 1618.239207] device-mapper: raid1: Mirror read failed.
> [ 1618.239372] device-mapper: raid1: Mirror read failed.
> [ 1618.239590] BTRFS warning (device dm-15): failed to read tree root
> [ 1618.239775] device-mapper: raid1: Mirror read failed.
> [ 1618.240055] device-mapper: raid1: Mirror read failed.
> [ 1618.240298] BTRFS warning (device dm-15): failed to read tree root
> [ 1618.240492] device-mapper: raid1: Mirror read failed.
> [ 1618.240744] device-mapper: raid1: Mirror read failed.
> [ 1618.240989] BTRFS warning (device dm-15): failed to read tree root
> [ 1618.363234] BTRFS error (device dm-15): open_ctree failed
>
> Without usebackuproot:
> [ 2149.015427] device-mapper: raid1: Mirror read failed.
> [ 2149.015700] device-mapper: raid1: Mirror read failed.
> [ 2149.015840] Buffer I/O error on dev dm-15, logical block
> 5371800560, async page read
> [ 2154.172102] BTRFS info (device dm-15): disk space caching is enabled
> [ 2155.325134] device-mapper: raid1: Mirror read failed.
> [ 2155.715439] device-mapper: raid1: Mirror read failed.
> [ 2155.715795] BTRFS warning (device dm-15): failed to read tree root
> [ 2155.851599] BTRFS error (device dm-15): open_ctree failed
>
> It appears that the damaged data has affected both the main and
> backup roots.
>
> Next I ran btrfs-find-root, which gave me the following:
> Superblock thinks the generation is 318593
> Superblock thinks the level is 1
> Well block 25826479144960(gen: 318346 level: 1) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826450505728(gen: 318345 level: 1) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826461237248(gen: 318344 level: 1) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826479669248(gen: 318342 level: 0) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826479603712(gen: 318342 level: 0) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826468495360(gen: 318342 level: 0) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826465923072(gen: 318342 level: 0) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> Well block 25826477654016(gen: 318341 level: 0) seems good, but
> generation/level doesn't match, want gen: 318593 level: 1
> ...[truncated]
>
> I tried running btrfs check with the top 5 roots, but only the
> first 3 seems to be usable. However, even with the first 3, btrfs
> check gives me a lot of:
> bytenr mismatch, want=26008292753408, have=0
> bytenr mismatch, want=26353175658496, have=0
> bytenr mismatch, want=26353188618240, have=0
> bytenr mismatch, want=26353513299968, have=0
> and thousands of extent errors, etc. I do see references to
> directories within the filesystem though, so I'd think the tree root
> is at least pretty good.
>
> Just to see if btrfs check can reach a usable state, I made a COW
> snapshot of the imaged drive, and ran btrfs check --repair. However,
> it eventually gives up, and seemed to have wrecked the FS.
>
> Is there a way to mount/repair the filesystem with the found root
> instead? I'd like to copy the files off the image, but prefer not to
> use btrfs restore. Can btrfs check just copy the alternative root and
> not try to repair anything else?
>
> ====Misc info====
> # uname -a
> Linux tvm 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64 GNU/Linux
> # btrfs --version
> btrfs-progs v4.13.3
>
> Thanks for the help!
> Liwei
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Damaged Root Tree(s)
2018-01-22 1:11 ` Qu Wenruo
@ 2018-01-22 1:14 ` Qu Wenruo
0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2018-01-22 1:14 UTC (permalink / raw)
To: Liwei, linux-btrfs
[-- Attachment #1.1: Type: text/plain, Size: 7356 bytes --]
On 2018年01月22日 09:11, Qu Wenruo wrote:
>
>
> On 2018年01月22日 03:16, Liwei wrote:
>> Hi list,
>>
>> ====TLDR====
>> 1. Can I mount a filesystem using one of the roots found with btrfs-find-root?
>
> Depends on the tree.
>
> If it's root tree, it's possible.
>
> Otherwise those found trees don't help much.
>
>
>> 2. Can btrfs check just fix the damaged root without attempting any
>> other repairs?
>
> No.
> But under most case, it's not a single corrupted tree but normally multiple.
>
>> 3. If the above is not possible, how should I proceed given that I
>> seem to have lost both the main and backup roots?
>
> In theory, it's possible to use specified fs tree root to salvage a
> filesystem.
>
> But under most case, metadata is protected by safer profile.
> So it's not implemented in btrfs-progs.
>
> Your current best try would be manually scanning through all tree backups.
> Which need extra info.
>
> Please provide the following info:
>
> # btrfs inspect dump-super -FfA <device> | grep backup_tree_root | sort
> | uniq
>
> And try them one by one:
>
> # btrfs check --tree-root <number from above output> <device>
And find-root output can also be tried here.
But please keep in mind, the older generation is, the less chance.
Thanks,
Qu
>
> If any one can proceed, then use it to repair:
>
> # btrfs check --tree-root <number> <device>
>
> And good luck.
>
> Thanks,
> Qu
>
>>
>> ====Background Information====
>> I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
>> theory is) experienced a headcrash while updating the root tree, or
>> maybe while it was carrying out background defragmentation.>
>> This occurred while I was setting up redundancy by using LVM
>> mirroring, so in the logs you'll see some dm errors. Unfortunately the
>> lost data has not been mirrored yet (what are the chances, given that
>> the mirror was 97% complete when this happened).
>>
>> Running a scrub on the raid shows that I have 1000+ unreadable
>> sectors, amounting to about 800kB of data. So I've got spare drives
>> and imaged the offending drive. Currently ddrescue is still trying to
>> read those sectors, but it seems unlikely that they'll ever succeed.
>>
>> ====Problem====
>> So with an imaged copy of the array, I tried remounting the
>> filesystem, but it refuses to mount even using 'usebackuproot':
>>
>> With usebackuproot:
>> [ 1610.788527] device-mapper: raid1: Mirror read failed.
>> [ 1610.788799] device-mapper: raid1: Mirror read failed.
>> [ 1610.788939] Buffer I/O error on dev dm-15, logical block
>> 5371800560, async page read
>> [ 1610.823141] BTRFS: device label edata devid 1 transid 318593
>> /dev/mapper/datavol-edata
>> [ 1616.778563] BTRFS info (device dm-15): trying to use backup root at
>> mount time
>> [ 1616.778758] BTRFS info (device dm-15): disk space caching is enabled
>> [ 1617.961152] device-mapper: raid1: Mirror read failed.
>> [ 1618.238198] device-mapper: raid1: Mirror read failed.
>> [ 1618.238498] BTRFS warning (device dm-15): failed to read tree root
>> [ 1618.238700] device-mapper: raid1: Mirror read failed.
>> [ 1618.238878] device-mapper: raid1: Mirror read failed.
>> [ 1618.239050] BTRFS warning (device dm-15): failed to read tree root
>> [ 1618.239207] device-mapper: raid1: Mirror read failed.
>> [ 1618.239372] device-mapper: raid1: Mirror read failed.
>> [ 1618.239590] BTRFS warning (device dm-15): failed to read tree root
>> [ 1618.239775] device-mapper: raid1: Mirror read failed.
>> [ 1618.240055] device-mapper: raid1: Mirror read failed.
>> [ 1618.240298] BTRFS warning (device dm-15): failed to read tree root
>> [ 1618.240492] device-mapper: raid1: Mirror read failed.
>> [ 1618.240744] device-mapper: raid1: Mirror read failed.
>> [ 1618.240989] BTRFS warning (device dm-15): failed to read tree root
>> [ 1618.363234] BTRFS error (device dm-15): open_ctree failed
>>
>> Without usebackuproot:
>> [ 2149.015427] device-mapper: raid1: Mirror read failed.
>> [ 2149.015700] device-mapper: raid1: Mirror read failed.
>> [ 2149.015840] Buffer I/O error on dev dm-15, logical block
>> 5371800560, async page read
>> [ 2154.172102] BTRFS info (device dm-15): disk space caching is enabled
>> [ 2155.325134] device-mapper: raid1: Mirror read failed.
>> [ 2155.715439] device-mapper: raid1: Mirror read failed.
>> [ 2155.715795] BTRFS warning (device dm-15): failed to read tree root
>> [ 2155.851599] BTRFS error (device dm-15): open_ctree failed
>>
>> It appears that the damaged data has affected both the main and
>> backup roots.
>>
>> Next I ran btrfs-find-root, which gave me the following:
>> Superblock thinks the generation is 318593
>> Superblock thinks the level is 1
>> Well block 25826479144960(gen: 318346 level: 1) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826450505728(gen: 318345 level: 1) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826461237248(gen: 318344 level: 1) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826479669248(gen: 318342 level: 0) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826479603712(gen: 318342 level: 0) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826468495360(gen: 318342 level: 0) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826465923072(gen: 318342 level: 0) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> Well block 25826477654016(gen: 318341 level: 0) seems good, but
>> generation/level doesn't match, want gen: 318593 level: 1
>> ...[truncated]
>>
>> I tried running btrfs check with the top 5 roots, but only the
>> first 3 seems to be usable. However, even with the first 3, btrfs
>> check gives me a lot of:
>> bytenr mismatch, want=26008292753408, have=0
>> bytenr mismatch, want=26353175658496, have=0
>> bytenr mismatch, want=26353188618240, have=0
>> bytenr mismatch, want=26353513299968, have=0
>> and thousands of extent errors, etc. I do see references to
>> directories within the filesystem though, so I'd think the tree root
>> is at least pretty good.
>>
>> Just to see if btrfs check can reach a usable state, I made a COW
>> snapshot of the imaged drive, and ran btrfs check --repair. However,
>> it eventually gives up, and seemed to have wrecked the FS.
>>
>> Is there a way to mount/repair the filesystem with the found root
>> instead? I'd like to copy the files off the image, but prefer not to
>> use btrfs restore. Can btrfs check just copy the alternative root and
>> not try to repair anything else?
>>
>> ====Misc info====
>> # uname -a
>> Linux tvm 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64 GNU/Linux
>> # btrfs --version
>> btrfs-progs v4.13.3
>>
>> Thanks for the help!
>> Liwei
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Damaged Root Tree(s)
@ 2018-01-22 3:30 Liwei
0 siblings, 0 replies; 7+ messages in thread
From: Liwei @ 2018-01-22 3:30 UTC (permalink / raw)
To: linux-btrfs
Hi Chris,
> On Sun, Jan 21, 2018 at 12:16 PM, Liwei <xieli...@gmail.com> wrote:
> > Hi list,
> >
> > ====TLDR====
> > 1. Can I mount a filesystem using one of the roots found with btrfs-find-root?
>
> Not necessarily because more than just the tree root needs to be
> readable to do a mount.
>
> But decent chance it's possible to do an offline scrape using one of
> those root trees with btrfs restore.
>
Its starting to look like that, I'll probably have to send a separate
email troubleshooting that, as there seems to be some errors occurring
even with the best root I've found.
>
> >
> > ====Background Information====
> > I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
> > theory is) experienced a headcrash while updating the root tree, or
> > maybe while it was carrying out background defragmentation.
> >
> > This occurred while I was setting up redundancy by using LVM
> > mirroring, so in the logs you'll see some dm errors. Unfortunately the
> > lost data has not been mirrored yet (what are the chances, given that
> > the mirror was 97% complete when this happened).
> >
> > Running a scrub on the raid shows that I have 1000+ unreadable
> > sectors, amounting to about 800kB of data. So I've got spare drives
> > and imaged the offending drive. Currently ddrescue is still trying to
> > read those sectors, but it seems unlikely that they'll ever succeed.
>
> Bad luck. What's the metadata profile? Single or DUP?
Metadata profile is DUP, but it seems like there is only one
up-to-date tree root at any time?
>
>
> >
> > Next I ran btrfs-find-root, which gave me the following:
> > Superblock thinks the generation is 318593
> > Superblock thinks the level is 1
> > Well block 25826479144960(gen: 318346 level: 1) seems good, but
> > generation/level doesn't match, want gen: 318593 level: 1
>
>
> That there's a big gap in generation between what's wanted and what's
> found, a bunch of those more recent trees must be colocated and are
> probably missing.
I thought so too. Is there a reason why they ended up being colocated?
I'm surprised with all the redundancies btrfs is capable of, this can
happen. Was it because the volume was starting to become full? (This
whole exercise of turning on mirroring was because we're migrating to
bigger disks)
>
> Anyway I think it's best to look at restore, and my limited experience
> it tends to be more successful when restoring from snapshots
Seems like that's the way forward indeed.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Damaged Root Tree(s)
@ 2018-01-22 6:11 Liwei
2018-01-22 6:26 ` Qu Wenruo
0 siblings, 1 reply; 7+ messages in thread
From: Liwei @ 2018-01-22 6:11 UTC (permalink / raw)
To: linux-btrfs
Hi Wenruo,
> On 2018年01月22日 09:11, Qu Wenruo wrote:
> >
> >
> > On 2018年01月22日 03:16, Liwei wrote:
> >> Hi list,
> >>
> >> ====TLDR====
> > >> 1. Can I mount a filesystem using one of the roots found with
> >> btrfs-find-root?
> >
> > Depends on the tree.
> >
> > If it's root tree, it's possible.
> >
> > Otherwise those found trees don't help much.
> >
> >
> >> 2. Can btrfs check just fix the damaged root without attempting any
> >> other repairs?
> >
> > No.
> > But under most case, it's not a single corrupted tree but normally multiple.
> >
> >> 3. If the above is not possible, how should I proceed given that I
> >> seem to have lost both the main and backup roots?
> >
> > In theory, it's possible to use specified fs tree root to salvage a
> > filesystem.
> >
> > But under most case, metadata is protected by safer profile.
> > So it's not implemented in btrfs-progs.
> >
> > Your current best try would be manually scanning through all tree backups.
> > > Which need extra info.
> >
> > Please provide the following info:
> >
> > # btrfs inspect dump-super -FfA <device> | grep backup_tree_root | sort
> > | uniq
backup_tree_root: 26008360648704 gen: 318590 level: 1
backup_tree_root: 26008365793280 gen: 318591 level: 1
backup_tree_root: 26008367398912 gen: 318592 level: 1
backup_tree_root: 26008375640064 gen: 318593 level: 1
> >
> > And try them one by one:
> >
> > # btrfs check --tree-root <number from above output> <device>
Seems like they're all part of the drive's bad sectors:
# btrfs check --tree-root 26008360648704 /dev/datavol/edata
bytenr mismatch, want=26008360648704, have=0
Couldn't read tree root
ERROR: cannot open file system
# btrfs check --tree-root 26008365793280 /dev/datavol/edata
bytenr mismatch, want=26008365793280, have=0
Couldn't read tree root
ERROR: cannot open file system
# btrfs check --tree-root 26008367398912 /dev/datavol/edata
bytenr mismatch, want=26008367398912, have=0
Couldn't read tree root
ERROR: cannot open file system
# btrfs check --tree-root 26008375640064 /dev/datavol/edata
bytenr mismatch, want=26008375640064, have=0
Couldn't read tree root
ERROR: cannot open file system
>
> And find-root output can also be tried here.
>
> But please keep in mind, the older generation is, the less chance.
After the first 10 or so entries from btrfs-find-root, btrfs check
wouldn't even recognise the root nodes. So it seems like this is a
gone case?
>
> Thanks,
> Qu
>
> >
> > If any one can proceed, then use it to repair:
> >
> > # btrfs check --tree-root <number> <device>
> >
> > And good luck.
> >
> > Thanks,
> > Qu
> >
> >>
> >> ====Background Information====
> >> I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
> >> theory is) experienced a headcrash while updating the root tree, or
> >> maybe while it was carrying out background defragmentation.>
> >> This occurred while I was setting up redundancy by using LVM
> >> mirroring, so in the logs you'll see some dm errors. Unfortunately the
> >> lost data has not been mirrored yet (what are the chances, given that
> >> the mirror was 97% complete when this happened).
> >>
> >> Running a scrub on the raid shows that I have 1000+ unreadable
> >> sectors, amounting to about 800kB of data. So I've got spare drives
> >> and imaged the offending drive. Currently ddrescue is still trying to
> >> read those sectors, but it seems unlikely that they'll ever succeed.
> >>
> >> ====Problem====
> >> So with an imaged copy of the array, I tried remounting the
> >> filesystem, but it refuses to mount even using 'usebackuproot':
> >>
> >> With usebackuproot:
> >> [ 1610.788527] device-mapper: raid1: Mirror read failed.
> >> [ 1610.788799] device-mapper: raid1: Mirror read failed.
> >> [ 1610.788939] Buffer I/O error on dev dm-15, logical block
> >> 5371800560, async page read
> >> [ 1610.823141] BTRFS: device label edata devid 1 transid 318593
> >> /dev/mapper/datavol-edata
> >> [ 1616.778563] BTRFS info (device dm-15): trying to use backup root at
> >> mount time
> >> [ 1616.778758] BTRFS info (device dm-15): disk space caching is enabled
> >> [ 1617.961152] device-mapper: raid1: Mirror read failed.
> >> [ 1618.238198] device-mapper: raid1: Mirror read failed.
> >> [ 1618.238498] BTRFS warning (device dm-15): failed to read tree root
> >> [ 1618.238700] device-mapper: raid1: Mirror read failed.
> >> [ 1618.238878] device-mapper: raid1: Mirror read failed.
> >> [ 1618.239050] BTRFS warning (device dm-15): failed to read tree root
> >> [ 1618.239207] device-mapper: raid1: Mirror read failed.
> >> [ 1618.239372] device-mapper: raid1: Mirror read failed.
> >> [ 1618.239590] BTRFS warning (device dm-15): failed to read tree root
> >> [ 1618.239775] device-mapper: raid1: Mirror read failed.
> >> [ 1618.240055] device-mapper: raid1: Mirror read failed.
> >> [ 1618.240298] BTRFS warning (device dm-15): failed to read tree root
> >> [ 1618.240492] device-mapper: raid1: Mirror read failed.
> >> [ 1618.240744] device-mapper: raid1: Mirror read failed.
> >> [ 1618.240989] BTRFS warning (device dm-15): failed to read tree root
> >> [ 1618.363234] BTRFS error (device dm-15): open_ctree failed
> >>
> >> Without usebackuproot:
> >> [ 2149.015427] device-mapper: raid1: Mirror read failed.
> >> [ 2149.015700] device-mapper: raid1: Mirror read failed.
> >> [ 2149.015840] Buffer I/O error on dev dm-15, logical block
> >> 5371800560, async page read
> >> [ 2154.172102] BTRFS info (device dm-15): disk space caching is enabled
> >> [ 2155.325134] device-mapper: raid1: Mirror read failed.
> >> [ 2155.715439] device-mapper: raid1: Mirror read failed.
> >> [ 2155.715795] BTRFS warning (device dm-15): failed to read tree root
> >> [ 2155.851599] BTRFS error (device dm-15): open_ctree failed
> >>
> >> It appears that the damaged data has affected both the main and
> >> backup roots.
> >>
> >> Next I ran btrfs-find-root, which gave me the following:
> >> Superblock thinks the generation is 318593
> >> Superblock thinks the level is 1
> >> Well block 25826479144960(gen: 318346 level: 1) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826450505728(gen: 318345 level: 1) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826461237248(gen: 318344 level: 1) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826479669248(gen: 318342 level: 0) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826479603712(gen: 318342 level: 0) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826468495360(gen: 318342 level: 0) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826465923072(gen: 318342 level: 0) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> Well block 25826477654016(gen: 318341 level: 0) seems good, but
> >> generation/level doesn't match, want gen: 318593 level: 1
> >> ...[truncated]
> >>
> >> I tried running btrfs check with the top 5 roots, but only the
> >> first 3 seems to be usable. However, even with the first 3, btrfs
> >> check gives me a lot of:
> >> bytenr mismatch, want=26008292753408, have=0
> >> bytenr mismatch, want=26353175658496, have=0
> >> bytenr mismatch, want=26353188618240, have=0
> >> bytenr mismatch, want=26353513299968, have=0
> >> and thousands of extent errors, etc. I do see references to
> >> directories within the filesystem though, so I'd think the tree root
> >> is at least pretty good.
> >>
> >> Just to see if btrfs check can reach a usable state, I made a COW
> >> snapshot of the imaged drive, and ran btrfs check --repair. However,
> >> it eventually gives up, and seemed to have wrecked the FS.
> >>
> >> Is there a way to mount/repair the filesystem with the found root
> >> instead? I'd like to copy the files off the image, but prefer not to
> >> use btrfs restore. Can btrfs check just copy the alternative root and
> >> not try to repair anything else?
> >>
> >> ====Misc info====
> >> # uname -a
> >> Linux tvm 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64
> >> GNU/Linux
> >> # btrfs --version
> >> btrfs-progs v4.13.3
> >>
> >> Thanks for the help!
> >> Liwei
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Damaged Root Tree(s)
2018-01-22 6:11 Liwei
@ 2018-01-22 6:26 ` Qu Wenruo
0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2018-01-22 6:26 UTC (permalink / raw)
To: Liwei, linux-btrfs
[-- Attachment #1.1: Type: text/plain, Size: 9789 bytes --]
On 2018年01月22日 14:11, Liwei wrote:
> Hi Wenruo,
>
>> On 2018年01月22日 09:11, Qu Wenruo wrote:
>>>
>>>
>>> On 2018年01月22日 03:16, Liwei wrote:
>>>> Hi list,
>>>>
>>>> ====TLDR====
>>>>> 1. Can I mount a filesystem using one of the roots found with
>>>> btrfs-find-root?
>>>
>>> Depends on the tree.
>>>
>>> If it's root tree, it's possible.
>>>
>>> Otherwise those found trees don't help much.
>>>
>>>
>>>> 2. Can btrfs check just fix the damaged root without attempting any
>>>> other repairs?
>>>
>>> No.
>>> But under most case, it's not a single corrupted tree but normally multiple.
>>>
>>>> 3. If the above is not possible, how should I proceed given that I
>>>> seem to have lost both the main and backup roots?
>>>
>>> In theory, it's possible to use specified fs tree root to salvage a
>>> filesystem.
>>>
>>> But under most case, metadata is protected by safer profile.
>>> So it's not implemented in btrfs-progs.
>>>
>>> Your current best try would be manually scanning through all tree backups.
>>>> Which need extra info.
>>>
>>> Please provide the following info:
>>>
>>> # btrfs inspect dump-super -FfA <device> | grep backup_tree_root | sort
>>> | uniq
>
> backup_tree_root: 26008360648704 gen: 318590 level: 1
> backup_tree_root: 26008365793280 gen: 318591 level: 1
> backup_tree_root: 26008367398912 gen: 318592 level: 1
> backup_tree_root: 26008375640064 gen: 318593 level: 1
>
>>>
>>> And try them one by one:
>>>
>>> # btrfs check --tree-root <number from above output> <device>
>
> Seems like they're all part of the drive's bad sectors:
>
> # btrfs check --tree-root 26008360648704 /dev/datavol/edata
> bytenr mismatch, want=26008360648704, have=0
> Couldn't read tree root
> ERROR: cannot open file system
> # btrfs check --tree-root 26008365793280 /dev/datavol/edata
> bytenr mismatch, want=26008365793280, have=0
> Couldn't read tree root
> ERROR: cannot open file system
> # btrfs check --tree-root 26008367398912 /dev/datavol/edata
> bytenr mismatch, want=26008367398912, have=0
> Couldn't read tree root
> ERROR: cannot open file system
> # btrfs check --tree-root 26008375640064 /dev/datavol/edata
> bytenr mismatch, want=26008375640064, have=0
> Couldn't read tree root
> ERROR: cannot open file system
>
>>
>> And find-root output can also be tried here.
>>
>> But please keep in mind, the older generation is, the less chance.
>
> After the first 10 or so entries from btrfs-find-root, btrfs check
> wouldn't even recognise the root nodes. So it seems like this is a
> gone case?
Unfortunately, it's gone.
And I doubt if btrfs-restore can restore anything since root tree is
corrupted.
The remaining idea would be, try to get the backup_fs_root and to see if
any of them can pass "btrfs-debug-tree -b <bytenr>" and get a good
enough result.
In that case, you may need a patched version of btrfs-debug-tree (btrfs
inspec dump-tree) to follow a tree node to verify if the tree is good
enough.
And pass the best bytenr to "btrfs restore -f <bytenr>", to have a
higher chance to salvage your fs.
Thanks,
Qu
>
>>
>> Thanks,
>> Qu
>>
>>>
>>> If any one can proceed, then use it to repair:
>>>
>>> # btrfs check --tree-root <number> <device>
>>>
>>> And good luck.
>>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> ====Background Information====
>>>> I have a 2x10TB raid0 (20TB, raid0 provided by md) volume that (my
>>>> theory is) experienced a headcrash while updating the root tree, or
>>>> maybe while it was carrying out background defragmentation.>
>>>> This occurred while I was setting up redundancy by using LVM
>>>> mirroring, so in the logs you'll see some dm errors. Unfortunately the
>>>> lost data has not been mirrored yet (what are the chances, given that
>>>> the mirror was 97% complete when this happened).
>>>>
>>>> Running a scrub on the raid shows that I have 1000+ unreadable
>>>> sectors, amounting to about 800kB of data. So I've got spare drives
>>>> and imaged the offending drive. Currently ddrescue is still trying to
>>>> read those sectors, but it seems unlikely that they'll ever succeed.
>>>>
>>>> ====Problem====
>>>> So with an imaged copy of the array, I tried remounting the
>>>> filesystem, but it refuses to mount even using 'usebackuproot':
>>>>
>>>> With usebackuproot:
>>>> [ 1610.788527] device-mapper: raid1: Mirror read failed.
>>>> [ 1610.788799] device-mapper: raid1: Mirror read failed.
>>>> [ 1610.788939] Buffer I/O error on dev dm-15, logical block
>>>> 5371800560, async page read
>>>> [ 1610.823141] BTRFS: device label edata devid 1 transid 318593
>>>> /dev/mapper/datavol-edata
>>>> [ 1616.778563] BTRFS info (device dm-15): trying to use backup root at
>>>> mount time
>>>> [ 1616.778758] BTRFS info (device dm-15): disk space caching is enabled
>>>> [ 1617.961152] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.238198] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.238498] BTRFS warning (device dm-15): failed to read tree root
>>>> [ 1618.238700] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.238878] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.239050] BTRFS warning (device dm-15): failed to read tree root
>>>> [ 1618.239207] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.239372] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.239590] BTRFS warning (device dm-15): failed to read tree root
>>>> [ 1618.239775] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.240055] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.240298] BTRFS warning (device dm-15): failed to read tree root
>>>> [ 1618.240492] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.240744] device-mapper: raid1: Mirror read failed.
>>>> [ 1618.240989] BTRFS warning (device dm-15): failed to read tree root
>>>> [ 1618.363234] BTRFS error (device dm-15): open_ctree failed
>>>>
>>>> Without usebackuproot:
>>>> [ 2149.015427] device-mapper: raid1: Mirror read failed.
>>>> [ 2149.015700] device-mapper: raid1: Mirror read failed.
>>>> [ 2149.015840] Buffer I/O error on dev dm-15, logical block
>>>> 5371800560, async page read
>>>> [ 2154.172102] BTRFS info (device dm-15): disk space caching is enabled
>>>> [ 2155.325134] device-mapper: raid1: Mirror read failed.
>>>> [ 2155.715439] device-mapper: raid1: Mirror read failed.
>>>> [ 2155.715795] BTRFS warning (device dm-15): failed to read tree root
>>>> [ 2155.851599] BTRFS error (device dm-15): open_ctree failed
>>>>
>>>> It appears that the damaged data has affected both the main and
>>>> backup roots.
>>>>
>>>> Next I ran btrfs-find-root, which gave me the following:
>>>> Superblock thinks the generation is 318593
>>>> Superblock thinks the level is 1
>>>> Well block 25826479144960(gen: 318346 level: 1) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826450505728(gen: 318345 level: 1) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826461237248(gen: 318344 level: 1) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826479669248(gen: 318342 level: 0) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826479603712(gen: 318342 level: 0) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826468495360(gen: 318342 level: 0) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826465923072(gen: 318342 level: 0) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> Well block 25826477654016(gen: 318341 level: 0) seems good, but
>>>> generation/level doesn't match, want gen: 318593 level: 1
>>>> ...[truncated]
>>>>
>>>> I tried running btrfs check with the top 5 roots, but only the
>>>> first 3 seems to be usable. However, even with the first 3, btrfs
>>>> check gives me a lot of:
>>>> bytenr mismatch, want=26008292753408, have=0
>>>> bytenr mismatch, want=26353175658496, have=0
>>>> bytenr mismatch, want=26353188618240, have=0
>>>> bytenr mismatch, want=26353513299968, have=0
>>>> and thousands of extent errors, etc. I do see references to
>>>> directories within the filesystem though, so I'd think the tree root
>>>> is at least pretty good.
>>>>
>>>> Just to see if btrfs check can reach a usable state, I made a COW
>>>> snapshot of the imaged drive, and ran btrfs check --repair. However,
>>>> it eventually gives up, and seemed to have wrecked the FS.
>>>>
>>>> Is there a way to mount/repair the filesystem with the found root
>>>> instead? I'd like to copy the files off the image, but prefer not to
>>>> use btrfs restore. Can btrfs check just copy the alternative root and
>>>> not try to repair anything else?
>>>>
>>>> ====Misc info====
>>>> # uname -a
>>>> Linux tvm 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64
>>>> GNU/Linux
>>>> # btrfs --version
>>>> btrfs-progs v4.13.3
>>>>
>>>> Thanks for the help!
>>>> Liwei
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-01-22 6:26 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-21 19:16 Damaged Root Tree(s) Liwei
2018-01-21 21:45 ` Chris Murphy
2018-01-22 1:11 ` Qu Wenruo
2018-01-22 1:14 ` Qu Wenruo
-- strict thread matches above, loose matches on Subject: below --
2018-01-22 3:30 Liwei
2018-01-22 6:11 Liwei
2018-01-22 6:26 ` Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).