linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0
@ 2016-11-16 10:25 Martin Steigerwald
  2016-11-16 10:43 ` Roman Mamedov
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Steigerwald @ 2016-11-16 10:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Martin

Hello!

A degraded BTRFS RAID 1 from one 3TB SATA HDD of my former workstation is not mountable.

Debian 4.8 kernel + btrfs-tools 4.7.3.

A btrfs restore seems to work well enough, so on one hand there is no
urgency. But on the other hand I want to repurpose the harddisk and I
think I want to do it next weekend. So if you want me to gather some
debug data, please speak up quickly. Thank you.

AFAIR I have been able to mount the filesystems in degraded mode, but
this may have been on the other SATA HDD that I already wiped with shred
command.


I have this:

    merkaba:~> btrfs fi sh
    […]
    warning, device 2 is missing
    warning, device 2 is missing
    warning, device 2 is missing
    Label: 'debian'  uuid: […]
            Total devices 2 FS bytes used 20.10GiB
            devid    1 size 50.00GiB used 29.03GiB path /dev/mapper/satafp1-debian
            *** Some devices missing

    Label: 'daten'  uuid: […]
            Total devices 2 FS bytes used 135.02GiB
            devid    1 size 1.00TiB used 142.06GiB path /dev/mapper/satafp1-daten
            *** Some devices missing

    Label: 'backup'  uuid: […]
            Total devices 2 FS bytes used 88.38GiB
            devid    1 size 1.00TiB used 93.06GiB path /dev/mapper/satafp1-backup
            *** Some devices missing

But none of these filesystems seem to be mountable. Here some attempts:

    merkaba:~#130> LANG=C mount -o degraded /dev/satafp1/backup /mnt/zeit
    mount: wrong fs type, bad option, bad superblock on /dev/mapper/satafp1-daten,
          missing codepage or helper program, or other error

          In some cases useful info is found in syslog - try
          dmesg | tail or so.
    merkaba:~> dmesg | tail -5
    [ 2945.155943] BTRFS info (device dm-13): allowing degraded mounts
    [ 2945.155953] BTRFS info (device dm-13): disk space caching is enabled
    [ 2945.155957] BTRFS info (device dm-13): has skinny extents
    [ 2945.611236] BTRFS warning (device dm-13): missing devices (1) exceeds the limit (0), writeable mount is not allowed
    [ 2945.646719] BTRFS: open_ctree failed


    merkaba:~> LANG=C mount -o usebackuproot /dev/satafp1/daten /mnt/zeit         
    mount: wrong fs type, bad option, bad superblock on /dev/mapper/satafp1-daten,
          missing codepage or helper program, or other error

          In some cases useful info is found in syslog - try
          dmesg | tail or so.
    merkaba:~#32> dmesg | tail -5                                           
    [ 5739.051433] BTRFS info (device dm-12): trying to use backup root at mount time
    [ 5739.051441] BTRFS info (device dm-12): disk space caching is enabled
    [ 5739.051444] BTRFS info (device dm-12): has skinny extents
    [ 5739.103153] BTRFS error (device dm-12): failed to read chunk tree: -5
    [ 5739.130304] BTRFS: open_ctree failed


    merkaba:~> LANG=C mount -o degraded,usebackuproot /dev/satafp1/daten /mnt/zeit
    mount: wrong fs type, bad option, bad superblock on /dev/mapper/satafp1-daten,
          missing codepage or helper program, or other error

          In some cases useful info is found in syslog - try
          dmesg | tail or so.
    merkaba:~#32> dmesg | tail -5                                                    
    [ 5801.704202] BTRFS info (device dm-12): trying to use backup root at mount time
    [ 5801.704206] BTRFS info (device dm-12): disk space caching is enabled
    [ 5801.704208] BTRFS info (device dm-12): has skinny extents
    [ 5803.928059] BTRFS warning (device dm-12): missing devices (1) exceeds the limit (0), writeable mount is not allowed
    [ 5804.064638] BTRFS: open_ctree failed


`btrfs check` reports:

    merkaba:~#32> btrfs check /dev/satafp1/backup 
    warning, device 2 is missing
    Checking filesystem on /dev/satafp1/backup
    UUID: 01cf0493-476f-42e8-8905-61ef205313db
    checking extents
    checking free space cache
    failed to load free space cache for block group 58003030016
    failed to load free space cache for block group 60150513664
    failed to load free space cache for block group 62297997312
    […]
    checking fs roots
    ^C

I aborted it at this time as I wanted to try clear_cache mount option
after seeing this. I can redo this thing after btrfs restore completed.

    merkaba:~> mount -o degraded,clear_cache /dev/satafp1/backup /mnt/zeit
    mount: Falscher Dateisystemtyp, ungültige Optionen, der
    Superblock von /dev/mapper/satafp1-backup ist beschädigt, fehlende
    Kodierungsseite oder ein anderer Fehler

          Manchmal liefert das Systemprotokoll wertvolle Informationen –
          versuchen Sie  dmesg | tail  oder ähnlich
    merkaba:~#32> dmesg | tail -6
    [ 3080.120687] BTRFS info (device dm-13): allowing degraded mounts
    [ 3080.120699] BTRFS info (device dm-13): force clearing of disk cache
    [ 3080.120703] BTRFS info (device dm-13): disk space caching is enabled
    [ 3080.120706] BTRFS info (device dm-13): has skinny extents
    [ 3080.150957] BTRFS warning (device dm-13): missing devices (1) exceeds the limit (0), writeable mount is not allowed
    [ 3080.195941] BTRFS: open_ctree failed

    merkaba:~> mount -o degraded,clear_cache,usebackuproot /dev/satafp1/backup /mnt/zeit
    mount: Falscher Dateisystemtyp, ungültige Optionen, der
    Superblock von /dev/mapper/satafp1-backup ist beschädigt, fehlende
    Kodierungsseite oder ein anderer Fehler

          Manchmal liefert das Systemprotokoll wertvolle Informationen –
          versuchen Sie  dmesg | tail  oder ähnlich

    merkaba:~> dmesg | tail -7
    [ 3173.784713] BTRFS info (device dm-13): allowing degraded mounts
    [ 3173.784728] BTRFS info (device dm-13): force clearing of disk cache
    [ 3173.784737] BTRFS info (device dm-13): trying to use backup root at mount time
    [ 3173.784742] BTRFS info (device dm-13): disk space caching is enabled
    [ 3173.784746] BTRFS info (device dm-13): has skinny extents
    [ 3173.816983] BTRFS warning (device dm-13): missing devices (1) exceeds the limit (0), writeable mount is not allowed
    [ 3173.865199] BTRFS: open_ctree failed

I aborted repairing after this assert:

    merkaba:~#130> btrfs check --repair /dev/satafp1/backup &| stdbuf -oL tee btrfs-check-repair-satafp1-backup.log
    enabling repair mode
    warning, device 2 is missing
    Checking filesystem on /dev/satafp1/backup
    UUID: 01cf0493-476f-42e8-8905-61ef205313db
    checking extents
    Unable to find block group for 0
    extent-tree.c:289: find_search_start: Assertion `1` failed.
    btrfs[0x43e418]
    btrfs(btrfs_reserve_extent+0x5c9)[0x4425df]
    btrfs(btrfs_alloc_free_block+0x63)[0x44297c]
    btrfs(__btrfs_cow_block+0xfc)[0x436636]
    btrfs(btrfs_cow_block+0x8b)[0x436bd8]
    btrfs[0x43ad82]
    btrfs(btrfs_commit_transaction+0xb8)[0x43c5dc]
    btrfs[0x4268b4]
    btrfs(cmd_check+0x1111)[0x427d6d]
    btrfs(main+0x12f)[0x40a341]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fb2e6bec2b1]
    btrfs(_start+0x2a)[0x40a37a]

    merkaba:~#130> btrfs --version
    btrfs-progs v4.7.3

(Honestly I think asserts like this need to be gone from btrfs-tools for good)

About this I only found this unanswered mailing list post:

btrfs-convert: Unable to find block group for 0
Date: Fri, 24 Jun 2016 11:09:27 +0200
https://www.spinics.net/lists/linux-btrfs/msg56478.html


Out of curiosity I tried:

    merkaba:~#1> btrfs rescue zero-log //dev/satafp1/daten
    warning, device 2 is missing
    Clearing log on //dev/satafp1/daten, previous log_root 0, level 0
    Unable to find block group for 0
    extent-tree.c:289: find_search_start: Assertion `1` failed.
    btrfs[0x43e418]
    btrfs(btrfs_reserve_extent+0x5c9)[0x4425df]
    btrfs(btrfs_alloc_free_block+0x63)[0x44297c]
    btrfs(__btrfs_cow_block+0xfc)[0x436636]
    btrfs(btrfs_cow_block+0x8b)[0x436bd8]
    btrfs[0x43ad82]
    btrfs(btrfs_commit_transaction+0xb8)[0x43c5dc]
    btrfs[0x42c0d4]
    btrfs(main+0x12f)[0x40a341]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fb2f16a82b1]
    btrfs(_start+0x2a)[0x40a37a]

(I didn´t expect much as this is an issue that AFAIK does not happen
easily anymore, but I also thought it could not do much harm)

Superblocks themselves seem to be sane:

    merkaba:~#1> btrfs rescue super-recover //dev/satafp1/daten
    All supers are valid, no need to recover

So "btrfs restore" it is:

    merkaba:[…]> btrfs restore -mxs /dev/satafp1/daten daten-restore

This prints out a ton of:

    Trying another mirror
    Trying another mirror

But it actually works. Somewhat, I now just got

    Trying another mirror
    We seem to be looping a lot on daten-restore/[…]/virtualbox-4.1.18-dfsg/out/lib/vboxsoap.a, do you want to keep going on ? (y/N/a):

after about 35 GiB of data restored. I answered no to this one and now it is
at about 53 GiB already. I just got another one of these, but also not 
concerning a file I actually need.

Thanks,

-- 
Martin Steigerwald  | Trainer

teamix GmbH
Südwestpark 43
90449 Nürnberg

Tel.:  +49 911 30999 55 | Fax: +49 911 30999 99
mail: martin.steigerwald@teamix.de | web:  http://www.teamix.de | blog: http://blog.teamix.de

Amtsgericht Nürnberg, HRB 18320 | Geschäftsführer: Oliver Kügow, Richard Müller

teamix Support Hotline: +49 911 30999-112
 
 *** Bitte liken Sie uns auf Facebook: facebook.com/teamix ***


^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0
@ 2017-08-22  9:31 g6094199
  2017-08-22 10:28 ` Dmitrii Tcvetkov
  0 siblings, 1 reply; 17+ messages in thread
From: g6094199 @ 2017-08-22  9:31 UTC (permalink / raw)
  To: linux-btrfs; +Cc: rm

He guys,


picking up this old topic cause i'm running into a similar problem.


Running a Ubuntu 16.04 (HWE K4.8) server with 2 nvme SSD as Raid1 as /.
Since one nvme died i had to replace it, where the trouble began. I
replaced the nvme, bootet degraded, added the new disk to the raid
(btrfs dev add) and removed the missing/dead device (btrfs dev del).
Everything worked well. BUT as i rebooted i ran into the "BTRFS RAID 1
not mountable: open_ctree failed, unable to find block group for 0"
because of a MISSING disk?! I checked the btrfs list and found that
there was a patch that enabled a more strict behavior in handing missing
devices (atm cant find the related patch anymore), which was merged some
kernels before k4.8 but was NOT in k4.4. So i managed to install the
k4.4 ubuntu kernel and the system startet booting and working again. So
my pitty is that i cant update to anything after k4.4 with this
production machine. :-(

So 1st should be investigating why did the disk not get removed
correctly? Btrfs dev del should remove the device corretly, right? Is
there a bug?

2nd Was the restriction on handling missing devices to strikt? Is there
a bug?

3rd i saw https://patchwork.kernel.org/patch/9419189/ from Roman. Did he
receive any comments on his patch? This one could help on this problem,
too. 


Regards

Sash


^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0
@ 2017-08-23 13:12 g6094199
  0 siblings, 0 replies; 17+ messages in thread
From: g6094199 @ 2017-08-23 13:12 UTC (permalink / raw)
  To: Dmitrii Tcvetkov; +Cc: linux-btrfs

> -----Ursprüngliche Nachricht-----
> Von: Dmitrii Tcvetkov 
> Gesendet: Di. 22.08.2017 12:28
> An: g6094199@freenet.de
> Kopie: linux-btrfs@vger.kernel.org
> Betreff: Re: degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0
>
> On Tue, 22 Aug 2017 11:31:23 +0200
> g6094199@freenet.de wrote:
>> So 1st should be investigating why did the disk not get removed
>> correctly? Btrfs dev del should remove the device corretly, right? Is
>> there a bug?
>
> It should and probably did. To check that we need to see output of 
> btrfs filesystem show 
> and output of 
> btrfs filesystem usage

He Dimitry!

THX fror your suggestions.

root@vHost1:~# btrfs fi show /
Label: 'System'  uuid: 35fdbfd4-5809-4a30-94c1-5e3ca206ca4d
    Total devices 2 FS bytes used 17.00GiB
    devid    6 size 50.00GiB used 22.03GiB path /dev/nvme0n1p1
    devid    7 size 48.83GiB used 22.03GiB path /dev/sda1

root@vHost1:~# btrfs fi df /
Data, RAID1: total=19.00GiB, used=15.68GiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=3.00GiB, used=1.32GiB
GlobalReserve, single: total=384.00MiB, used=0.00B

root@vHost1:/var/log#  btrfs filesystem usag /
Overall:
    Device size:          98.83GiB
    Device allocated:          44.06GiB
    Device unallocated:          54.76GiB
    Device missing:             0.00B
    Used:              34.11GiB
    Free (estimated):          30.68GiB    (min: 30.68GiB)
    Data ratio:                  2.00
    Metadata ratio:              2.00
    Global reserve:         384.00MiB    (used: 0.00B)

Data,RAID1: Size:19.00GiB, Used:15.70GiB
   /dev/nvme0n1p1      19.00GiB
   /dev/sda1      19.00GiB

Metadata,RAID1: Size:3.00GiB, Used:1.36GiB
   /dev/nvme0n1p1       3.00GiB
   /dev/sda1       3.00GiB

System,RAID1: Size:32.00MiB, Used:16.00KiB
   /dev/nvme0n1p1      32.00MiB
   /dev/sda1      32.00MiB

Unallocated:
   /dev/nvme0n1p1      27.97GiB
   /dev/sda1      26.80GiB

 
> If there are non-raid1 chunks then you need to do soft balance:
> btrfs balance start -mconvert=raid1,soft -dconvert=raid1,soft

Yes of cause i did a balance after replacing the disk (see above). I'm aware of the problems occuring with single chunks, etc. I did again a soft balance as you have suggested, which finished within seconds.
 
> The balance should finish very quickly as you probably have only one of
> data and metadata single chunks. They appeared during writes when the
> filesystem was mounted read-write in degraded mode.

I guess the typical erros are now sorted out. i will reboot the machine with a currect hwe kernel and send the logs. Anything else i can do?

regards
sash



MIT TRAVELXTRA PROFITIEREN SIE VON 5% RÜCKVERGÜTUNG AUF DEN
REISEPREIS ? bekannte dt. Reiseanbieter und ein umfangreiches
Reiseangebot wie im Reisebüro!
https://email.freenet.de/reisen/index.html
[https://email.freenet.de/reisen/index.html?utm_medium=Text&utm_source=Footersatz&utm_campaign=Footersatz_Reisen07082017&epid=e9900000699&utm_content=Text]


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-08-23 13:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-16 10:25 degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0 Martin Steigerwald
2016-11-16 10:43 ` Roman Mamedov
2016-11-16 10:55   ` Martin Steigerwald
2016-11-16 11:00     ` Roman Mamedov
2016-11-16 11:04       ` Martin Steigerwald
2016-11-16 12:57         ` Austin S. Hemmelgarn
2016-11-16 17:06           ` Martin Steigerwald
2016-11-17 20:05             ` Chris Murphy
2016-11-17 20:20               ` Austin S. Hemmelgarn
2016-11-19 20:27                 ` Chris Murphy
2016-11-20 11:58                 ` Niccolò Belli
2016-11-17 20:46               ` Martin Steigerwald
2016-11-16 11:18     ` Martin Steigerwald
2016-11-16 12:48     ` Austin S. Hemmelgarn
  -- strict thread matches above, loose matches on Subject: below --
2017-08-22  9:31 g6094199
2017-08-22 10:28 ` Dmitrii Tcvetkov
2017-08-23 13:12 g6094199

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).