From: "Maximilian Bräutigam" <m@xbra.de>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PARTIALLY SOLVED] Btrfs RAID1 corrupted after crash
Date: Mon, 14 Apr 2014 09:12:44 +0200 [thread overview]
Message-ID: <534B8A6C.4090808@xbra.de> (raw)
In-Reply-To: <pan$b81be$f24a3d23$40041e1f$3538cdeb@cox.net>
Am 14.04.2014 00:42, schrieb Duncan:
> Maximilian Bräutigam posted on Sun, 13 Apr 2014 22:18:21 +0200 as
> excerpted:
>
>> unfortunately, I am very very deperate and I highly appreciate any help.
>> One week ago, I move my entire system to btrfs to setup a RAID1. I
>> created the RAID between device /dev/sdb and /dev/sdc with no partition
>> table on normal HDDs. Everything was working smoothly until my computer
>> crashed and at reboot I was not able to mount the device (my home dir)
>> again and got the following messages:
>
> You did your research before switching to a new filesystem and know that
> (as the btrfs kernel config option implies, and as the mkfs.btrfs command
> said at least last I used it, tho that was the v3.12 version) btrfs isn't
> entirely stable yet, and that (even more than with fully stable
> filesystems, where the general principle still applies) you should keep
> tested-to-be-usable backups when running it, or by action if not words,
> you're demonstrating that you really don't care about the data you place
> on it and don't mind if it gets trashed, right?
>
> Good. Then you either have a backup and can simply mkfs from your rescue
> method and restore from that backup, or you've demonstrated by your
> actions that the data wasn't of any major value to you anyway. No big
> deal either way! =:^)
>
> In case you didn't, well, you still have a reasonably good chance at
> recovery =:^), but regardless of whether it's recovered or not, do chalk
> this up to a learning experience and do your research and have those
> backups ready and tested next time, OK?
>
> [snip dmesg output from first attempt to mount]
>
>> So I cleared the cache with trying the mount option clear_cache
>
> Good. First thing to try. =:^)
>
>> but it stayed problematic and I was not able to mount it:
>>
>> [ 368.159594] BTRFS: error (device sdc) in __btrfs_free_extent:5755:
>> errno=-5 IO failure
>> [ 368.159602] BTRFS: error (device sdc) in
>> btrfs_run_delayed_refs:2713: errno=-5 IO failure
>> [ 368.165584] BTRFS warning (device sdc): Skipping commit of aborted
>> transaction.
>> [ 368.165589] BTRFS: error (device sdc) in cleanup_transaction:1545:
>> errno=-5 IO failure
>> [ 368.165787] BTRFS: error (device sdc) in
>> open_ctree:2839: errno=-5 IO failure (Failed to recover log tree)
>> [ 368.227161] BTRFS: open_ctree failed
>
> OK, there's several things to try based on that output...
>
>> Now, if I tried to mount it manually with degraded option enabled:
>>
>> # mount -t btrfs -o degraded /dev/sdb /mnt/sonst/
>> mount: wrong fs type, bad option, bad superblock on /dev/sdb,
>> missing codepage or helper program, or other error
>>
>> In some cases useful info is found in syslog - try dmesg | tail
>> or so.
>
> FWIW, the degraded option could be used if you didn't have both devices
> available, but the above dmesg got beyond that, so degraded isn't likely
> to help here.
>
>
>> Now I run btrfsck with repair option enabled but still I cannot mount
>> it.
>
> That was a mistake, as you'd have known if you had read this list before
> you tried your btrfs test. btrfsck --repair can fix some problems, but
> the code is rather new and not well tested and it can also make some
> problems it doesn't know about worse, so the recommendation is to try it
> last, after all other attempts to either fix the problem or simply
> recover the data have failed and the next step would be a mkfs, so you're
> not losing anything by trying it anyway. Either that, or run it in
> repair mode (without --repair it's OK since it's read-only and thus can't
> do further damage) only after being told to do so by a dev who can read
> the output from the read-only run and other diagnostics and is thus
> relatively confident it will fix the problems without doing further
> damage.
>
>> Here you can find the dmesg and btrfsck outputs:
>> dmesg: http://pastebin.com/zsaKQ0h1
>> btrfsck: http://pastebin.com/xva6uJwT
>>
>> Please, help me! ;( Are there other options to investigate my RAID or to
>> even temporarily mount it to get some data? What went wrong here? What
>> can I do? Why is a simple crash making my RAID unusable? Can I use other
>> tools for a recovery?
>
>> Archlinux, linux-3.14-5, btrfs-progs-3.14-1
>
> Good. You're using current kernel and tools. =:^)
>
> As hinted above, there are indeed additional tools to try, and there's a
> fair chance you can at least recover some/most of the data. =:^) Tho
> you didn't do yourself any favors running btrfsck --repair before trying
> them. =:^(
>
> Please read the wiki and manpages before doing anything else so as to
> increase the chances of recovery without further damage, but there's the
> recovery mount option (which often works best with ro), and tools to
> bypass the log tree and to recover from previous tree roots, among other
> things.
>
> wiki start page (suitable for memory or bookmarking):
>
> https://btrfs.wiki.kernel.org
>
> Here's the wiki's btrfsck page, which has a nice list of other things to
> try before you use it with --repair (and a link to the page of a list
> regular with further detail, too), but they will hopefully work afterward
> as well. Given the log-tree error in your dmesg, the btrfs-zero-log tool
> might be useful. But I'd definitely try mount -o ro,recovery first, and
> if that works, get everything to backup before trying anything else.
>
> https://btrfs.wiki.kernel.org/index.php/Btrfsck
>
Hi Duncan,
I was not really afraid of my data since I have several external backups
of the important data or git repos of what I do for work. But I would
have lost some very recent photos, which would have not been nice. And I
am (still) afraid of setting up/configure a properly working home dir on
another fs again. This is just time consuming. Furthermore, I thought
that btrfs has reached a certain level of maturity and this means some
fail safety for me. But "filesystem disk format is no longer unstable"
[1] does obviously not mean that there is an intact ecosystem of repair
tools (or better said one program that simply tries its best).
I tried several things according to [2].
1) btrfs restore
Was not really working, only a few GB of my data.
2) then I realised some "transid verify failed", so I did a
btrfs-zero-log DEVICE
3) From here I was able to mount my volume again – so I could save my
latest photos. ;)
When I mount my volume with autodefrag,compress=lzo,subvolid=0, I end up
with a "rw" mounted device. Then I copy some data with e.g. rsync and it
turns to "ro" on some point. I found this while I wanted to scrub the
devices, but this is naturally only working for writable mounts. And it
is still – I don't know why – not possible to boot from the device again.
Things to do next: try again with recovery option. If this is not
working: roll back to ext4. But I really like the idea behind COW,
subvolumes, no partitioning, RAID and everything in one fs. Snapshots
against user mistakes, RAID against disk failure – perfectly save, if
there was not the fs itself.
So far, so good. The problem is, that even if I can come back to a fully
working device or RAID again, the work load (that I have to put in just
because my computer crashed) is much to high for something profound like
a home dir.
Duncan, I appreciate your email. Unfortunately, the only thing I learned
to far is to give btrfs some more decades to age. ;)
Best wishes and thanks again,
Max
[1] https://btrfs.wiki.kernel.org/index.php/Main_Page
[2] https://unix.stackexchange.com/questions/32440/how-do-i-fix-btrfs
next prev parent reply other threads:[~2014-04-14 7:12 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-13 20:18 Btrfs RAID1 corrupted after crash Maximilian Bräutigam
2014-04-13 22:42 ` Duncan
2014-04-14 7:12 ` Maximilian Bräutigam [this message]
2014-04-14 11:02 ` [PARTIALLY SOLVED] " Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=534B8A6C.4090808@xbra.de \
--to=m@xbra.de \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).