From: Ferry Toth <ftoth@exalondelft.nl>
To: linux-btrfs@vger.kernel.org
Subject: Re: raid10 array lost with single disk failure?
Date: Sun, 9 Jul 2017 23:13:16 +0000 (UTC) [thread overview]
Message-ID: <ojudab$mcr$1@blaine.gmane.org> (raw)
In-Reply-To: pan$b8d9f$672ac146$c6d8fd45$3b1ed204@cox.net
Op Sat, 08 Jul 2017 20:51:41 +0000, schreef Duncan:
> Adam Bahe posted on Fri, 07 Jul 2017 23:26:31 -0500 as excerpted:
>
>> I did recently upgrade the kernel a few days ago from
>> 4.8.7-1.el7.elrepo.x86_64 to 4.10.6-1.el7.elrepo.x86_64. I had also
>> added a new 6TB disk a few days ago but I'm not sure if the balance
>> finished as it locked up sometime today when I was at work. Any ideas
>> how I can recover? Even if I have 1 bad disk, raid10 should have kept
>> my data safe no? Is there anything I can do to recover?
>
> Yes, btrfs raid10 should be fine with a single bad device. That's
> unlikely to be the issue.
I'm wondering about that.
The btrfs wiki says 'mostly ok' for raid10 and mentions btrfs needs to be
able to create 2 copies of a file to prevent going into an unreversable
read only.
To me that sounds like you are save against a single failing drive if you
had 5 to start with. Is that correct?
'mostly ok' is a bit useless message. We need to know what to do to be
safe, and to have defined procedures to prevent screwing up
'unrecoverably' when something bad happens.
Any advice?
> But you did well to bring up the balance. Have you tried mounting with
> the "skip_balance" mount option?
>
> Sometimes a balance will run into a previously undetected problem with
> the filesystem and crash. While mounting would otherwise still work, as
> soon as the filesystem goes active at the kernel level and before the
> mount call returns to userspace, the kernel will see the in-progress
> balance and attempt to continue it. But if it crashed while processing
> a particular block group (aka chunk), of course that's the first one in
> line to continue the balance with, which will naturally crash again as
> it comes to the same inconsistency that triggered the crash the first
> time.
>
> So the skip_balance mount option was invented to create a work-around
> and allow you to mount the filesystem again. =:^)
>
> The fact that it sits there for awhile trying to do IO on all devices
> before it crashes is another clue it's probably the resumed balance
> crashing things as it comes to the same inconsistency that triggered the
> original crash during balance, so it's very likely that skip_balance
> will help. =:^)
>
> Assuming that lets you mount, the next thing I'd try is a btrfs scrub.
> Chances are it'll find some checksum problems, but given that you're
> running raid10, there's a second copy it can try to use to correct the
> bad one and there's a reasonably good chance scrub will find and fix
> your problems. Even if it can't fix them all, it should get you closer,
> with less chance at making things worse instead of better than more
> risky options such as btrfs check with --repair.
>
> If a scrub completes with no uncorrected errors, I'd do an umount/mount
> cycle or reboot just to be sure -- don't forget the skip_balance option
> again tho -- and then, ensuring you're not doing anything that a crash
> would interrupt and have taken the opportunity presented to update your
> backups if you need to and assuming you consider the data worth more
> than the time/trouble/resources required for a backup, try a balance
> resume.
>
> Once the balance resume gets reasonably past the time it otherwise took
> to crash, you can reasonably assume you've safely corrected at least
> /that/ inconsistency, and hope the scrub took care of any others before
> you got to them.
>
> But of course all scrub does is verify checksums and where there's a
> second copy (as there is with dup, raid1 and raid10 modes) attempt a
> repair of the bad copy with the second one, of course verifying it as
> well in the process. If the second copy of that block is bad too or in
> cases where there isn't such a second copy, it'll detect but not be able
> to fix the block with a bad checksum, and if the block has a valid
> checksum but is logically invalid for other reasons, scrub won't detect
> it, because /all/ it does is verify checksums, not actual filesystem
> consistency. That's what the somewhat more risky (if --repair or other
> fix option is used, not in read-only mode, which detects but doesn't
> attempt to fix) btrfs check is for.
>
> So if skip_balance doesn't work, or it does but scrub can't fix all the
> errors it finds, or scrub fixes everything it detects but a balance
> resume still crashes, then it's time to try riskier fixes. I'll let
> others guide you there if needed, but will leave you with one
> reminder...
>
> Sysadmin's first rule of backups:
>
> Don't test fate and challenge reality! Have your backups or regardless
> of claims to the contrary you're defining your data as throw-away value,
> and eventually, fate and reality are going to call you on it!
>
> So don't worry too much even if you lose the filesystem. Either you
> have backups and can restore from them should it be necessary, or you
> defined the data as not worth the trouble of those backups, and losing
> it isn't a big deal, because in either case you saved what was truly
> important to you, either the data because it was important enough to you
> to have backups, or the time/resources/trouble you would have spent
> doing those backups, which you still saved regardless of whether you can
> save the data or not. =:^)
>
> --
> Duncan - List replies preferred. No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2017-07-09 23:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-08 4:26 raid10 array lost with single disk failure? Adam Bahe
2017-07-08 4:40 ` Adam Bahe
2017-07-08 19:37 ` Duncan
2017-07-08 20:51 ` Duncan
2017-07-09 23:13 ` Ferry Toth [this message]
2017-07-10 2:13 ` Adam Bahe
2017-07-10 12:49 ` Austin S. Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='ojudab$mcr$1@blaine.gmane.org' \
--to=ftoth@exalondelft.nl \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).