Re: raid10 array lost with single disk failure?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ferry Toth <ftoth@exalondelft.nl>
To: linux-btrfs@vger.kernel.org
Subject: Re: raid10 array lost with single disk failure?
Date: Sun, 9 Jul 2017 23:13:16 +0000 (UTC)	[thread overview]
Message-ID: <ojudab$mcr$1@blaine.gmane.org> (raw)
In-Reply-To: pan$b8d9f$672ac146$c6d8fd45$3b1ed204@cox.net

Op Sat, 08 Jul 2017 20:51:41 +0000, schreef Duncan:

> Adam Bahe posted on Fri, 07 Jul 2017 23:26:31 -0500 as excerpted:
> 
>> I did recently upgrade the kernel a few days ago from
>> 4.8.7-1.el7.elrepo.x86_64 to 4.10.6-1.el7.elrepo.x86_64. I had also
>> added a new 6TB disk a few days ago but I'm not sure if the balance
>> finished as it locked up sometime today when I was at work. Any ideas
>> how I can recover? Even if I have 1 bad disk, raid10 should have kept
>> my data safe no? Is there anything I can do to recover?
> 
> Yes, btrfs raid10 should be fine with a single bad device.  That's
> unlikely to be the issue.

I'm wondering about that.

The btrfs wiki says 'mostly ok' for raid10 and mentions btrfs needs to be 
able to create 2 copies of a file to prevent going into an unreversable 
read only.

To me that sounds like you are save against a single failing drive if you 
had 5 to start with. Is that correct?

'mostly ok' is a bit useless message. We need to know what to do to be 
safe, and to have defined procedures to prevent screwing up 
'unrecoverably' when something bad happens.

Any advice?

> But you did well to bring up the balance.  Have you tried mounting with
> the "skip_balance" mount option?
> 
> Sometimes a balance will run into a previously undetected problem with
> the filesystem and crash.  While mounting would otherwise still work, as
> soon as the filesystem goes active at the kernel level and before the
> mount call returns to userspace, the kernel will see the in-progress
> balance and attempt to continue it.  But if it crashed while processing
> a particular block group (aka chunk), of course that's the first one in
> line to continue the balance with, which will naturally crash again as
> it comes to the same inconsistency that triggered the crash the first
> time.
> 
> So the skip_balance mount option was invented to create a work-around
> and allow you to mount the filesystem again. =:^)
> 
> The fact that it sits there for awhile trying to do IO on all devices
> before it crashes is another clue it's probably the resumed balance
> crashing things as it comes to the same inconsistency that triggered the
> original crash during balance, so it's very likely that skip_balance
> will help. =:^)
> 
> Assuming that lets you mount, the next thing I'd try is a btrfs scrub.
> Chances are it'll find some checksum problems, but given that you're
> running raid10, there's a second copy it can try to use to correct the
> bad one and there's a reasonably good chance scrub will find and fix
> your problems.  Even if it can't fix them all, it should get you closer,
> with less chance at making things worse instead of better than more
> risky options such as btrfs check with --repair.
> 
> If a scrub completes with no uncorrected errors, I'd do an umount/mount
> cycle or reboot just to be sure -- don't forget the skip_balance option
> again tho -- and then, ensuring you're not doing anything that a crash
> would interrupt and have taken the opportunity presented to update your
> backups if you need to and assuming you consider the data worth more
> than the time/trouble/resources required for a backup, try a balance
> resume.
> 
> Once the balance resume gets reasonably past the time it otherwise took
> to crash, you can reasonably assume you've safely corrected at least
> /that/ inconsistency, and hope the scrub took care of any others before
> you got to them.
> 
> But of course all scrub does is verify checksums and where there's a
> second copy (as there is with dup, raid1 and raid10 modes) attempt a
> repair of the bad copy with the second one, of course verifying it as
> well in the process.  If the second copy of that block is bad too or in
> cases where there isn't such a second copy, it'll detect but not be able
> to fix the block with a bad checksum, and if the block has a valid
> checksum but is logically invalid for other reasons, scrub won't detect
> it, because /all/ it does is verify checksums, not actual filesystem
> consistency.  That's what the somewhat more risky (if --repair or other
> fix option is used, not in read-only mode, which detects but doesn't
> attempt to fix) btrfs check is for.
> 
> So if skip_balance doesn't work, or it does but scrub can't fix all the
> errors it finds, or scrub fixes everything it detects but a balance
> resume still crashes, then it's time to try riskier fixes.  I'll let
> others guide you there if needed, but will leave you with one
> reminder...
> 
> Sysadmin's first rule of backups:
> 
> Don't test fate and challenge reality!  Have your backups or regardless
> of claims to the contrary you're defining your data as throw-away value,
> and eventually, fate and reality are going to call you on it!
> 
> So don't worry too much even if you lose the filesystem.  Either you
> have backups and can restore from them should it be necessary, or you
> defined the data as not worth the trouble of those backups, and losing
> it isn't a big deal, because in either case you saved what was truly
> important to you, either the data because it was important enough to you
> to have backups, or the time/resources/trouble you would have spent
> doing those backups, which you still saved regardless of whether you can
> save the data or not. =:^)
> 
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman

next prev parent reply	other threads:[~2017-07-09 23:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-08  4:26 raid10 array lost with single disk failure? Adam Bahe
2017-07-08  4:40 ` Adam Bahe
2017-07-08 19:37   ` Duncan
2017-07-08 20:51 ` Duncan
2017-07-09 23:13   ` Ferry Toth [this message]
2017-07-10  2:13     ` Adam Bahe
2017-07-10 12:49       ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='ojudab$mcr$1@blaine.gmane.org' \
    --to=ftoth@exalondelft.nl \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).