public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: "1i5t5.duncan@cox.net" <1i5t5.duncan@cox.net>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Scrubbing with BTRFS Raid 5
Date: Wed, 22 Jan 2014 20:45:38 +0000	[thread overview]
Message-ID: <1390423628.1198.49.camel@ret.masoncoding.com> (raw)
In-Reply-To: <pan$226f4$e1329373$53c2dcca$438e95e@cox.net>

On Tue, 2014-01-21 at 17:08 +0000, Duncan wrote:
> Graham Fleming posted on Tue, 21 Jan 2014 01:06:37 -0800 as excerpted:
> 
> > Thanks for all the info guys.
> > 
> > I ran some tests on the latest 3.12.8 kernel. I set up 3 1GB files and
> > attached them to /dev/loop{1..3} and created a BTRFS RAID 5 volume with
> > them.
> > 
> > I copied some data (from dev/urandom) into two test files and got their
> > MD5 sums and saved them to a text file.
> > 
> > I then unmounted the volume, trashed Disk3 and created a new Disk4 file,
> > attached to /dev/loop4.
> > 
> > I mounted the BTRFS RAID 5 volume degraded and the md5 sums were fine. I
> > added /dev/loop4 to the volume and then deleted the missing device and
> > it rebalanced. I had data spread out on all three devices now. MD5 sums
> > unchanged on test files.
> > 
> > This, to me, implies BTRFS RAID 5 is working quite well and I can in
> > fact,
> > replace a dead drive.
> > 
> > Am I missing something?
> 
> What you're missing is that device death and replacement rarely happens 
> as neatly as your test (clean unmounts and all, no middle-of-process 
> power-loss, etc).  You tested best-case, not real-life or worst-case.
> 
> Try that again, setting up the raid5, setting up a big write to it, 
> disconnect one device in the middle of that write (I'm not sure if just 
> dropping the loop works or if the kernel gracefully shuts down the loop 
> device), then unplugging the system without unmounting... and /then/ see 
> what sense btrfs can make of the resulting mess.  In theory, with an 
> atomic write btree filesystem such as btrfs, even that should work fine, 
> minus perhaps the last few seconds of file-write activity, but the 
> filesystem should remain consistent on degraded remount and device add, 
> device remove, and rebalance, even if another power-pull happens in the 
> middle of /that/.
> 
> But given btrfs' raid5 incompleteness, I don't expect that will work.
> 

raid5/6 deals with IO errors from one or two drives, and it is able to
reconstruct the parity from the remaining drives and give you good data.

If we hit a crc error, the raid5/6 code will try a parity reconstruction
to make good data, and if we find good data from the other copy, it'll
return that up to userland.

In other words, for those cases it works just like raid1/10.  What it
won't do (yet) is write that good data back to the storage.  It'll stay
bad until you remove the device or run balance to rewrite everything.

Balance will reconstruct parity to get good data as it balances.  This
isn't as useful as scrub, but that work is coming.

-chris




  parent reply	other threads:[~2014-01-22 20:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-21  9:06 Scrubbing with BTRFS Raid 5 Graham Fleming
2014-01-21 17:08 ` Duncan
2014-01-21 17:18   ` Jim Salter
2014-01-21 17:38     ` Chris Murphy
2014-01-21 18:25       ` Jim Salter
2014-01-22 16:02     ` Duncan
2014-01-22 20:45   ` Chris Mason [this message]
2014-01-22 21:06     ` ronnie sahlberg
2014-01-22 21:16       ` Chris Mason
2014-01-22 22:36         ` ronnie sahlberg
  -- strict thread matches above, loose matches on Subject: below --
2014-01-21 18:03 Graham Fleming
2014-01-22 15:39 ` Duncan
2014-01-20  0:53 Graham Fleming
2014-01-20 13:21 ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1390423628.1198.49.camel@ret.masoncoding.com \
    --to=clm@fb.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox