Re: How does btrfs handle bad blocks in raid1?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: How does btrfs handle bad blocks in raid1?
Date: Thu, 9 Jan 2014 12:41:56 +0000 (UTC)	[thread overview]
Message-ID: <pan$ab233$2067aff9$a13049e6$e769dc55@cox.net> (raw)
In-Reply-To: 20140109104247.GH15634@carfax.org.uk

Hugo Mills posted on Thu, 09 Jan 2014 10:42:47 +0000 as excerpted:

> On Thu, Jan 09, 2014 at 11:26:26AM +0100, Clemens Eisserer wrote:
>> Hi,
>> 
>> I am running write-intensive (well sort of, one write every 10s)
>> workloads on cheap flash media which proved to be horribly unreliable.
>> A 32GB microSDHC card reported bad blocks after 4 days, while a usb pen
>> drive returns bogus data without any warning at all.
>> 
>> So I wonder, how would btrfs behave in raid1 on two such devices? Would
>> it simply mark bad blocks as "bad" and continue to be operational, or
>> will it bail out when some block can not be read/written anymore on one
>> of the two devices?
> 
> If a block is read and fails its checksum, then the other copy (in
> RAID-1) is checked and used if it's good. The bad copy is rewritten to
> use the good data.

This is why I'm (semi-impatiently, but not being a coder, I have little 
choice, and I do see advances happening) so looking forward to the 
planned N-way-mirroring, aka true-raid-1, feature, as opposed to btrfs' 
current 2-way-only mirroring.  Having checksumming is good, and a second 
copy in case one fails the checksum is nice, but what if they BOTH do?
I'd love to have the choice of (at least) three-way-mirroring, as for me 
that seems the best practical hassle/cost vs. risk balance I could get, 
but it's not yet possible. =:^(

For (at least) year now, the roadmap has had N-way-mirroring on the list 
for after raid5/6 as they want to build on its features, but (like much 
of the btrfs work) raid5/6 took about three kernels longer to introduce 
than originally thought, and even when introduced, the raid5/6 feature 
lacked some critical parts (like scrub) and wasn't considered real-world 
usable as integrity over a crash and/or device failure, the primary 
feature of raid5/6, couldn't be assured.  That itself was about three 
kernels ago now, and the raid5/6 functionality remains partial -- it 
writes the data and parities as it should, but scrub and recovery remain 
only partially coded, so it looks like that'll /still/ be a few more 
kernels before that's fully implemented and most bugs worked out, with 
very likely a similar story to play out for N-way-mirroring after that, 
thus placing it late this year for introduction and early next for 
actually usable stability.

But it remains on the roadmap and btrfs should have it... eventually.  
Meanwhile, I keep telling myself that this is filesystem code which a LOT 
of folks including me stake the survival of their data on, and I along 
with all the others definitely prefer it done CORRECTLY, even if it takes 
TEN years longer than intended, than have it sloppily and unreliably 
implemented sooner.

But it's still hard to wait, when sometimes I begin to think of it like 
that carrot suspended in front of the donkey, never to actually be 
reached.  Except... I *DO* see changes, and after originally taking off 
for a few months after my original btrfs investigation, finding it 
unusable in its then-current state, upon coming back about 5 months 
later, actual usability and stability on current features had improved to 
the point that I'm actually using it now, so there's certainly progress 
being made, and the fact that I'm actually using it now attests to that 
progress *NOT* being a simple illusion.  So it'll come, even if it /does/ 
sometimes seem it's Duke-Nukem-Forever.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

next prev parent reply	other threads:[~2014-01-09 12:42 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-09 10:26 How does btrfs handle bad blocks in raid1? Clemens Eisserer
2014-01-09 10:42 ` Hugo Mills
2014-01-09 12:41   ` Duncan [this message]
2014-01-09 12:52     ` Austin S Hemmelgarn
2014-01-09 15:15       ` Duncan
2014-01-09 16:49         ` George Eleftheriou
2014-01-09 17:09           ` Hugo Mills
2014-01-09 17:34             ` George Eleftheriou
2014-01-09 17:43               ` Hugo Mills
2014-01-09 18:40                 ` George Eleftheriou
2014-01-09 17:29           ` Chris Murphy
2014-01-09 18:00             ` George Eleftheriou
2014-01-10 15:27           ` Duncan
2014-01-10 15:46             ` George Mitchell
2014-01-09 17:31       ` Chris Murphy
2014-01-09 18:20         ` Austin S Hemmelgarn
2014-01-09 14:58     ` Chris Mason
2014-01-09 18:08     ` Chris Murphy
2014-01-09 18:22       ` Austin S Hemmelgarn
2014-01-09 18:52         ` Chris Murphy
2014-01-10 17:03           ` Duncan
2014-01-09 18:40   ` Chris Murphy
2014-01-09 19:13     ` Kyle Gates
2014-01-09 19:31       ` Chris Murphy
2014-01-09 23:24         ` George Mitchell
2014-01-10  0:08           ` Clemens Eisserer
2014-01-10  0:46             ` George Mitchell
     [not found] <201401100106.s0A16CNd016476@atl4mhib27.myregisteredsite.com>
2014-01-10  1:31 ` George Mitchell
2014-01-14 19:13   ` Chris Murphy
2014-01-14 19:37     ` Roman Mamedov
2014-01-14 21:05       ` Chris Murphy
2014-01-14 21:19         ` Roman Mamedov
2014-01-14 21:37           ` Chris Murphy
2014-01-14 21:45             ` Chris Murphy
2014-01-14 21:54             ` Roman Mamedov
2014-01-14 20:29     ` George Mitchell
2014-01-14 21:00       ` Roman Mamedov
2014-01-14 21:06         ` Hugo Mills
2014-01-14 21:27           ` Chris Murphy
2014-01-14 21:27         ` George Mitchell
2014-01-14 21:28         ` George Mitchell
2014-01-14 21:14       ` Chris Murphy
2014-01-14 21:48         ` George Mitchell
2014-01-14 21:48         ` George Mitchell
2014-01-14 22:14         ` George Mitchell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$ab233$2067aff9$a13049e6$e769dc55@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.