From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs [raid56] stability
Date: Fri, 27 May 2016 01:28:42 +0000 (UTC) [thread overview]
Message-ID: <pan$dff8$271eaaf1$15ab6462$128a4291@cox.net> (raw)
In-Reply-To: CAKg6aQzTx7NLjC6gZxKUKkuACo313N3ZqxQ0SXO4WXEwGPFNkg@mail.gmail.com
Diego Torres posted on Fri, 27 May 2016 00:42:07 +0200 as excerpted:
> I've been using btrfs with a raid5 configuration with 3 disks for 6
> months, and then with 4 disks for a couple of months more. I run a
> weekly scrub, and a monthly balance. Btrfs is the only fs that can add
> drives one by one to an existing raid setup, and use the new space
> inmediately, without replacing all the drives. For me, this is one of
> the strongest points.
>
> And, as far as I understand, If I keep and eye on the free space
> available, and no drives fail, the filesystem would last indefinitely.
> However, the code to replace a failed/missing drive is not yet final,
> as I have discovered reading some wikis and this mailing list. Maybe I'm
> wrong.
>
> I haven't been able to find a timeline/roadmap about when the replace
> command will be stable/ready for use.
>
> Is this someone's priority? Is it planned for the next one,two or three
> years coming?
[I took the liberty of updating the title, since you're not really asking
about btrfs stability in general, but about btrfs raid56 mode
stability...]
You ask a very good question... with a rather complicated answer, at
least if I try to answer what I consider the real, unstated question.
The shortest accurate answer is that due to AFAIK currently not yet fully
traced bugs, raid56 (that is, parity-raid) mode is (still) not
recommended -- because while it nominally works, replacement turns out
not to be practical (takes waaayyy long, think weeks, easily enough time
for another device to die before replacement of the first is complete,
thereby possibly killing the array) in some but not all reported cases
currently, due to those bugs. Provided they can be properly traced, a
fix should be available relatively soon, and raid56 mode would then be
rather cautiously considered usable, tho still newer and less mature than
redundancy-raid mode (raid1, raid10). I'd say 1-3 kernel cycles...
unless something else comes up or the bugs (two of them) prove extremely
difficult to trace and fix.
A longer, more complicated answer, will note that the raid56 code
(including replace, I believe) was considered nominally complete with
3.19, altho there were a couple critical bugs found and fixed in the
early going, so the LTS stable 4.1 series is considered the absolute
minimum for raid56 mode, and 4.4 LTS or current is strongly recommended.
This is very likely what most of the resources you read were referring
to, the period between the original introduction of the runtime code in
(AFAIK) 3.9, and nominal completion in 3.19 or fix of the initial
critical bugs in 4.1. Those resources likely simply haven't been updated
since, altho with the current state, perhaps it's better that they
aren't, as if the were more people would be trying it and running into
these other bugs.
By late 4.3 and early 4.4, I was actually beginning to (extremely
cautiously still) consider raid56 mode usable... but then the reports of
these two further bugs, likely related, started coming in.
As mentioned above, the problem from the user perspective is that device
replacement or restriping to a different width (as you'd do using a
balance-convert if you started with N devices and then decided to expand
the array) /can/ /sometimes/ take effectively /forever/, *far* longer
than would be expected, and definitely long enough that there's a
reasonable risk of further device death, killing the entire raid. So the
raid56 parity guarantees cannot be relied on in terms of device
replacement, which pretty well breaks the whole reason people would
choose raid56 mode, as opposed to something else, in the first place.
That's why it's not currently recommended.
The problem from the developer perspective is different. It's that
replace and/or restripe works perfectly fine for some people, while
others are affected by this pair of bugs, and AFAIK, it hasn't yet been
possible to find the exact circumstances which trigger the bug(s), making
it about impossible to reliably reproduce in any predictable manner, thus
making it extremely difficult to reliably trace and fix.
Still, given that it's a known bug (or two) affecting enough people that
it can't be a one-off, chances are pretty good that they'll have it
traced and fixed within three kernel cycles. I'd say one kernel cycle,
as the couple of similarly widely seen bugs in other areas have been, but
for the fact that pretty much /everything/ related to raid56 mode has
seemed to take at least twice as long as people expected, so I'm allowing
3 kernel cycles from now, 4.6, 4 from 4.5, the cycle we were in when we
had enough reports of the problem to realize it was /not/ a one-off.
So I expect a fix by 4.9, but would recommend giving it another couple
cycles after that fix, until 4.8 if the fix actually gets into the 4.6
release, or 4.11 if it's actually 4.9 before the fix is integrated, just
to see if any other raid56 related bugs turn up, before actually
considering it reasonably usable. And definitely ask again then (if you
haven't been following the list and further raid56 development in the
mean time) before you start relying on it, just in case it either hasn't
been fixed, or some other serious bug has been found.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2016-05-27 1:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-26 22:42 btrfs stability Diego Torres
2016-05-27 1:28 ` Duncan [this message]
2016-05-27 5:14 ` Roman Mamedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$dff8$271eaaf1$15ab6462$128a4291@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.