raid5

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid5
@ 2016-03-01 14:24 John Smith
  2016-03-01 21:44 ` raid5 Duncan
  0 siblings, 1 reply; 4+ messages in thread
From: John Smith @ 2016-03-01 14:24 UTC (permalink / raw)
  To: linux-btrfs

Hi,
what is the status of  btrfs raid5 in kernel 4.4? Thank you

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid5
  2016-03-01 14:24 raid5 John Smith
@ 2016-03-01 21:44 ` Duncan
  2016-03-02 13:43   ` raid5 Austin S. Hemmelgarn
  0 siblings, 1 reply; 4+ messages in thread
From: Duncan @ 2016-03-01 21:44 UTC (permalink / raw)
  To: linux-btrfs

John Smith posted on Tue, 01 Mar 2016 15:24:04 +0100 as excerpted:

> what is the status of  btrfs raid5 in kernel 4.4? Thank you

That is a very good question. =:^)

The answer, to the best I can give it, is, btrfs raid56 mode has no known 
outstanding bugs specific to it at this time (unless a dev knows of any, 
but I've not seen any confirmed on-list), and hasn't had any, at least 
nothing major, since early in the 4.1 cycle, so 4.2 thru 4.4 should be 
clean of /known/ raid56 bugs.

However, there was (at least) /one/ report of a problem on raid56, that 
to my knowledge hasn't been traced, so it's unknown if it would have 
happened on say raid1 or raid10, as opposed to the raid5/6 it did happen 
on, or not.  Of course with raid56 mode still being relatively new, 
that's one suspect, but we simply don't know at this point, and I've seen 
nothing further on that thread in say 10 days or so, so I'm not sure we 
ever will.

So the best recommendation I can give is that raid56 mode is definitely 
coming along, but depending on how cautious you are, may or may not yet 
be considered quite as stable as the rest of btrfs in general.

If your use-case calls for raid5 or raid6 and provided you have backups 
in place if you value the data (a caveat which still applies to btrfs in 
general even more than it does to fully stable and mature filesystems, as 
in general it's "stabilizing, but not yet fully stable and mature", and 
here, even incrementally more to raid56 mode), raid56 mode is what I'd 
call the barely acceptable tolerably stable range, but I'd still be much 
more comfortable with, and recommend, waiting another couple of kernel 
cycles just to be sure if you don't have an /immediate/ need, or 
alternatively, using say raid10 mode.

Again, that's assuming backups and that you're prepared to use them if 
necessary, if you care about the data, but again, that still applies to 
btrfs in general, and indeed, to a slightly lessor extent to /any/ data 
on /any/ filesystem.  Because as the sysadmin's rule of backups states 
(in simple form), you either have at least one level of backup, or by 
your (in)actions, you are literally defining that data as worth less than 
the time and resources necessary to do that backup.  Which means if you 
lose the data and don't have it backed up elsewhere to restore it from, 
you can still be happy, as obviously you considered the time and 
resources necessary to do that backup as of more worth than the data, so 
even if you lose the data, you saved what was obviously more important to 
you, the time and resources you otherwise would have put into ensuring 
that you had a backup, if the data was worth it.

So take heed and don't decide only AFTER you've lost it, that the data 
was actually worth more than the time and resources you DIDN'T spend on 
backing it up.  And while that definitely applies a bit more to btrfs in 
its current "stabilizing but not yet fully stable and mature" state than 
it does to fully stable and mature filesystems, it applies well enough to 
all of them, that figuring out the data was worth more than you thought 
is /always/ an experience you'd rather avoid, /regardless/ of the 
filesystem and hardware that data is on. =:^)

And with it either backed up or of only trivial value regardless, you 
don't have anything to lose, and even if raid56 mode /doesn't/ prove so 
stable for you after all, you can still rest easy knowing you aren't 
going to lose anything of value. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid5
  2016-03-01 21:44 ` raid5 Duncan
@ 2016-03-02 13:43   ` Austin S. Hemmelgarn
  2016-03-03  4:16     ` raid5 Duncan
  0 siblings, 1 reply; 4+ messages in thread
From: Austin S. Hemmelgarn @ 2016-03-02 13:43 UTC (permalink / raw)
  To: linux-btrfs

On 2016-03-01 16:44, Duncan wrote:
> John Smith posted on Tue, 01 Mar 2016 15:24:04 +0100 as excerpted:
>
>> what is the status of  btrfs raid5 in kernel 4.4? Thank you
>
> That is a very good question. =:^)
>
> The answer, to the best I can give it, is, btrfs raid56 mode has no known
> outstanding bugs specific to it at this time (unless a dev knows of any,
> but I've not seen any confirmed on-list), and hasn't had any, at least
> nothing major, since early in the 4.1 cycle, so 4.2 thru 4.4 should be
> clean of /known/ raid56 bugs.
That really depends on what you consider to be a bug...

For example, for most production usage, the insanely long 
rebuild/rebalance times that people are seeing with BTRFS raid56 (on the 
order of multiple days per terabyte of data to be rebuilt, compared to a 
couple of hours for a rebuild on the same hardware using MDRAID or LVM) 
would very much be considered a serious bug, as it significantly 
increases the chances of data loss due to further disk failures. 
Personally, my recommendation would be to not use BTRFS raid56 for 
anything other than testing if you're working with data-sets bigger than 
about 250G until this particular issue gets fixed, which may be a while 
as we can't seem to figure out what exactly is causing the problem.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid5
  2016-03-02 13:43   ` raid5 Austin S. Hemmelgarn
@ 2016-03-03  4:16     ` Duncan
  0 siblings, 0 replies; 4+ messages in thread
From: Duncan @ 2016-03-03  4:16 UTC (permalink / raw)
  To: linux-btrfs

Austin S. Hemmelgarn posted on Wed, 02 Mar 2016 08:43:17 -0500 as
excerpted:

> On 2016-03-01 16:44, Duncan wrote:
>> John Smith posted on Tue, 01 Mar 2016 15:24:04 +0100 as excerpted:
>>
>>> what is the status of  btrfs raid5 in kernel 4.4? Thank you
>>
>> That is a very good question. =:^)
>>
>> The answer, to the best I can give it, is, btrfs raid56 mode has no
>> known outstanding bugs specific to it at this time (unless a dev knows
>> of any,
>> but I've not seen any confirmed on-list), and hasn't had any, at least
>> nothing major, since early in the 4.1 cycle, so 4.2 thru 4.4 should be
>> clean of /known/ raid56 bugs.
> That really depends on what you consider to be a bug...
> 
> For example, for most production usage, the insanely long
> rebuild/rebalance times that people are seeing with BTRFS raid56 (on the
> order of multiple days per terabyte of data to be rebuilt, compared to a
> couple of hours for a rebuild on the same hardware using MDRAID or LVM)

Very good point.  I wasn't considering that a bug as it's not a direct 
dataloss danger (only the indirect danger of another device dying during 
the extremely long rebuilds), but you're correct, in practice it's a 
potentially blocker level bug.

But from what I've seen, it isn't affecting everyone, which is of course 
part of the problem from a developer POV, since that makes it harder to 
replicate and trace down.  And it's equally a problem from a user POV, as 
until it's fixed, even if your testing demonstrates that it's not 
affecting you ATM, until we actually pin down what's triggering it, 
there's no way of knowing whether or when it /might/ trigger, which means 
even if it's not affecting you in testing, you gotta assume it's going to 
affect you if you end up trying to do data recovery.

So agreed, tho the effect is pretty much the same as my preferred 
recommendation in any case, effectively, hold off another couple kernel 
cycles and ask again.  I simply wasn't thinking of this specific bug at 
the time and thus couldn't specifically mention it as a concrete reason.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-03-03  4:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-01 14:24 raid5 John Smith
2016-03-01 21:44 ` raid5 Duncan
2016-03-02 13:43   ` raid5 Austin S. Hemmelgarn
2016-03-03  4:16     ` raid5 Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).