RAID6 stable enough for production?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID6 stable enough for production?
@ 2015-10-14 20:19 Sjoerd
  2015-10-14 20:23 ` Donald Pearson
  2015-10-15  1:55 ` Duncan
  0 siblings, 2 replies; 11+ messages in thread
From: Sjoerd @ 2015-10-14 20:19 UTC (permalink / raw)
  To: linux-btrfs

Hi all,

Is RAID6 still considered unstable so I shouldn't use it in production?
The latest I could find about a test scenario is more than a year ago 
(http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html)

I want to build a new NAS (6 disks of 4TB) on RAID6 and prefer to use btrfs 
over zfs, but the latter is proven stable and I am unsure about btrfs...
Main usage for me would be to able to replace 1 or 2 failing (or going to 
fail) drives and be able to extend it in the future with more disks. The data 
on it shouldn't get corrupted unless the building were it's in is destroyed ;)

So should I go for btrfs?

NB: I am running happily a RAID5 btrfs with 4x2TB disks in it, but you'll just 
know the value of the filesystem when something goes wrong. Yes I know RAID5/6 
is not a backup ;)

Cheers,
Sjoerd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 20:19 RAID6 stable enough for production? Sjoerd
@ 2015-10-14 20:23 ` Donald Pearson
  2015-10-14 20:34   ` Lionel Bouton
  2015-10-15  1:55 ` Duncan
  1 sibling, 1 reply; 11+ messages in thread
From: Donald Pearson @ 2015-10-14 20:23 UTC (permalink / raw)
  To: Sjoerd; +Cc: Btrfs BTRFS

I would not use Raid56 in production.  I've tried using it a few
different ways but have run in to trouble with stability and
performance.  Raid10 has been working excellently for me.

On Wed, Oct 14, 2015 at 3:19 PM, Sjoerd <sjoerd@sjomar.eu> wrote:
> Hi all,
>
> Is RAID6 still considered unstable so I shouldn't use it in production?
> The latest I could find about a test scenario is more than a year ago
> (http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html)
>
> I want to build a new NAS (6 disks of 4TB) on RAID6 and prefer to use btrfs
> over zfs, but the latter is proven stable and I am unsure about btrfs...
> Main usage for me would be to able to replace 1 or 2 failing (or going to
> fail) drives and be able to extend it in the future with more disks. The data
> on it shouldn't get corrupted unless the building were it's in is destroyed ;)
>
> So should I go for btrfs?
>
> NB: I am running happily a RAID5 btrfs with 4x2TB disks in it, but you'll just
> know the value of the filesystem when something goes wrong. Yes I know RAID5/6
> is not a backup ;)
>
> Cheers,
> Sjoerd
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 20:23 ` Donald Pearson
@ 2015-10-14 20:34   ` Lionel Bouton
  2015-10-14 20:53     ` Donald Pearson
  0 siblings, 1 reply; 11+ messages in thread
From: Lionel Bouton @ 2015-10-14 20:34 UTC (permalink / raw)
  To: Donald Pearson, Sjoerd; +Cc: Btrfs BTRFS

Le 14/10/2015 22:23, Donald Pearson a écrit :
> I would not use Raid56 in production.  I've tried using it a few
> different ways but have run in to trouble with stability and
> performance.  Raid10 has been working excellently for me.

Hi, could you elaborate on the stability and performance problems you
had? Which kernels were you using at the time you were testing?

I'm interested because I have some RAID10 installations of 7 disks which
don't need much write performance (large backup servers with few clients
and few updates but very large datasets) that I plan to migrate to RAID6
when they approach their storage capacity (at least theoretically with 7
disks this will give better read performance and better protection
against disk failures). 3.19 brought full RAID5/6 support and from what
I remember there were some initial quirks but I'm unaware of any big
RAID5/6 problem in 4.1+ kernels.

Best regards,

Lionel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 20:34   ` Lionel Bouton
@ 2015-10-14 20:53     ` Donald Pearson
  2015-10-14 21:15       ` Rich Freeman
  2015-10-14 21:16       ` Lionel Bouton
  0 siblings, 2 replies; 11+ messages in thread
From: Donald Pearson @ 2015-10-14 20:53 UTC (permalink / raw)
  To: Lionel Bouton; +Cc: Sjoerd, Btrfs BTRFS

I've used it from 3.8 something to current, it does not handle drive
failure well at all, which is the point of parity raid. I had a 10disk
Raid6 array on 4.1.1 and a drive failure put the filesystem in an
irrecoverable state.  Scrub speeds are also an order of magnitude or
more slower in my own experience.  The issue isn't filesystem
read/write performance, it's maintenance and operation.

That 10 drive system was rebuilt as raid10 and I haven't had problems
since, and it's handled hdd problems reasonably well.

I finally moved away from raid56 yesterday because of the time it took
to scrub.  This was a 4x3tb array raid6 that i only used for backups.
I attempted to just rebalance in to raid10 but partway through the
balance the filesystem ran in to problems, forcing the filesystem into
readonly.  I tried some things to overcome that but ultimately just
wiped it out and recreated as raid10.  I suspect one of the drives may
be having problems so I'm running tests on it now.

Personally I would still recommend zfs on illumos in production,
because it's nearly unshakeable and the creative things you can do to
deal with problems are pretty remarkable.  The unfortunate reality is
though that over time your system will probably grow and expand and
zfs is very locked in to the original configuration.  Adding vdevs is
a poor solution IMO.

On Wed, Oct 14, 2015 at 3:34 PM, Lionel Bouton
<lionel-subscription@bouton.name> wrote:
> Le 14/10/2015 22:23, Donald Pearson a écrit :
>> I would not use Raid56 in production.  I've tried using it a few
>> different ways but have run in to trouble with stability and
>> performance.  Raid10 has been working excellently for me.
>
> Hi, could you elaborate on the stability and performance problems you
> had? Which kernels were you using at the time you were testing?
>
> I'm interested because I have some RAID10 installations of 7 disks which
> don't need much write performance (large backup servers with few clients
> and few updates but very large datasets) that I plan to migrate to RAID6
> when they approach their storage capacity (at least theoretically with 7
> disks this will give better read performance and better protection
> against disk failures). 3.19 brought full RAID5/6 support and from what
> I remember there were some initial quirks but I'm unaware of any big
> RAID5/6 problem in 4.1+ kernels.
>
> Best regards,
>
> Lionel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 20:53     ` Donald Pearson
@ 2015-10-14 21:15       ` Rich Freeman
  2015-10-14 21:19         ` Donald Pearson
  2015-10-15  1:47         ` Chris Murphy
  2015-10-14 21:16       ` Lionel Bouton
  1 sibling, 2 replies; 11+ messages in thread
From: Rich Freeman @ 2015-10-14 21:15 UTC (permalink / raw)
  To: Donald Pearson; +Cc: Lionel Bouton, Sjoerd, Btrfs BTRFS

On Wed, Oct 14, 2015 at 4:53 PM, Donald Pearson
<donaldwhpearson@gmail.com> wrote:
>
> Personally I would still recommend zfs on illumos in production,
> because it's nearly unshakeable and the creative things you can do to
> deal with problems are pretty remarkable.  The unfortunate reality is
> though that over time your system will probably grow and expand and
> zfs is very locked in to the original configuration.  Adding vdevs is
> a poor solution IMO.
>

This is the main thing that has kept me away from zfs - you can't
modify a vdev, like you can with an md array or btrfs.  I don't think
zfs makes use of all your space if you have mixed disk sizes in a
raid-z either - it works like mdadm.  I'm not sure whether btrfs will
be any better in that regard (if I have 2x3TB and 3x1TB drives in a
RAID5 I should get 6TB of usable space, not 4TB, without messing with
partitioning).

So, I am running raid1 btrfs in the hope that I'll be able to move to
something more efficient in the future.

However, I would not personally be using raid5/6 for anything but pure
experimentation on btrfs anytime soon.  I don't even trust the 4.1
kernel series for btrfs at all just yet, and you're not going to be
running older than that for raid5/6.

--
Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 20:53     ` Donald Pearson
  2015-10-14 21:15       ` Rich Freeman
@ 2015-10-14 21:16       ` Lionel Bouton
  1 sibling, 0 replies; 11+ messages in thread
From: Lionel Bouton @ 2015-10-14 21:16 UTC (permalink / raw)
  To: Donald Pearson; +Cc: Sjoerd, Btrfs BTRFS

Le 14/10/2015 22:53, Donald Pearson a écrit :
> I've used it from 3.8 something to current, it does not handle drive
> failure well at all, which is the point of parity raid. I had a 10disk
> Raid6 array on 4.1.1 and a drive failure put the filesystem in an
> irrecoverable state.  Scrub speeds are also an order of magnitude or
> more slower in my own experience.  The issue isn't filesystem
> read/write performance, it's maintenance and operation.

Thanks, I'll proceed with caution...
When 3.19 got out I tried various tests with loopback devices in RAID6
(dd if=/dev/random in the middle of one loopback device guaranteed to
have file data while using the filesystem for example) and didn't manage
to break it but it was arguably simple situations (either missing device
or corrupted data on device, not something behaving really erratically
like failing hardware).

Lionel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 21:15       ` Rich Freeman
@ 2015-10-14 21:19         ` Donald Pearson
  2015-10-15  1:47         ` Chris Murphy
  1 sibling, 0 replies; 11+ messages in thread
From: Donald Pearson @ 2015-10-14 21:19 UTC (permalink / raw)
  To: Rich Freeman; +Cc: Lionel Bouton, Sjoerd, Btrfs BTRFS

btrfs does handle mixed device sizes really well actually.  And you're
right, zfs is limited to the smallest drive x vdev width.  The rest
goes unused.  You can do things like pre-slice the drives with sparse
files and create zfs on those files, but then you'll load up those
larger drives with a lot more iop requests and you may dramatically
slow things down.  It can expand a vdev by replacing drives with
larger ones.  The other drawback is the single-drive performance of a
single vdev regardless of the number of drives in it.  I love zfs for
a lot of reasons, and dislike it for a lot too.  I ultimately decided
to use btrfs on my personal equipment because it promises to be more
organic and my commodity hardware definitely likes to play the organic
role. :)

On Wed, Oct 14, 2015 at 4:15 PM, Rich Freeman
<r-btrfs@thefreemanclan.net> wrote:
> On Wed, Oct 14, 2015 at 4:53 PM, Donald Pearson
> <donaldwhpearson@gmail.com> wrote:
>>
>> Personally I would still recommend zfs on illumos in production,
>> because it's nearly unshakeable and the creative things you can do to
>> deal with problems are pretty remarkable.  The unfortunate reality is
>> though that over time your system will probably grow and expand and
>> zfs is very locked in to the original configuration.  Adding vdevs is
>> a poor solution IMO.
>>
>
> This is the main thing that has kept me away from zfs - you can't
> modify a vdev, like you can with an md array or btrfs.  I don't think
> zfs makes use of all your space if you have mixed disk sizes in a
> raid-z either - it works like mdadm.  I'm not sure whether btrfs will
> be any better in that regard (if I have 2x3TB and 3x1TB drives in a
> RAID5 I should get 6TB of usable space, not 4TB, without messing with
> partitioning).
>
> So, I am running raid1 btrfs in the hope that I'll be able to move to
> something more efficient in the future.
>
> However, I would not personally be using raid5/6 for anything but pure
> experimentation on btrfs anytime soon.  I don't even trust the 4.1
> kernel series for btrfs at all just yet, and you're not going to be
> running older than that for raid5/6.
>
> --
> Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 21:15       ` Rich Freeman
  2015-10-14 21:19         ` Donald Pearson
@ 2015-10-15  1:47         ` Chris Murphy
  2015-10-15 16:40           ` Rich Freeman
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2015-10-15  1:47 UTC (permalink / raw)
  To: Rich Freeman, Btrfs BTRFS

On Wed, Oct 14, 2015 at 3:15 PM, Rich Freeman
<r-btrfs@thefreemanclan.net> wrote:

> This is the main thing that has kept me away from zfs - you can't
> modify a vdev, like you can with an md array or btrfs.

A possible work around is ZoL (ZFS on Linux) used as a GlusterFS brick.

For that matter, now that GlusterFS has checksums and snapshots, if
your workflow permits use a glusterfs only workflow (using SMB or NFS
for Windows or OS X, and the libvirt glusterfs backend for images) you
could build a conventional md/lvm RAID+XFS brick and then glusterfs on
that. And with distributed-replicated you can bail on raid6 and just
go raid 5. Heck if you have enough bricks you could do raid0 and just
let the bricks implode if need be.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-14 20:19 RAID6 stable enough for production? Sjoerd
  2015-10-14 20:23 ` Donald Pearson
@ 2015-10-15  1:55 ` Duncan
  1 sibling, 0 replies; 11+ messages in thread
From: Duncan @ 2015-10-15  1:55 UTC (permalink / raw)
  To: linux-btrfs

Sjoerd posted on Wed, 14 Oct 2015 22:19:50 +0200 as excerpted:

> Is RAID6 still considered unstable so I shouldn't use it in production?
> The latest I could find about a test scenario is more than a year ago
> (http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-
Status.html)
> 
> I want to build a new NAS (6 disks of 4TB) on RAID6 and prefer to use
> btrfs over zfs, but the latter is proven stable and I am unsure about
> btrfs...

My general recommendation for new btrfs features is to let them stabilize 
for a year -- five kernel series -- after they are nominally complete, 
before you consider them roughly as stable as btrfs itself.  Given that 
btrfs itself is definitely stabilizing, but not yet fully stable and 
mature (a status that it's likely to maintain for... probably another 
year at least, IMO, I'll skip the supporting detail arguments here), that 
would be the "stable" target for new features, as well.

Since btrfs raid56 mode was nominally complete in 3.19, that would place 
"stable as btrfs in general" at 4.4, and raid56 mode does indeed seem to 
be healthy and maturing, with no new show-stopping bugs since 4.1, such 
that I'm reasonably confident in that 4.4 prediction.

Throwing in another variable, that of LTS-stable kernels, which are very 
likely to be the ones of interest to folks looking at production usage, 
the current two latest LTS series are 3.18, which was pre-raid56-
completion, and 4.1, which was just after the last known-to-date raid56 
show-stopping bug.

Given that situation and the above 4.4 prediction, the absolute earliest 
LTS-series "production" recommendation I could comfortably make would be 
4.1 series, after it has time to integrate any 4.4 series back-patches, 
so say around kernels 4.5 or possibly 4.6, but deploying (after site 
testing of course) the 4.1 LTS series as stable at that point.

An arguably more conservative position would be to declare the 4.1 LTS 
series still too early as it didn't originate after the year 
stabilization period, and wait for the next LTS series /after/ 4.4 
(possibly including 4.4 if it is picked up as such).  Once that series, 
whatever it may be, is declared LTS, start site-testing it, and deploy it 
after you're comfortable with the results of those tests, presumably at 
least a couple kernel cycles later, so if for example 4.4 itself is 
declared LTS, again, possibly around 4.6 or so, but deploying the LTS 
series not the just released kernel.

That of course is using my general btrfs feature stability rules as 
developed after some time as a regular on this list, not after any raid56 
specific experience as Donald Pearson and others in-thread are reporting, 
since my own use-case is btrfs raid1 mode (deployed as multiple small 
pair-device btrfs raid1 filesystems on partitioned SSDs, without use of 
the subvolume/snapshot or btrfs send/receive features).

Tho it can be noted that my recommendations and theirs dovetail at this 
point, that btrfs raid56 isn't ready /yet/ for production usage.  I'm 
just predicting based on general btrfs experience that given general 
experience, it should be "as ready as btrfs itself is" come 4.4 or the 
next LTS kernel series thereafter, whereas I don't read them as making 
such predictions at this point... which of course they couldn't, since a 
feature-specific experience-based recommendation obviously rests on 
current experience, and everyone appears to agree that it simply isn't 
there at this point.  I'm simply predicting that it well /could/ be, by 
LTS-after-4.4 time, even if it isn't, now.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-15  1:47         ` Chris Murphy
@ 2015-10-15 16:40           ` Rich Freeman
  2015-10-15 19:04             ` Chris Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: Rich Freeman @ 2015-10-15 16:40 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Wed, Oct 14, 2015 at 9:47 PM, Chris Murphy <lists@colorremedies.com> wrote:
>
> For that matter, now that GlusterFS has checksums and snapshots...

Interesting - I haven't kept up with that.  Does it actually do
end-to-end checksums?  That is, compute the checksum at the time of
storage, store the checksum in the metadata somehow, and ensure the
checksum matches when data is retrieved?

I forget whether it was glusterfs or ceph I was looking at, but some
of those distributed filesystems will only checksum data while in
transit, but not while it is at rest.  So, if a server claims it has a
copy of the file, then it is assumed to be a good copy and you never
realize that even though you have 5 copies of that file distributed
around the server you ended up using differs from the other 4.

I'm also not sure if it supports an n+1/2 model like raid5/6, or if it
is just a 2*n model like raid1.  If I want to store 5TB of data with
redundancy, I'd prefer to not need 10TB worth of drives to do it,
regardless of how many systems they're spread across.

--
Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID6 stable enough for production?
  2015-10-15 16:40           ` Rich Freeman
@ 2015-10-15 19:04             ` Chris Murphy
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Murphy @ 2015-10-15 19:04 UTC (permalink / raw)
  To: Rich Freeman; +Cc: Chris Murphy, Btrfs BTRFS

On Thu, Oct 15, 2015 at 10:40 AM, Rich Freeman
<r-btrfs@thefreemanclan.net> wrote:
> On Wed, Oct 14, 2015 at 9:47 PM, Chris Murphy <lists@colorremedies.com> wrote:
>>
>> For that matter, now that GlusterFS has checksums and snapshots...
>
> Interesting - I haven't kept up with that.  Does it actually do
> end-to-end checksums?  That is, compute the checksum at the time of
> storage, store the checksum in the metadata somehow, and ensure the
> checksum matches when data is retrieved?

http://www.gluster.org/community/documentation/index.php/Features/BitRot

It could be argued that since checksums are computed after writing, that
SDC has already happened by the time the checksum is computed and written
(I also don't know if the metadata itself is checksummed so the checksum
could be wrong maybe and we don't know that?).

But yes it's stored checksum. It can be enabled to check on read/open but
by default it only does verification with scrubs. And the checksums are per
file, and are SHA256 based. So it's different than Btrfs, and it's passive.
But it's there.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-15 19:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-14 20:19 RAID6 stable enough for production? Sjoerd
2015-10-14 20:23 ` Donald Pearson
2015-10-14 20:34   ` Lionel Bouton
2015-10-14 20:53     ` Donald Pearson
2015-10-14 21:15       ` Rich Freeman
2015-10-14 21:19         ` Donald Pearson
2015-10-15  1:47         ` Chris Murphy
2015-10-15 16:40           ` Rich Freeman
2015-10-15 19:04             ` Chris Murphy
2015-10-14 21:16       ` Lionel Bouton
2015-10-15  1:55 ` Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).