RAID56 status?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID56 status?
@ 2017-01-22 21:22 Jan Vales
  2017-01-22 22:35 ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Vales @ 2017-01-22 21:22 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1.1: Type: text/plain, Size: 707 bytes --]

Hi there!

Im using btrfs since like 2.28 on my laptop without too many problems.

Now im thinking of going further and go for btrfs raid6 on my pc.
I dont mind to restore backups once in a while, but it should not be
like on a weekly basis.

Therefore my question: whats the status of raid5/6 is in btrfs?
Is it somehow "production"-ready by now?

As in: if used on / on my pc ... do you expect it to brick itself, if
the fs fills up, or a power cycle happens, or some other somehow common
error-case to a point where i will need to restore backups?

regards,
Jan Vales
--
I only read plaintext emails.

Someone @ irc://irc.fsinf.at:6667/tuwien
webIRC: https://frost.fsinf.at/iris/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-22 21:22 RAID56 status? Jan Vales
@ 2017-01-22 22:35 ` Christoph Anton Mitterer
  2017-01-22 22:39   ` Hugo Mills
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2017-01-22 22:35 UTC (permalink / raw)
  To: Jan Vales, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 369 bytes --]

On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
> Therefore my question: whats the status of raid5/6 is in btrfs?
> Is it somehow "production"-ready by now?
AFAIK, what's on the - apparently already no longer updated - 
https://btrfs.wiki.kernel.org/index.php/Status still applies, and
RAID56 is not yet usable for anything near production.

Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-22 22:35 ` Christoph Anton Mitterer
@ 2017-01-22 22:39   ` Hugo Mills
  2017-01-22 22:48     ` Waxhead
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hugo Mills @ 2017-01-22 22:39 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: Jan Vales, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1597 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Jan 22, 2017 at 11:35:49PM +0100, Christoph Anton Mitterer wrote:
> On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
> > Therefore my question: whats the status of raid5/6 is in btrfs?
> > Is it somehow "production"-ready by now?
> AFAIK, what's on the - apparently already no longer updated -
> https://btrfs.wiki.kernel.org/index.php/Status still applies, and
> RAID56 is not yet usable for anything near production.

   It's still all valid. Nothing's changed.

   How would you like it to be updated? "Nope, still broken"?

   Hugo.

- -- 
Hugo Mills             | I went to a fight once, and an ice hockey match
hugo@... carfax.org.uk | broke out.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQIcBAEBAgAGBQJYhTRUAAoJEFheFHXiqx3kwI4P/3HqTR5Y3w3Po/t+F7BFFIAT
zBwKb0Fk/kk4YE6igSrR69qT/nPOHleYFCmcp+XDW6bDqKwrzVOWHJtiy3oDvhGK
hQoJqbgJ69LCI3nPxHi83VUDmgbFKA6WFU3d/F3L4k4UfbElv6Jz0UXo16hKZTzK
xVSh468MP7HvMGoZriWLMeS+It/y08Ojpx14sG60zCpgGEdT6czyLBPspYg1XupP
n2Q9nWPyjQnJ2c6YD+4JLYC0HhIMxAV74BXr4l7cmf1iWDB6064Q/DYhsejvJnuD
i9K0r5iHHjOc9yCGVdusVCXHBXiyBQrzDTls0jSxMN1hDmYaKo6knif2BZ7w7MfY
vqwgQJA+4XFkmcpPJccEfrqcup23RX+Gj61yEweuQnSGWTCv21WsASfwaUl49dFS
lxpX6+WsWuUZRh5Nvwt2hkRtoFhl2N2rdi0NEQfcUzj6qZD7Yg3jRNHNBZW0O8Dp
s8VtnDqXPDQatQsSAHLoTE2M8yXRoBg6asll+TBIQTvycXJ0TrEtvKxbxmZAUTqQ
yafn1wh8KFwhRuKygHhDyOn91iPKiq7vNuPCXKV0uM2oJE+0FaA2TfPXAQQkk74b
tQl2MJbIoqIRjJtQjtX+3aqQARkYno50fJJLqj03IDNuY48/sHEEkxeR+9Rjl2Xy
OK6tyMZL1nvPE3GYnUUP
=jn+E
-----END PGP SIGNATURE-----

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-22 22:39   ` Hugo Mills
@ 2017-01-22 22:48     ` Waxhead
  2017-01-22 22:56     ` Christoph Anton Mitterer
  2017-01-23  0:25     ` Jan Vales
  2 siblings, 0 replies; 14+ messages in thread
From: Waxhead @ 2017-01-22 22:48 UTC (permalink / raw)
  To: Hugo Mills, Christoph Anton Mitterer, Jan Vales, linux-btrfs

Hugo Mills wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Sun, Jan 22, 2017 at 11:35:49PM +0100, Christoph Anton Mitterer wrote:
>> On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
>>> Therefore my question: whats the status of raid5/6 is in btrfs?
>>> Is it somehow "production"-ready by now?
>> AFAIK, what's on the - apparently already no longer updated -
>> https://btrfs.wiki.kernel.org/index.php/Status still applies, and
>> RAID56 is not yet usable for anything near production.
>     It's still all valid. Nothing's changed.
>
>     How would you like it to be updated? "Nope, still broken"?
>
>     Hugo.
>
I risked updating the wiki to show kernel version from 4.9 instead of 
4.7 then...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-22 22:39   ` Hugo Mills
  2017-01-22 22:48     ` Waxhead
@ 2017-01-22 22:56     ` Christoph Anton Mitterer
  2017-01-23  0:25     ` Jan Vales
  2 siblings, 0 replies; 14+ messages in thread
From: Christoph Anton Mitterer @ 2017-01-22 22:56 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 372 bytes --]

On Sun, 2017-01-22 at 22:39 +0000, Hugo Mills wrote:
>    It's still all valid. Nothing's changed.
> 
>    How would you like it to be updated? "Nope, still broken"?

The kernel version mentioned there is 4.7... so noone (at least
endusers) really knows whether it's just no longer maintainer or still
up-to-date with nothing changed... :(


Cherrs,
Chris

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-22 22:39   ` Hugo Mills
  2017-01-22 22:48     ` Waxhead
  2017-01-22 22:56     ` Christoph Anton Mitterer
@ 2017-01-23  0:25     ` Jan Vales
  2017-01-23  1:34       ` Qu Wenruo
  2017-01-23  6:57       ` Brendan Hide
  2 siblings, 2 replies; 14+ messages in thread
From: Jan Vales @ 2017-01-23  0:25 UTC (permalink / raw)
  To: Hugo Mills, Christoph Anton Mitterer, linux-btrfs

[-- Attachment #1.1: Type: text/plain, Size: 2099 bytes --]

On 01/22/2017 11:39 PM, Hugo Mills wrote:
> On Sun, Jan 22, 2017 at 11:35:49PM +0100, Christoph Anton Mitterer wrote:
>> On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
>>> Therefore my question: whats the status of raid5/6 is in btrfs?
>>> Is it somehow "production"-ready by now?
>> AFAIK, what's on the - apparently already no longer updated -
>> https://btrfs.wiki.kernel.org/index.php/Status still applies, and
>> RAID56 is not yet usable for anything near production.
> 
>    It's still all valid. Nothing's changed.
> 
>    How would you like it to be updated? "Nope, still broken"?
> 
>    Hugo.
> 
> 

As the changelog stops at 4.7 the wiki seemed a little dead - "still
broken as of $(date)" or something like that would be nice ^.^

Also some more exact documentation/definition of btrfs' raid-levels
would be cool, as they seem to mismatch traditional raid-levels - or at
least I as an ignorant user fail to understand them...

Correct me, if im wrong...
* It seems, raid1(btrfs) is actually raid10, as there are no more than 2
copies of data, regardless of the count of devices.
** Is there a way to duplicate data n-times?
** If there are only 3 devices and the wrong device dies... is it dead?
* Whats the diffrence of raid1(btrfs) and raid10(btrfs)?
** After reading like 5 diffrent wiki pages, I understood, that there
are diffrences ... but not what they are and how they affect me :/
* Whats the diffrence of raid0(btrfs) and "normal" multi-device
operation which seems like a traditional raid0 to me?

Maybe rename/alias raid-levels that do not match traditional
raid-levels, so one cannot expect some behavior that is not there.
The extreme example is imho raid1(btrfs) vs raid1.
I would expect that if i have 5 btrfs-raid1-devices, 4 may die and btrfs
should be able to fully recover, which, if i understand correctly, by
far does not hold.
If you named that raid-level say "george" ... I would need to consult
the docs and I obviously would not expect any behavior. :)

regards,
Jan Vales
--
I only read plaintext emails.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23  0:25     ` Jan Vales
@ 2017-01-23  1:34       ` Qu Wenruo
       [not found]         ` <CALqCWCXNoKqAJR=7c4wzOvVjSBxNRMsUYMvkfRMcVk14dkp27Q@mail.gmail.com>
  2017-01-23  6:57       ` Brendan Hide
  1 sibling, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2017-01-23  1:34 UTC (permalink / raw)
  To: Jan Vales, Hugo Mills, Christoph Anton Mitterer, linux-btrfs

At 01/23/2017 08:25 AM, Jan Vales wrote:
> On 01/22/2017 11:39 PM, Hugo Mills wrote:
>> On Sun, Jan 22, 2017 at 11:35:49PM +0100, Christoph Anton Mitterer wrote:
>>> On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
>>>> Therefore my question: whats the status of raid5/6 is in btrfs?
>>>> Is it somehow "production"-ready by now?
>>> AFAIK, what's on the - apparently already no longer updated -
>>> https://btrfs.wiki.kernel.org/index.php/Status still applies, and
>>> RAID56 is not yet usable for anything near production.
>>
>>    It's still all valid. Nothing's changed.
>>
>>    How would you like it to be updated? "Nope, still broken"?
>>
>>    Hugo.
>>
>>

I'd like to update the wiki to "More and more RAID5/6 bugs are found" :)

OK, no kidding, at least we did exposed several new bugs, and reports 
already exists for a while in mail list.

Some examples are:

1) RAID5/6 scrub will repair data while corrupting parity
    Quite ironic, repairing is just changing one corruption to
    another.

2) RAID5/6 scrub can report false alerts on csum error

3) Dev-replace cancel sometimes can cause kernel panic.

And if we find more bugs, I'm not surprised at all.

So, if really want to use RAID5/6, please use soft raid, then build 
single volume btrfs on it.

I'm seriously considering to re-implement btrfs RAID5/6 using device 
mapper, which is tried and true.

>
> As the changelog stops at 4.7 the wiki seemed a little dead - "still
> broken as of $(date)" or something like that would be nice ^.^
>
> Also some more exact documentation/definition of btrfs' raid-levels
> would be cool, as they seem to mismatch traditional raid-levels - or at
> least I as an ignorant user fail to understand them...

man mkfs.btrfs has a quite good table for the btrfs profiles.

>
> Correct me, if im wrong...
> * It seems, raid1(btrfs) is actually raid10, as there are no more than 2
> copies of data, regardless of the count of devices.

Somewhat right, despite the stripe size of RAID10 is 64K while RAID1 is 
chunk size(1G for data normally), and the large stripe size for RAID1 
makes it meaningless to call it RAID0.

> ** Is there a way to duplicate data n-times?

The only supported n-times duplication is 3-times duplication, which 
uses RAID6 on 3 devices, and I don't consider it safe compared to RAID1.

> ** If there are only 3 devices and the wrong device dies... is it dead?

For RAID1/10/5/6, theoretically it's still alive.
RAID5/6 is of course no problem for it.

For RAID1, always 2 mirrors and mirrors are always located on difference 
device, so no matter which mirrors dies, btrfs can still read it out.

But in practice, it's btrfs, you know right?

> * Whats the diffrence of raid1(btrfs) and raid10(btrfs)?

RAID1: Pure mirror, no striping
           Disk 1                |           Disk 2
----------------------------------------------------------------
  Data Data Data Data Data       | Data Data Data Data Data
  \                      /
      Full one chunk

While chunks are always allocated to the device with most unallocated 
space, you can consider it as extent level RAID1 with chunk level RAID0.

RAID10: RAID1 first, then RAID0
         IIRC RAID0 stripe size is 64K

Disk 1 | Data 1 (64K) Data 4 (64K)
Disk 2 | Data 1 (64K) Data 4 (64K)
---------------------------------------
Disk 3 | Data 2 (64K)
Disk 4 | Data 2 (64K)
---------------------------------------
Disk 5 | Data 3 (64K)
Disk 6 | Data 3 (64K)

> ** After reading like 5 diffrent wiki pages, I understood, that there
> are diffrences ... but not what they are and how they affect me :/

Chunk level striping won't have any obvious performance advantage, while 
64K level striping do.

> * Whats the diffrence of raid0(btrfs) and "normal" multi-device
> operation which seems like a traditional raid0 to me?

What's "normal" or traditional RAID0?
Doesn't it uses all devices for striping? Or just uses 2?

Btrfs RAID0 is always using stripe size 64K (not only RAID0, but also 
RAID10/5/6).

While btrfs chunk allocation also provide chunk size level striping, 
which is 1G for data (considering your fs is larger than 10G) or 256M 
for metadata.

But that striping size won't provide anything useful.
So you could just forgot that chunk level thing.

Despite that, btrfs RAID should quite match normal RAID.

Thanks,
Qu

>
> Maybe rename/alias raid-levels that do not match traditional
> raid-levels, so one cannot expect some behavior that is not there.
> The extreme example is imho raid1(btrfs) vs raid1.
> I would expect that if i have 5 btrfs-raid1-devices, 4 may die and btrfs
> should be able to fully recover, which, if i understand correctly, by
> far does not hold.
> If you named that raid-level say "george" ... I would need to consult
> the docs and I obviously would not expect any behavior. :)
>
> regards,
> Jan Vales
> --
> I only read plaintext emails.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <CALqCWCXNoKqAJR=7c4wzOvVjSBxNRMsUYMvkfRMcVk14dkp27Q@mail.gmail.com>]

* Re: RAID56 status?
       [not found]         ` <CALqCWCXNoKqAJR=7c4wzOvVjSBxNRMsUYMvkfRMcVk14dkp27Q@mail.gmail.com>
@ 2017-01-23  5:24           ` Qu Wenruo
  2017-01-23 17:53             ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2017-01-23  5:24 UTC (permalink / raw)
  To: Zane Zakraisek, Jan Vales, Hugo Mills, Christoph Anton Mitterer,
	linux-btrfs



At 01/23/2017 12:42 PM, Zane Zakraisek wrote:
> Hi Qu,
> I've seen a good amount of Raid56 patches come in from you on the
> mailing list. Do these catch a large portion of the Raid56 bugs, or are
> they only the beginning? :)

Hard to say, it can be just tip of a iceberg, or beginning of RAID56 doom.

What I can do is just fixing bugs reported by users and let the patches 
goes through xfstests and internal test scripts.

So the patches just catch a large portion of *known* RAID56 bugs, I 
don't know how many hidden.

Thanks,
Qu

>
> ZZ
>
> On Sun, Jan 22, 2017, 6:34 PM Qu Wenruo <quwenruo@cn.fujitsu.com
> <mailto:quwenruo@cn.fujitsu.com>> wrote:
>
>
>
>     At 01/23/2017 08:25 AM, Jan Vales wrote:
>     > On 01/22/2017 11:39 PM, Hugo Mills wrote:
>     >> On Sun, Jan 22, 2017 at 11:35:49PM +0100, Christoph Anton
>     Mitterer wrote:
>     >>> On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
>     >>>> Therefore my question: whats the status of raid5/6 is in btrfs?
>     >>>> Is it somehow "production"-ready by now?
>     >>> AFAIK, what's on the - apparently already no longer updated -
>     >>> https://btrfs.wiki.kernel.org/index.php/Status still applies, and
>     >>> RAID56 is not yet usable for anything near production.
>     >>
>     >>    It's still all valid. Nothing's changed.
>     >>
>     >>    How would you like it to be updated? "Nope, still broken"?
>     >>
>     >>    Hugo.
>     >>
>     >>
>
>     I'd like to update the wiki to "More and more RAID5/6 bugs are found" :)
>
>     OK, no kidding, at least we did exposed several new bugs, and reports
>     already exists for a while in mail list.
>
>     Some examples are:
>
>     1) RAID5/6 scrub will repair data while corrupting parity
>         Quite ironic, repairing is just changing one corruption to
>         another.
>
>     2) RAID5/6 scrub can report false alerts on csum error
>
>     3) Dev-replace cancel sometimes can cause kernel panic.
>
>     And if we find more bugs, I'm not surprised at all.
>
>     So, if really want to use RAID5/6, please use soft raid, then build
>     single volume btrfs on it.
>
>     I'm seriously considering to re-implement btrfs RAID5/6 using device
>     mapper, which is tried and true.
>
>     >
>     > As the changelog stops at 4.7 the wiki seemed a little dead - "still
>     > broken as of $(date)" or something like that would be nice ^.^
>     >
>     > Also some more exact documentation/definition of btrfs' raid-levels
>     > would be cool, as they seem to mismatch traditional raid-levels -
>     or at
>     > least I as an ignorant user fail to understand them...
>
>     man mkfs.btrfs has a quite good table for the btrfs profiles.
>
>     >
>     > Correct me, if im wrong...
>     > * It seems, raid1(btrfs) is actually raid10, as there are no more
>     than 2
>     > copies of data, regardless of the count of devices.
>
>     Somewhat right, despite the stripe size of RAID10 is 64K while RAID1 is
>     chunk size(1G for data normally), and the large stripe size for RAID1
>     makes it meaningless to call it RAID0.
>
>     > ** Is there a way to duplicate data n-times?
>
>     The only supported n-times duplication is 3-times duplication, which
>     uses RAID6 on 3 devices, and I don't consider it safe compared to RAID1.
>
>     > ** If there are only 3 devices and the wrong device dies... is it
>     dead?
>
>     For RAID1/10/5/6, theoretically it's still alive.
>     RAID5/6 is of course no problem for it.
>
>     For RAID1, always 2 mirrors and mirrors are always located on difference
>     device, so no matter which mirrors dies, btrfs can still read it out.
>
>     But in practice, it's btrfs, you know right?
>
>     > * Whats the diffrence of raid1(btrfs) and raid10(btrfs)?
>
>     RAID1: Pure mirror, no striping
>                Disk 1                |           Disk 2
>     ----------------------------------------------------------------
>       Data Data Data Data Data       | Data Data Data Data Data
>       \                      /
>           Full one chunk
>
>     While chunks are always allocated to the device with most unallocated
>     space, you can consider it as extent level RAID1 with chunk level RAID0.
>
>     RAID10: RAID1 first, then RAID0
>              IIRC RAID0 stripe size is 64K
>
>     Disk 1 | Data 1 (64K) Data 4 (64K)
>     Disk 2 | Data 1 (64K) Data 4 (64K)
>     ---------------------------------------
>     Disk 3 | Data 2 (64K)
>     Disk 4 | Data 2 (64K)
>     ---------------------------------------
>     Disk 5 | Data 3 (64K)
>     Disk 6 | Data 3 (64K)
>
>
>     > ** After reading like 5 diffrent wiki pages, I understood, that there
>     > are diffrences ... but not what they are and how they affect me :/
>
>     Chunk level striping won't have any obvious performance advantage, while
>     64K level striping do.
>
>     > * Whats the diffrence of raid0(btrfs) and "normal" multi-device
>     > operation which seems like a traditional raid0 to me?
>
>     What's "normal" or traditional RAID0?
>     Doesn't it uses all devices for striping? Or just uses 2?
>
>
>
>     Btrfs RAID0 is always using stripe size 64K (not only RAID0, but also
>     RAID10/5/6).
>
>     While btrfs chunk allocation also provide chunk size level striping,
>     which is 1G for data (considering your fs is larger than 10G) or 256M
>     for metadata.
>
>     But that striping size won't provide anything useful.
>     So you could just forgot that chunk level thing.
>
>     Despite that, btrfs RAID should quite match normal RAID.
>
>     Thanks,
>     Qu
>
>     >
>     > Maybe rename/alias raid-levels that do not match traditional
>     > raid-levels, so one cannot expect some behavior that is not there.
>     > The extreme example is imho raid1(btrfs) vs raid1.
>     > I would expect that if i have 5 btrfs-raid1-devices, 4 may die and
>     btrfs
>     > should be able to fully recover, which, if i understand correctly, by
>     > far does not hold.
>     > If you named that raid-level say "george" ... I would need to consult
>     > the docs and I obviously would not expect any behavior. :)
>     >
>     > regards,
>     > Jan Vales
>     > --
>     > I only read plaintext emails.
>     >
>
>
>     --
>     To unsubscribe from this list: send the line "unsubscribe
>     linux-btrfs" in
>     the body of a message to majordomo@vger.kernel.org
>     <mailto:majordomo@vger.kernel.org>
>     More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23  5:24           ` Qu Wenruo
@ 2017-01-23 17:53             ` Christoph Anton Mitterer
  2017-01-23 23:18               ` Chris Mason
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2017-01-23 17:53 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

Just wondered... is there any larger known RAID56 deployment? I mean
something with real-world production systems and ideally many different
 IO scenarios, failures, pulling disks randomly and perhaps even so
many disks that it's also likely to hit something like silent data
corruption (on the disk level)?

Has CM already migrated all of Facebook's storage to btrfs RAID56?! ;-)
Well at least facebook.com seems till online ;-P *kidding*

I mean the good thing in having such a massive production-like
environment - especially when it's not just one homogeneous usage
pattern - is that it would help to build up quite some trust into the
code (once the already known bugs are fixed).

Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23 17:53             ` Christoph Anton Mitterer
@ 2017-01-23 23:18               ` Chris Mason
  2017-01-23 23:31                 ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Mason @ 2017-01-23 23:18 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-btrfs

On Mon, Jan 23, 2017 at 06:53:21PM +0100, Christoph Anton Mitterer wrote:
>Just wondered... is there any larger known RAID56 deployment? I mean
>something with real-world production systems and ideally many different
> IO scenarios, failures, pulling disks randomly and perhaps even so
>many disks that it's also likely to hit something like silent data
>corruption (on the disk level)?
>
>Has CM already migrated all of Facebook's storage to btrfs RAID56?! ;-)
>Well at least facebook.com seems till online ;-P *kidding*
>
>I mean the good thing in having such a massive production-like
>environment - especially when it's not just one homogeneous usage
>pattern - is that it would help to build up quite some trust into the
>code (once the already known bugs are fixed).

We've been focusing on the single-drive use cases internally.  This year 
that's changing as we ramp up more users in different places.  
Performance/stability work and raid5/6 are the top of my list right now.

-chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23 23:18               ` Chris Mason
@ 2017-01-23 23:31                 ` Christoph Anton Mitterer
  2017-01-24 14:36                   ` Niccolò Belli
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2017-01-23 23:31 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 470 bytes --]

On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote:
> We've been focusing on the single-drive use cases internally.  This
> year 
> that's changing as we ramp up more users in different places.  
> Performance/stability work and raid5/6 are the top of my list right
> now.
+1

Would be nice to get some feedback on what happens behind the scenes...
 actually I think a regular btrfs development blog could be generally a
nice thing :)

Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23 23:31                 ` Christoph Anton Mitterer
@ 2017-01-24 14:36                   ` Niccolò Belli
  0 siblings, 0 replies; 14+ messages in thread
From: Niccolò Belli @ 2017-01-24 14:36 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-btrfs

+1

On martedì 24 gennaio 2017 00:31:42 CET, Christoph Anton Mitterer wrote:
> On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote:
>> We've been focusing on the single-drive use cases internally.  This
>> year 
>> that's changing as we ramp up more users in different places.  
>> Performance/stability work and raid5/6 are the top of my list right
>> now.
> +1
>
> Would be nice to get some feedback on what happens behind the scenes...
>  actually I think a regular btrfs development blog could be generally a
> nice thing :)
>
> Cheers,
> Chris.
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23  0:25     ` Jan Vales
  2017-01-23  1:34       ` Qu Wenruo
@ 2017-01-23  6:57       ` Brendan Hide
  2017-01-23  8:34         ` Janos Toth F.
  1 sibling, 1 reply; 14+ messages in thread
From: Brendan Hide @ 2017-01-23  6:57 UTC (permalink / raw)
  To: Jan Vales, Hugo Mills, Christoph Anton Mitterer, Qu Wenruo,
	linux-btrfs

Hey, all

Long-time lurker/commenter here. Production-ready RAID5/6 and N-way 
mirroring are the two features I've been anticipating most, so I've 
commented regularly when this sort of thing pops up. :)

I'm only addressing some of the RAID-types queries as Qu already has a 
handle on the rest.

Small-yet-important hint: If you don't have a backup of it, it isn't 
important.

On 01/23/2017 02:25 AM, Jan Vales wrote:
> [ snip ]
> Correct me, if im wrong...
> * It seems, raid1(btrfs) is actually raid10, as there are no more than 2
> copies of data, regardless of the count of devices.

The original "definition" of raid1 is two mirrored devices. The *nix 
industry standard implementation (mdadm) extends this to any number of 
mirrored devices. Thus confusion here is understandable.

> ** Is there a way to duplicate data n-times?

This is a planned feature, especially in lieu of feature-parity with 
mdadm, though the priority isn't particularly high right now. This has 
been referred to as "N-way mirroring". The last time I recall discussion 
over this, it was hoped to get work started on it after raid5/6 was stable.

> ** If there are only 3 devices and the wrong device dies... is it dead?

Qu has the right answers. Generally if you're using anything other than 
dup, raid0, or single, one disk failure is "okay". More than one failure 
is closer to "undefined". Except with RAID6, where you need to have more 
than two disk failures before you have lost data.

> * Whats the diffrence of raid1(btrfs) and raid10(btrfs)?

Some nice illustrations from Qu there. :)

> ** After reading like 5 diffrent wiki pages, I understood, that there
> are diffrences ... but not what they are and how they affect me :/
> * Whats the diffrence of raid0(btrfs) and "normal" multi-device
> operation which seems like a traditional raid0 to me?

raid0 stripes data in 64k chunks (I think this size is tunable) across 
all devices, which is generally far faster in terms of throughput in 
both writing and reading data.

By '"normal" multi-device' I will assume this means "single" with 
multiple devices. New writes with "single" will use a 1GB chunk on one 
device until the chunk is full, at which point it allocates a new chunk, 
which will usually be put on the disk with the most available free 
space. There is no particular optimisation in place comparable to raid0 
here.

>
> Maybe rename/alias raid-levels that do not match traditional
> raid-levels, so one cannot expect some behavior that is not there.

> The extreme example is imho raid1(btrfs) vs raid1.
> I would expect that if i have 5 btrfs-raid1-devices, 4 may die and btrfs
> should be able to fully recover, which, if i understand correctly, by
> far does not hold.
> If you named that raid-level say "george" ... I would need to consult
> the docs and I obviously would not expect any behavior. :)

We've discussed this a couple of times. Hugo came up with a notation 
since dubbed "csp" notation: c->Copies, s->Stripes, and p->Parities.

Examples of this would be:
raid1: 2c
3-way mirroring across 3 (or more*) devices: 3c
raid0 (2-or-more-devices): 2s
raid0 (3-or-more): 3s
raid5 (5-or-more): 4s1p
raid16 (12-or-more): 2c4s2p

* note the "or more": Mdadm *cannot* mirror less mirrors or stripes than 
devices, whereas there is no particular reason why btrfs won't be able 
to do this.

A minor problem with csp notation is that it implies a complete 
implementation of *any* combination of these, whereas the idea was 
simply to create a way to refer to the "raid" levels in a consistent way.

I hope this brings some clarity. :)

>
> regards,
> Jan Vales
> --
> I only read plaintext emails.
>

-- 
__________
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RAID56 status?
  2017-01-23  6:57       ` Brendan Hide
@ 2017-01-23  8:34         ` Janos Toth F.
  0 siblings, 0 replies; 14+ messages in thread
From: Janos Toth F. @ 2017-01-23  8:34 UTC (permalink / raw)
  To: Btrfs BTRFS

On Mon, Jan 23, 2017 at 7:57 AM, Brendan Hide <brendan@swiftspirit.co.za> wrote:
>
> raid0 stripes data in 64k chunks (I think this size is tunable) across all
> devices, which is generally far faster in terms of throughput in both
> writing and reading data.

I remember seeing some proposals for configurable stripe size in the
form of patches (which changed a lot over time) but I don't think the
idea reached a consensus (let alone if a final patch materialized and
got merged). I think it would be a nice feature though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-01-24 14:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-22 21:22 RAID56 status? Jan Vales
2017-01-22 22:35 ` Christoph Anton Mitterer
2017-01-22 22:39   ` Hugo Mills
2017-01-22 22:48     ` Waxhead
2017-01-22 22:56     ` Christoph Anton Mitterer
2017-01-23  0:25     ` Jan Vales
2017-01-23  1:34       ` Qu Wenruo
     [not found]         ` <CALqCWCXNoKqAJR=7c4wzOvVjSBxNRMsUYMvkfRMcVk14dkp27Q@mail.gmail.com>
2017-01-23  5:24           ` Qu Wenruo
2017-01-23 17:53             ` Christoph Anton Mitterer
2017-01-23 23:18               ` Chris Mason
2017-01-23 23:31                 ` Christoph Anton Mitterer
2017-01-24 14:36                   ` Niccolò Belli
2017-01-23  6:57       ` Brendan Hide
2017-01-23  8:34         ` Janos Toth F.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).