Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
@ 2013-11-19  5:12 deadhorseconsulting
  2013-11-19  9:06 ` Hugo Mills
  2013-11-21 17:14 ` Jeff Mahoney
  0 siblings, 2 replies; 14+ messages in thread
From: deadhorseconsulting @ 2013-11-19  5:12 UTC (permalink / raw)
  To: linux-btrfs

In theory (going by the man page and available documentation, not 100%
clear) does the following command indeed actually work as advertised
and specify how metadata should be placed and kept only on the
"devices" specified after the "-m" flag?

Thus given the following example:
mkfs.btrfs -L foo -m raid10 <ssd> <ssd> <ssd> <ssd> -d raid10 <rust>
<rust> <rust> <rust>

Would btrfs stripe/mirror and only keep metadata on the 4 specified SSD devices?
Likewise then stripe/mirror and only keep data on the specified 4 spinning rust?

In trying and creating this type of setup it appears that data is also
being stored on the devices specified as "metadata devices". This is
observed through via a "btrfs filesystem show". after committing a
large amount of data to the filesystem The data devices have balanced
data as expected with plenty of free space but the SSD device are
reported as either nearly used or completely used.

- DHC

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19  5:12 Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> deadhorseconsulting
@ 2013-11-19  9:06 ` Hugo Mills
  2013-11-19 19:24   ` deadhorseconsulting
  2013-11-19 23:16   ` Duncan
  2013-11-21 17:14 ` Jeff Mahoney
  1 sibling, 2 replies; 14+ messages in thread
From: Hugo Mills @ 2013-11-19  9:06 UTC (permalink / raw)
  To: deadhorseconsulting; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

On Mon, Nov 18, 2013 at 11:12:03PM -0600, deadhorseconsulting wrote:
> In theory (going by the man page and available documentation, not 100%
> clear) does the following command indeed actually work as advertised
> and specify how metadata should be placed and kept only on the
> "devices" specified after the "-m" flag?
> 
> Thus given the following example:
> mkfs.btrfs -L foo -m raid10 <ssd> <ssd> <ssd> <ssd> -d raid10 <rust>
> <rust> <rust> <rust>
> 
> Would btrfs stripe/mirror and only keep metadata on the 4 specified SSD devices?
> Likewise then stripe/mirror and only keep data on the specified 4 spinning rust?

   No. The devices are general purpose. The -d and -m options only
specify the type of redundancy, not the devices to use. There's a
project[1] to look at this kind of more intelligent chunk allocator,
but it's not been updated in a while.

[1] https://btrfs.wiki.kernel.org/index.php/Project_ideas#Device_IO_Priorities

> In trying and creating this type of setup it appears that data is also
> being stored on the devices specified as "metadata devices". This is
> observed through via a "btrfs filesystem show". after committing a
> large amount of data to the filesystem The data devices have balanced
> data as expected with plenty of free space but the SSD device are
> reported as either nearly used or completely used.

   This will happen with RAID-10. The allocator will write stripes as
wide as it can: in this case, the first stripes will run across all 8
devices, until the SSDs are full, and then will write across the
remaining 4 devices.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                --- If it ain't broke,  hit it again. ---                

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19  9:06 ` Hugo Mills
@ 2013-11-19 19:24   ` deadhorseconsulting
  2013-11-19 21:04     ` Duncan
  2013-11-20  6:41     ` Martin
  2013-11-19 23:16   ` Duncan
  1 sibling, 2 replies; 14+ messages in thread
From: deadhorseconsulting @ 2013-11-19 19:24 UTC (permalink / raw)
  To: Hugo Mills, deadhorseconsulting, linux-btrfs

Interesting, this confirms what I was observing.
Given the wording in man pages for "-m" and "-d" which states "Specify
how the metadata or data must be spanned across the devices
specified."
I took "devices specified" to literally mean the devices specified
after the according switch.

- DHC


On Tue, Nov 19, 2013 at 3:06 AM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Mon, Nov 18, 2013 at 11:12:03PM -0600, deadhorseconsulting wrote:
>> In theory (going by the man page and available documentation, not 100%
>> clear) does the following command indeed actually work as advertised
>> and specify how metadata should be placed and kept only on the
>> "devices" specified after the "-m" flag?
>>
>> Thus given the following example:
>> mkfs.btrfs -L foo -m raid10 <ssd> <ssd> <ssd> <ssd> -d raid10 <rust>
>> <rust> <rust> <rust>
>>
>> Would btrfs stripe/mirror and only keep metadata on the 4 specified SSD devices?
>> Likewise then stripe/mirror and only keep data on the specified 4 spinning rust?
>
>    No. The devices are general purpose. The -d and -m options only
> specify the type of redundancy, not the devices to use. There's a
> project[1] to look at this kind of more intelligent chunk allocator,
> but it's not been updated in a while.
>
> [1] https://btrfs.wiki.kernel.org/index.php/Project_ideas#Device_IO_Priorities
>
>> In trying and creating this type of setup it appears that data is also
>> being stored on the devices specified as "metadata devices". This is
>> observed through via a "btrfs filesystem show". after committing a
>> large amount of data to the filesystem The data devices have balanced
>> data as expected with plenty of free space but the SSD device are
>> reported as either nearly used or completely used.
>
>    This will happen with RAID-10. The allocator will write stripes as
> wide as it can: in this case, the first stripes will run across all 8
> devices, until the SSDs are full, and then will write across the
> remaining 4 devices.
>
>    Hugo.
>
> --
> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
>   PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
>                 --- If it ain't broke,  hit it again. ---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19 19:24   ` deadhorseconsulting
@ 2013-11-19 21:04     ` Duncan
  2013-11-20  6:41     ` Martin
  1 sibling, 0 replies; 14+ messages in thread
From: Duncan @ 2013-11-19 21:04 UTC (permalink / raw)
  To: linux-btrfs

deadhorseconsulting posted on Tue, 19 Nov 2013 13:24:01 -0600 as
excerpted:

> Interesting, this confirms what I was observing.

> Given the wording in man pages for "-m" and "-d" which states "Specify
> how the metadata or data must be spanned across the devices specified."
> I took "devices specified" to literally mean the devices specified after
> the according switch.

It's all in how you read the documentation.  After years of doing so...

While I can see how you might get that from reading the -m and -d option 
text descriptions, the synopsis indicates differently (excerpt quotes 
reformatted for posting):

SYNOPSIS

mkfs.btrfs  [ -A alloc-start ] [ -b byte-count ] [ -d data-profile ]
[ -f ] [ -n node‐size ] [ -l leafsize ] [ -L label ] [ -m metadata 
profile ] [ -M mixed data+metadata ] [ -s sectorsize ] [ -r rootdir ]
[ -K ] [ -O feature1,feature2,... ] [ -h ] [ -V ] device [ device ... ]

Here, you can see that the -d and -m options take only a single 
parameter, the profile, and that the device list goes at the end and is 
thus a general device list, not specifically linked to the -d and -m 
options.

Similarly, the option lines themselves:

-d, --data type

-m, --metadata profile

... not...

-d, --data type [ device [ device ... ]]

-m, --metdata profile [ device [ device ...]]

Those are from the manpage.  Similarly, the usage line from the output of 
mkfs.btrfs --help (--help being an unrecognized option it says, but it 
does what it needs to do...):

usage: mkfs.btrfs [options] dev [ dev ... ]

options:

-d --data data profile, raid0, raid1, raid5, raid6, raid10, dup or single

-m --metadata metadata profile, values like data profile

All options come first, no indication of per-option device list, then the 
general purpose devices list.

So I'd argue that the documentation's reasonably clear as-is, no per-
option device list, general purpose device list at the end, thus no 
ability to specify data-specific and metadata-specific device lists.

(Of course it can happen that the code gets out of sync with the 
documentation, but that wasn't the argument here.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19  9:06 ` Hugo Mills
  2013-11-19 19:24   ` deadhorseconsulting
@ 2013-11-19 23:16   ` Duncan
  2013-11-20  6:35     ` Martin
  2013-11-20  8:09     ` Hugo Mills
  1 sibling, 2 replies; 14+ messages in thread
From: Duncan @ 2013-11-19 23:16 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills posted on Tue, 19 Nov 2013 09:06:02 +0000 as excerpted:

> This will happen with RAID-10. The allocator will write stripes as wide
> as it can: in this case, the first stripes will run across all 8
> devices, until the SSDs are full, and then will write across the
> remaining 4 devices.

Hugo, it doesn't change the outcome for this case, but either your 
assertion above is incorrect, or the wiki discussion is incorrect (of 
course, or possibly I'm the one misunderstanding something, in which case 
hopefully replies to this will correct my understanding).

Because I distinctly recall reading on the wiki that for raid, regardless 
of the raid level, btrfs always allocates in pairs (well, I guess it'd be 
pairs of pairs for raid10 mode, and I believe that statement pre-dated 
raid5/6 support so that isn't included).  I was actually shocked by that 
because while I knew that was the case for raid1, I had thought that 
other raid levels would stripe as widely as possible, which is what you 
assert above as well.

Now I just have to find where I read that on the wiki...

OK, here's one spot, FAQ, md-raid/device-mapper-raid/btrfs-raid 
differences, btrfs:

https://btrfs.wiki.kernel.org/index.php/FAQ#btrfs

>>>>

btrfs combines all the drives into a storage pool first, and then 
duplicates the chunks as file data is created. RAID-1 is defined 
currently as "2 copies of all the data on different disks". This differs 
from MD-RAID and dmraid, in that those make exactly n copies for n disks. 
In a btrfs RAID-1 on 3 1TB drives we get 1.5TB of usable data. Because 
each block is only copied to 2 drives, writing a given block only 
requires exactly 2 drives spin up, reading requires only 1 drive to 
spinup.

RAID-0 is similarly defined, with the stripe split among exactly 2 disks. 
3 1TB drives yield 3TB usable space, but to read a given stripe only 
requires 2 disks.

RAID-10 is built on top of these definitions. Every stripe is split 
across to exactly 2 RAID1 sets and those RAID1 sets are written to 
exactly 2 disk (hense 4 disk minimum). A btrfs raid-10 volume with 6 1TB 
drives will yield 3TB usable space with 2 copies of all data, but only 4

<<<<

[Yes, that ending sentence is incomplete in the wiki.]

So we have:

1) raid1 is exactly two copies of data, paired devices.

2) raid0 is a stripe exactly two devices wide (reinforced by to read a 
stripe takes only two devices), so again paired devices.

3) raid10 is a combination of the above raid0 and raid1 definitions, 
exactly two raid1 pairs, paired in raid0.

So btrfs raid10 is pairs of pairs, each raid0 stripe a pair of raid1 
mirrors.  If there's 8 devices, four smaller, four larger, the first  
allocated chunks should be one per device, until the smaller devices fill 
up it'll chunk across the remaining four, but it'll be pairs of pairs of 
pairs, two pair(0)-of-pair(1) stripes wide instead of a single quad(0)-of-
pair(1) stripe wide.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19 23:16   ` Duncan
@ 2013-11-20  6:35     ` Martin
  2013-11-20 10:16       ` Chris Murphy
  2013-11-20  8:09     ` Hugo Mills
  1 sibling, 1 reply; 14+ messages in thread
From: Martin @ 2013-11-20  6:35 UTC (permalink / raw)
  To: linux-btrfs

On 19/11/13 23:16, Duncan wrote:

> So we have:
> 
> 1) raid1 is exactly two copies of data, paired devices.
> 
> 2) raid0 is a stripe exactly two devices wide (reinforced by to read a 
> stripe takes only two devices), so again paired devices.

Which is fine for some occasions and a very good start point.

However, I'm sure there is a strong wish to be able to specify n-copies
of data/metadata spread across m devices. Or even to specify 'hot spares'.

This would be a great to overcome the problem of a set of drives
becoming "read-only" when one btrfs drive fails or is removed.

(Or should we always mount with the "degraded" option?)

Regards,
Martin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19 19:24   ` deadhorseconsulting
  2013-11-19 21:04     ` Duncan
@ 2013-11-20  6:41     ` Martin
  1 sibling, 0 replies; 14+ messages in thread
From: Martin @ 2013-11-20  6:41 UTC (permalink / raw)
  To: linux-btrfs

On 19/11/13 19:24, deadhorseconsulting wrote:
> Interesting, this confirms what I was observing.
> Given the wording in man pages for "-m" and "-d" which states "Specify
> how the metadata or data must be spanned across the devices
> specified."
> I took "devices specified" to literally mean the devices specified
> after the according switch.

That sounds like a hang-over from too many years use of the mdadm
command and more recently such as the sgdisk command...

;-)


Myself, I like the btrfs way to specify the list of parameters and then
they all then get applied as a whole.

The one bugbear at the moment is that for using multiple disks: Any
actions seem to be applied to the list of devices in sequence
one-by-one. There's no apparent intelligence to consider "present pool"
-> "new pool" of devices as a whole.


More development!

Regards,
Martin



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19 23:16   ` Duncan
  2013-11-20  6:35     ` Martin
@ 2013-11-20  8:09     ` Hugo Mills
  2013-11-20 16:43       ` Duncan
  1 sibling, 1 reply; 14+ messages in thread
From: Hugo Mills @ 2013-11-20  8:09 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4445 bytes --]

On Tue, Nov 19, 2013 at 11:16:58PM +0000, Duncan wrote:
> Hugo Mills posted on Tue, 19 Nov 2013 09:06:02 +0000 as excerpted:
> 
> > This will happen with RAID-10. The allocator will write stripes as wide
> > as it can: in this case, the first stripes will run across all 8
> > devices, until the SSDs are full, and then will write across the
> > remaining 4 devices.
> 
> Hugo, it doesn't change the outcome for this case, but either your 
> assertion above is incorrect, or the wiki discussion is incorrect (of 
> course, or possibly I'm the one misunderstanding something, in which case 
> hopefully replies to this will correct my understanding).
> 
> Because I distinctly recall reading on the wiki that for raid, regardless 
> of the raid level, btrfs always allocates in pairs (well, I guess it'd be 
> pairs of pairs for raid10 mode, and I believe that statement pre-dated 
> raid5/6 support so that isn't included).  I was actually shocked by that 
> because while I knew that was the case for raid1, I had thought that 
> other raid levels would stripe as widely as possible, which is what you 
> assert above as well.

   That's incorrect. I used to think that, a few years ago, and it got
into at least one piece of documentation as a result, but once I
worked out the actual behaviour, I did try to correct it (I definitely
remember fixing the sysadmin guide this way). For striped levels
(RAID-0, 10, 5, 6), the FS will use as many stripes as possible -- for
RAID-10, this means an even number; for the others, this is all the
devices with free space on, down to a RAID-level dependent minimum.

RAID-0:  min 2 devices
RAID-10: min 4 devices
RAID-5:  min 2 devices (I think)
RAID-6:  min 3 devices (I think)

> Now I just have to find where I read that on the wiki...
> 
> OK, here's one spot, FAQ, md-raid/device-mapper-raid/btrfs-raid 
> differences, btrfs:
> 
> https://btrfs.wiki.kernel.org/index.php/FAQ#btrfs
> 
> >>>>
> 
> btrfs combines all the drives into a storage pool first, and then 
> duplicates the chunks as file data is created. RAID-1 is defined 
> currently as "2 copies of all the data on different disks". This differs 
> from MD-RAID and dmraid, in that those make exactly n copies for n disks. 
> In a btrfs RAID-1 on 3 1TB drives we get 1.5TB of usable data. Because 
> each block is only copied to 2 drives, writing a given block only 
> requires exactly 2 drives spin up, reading requires only 1 drive to 
> spinup.

   This is correct.

> RAID-0 is similarly defined, with the stripe split among exactly 2 disks. 
> 3 1TB drives yield 3TB usable space, but to read a given stripe only 
> requires 2 disks.

   This is definitely wrong. RAID-0 will use all 3 drives for each
stripe.

> RAID-10 is built on top of these definitions. Every stripe is split 
> across to exactly 2 RAID1 sets and those RAID1 sets are written to 
> exactly 2 disk (hense 4 disk minimum). A btrfs raid-10 volume with 6 1TB 
> drives will yield 3TB usable space with 2 copies of all data, but only 4

   This is also wrong. You will get 3 TB usage out of 6 × 1 TB drives,
but the individual stripes will be 3 drives wide. You would have the
same behaviour (2 copies of 3 stripes wide) on a 7-device array.

> <<<<
> 
> [Yes, that ending sentence is incomplete in the wiki.]
> 
> So we have:
> 
> 1) raid1 is exactly two copies of data, paired devices.
> 
> 2) raid0 is a stripe exactly two devices wide (reinforced by to read a 
> stripe takes only two devices), so again paired devices.
> 
> 3) raid10 is a combination of the above raid0 and raid1 definitions, 
> exactly two raid1 pairs, paired in raid0.
> 
> So btrfs raid10 is pairs of pairs, each raid0 stripe a pair of raid1 
> mirrors.  If there's 8 devices, four smaller, four larger, the first  
> allocated chunks should be one per device, until the smaller devices fill 
> up it'll chunk across the remaining four, but it'll be pairs of pairs of 
> pairs, two pair(0)-of-pair(1) stripes wide instead of a single quad(0)-of-
> pair(1) stripe wide.

   If the RAID code used pairs for its stripes, that'd be the case,
but it doesn't...

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
               --- emacs: Emacs Makes A Computer Slow. ---               

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-20  6:35     ` Martin
@ 2013-11-20 10:16       ` Chris Murphy
  2013-11-20 10:22         ` Russell Coker
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Murphy @ 2013-11-20 10:16 UTC (permalink / raw)
  To: Btrfs BTRFS

On Nov 19, 2013, at 11:35 PM, Martin <m_btrfs@ml1.co.uk> wrote:

> On 19/11/13 23:16, Duncan wrote:
> 
>> So we have:
>> 
>> 1) raid1 is exactly two copies of data, paired devices.
>> 
>> 2) raid0 is a stripe exactly two devices wide (reinforced by to read a 
>> stripe takes only two devices), so again paired devices.
> 
> Which is fine for some occasions and a very good start point.
> 
> However, I'm sure there is a strong wish to be able to specify n-copies
> of data/metadata spread across m devices. Or even to specify 'hot spares'.

Hot spares are worse than useless. Especially for raid10. The drive takes up space doing nothing but suck power, rather than adding space or performance. Somehow this idea comes from cheap companies who seem to think their data is so valuable they need hot spares, yet they don't have 24/7 staff on hand to do a hot swap. (As if the only problem that can occur is a dead drive.) So I think those companies can develop this otherwise unneeded feature.

n-copies raid1 is a good idea and I think it's being worked on.

Chris Murphy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-20 10:16       ` Chris Murphy
@ 2013-11-20 10:22         ` Russell Coker
  0 siblings, 0 replies; 14+ messages in thread
From: Russell Coker @ 2013-11-20 10:22 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Wed, 20 Nov 2013, Chris Murphy <lists@colorremedies.com> wrote:
> Hot spares are worse than useless. Especially for raid10. The drive takes
> up space doing nothing but suck power, rather than adding space or
> performance. Somehow this idea comes from cheap companies who seem to
> think their data is so valuable they need hot spares, yet they don't have
> 24/7 staff on hand to do a hot swap. (As if the only problem that can
> occur is a dead drive.) So I think those companies can develop this
> otherwise unneeded feature.
> 
> n-copies raid1 is a good idea and I think it's being worked on.

N copies RAID-1 is definitely more useful than RAID-1 with a hot-spare.

But for RAID-5/RAID-6 a hot spare can provide real value.  Not having to pay 
someone to make a special rushed visit to replace a disk is a definite 
benefit.

Also when a disk isn't being used it doesn't draw much power.  Last time I 
tested such things I found an IDE disk to use about 7W while spinning and no 
measurable difference to overall system power use when spun-down.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-20  8:09     ` Hugo Mills
@ 2013-11-20 16:43       ` Duncan
  2013-11-20 16:52         ` Hugo Mills
  0 siblings, 1 reply; 14+ messages in thread
From: Duncan @ 2013-11-20 16:43 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills posted on Wed, 20 Nov 2013 08:09:58 +0000 as excerpted:

> RAID-0:  min 2 devices
> RAID-10: min 4 devices
> RAID-5:  min 2 devices (I think)
> RAID-6:  min 3 devices (I think)

RAID-5 should be 3-device minimum (each stripe consisting of two data 
segments and one parity segment, each on a different device).

And RAID-6 similarly four devices (two data and two parity).

Perhaps it's time I get that wiki account and edit some of this stuff 
myself...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-20 16:43       ` Duncan
@ 2013-11-20 16:52         ` Hugo Mills
  2013-11-20 21:13           ` Duncan
  0 siblings, 1 reply; 14+ messages in thread
From: Hugo Mills @ 2013-11-20 16:52 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

On Wed, Nov 20, 2013 at 04:43:57PM +0000, Duncan wrote:
> Hugo Mills posted on Wed, 20 Nov 2013 08:09:58 +0000 as excerpted:
> 
> > RAID-0:  min 2 devices
> > RAID-10: min 4 devices
> > RAID-5:  min 2 devices (I think)
> > RAID-6:  min 3 devices (I think)
> 
> RAID-5 should be 3-device minimum (each stripe consisting of two data 
> segments and one parity segment, each on a different device).

   You can successfully run RAID-5 on two devices: one data device(*),
plus its parity. The parity check of a single piece of data is that
data, so it's equivalent to RAID-1 in that configuration. IIRC, the
MD-RAID code allows this; I can't remember if the btrfs RAID code does
or not, but it probably should do if it doesn't.

> And RAID-6 similarly four devices (two data and two parity).

   Similarly for RAID-6: it's a single data device(*), plus an
XOR-based parity (effectively a mirror), plus a more complex parity
calculation.

> Perhaps it's time I get that wiki account and edit some of this stuff 
> myself...

   Do check the assumptions first. :)

   Hugo.

(*) Yeah, OK, rotate the data/parity position as you move through the
stripes because it's not RAID-4.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- In my day, we didn't have fancy high numbers.  We had "nowt", ---  
                    "one",  "twain" and "multitudes".                    

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-20 16:52         ` Hugo Mills
@ 2013-11-20 21:13           ` Duncan
  0 siblings, 0 replies; 14+ messages in thread
From: Duncan @ 2013-11-20 21:13 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills posted on Wed, 20 Nov 2013 16:52:47 +0000 as excerpted:

>> Perhaps it's time I get that wiki account and edit some of this stuff
>> myself...
> 
>    Do check the assumptions first. :)

Of course. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> ...
  2013-11-19  5:12 Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> deadhorseconsulting
  2013-11-19  9:06 ` Hugo Mills
@ 2013-11-21 17:14 ` Jeff Mahoney
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff Mahoney @ 2013-11-21 17:14 UTC (permalink / raw)
  To: deadhorseconsulting, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1808 bytes --]

On 11/19/13, 12:12 AM, deadhorseconsulting wrote:
> In theory (going by the man page and available documentation, not 100%
> clear) does the following command indeed actually work as advertised
> and specify how metadata should be placed and kept only on the
> "devices" specified after the "-m" flag?
> 
> Thus given the following example:
> mkfs.btrfs -L foo -m raid10 <ssd> <ssd> <ssd> <ssd> -d raid10 <rust>
> <rust> <rust> <rust>
> 
> Would btrfs stripe/mirror and only keep metadata on the 4 specified SSD devices?
> Likewise then stripe/mirror and only keep data on the specified 4 spinning rust?
> 
> In trying and creating this type of setup it appears that data is also
> being stored on the devices specified as "metadata devices". This is
> observed through via a "btrfs filesystem show". after committing a
> large amount of data to the filesystem The data devices have balanced
> data as expected with plenty of free space but the SSD device are
> reported as either nearly used or completely used.

Others have noted that's not how it works, but I wanted to add a comment.

I had a feature request from a customer recently that was pretty much
exactly this. I think it'd be pretty easy to implement by allocating all
(except for overhead) of the devices to chunks immediately at mkfs time,
bypassing the kernel's dynamic chunk allocation. Since you don't *want*
to mix allocation profiles, the usual reason for doing it dynamically
doesn't apply. Extending an existing file system created in a such a
manner so that the added devices are set up with the right kinds of
chunks would require other extensions, though.

I have a few things on my plate right now, but I'll probably dig into
this in the next month or so.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 841 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-11-21 17:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-19  5:12 Actual effect of mkfs.btrfs -m raid10 </dev/sdX> ... -d raid10 </dev/sdX> deadhorseconsulting
2013-11-19  9:06 ` Hugo Mills
2013-11-19 19:24   ` deadhorseconsulting
2013-11-19 21:04     ` Duncan
2013-11-20  6:41     ` Martin
2013-11-19 23:16   ` Duncan
2013-11-20  6:35     ` Martin
2013-11-20 10:16       ` Chris Murphy
2013-11-20 10:22         ` Russell Coker
2013-11-20  8:09     ` Hugo Mills
2013-11-20 16:43       ` Duncan
2013-11-20 16:52         ` Hugo Mills
2013-11-20 21:13           ` Duncan
2013-11-21 17:14 ` Jeff Mahoney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).