* RAID10 and 'writemostly' support
@ 2017-02-16 14:08 Reindl Harald
2017-02-17 1:24 ` Anthony Youngman
0 siblings, 1 reply; 7+ messages in thread
From: Reindl Harald @ 2017-02-16 14:08 UTC (permalink / raw)
To: linux-raid
Hi
i am new and was redirected to this list from the bugtracker
please have a look at https://bugzilla.kernel.org/show_bug.cgi?id=194551
currently "writemostly" seems to be only supported on "real" RAID1 while
i was in hope that by the conecpt of RAID10 having more or less
RAID0+RAID1 it would also work on RAID10 (and on the virtual machine wre
i tested mdadm with the flag before buying the disks it did not complain)
RAID10 with "writemostly" makes a lot of sense for large storages to get
them fast *and* reliable without make it extremly expensive
* you don't want RAID5/RAID6 rebuild over many TB
* very large SSD for RAID1 are much more expensive than smaller ones
so with 4x2 TB disks you get 4 TB useable storage and with "writemostly"
which would be in the best case "writeonly" you have a lightening fast
RAID0 on SSD with good redundancy
most workloads are read-intense with less writes (rsync with --checksum
enabled, booting, starting large applications...)
another benefit: different technologies - it's very unlikely that both
disks of a stripe fail at the same time or due rebuild when one half is
a SSD and the other a HDD
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID10 and 'writemostly' support
2017-02-16 14:08 RAID10 and 'writemostly' support Reindl Harald
@ 2017-02-17 1:24 ` Anthony Youngman
2017-02-17 10:03 ` Reindl Harald
0 siblings, 1 reply; 7+ messages in thread
From: Anthony Youngman @ 2017-02-17 1:24 UTC (permalink / raw)
To: Reindl Harald, linux-raid
On 16/02/17 14:08, Reindl Harald wrote:
> Hi
>
> i am new and was redirected to this list from the bugtracker
> please have a look at https://bugzilla.kernel.org/show_bug.cgi?id=194551
>
> currently "writemostly" seems to be only supported on "real" RAID1 while
> i was in hope that by the conecpt of RAID10 having more or less
> RAID0+RAID1 it would also work on RAID10 (and on the virtual machine wre
> i tested mdadm with the flag before buying the disks it did not complain)
Be careful. Don't confuse Raid10 with Raid1+0. They are NOT the same
thing (on linux at least), although they are very similar.
Cheers,
Wol
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID10 and 'writemostly' support
2017-02-17 1:24 ` Anthony Youngman
@ 2017-02-17 10:03 ` Reindl Harald
2017-02-18 22:20 ` Phil Turmel
0 siblings, 1 reply; 7+ messages in thread
From: Reindl Harald @ 2017-02-17 10:03 UTC (permalink / raw)
To: linux-raid
Am 17.02.2017 um 02:24 schrieb Anthony Youngman:
> On 16/02/17 14:08, Reindl Harald wrote:
>> Hi
>>
>> i am new and was redirected to this list from the bugtracker
>> please have a look at https://bugzilla.kernel.org/show_bug.cgi?id=194551
>>
>> currently "writemostly" seems to be only supported on "real" RAID1 while
>> i was in hope that by the conecpt of RAID10 having more or less
>> RAID0+RAID1 it would also work on RAID10 (and on the virtual machine wre
>> i tested mdadm with the flag before buying the disks it did not complain)
>
> Be careful. Don't confuse Raid10 with Raid1+0. They are NOT the same
> thing (on linux at least), although they are very similar
yeah, i realized that but anyways thought the "writemostly" logic is
there too and maybe the docs not up-to-date
sadly i can't write a patch on my own but only point how useful it would be
let's say you need a fast and really large storage on a mostly-read
workload - take 10x4 TB disks - RAID5/RAID6 is horrible in case of drive
error and rebuild, 10 x 4 TB SSD is horrible in case of pricing
5x4 TB SSD = 5 x 1400 = 7000
5x4 TB HDD = 5 x 100 = 500
total price 7500 versus 14000 for flash-only
surely, you can setup 5 RAID1 and on top RAID0 or LVM for such a large
setup if your start from scratch - on the other hand my current RAID10
is from 2011 where SSD was not such a topic and since the operating
system is RAID10 too the inital setup is not that easy and after
Fedora/RHEL "reworked" anaconda it's more painful to impossible (i even
had enough of the manual partition tool in a virtual machine installing
CentOS7 and created the data partitions after the OS install)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID10 and 'writemostly' support
2017-02-17 10:03 ` Reindl Harald
@ 2017-02-18 22:20 ` Phil Turmel
2017-02-18 23:35 ` Reindl Harald
0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2017-02-18 22:20 UTC (permalink / raw)
To: Reindl Harald, linux-raid
On 02/17/2017 05:03 AM, Reindl Harald wrote:
>> Be careful. Don't confuse Raid10 with Raid1+0. They are NOT the
>> same thing (on linux at least), although they are very similar
>
> yeah, i realized that but anyways thought the "writemostly" logic is
> there too and maybe the docs not up-to-date
Linux MD raid10 doesn't have a requirement that the number of devices
be a multiple of the number of data copies. Which creates "interesting"
data layouts with odd numbers of devices or similar effects with ,n3 or
,f3 layouts. Which makes it difficult if not impossible to designate
specific devices as write mostly without weird operational asymmetries
across the assembled array.
In other words, it is not at all like raid 1 on top of raid 0, except in
certain very limited cases, and your assumptions are simply wrong.
If there are features (other than layouts) of raid10 that make you
prefer it to raid1, it would make sense to ask for those features to
be implemented in raid1.
> sadly i can't write a patch on my own but only point how useful it
> would be
That's unfortunate. Patches are generally welcome.
Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID10 and 'writemostly' support
2017-02-18 22:20 ` Phil Turmel
@ 2017-02-18 23:35 ` Reindl Harald
2017-02-19 17:31 ` Phil Turmel
0 siblings, 1 reply; 7+ messages in thread
From: Reindl Harald @ 2017-02-18 23:35 UTC (permalink / raw)
To: linux-raid
Am 18.02.2017 um 23:20 schrieb Phil Turmel:
> On 02/17/2017 05:03 AM, Reindl Harald wrote:
>
>>> Be careful. Don't confuse Raid10 with Raid1+0. They are NOT the
>>> same thing (on linux at least), although they are very similar
>>
>> yeah, i realized that but anyways thought the "writemostly" logic is
>> there too and maybe the docs not up-to-date
>
> Linux MD raid10 doesn't have a requirement that the number of devices
> be a multiple of the number of data copies. Which creates "interesting"
> data layouts with odd numbers of devices or similar effects with ,n3 or
> ,f3 layouts. Which makes it difficult if not impossible to designate
> specific devices as write mostly without weird operational asymmetries
> across the assembled array.
but since --writemostly doesn't get without manually intervention that
cases would be unchanged (besides that they are unlikely)
> In other words, it is not at all like raid 1 on top of raid 0, except in
> certain very limited cases, and your assumptions are simply wrong.
>
> If there are features (other than layouts) of raid10 that make you
> prefer it to raid1, it would make sense to ask for those features to
> be implemented in raid1.
writemostly it's also very appealing on existing setups, the machine
from where i type was installed in 2011
RAID1 don't have the benefit of doubled performance (also for writes, on
a hybrid RAID slower but still faster than RAID1) *and* doubled space
compared to a single disk combined with mirroring
another example: on machines like a HP microserver with only 4 drive
slots that you could easily improve read-performance which is for many
workloads the most important part by just switch half of the disk to SSD
price calculation for a hybrid RAID10 with 10 disks:
5x4 TB SSD = 5 x 1400€ = 7000€
5x4 TB HDD = 5 x 100€ = 500€
total price 7500€ versus 14000€ for flash-only
>> sadly i can't write a patch on my own but only point how useful it
>> would be
>
> That's unfortunate. Patches are generally welcome
i would be *seriously* willing to pay the inital patch for any kernel
maintainer who takes it over - Fedora regulary does kernel-rebases on GA
versions
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID10 and 'writemostly' support
2017-02-18 23:35 ` Reindl Harald
@ 2017-02-19 17:31 ` Phil Turmel
2017-02-19 17:48 ` Reindl Harald
0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2017-02-19 17:31 UTC (permalink / raw)
To: Reindl Harald, linux-raid
On 02/18/2017 06:35 PM, Reindl Harald wrote:
>
> Am 18.02.2017 um 23:20 schrieb Phil Turmel:
>> If there are features (other than layouts) of raid10 that make you
>> prefer it to raid1, it would make sense to ask for those features to
>> be implemented in raid1.
>
> writemostly it's also very appealing on existing setups, the machine
> from where i type was installed in 2011
>
> RAID1 don't have the benefit of doubled performance (also for writes, on
> a hybrid RAID slower but still faster than RAID1) *and* doubled space
> compared to a single disk combined with mirroring
Doubled capacity? Vs. raid1? No. Raid10,n2 (,n2 is default) on two
devices yields the same capacity as raid1 on two devices. Unless I'm
misunderstanding your point.
> another example: on machines like a HP microserver with only 4 drive
> slots that you could easily improve read-performance which is for many
> workloads the most important part by just switch half of the disk to SSD
>
> price calculation for a hybrid RAID10 with 10 disks:
> 5x4 TB SSD = 5 x 1400€ = 7000€
> 5x4 TB HDD = 5 x 100€ = 500€
> total price 7500€ versus 14000€ for flash-only
What is preventing you from using the existing raid1 in pairs with
write mostly, then layering raid0 on top of them for the capacity you
are trying to achieve? No new code required. What you are asking for
really is raid1+0, which MD raid allows you to assemble yourself.
> i would be *seriously* willing to pay the inital patch for any kernel
> maintainer who takes it over - Fedora regulary does kernel-rebases on GA
> versions
Since no new kernel code is needed to achieve what you desire, I doubt
a kernel patch for it would be accepted. (But I'm not a maintainer, so
YMMV.) This is really a user-space question, along the lines of
"should/could mdadm automate creation of dual layers like raid1+0?"
Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RAID10 and 'writemostly' support
2017-02-19 17:31 ` Phil Turmel
@ 2017-02-19 17:48 ` Reindl Harald
0 siblings, 0 replies; 7+ messages in thread
From: Reindl Harald @ 2017-02-19 17:48 UTC (permalink / raw)
To: linux-raid
Am 19.02.2017 um 18:31 schrieb Phil Turmel:
> On 02/18/2017 06:35 PM, Reindl Harald wrote:
>>
>> Am 18.02.2017 um 23:20 schrieb Phil Turmel:
>
>>> If there are features (other than layouts) of raid10 that make you
>>> prefer it to raid1, it would make sense to ask for those features to
>>> be implemented in raid1.
>>
>> writemostly it's also very appealing on existing setups, the machine
>> from where i type was installed in 2011
>>
>> RAID1 don't have the benefit of doubled performance (also for writes, on
>> a hybrid RAID slower but still faster than RAID1) *and* doubled space
>> compared to a single disk combined with mirroring
>
> Doubled capacity? Vs. raid1? No. Raid10,n2 (,n2 is default) on two
> devices yields the same capacity as raid1 on two devices. Unless I'm
> misunderstanding your point.
you are misunderstanding
RAID1: 2x2 TB = 2 TB usable
RAID10: 4x2 TB = 4 TB useable
typically smaller disks are cheaper and when i installed the 4x2 TB
RAID10 4 TB disks where not that common and 4 TB SSD not available at
all (and 2 TB SSD unpaibale)
>> another example: on machines like a HP microserver with only 4 drive
>> slots that you could easily improve read-performance which is for many
>> workloads the most important part by just switch half of the disk to SSD
>>
>> price calculation for a hybrid RAID10 with 10 disks:
>> 5x4 TB SSD = 5 x 1400€ = 7000€
>> 5x4 TB HDD = 5 x 100€ = 500€
>> total price 7500€ versus 14000€ for flash-only
>
> What is preventing you from using the existing raid1 in pairs with
> write mostly, then layering raid0 on top of them for the capacity you
> are trying to achieve? No new code required. What you are asking for
> really is raid1+0, which MD raid allows you to assemble yourself.
already existing setups and the easier configuraion of RAID10 than wrap
2 RAID1 into a RAID0 especially at inital setup time when you also cover
the os setup itself
/dev/md0 ext4 485M 33M 448M 7% /boot
/dev/md1 ext4 29G 6,8G 22G 24% /
/dev/md2 ext4 3,6T 2,3T 1,4T 63% /mnt/data
md0: RAID1
md1: RAID10
md2: RAID10
it's really not funny to change that existing layout from RAID10 to
RAID0+RAID1
>> i would be *seriously* willing to pay the inital patch for any kernel
>> maintainer who takes it over - Fedora regulary does kernel-rebases on GA
>> versions
>
> Since no new kernel code is needed to achieve what you desire, I doubt
> a kernel patch for it would be accepted. (But I'm not a maintainer, so
> YMMV.) This is really a user-space question, along the lines of
> "should/could mdadm automate creation of dual layers like raid1+0?"
at least "mdadm" in the current state should just refuse
"--write-mostly" when the array is a RAID10 - in that case i would have
known by testing it based on http://www.tansi.org/hybrid/ in a virtual
machine that it *really* don't work with RAID10
obviously there is code needed to achieve "writemostly" on the most
common setup of 4 disks for a RAID10 where you later try to replace half
of the disks with SSD and have writes only on the remaining HDD
there are so many workloads where read-performance is more imprtant
(boot, start of large applications, start virtual machines, rsync large
data...)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-02-19 17:48 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-16 14:08 RAID10 and 'writemostly' support Reindl Harald
2017-02-17 1:24 ` Anthony Youngman
2017-02-17 10:03 ` Reindl Harald
2017-02-18 22:20 ` Phil Turmel
2017-02-18 23:35 ` Reindl Harald
2017-02-19 17:31 ` Phil Turmel
2017-02-19 17:48 ` Reindl Harald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).