* RAID1 round robin read support in md?
@ 2011-12-04 23:31 Borg Onion
[not found] ` <CALFpzo48C2-HHgw=iOefdHaAUCwEJfjoiXjq=Ngv62a-0yEpcQ@mail.gmail.com>
2011-12-05 5:33 ` NeilBrown
0 siblings, 2 replies; 6+ messages in thread
From: Borg Onion @ 2011-12-04 23:31 UTC (permalink / raw)
To: linux-raid
Hello,
Is it possible to implement round-robin read support for raid1.c in
md? I sure would appreciate the extra speed given my RAID1x3SSD
setup. A possible algorithm might go like this:
1) Given the RAID1 drives are enumerated 0,1,2
2) For each read request, call an atomic increment
3) mod the returned value with # of drives (3)
4) Use the modulus result as the drive # to read from for this operation
Part of implementation might look like this:
uint32_t get_next_read_drive(uint32_t num_drives)
{
static uint32_t op_count;
return __sync_fetch_and_add(&op_count, 1) % num_drives;
}
I'm making the assumption that the md code is multi-threaded. This
implementation won't work on all platforms, others might have to do
some locking. The num_drives and drive list enumeration is assumed to
exclude write-mostly devices. Temporary balancing fairness errors
from the num_drives variable changing value (adding or removing of
drives) should be innocuous given these are just read ops on mirrored
data.
Thoughts?
--
Borg Onion
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: RAID1 round robin read support in md?
[not found] ` <CAAs5ODv01LE6HRZjteYYsaMF5stTNv-8pdK9JDP18u4QeNGUOQ@mail.gmail.com>
@ 2011-12-05 3:13 ` Marcus Sorensen
2011-12-05 5:21 ` Borg Onion
0 siblings, 1 reply; 6+ messages in thread
From: Marcus Sorensen @ 2011-12-05 3:13 UTC (permalink / raw)
To: Borg Onion; +Cc: linux-raid
Keep forgetting to reply all on these.
Here is my dd test, split ~50/50. Again, one thread will leave half of
each disk idle, as it makes requests serially:
[root@server ~]# dd if=/dev/md0 of=/dev/null & iostat -xk 1 /dev/sda /dev/sdb
[1] 5293
Linux 2.6.32-71.29.1.el6.x86_64 (server) 12/04/2011 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
2.65 0.00 6.67 0.03 0.00 90.65
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
sdb 0.44 0.44 4.24 9.50 172.99 56.46
33.39 0.07 4.78 2.41 3.31
sda 0.43 0.44 4.26 9.50 173.30 56.46
33.39 0.07 4.81 2.39 3.29
avg-cpu: %user %nice %system %iowait %steal %idle
6.42 0.00 21.48 22.96 0.00 49.14
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
sdb 8776.00 0.00 292.00 10.00 36320.00 83.00
241.08 1.07 3.56 1.85 55.90
sda 8080.00 2.00 269.00 0.00 33272.00 0.00
247.38 0.98 3.40 1.81 48.80
avg-cpu: %user %nice %system %iowait %steal %idle
6.90 0.00 21.67 22.91 0.00 48.52
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
sdb 6903.00 0.00 231.00 40.00 28288.00 47.00
209.11 2.01 7.41 2.21 59.80
sda 7264.00 1.00 242.00 54.00 30148.00 131.50
204.59 1.64 5.76 1.77 52.50
On Sun, Dec 4, 2011 at 6:42 PM, Borg Onion <borg.onion@gmail.com> wrote:
> According to what sar is showing, it only reads from ONE drive. (dd
> if=... is my test tool). Multi-threaded clients should not drive
> performance in md IO. A large read request can be broken down
> internally to take advantage of round-robin.
>
> --Bart
>
>
> On 4 December 2011 17:11, Marcus Sorensen <shadowsor@gmail.com> wrote:
>> Is there a reason to do round robin when it already does simultaneous read
>> from all member devices? Note this is only obvious with a multithreaded
>> benchmark.
>>
>> On Dec 4, 2011 4:31 PM, "Borg Onion" <borg.onion@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> Is it possible to implement round-robin read support for raid1.c in
>>> md? I sure would appreciate the extra speed given my RAID1x3SSD
>>> setup. A possible algorithm might go like this:
>>>
>>> 1) Given the RAID1 drives are enumerated 0,1,2
>>> 2) For each read request, call an atomic increment
>>> 3) mod the returned value with # of drives (3)
>>> 4) Use the modulus result as the drive # to read from for this operation
>>>
>>> Part of implementation might look like this:
>>>
>>> uint32_t get_next_read_drive(uint32_t num_drives)
>>> {
>>> static uint32_t op_count;
>>>
>>> return __sync_fetch_and_add(&op_count, 1) % num_drives;
>>> }
>>>
>>> I'm making the assumption that the md code is multi-threaded. This
>>> implementation won't work on all platforms, others might have to do
>>> some locking. The num_drives and drive list enumeration is assumed to
>>> exclude write-mostly devices. Temporary balancing fairness errors
>>> from the num_drives variable changing value (adding or removing of
>>> drives) should be innocuous given these are just read ops on mirrored
>>> data.
>>>
>>> Thoughts?
>>>
>>> --
>>> Borg Onion
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> --Borg Onion
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: RAID1 round robin read support in md?
2011-12-05 3:13 ` Marcus Sorensen
@ 2011-12-05 5:21 ` Borg Onion
2011-12-05 5:53 ` Doug Dumitru
0 siblings, 1 reply; 6+ messages in thread
From: Borg Onion @ 2011-12-05 5:21 UTC (permalink / raw)
To: Marcus Sorensen; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 37381 bytes --]
On 4 December 2011 19:13, Marcus Sorensen <shadowsor@gmail.com> wrote:
> Keep forgetting to reply all on these.
>
> Here is my dd test, split ~50/50. Again, one thread will leave half of
> each disk idle, as it makes requests serially:
>
> [root@server ~]# dd if=/dev/md0 of=/dev/null & iostat -xk 1 /dev/sda /dev/sdb
> [1] 5293
> Linux 2.6.32-71.29.1.el6.x86_64 (server) 12/04/2011 _x86_64_ (4 CPU)
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 2.65 0.00 6.67 0.03 0.00 90.65
>
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sdb 0.44 0.44 4.24 9.50 172.99 56.46
> 33.39 0.07 4.78 2.41 3.31
> sda 0.43 0.44 4.26 9.50 173.30 56.46
> 33.39 0.07 4.81 2.39 3.29
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 6.42 0.00 21.48 22.96 0.00 49.14
>
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sdb 8776.00 0.00 292.00 10.00 36320.00 83.00
> 241.08 1.07 3.56 1.85 55.90
> sda 8080.00 2.00 269.00 0.00 33272.00 0.00
> 247.38 0.98 3.40 1.81 48.80
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 6.90 0.00 21.67 22.91 0.00 48.52
>
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sdb 6903.00 0.00 231.00 40.00 28288.00 47.00
> 209.11 2.01 7.41 2.21 59.80
> sda 7264.00 1.00 242.00 54.00 30148.00 131.50
> 204.59 1.64 5.76 1.77 52.50
OK it seems to be sporadic. There are long periods of time when one
drive is idle. Pardon the long trace, but you'll see how the load
shifts to just 1 drive many times. I've produced a graph which is a
lot easier to read, here: http://i.imgur.com/XfYhY.gif . It's also
attached, if the list doesn't strip it. I'm not sure how to check for
write-mostly. I think the 0.90 superblock doesn't even support that?
Could that be part of the problem?
server # mdadm --detail /dev/md127 && dd if=/dev/md127 of=/dev/null
bs=128k & iostat -xk 1 /dev/md127 /dev/sdy /dev/sdz
[1] 26208
Linux 3.1.4-gentoo (server) 12/04/11 _x86_64_ (24 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
1.06 0.00 0.27 0.12 0.00 98.54
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 293.37 57.63 14.33 7.15 1385.16 259.74
153.18 0.06 2.87 1.02 6.59 0.42 0.91
sdz 32.46 57.61 6.29 7.17 299.74 259.74
83.15 0.04 3.29 0.57 5.68 0.24 0.33
md127 0.00 0.00 346.36 64.43 1684.66 257.72
9.46 0.00 0.00 0.00 0.00 0.00 0.00
/dev/md127:
Version : 0.90
Creation Time : Mon Nov 28 20:36:04 2011
Raid Level : raid1
Array Size : 232286272 (221.53 GiB 237.86 GB)
Used Dev Size : 232286272 (221.53 GiB 237.86 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 127
Persistence : Superblock is persistent
Update Time : Sun Dec 4 20:41:34 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : 57f9a386:d4e46e7a:78a9b883:5f5f67d5
Events : 0.3971
Number Major Minor RaidDevice State
0 65 130 0 active sync /dev/sdy2
1 65 146 1 active sync /dev/sdz2
avg-cpu: %user %nice %system %iowait %steal %idle
0.51 0.00 10.06 5.83 0.00 83.59
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 28285.00 0.00 976.00 2.00 117316.00 8.00
239.93 1.01 1.03 1.02 5.00 0.58 57.00
sdz 23379.00 0.00 958.00 2.00 97264.00 8.00
202.65 0.73 0.76 0.75 5.00 0.47 45.00
md127 0.00 0.00 53644.00 0.00 214576.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.46 0.00 5.06 4.14 0.00 90.34
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 24623.00 0.00 841.00 0.00 101664.00 0.00
241.77 0.83 0.99 0.99 0.00 0.55 46.00
sdz 29971.00 0.00 1010.00 0.00 124200.00 0.00
245.94 0.97 0.96 0.96 0.00 0.52 53.00
md127 0.00 0.00 56466.00 0.00 225864.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.77 0.00 5.83 3.97 0.00 89.43
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 9069.00 0.00 333.00 0.00 37948.00 0.00
227.92 0.30 0.90 0.90 0.00 0.51 17.00
sdz 52843.00 0.00 1743.00 0.00 218100.00 0.00
250.26 1.50 0.86 0.86 0.00 0.48 83.00
md127 0.00 0.00 64012.00 0.00 256048.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.16 0.00 3.50 2.13 0.00 94.21
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 7412.00 0.00 289.00 0.00 30896.00 0.00
213.81 0.20 0.69 0.69 0.00 0.42 12.00
sdz 55156.00 0.00 1853.00 0.00 228288.00 0.00
246.40 1.64 0.89 0.89 0.00 0.48 89.00
md127 0.00 0.00 64765.00 0.00 259060.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.21 0.00 5.07 3.44 0.00 91.29
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 25766.00 1060.00 869.00 56.00 106628.00 4464.00
240.20 1.41 1.52 0.92 10.89 0.51 47.00
sdz 30453.00 1067.00 1064.00 47.00 126312.00 4464.00
235.42 2.29 2.06 0.93 27.66 0.51 57.00
md127 0.00 0.00 58234.00 1114.00 232936.00 4456.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.70 0.00 7.20 5.53 0.00 86.57
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 15413.00 0.00 524.00 0.00 63812.00 0.00
243.56 0.43 0.82 0.82 0.00 0.42 22.00
sdz 43092.00 0.00 1424.00 0.00 178128.00 0.00
250.18 1.47 1.03 1.03 0.00 0.55 78.00
md127 0.00 0.00 60517.00 0.00 242068.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.21 0.00 1.25 6.51 0.00 92.03
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 35370.00 196.00 1146.00 9.00 146068.00 824.00
254.36 1.24 1.07 1.08 0.00 0.57 66.00
sdz 19614.00 194.00 681.00 12.00 81256.00 824.00
236.88 0.61 0.88 0.90 0.00 0.48 33.00
md127 0.00 0.00 56799.00 204.00 227196.00 816.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.39 0.00 4.13 11.59 0.00 83.89
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 41724.00 0.00 1439.00 0.00 172532.00 0.00
239.79 1.38 0.96 0.96 0.00 0.51 74.00
sdz 20375.00 0.00 692.00 0.00 84440.00 0.00
244.05 0.50 0.72 0.72 0.00 0.39 27.00
md127 0.00 0.00 64275.00 0.00 257100.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.86 0.00 2.97 8.82 0.00 87.35
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 22819.00 0.00 783.00 0.00 94756.00 0.00
242.03 0.64 0.82 0.82 0.00 0.42 33.00
sdz 48043.00 0.00 1603.00 0.00 198664.00 0.00
247.87 1.39 0.87 0.87 0.00 0.42 68.00
md127 0.00 0.00 73291.00 0.00 293164.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.21 0.00 1.41 3.85 0.00 94.54
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 38957.00 0.00 1289.00 0.00 160908.00 0.00
249.66 1.39 1.08 1.08 0.00 0.61 78.00
sdz 15029.00 0.00 487.00 0.00 62064.00 0.00
254.88 0.37 0.76 0.76 0.00 0.45 22.00
md127 0.00 0.00 55783.00 0.00 223132.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.37 0.00 4.70 8.00 0.00 86.93
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 38457.00 614.00 1303.00 40.00 159248.00 2620.00
241.05 1.11 0.83 0.72 4.25 0.42 56.00
sdz 28957.00 613.00 961.00 42.00 119500.00 2620.00
243.51 0.99 1.00 0.80 5.48 0.44 44.00
md127 0.00 0.00 69711.00 653.00 278844.00 2612.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.81 0.00 5.08 9.52 0.00 84.59
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 6050.00 0.00 202.00 0.00 25008.00 0.00
247.60 0.18 0.89 0.89 0.00 0.45 9.00
sdz 77234.00 0.00 2542.00 0.00 319236.00 0.00
251.17 1.57 0.62 0.62 0.00 0.36 91.00
md127 0.00 0.00 86061.00 0.00 344244.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.71 0.00 4.35 11.83 0.00 83.11
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 35507.00 0.00 1250.00 0.00 147148.00 0.00
235.44 0.78 0.62 0.62 0.00 0.37 46.00
sdz 40901.00 0.00 1364.00 0.00 169404.00 0.00
248.39 1.01 0.74 0.74 0.00 0.40 54.00
md127 0.00 0.00 79138.00 0.00 316552.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.23 0.00 3.12 6.28 0.00 90.38
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 50797.00 0.00 1779.00 0.00 210564.00 0.00
236.72 1.27 0.71 0.71 0.00 0.37 65.00
sdz 25779.00 0.00 923.00 0.00 106968.00 0.00
231.78 0.61 0.66 0.66 0.00 0.38 35.00
md127 0.00 0.00 79383.00 0.00 317532.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.87 0.00 2.48 10.73 0.00 85.93
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 17467.00 0.00 673.00 0.00 72860.00 0.00
216.52 0.46 0.68 0.68 0.00 0.37 25.00
sdz 45804.00 0.00 1606.00 0.00 189584.00 0.00
236.09 1.39 0.87 0.87 0.00 0.47 75.00
md127 0.00 0.00 65611.00 0.00 262444.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.15 0.00 1.84 5.00 0.00 93.02
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 24057.00 0.00 822.00 0.00 99464.00 0.00
242.00 0.67 0.82 0.82 0.00 0.46 38.00
sdz 33170.00 0.00 1106.00 0.00 137364.00 0.00
248.40 1.10 0.99 0.99 0.00 0.56 62.00
md127 0.00 0.00 59219.00 0.00 236872.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.06 0.00 1.55 5.54 0.00 92.85
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 25155.00 1076.00 903.00 42.00 104600.00 4472.00
230.84 1.36 1.44 1.09 9.05 0.60 57.00
sdz 28702.00 1078.00 978.00 40.00 118588.00 4472.00
241.77 1.22 1.20 0.82 10.50 0.45 46.00
md127 0.00 0.00 55785.00 1116.00 223144.00 4464.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.11 0.00 3.40 12.15 0.00 83.34
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 12051.00 0.00 408.00 0.00 49636.00 0.00
243.31 0.33 0.81 0.81 0.00 0.44 18.00
sdz 60427.00 0.00 1987.00 0.00 249988.00 0.00
251.62 1.53 0.77 0.77 0.00 0.41 82.00
md127 0.00 0.00 74910.00 0.00 299636.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.07 0.00 3.06 6.11 0.00 90.76
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 28577.00 0.00 960.00 0.00 118220.00 0.00
246.29 0.75 0.78 0.78 0.00 0.42 40.00
sdz 44592.00 0.00 1471.00 0.00 184296.00 0.00
250.57 1.04 0.71 0.71 0.00 0.41 61.00
md127 0.00 0.00 75625.00 0.00 302504.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.46 0.00 4.76 9.64 0.00 85.13
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 57918.00 0.00 1883.00 0.00 239452.00 0.00
254.33 1.48 0.79 0.79 0.00 0.45 84.00
sdz 11124.00 0.00 428.00 0.00 46084.00 0.00
215.35 0.29 0.68 0.68 0.00 0.37 16.00
md127 0.00 0.00 71384.00 0.00 285536.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.55 0.00 3.01 6.02 0.00 90.42
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 64465.00 0.00 2081.00 0.00 266060.00 0.00
255.70 1.68 0.81 0.81 0.00 0.47 97.00
sdz 2545.00 0.00 88.00 0.00 10780.00 0.00
245.00 0.06 0.68 0.68 0.00 0.34 3.00
md127 0.00 0.00 69210.00 0.00 276840.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.24 0.00 1.99 4.70 0.00 93.07
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 26655.00 0.00 945.00 0.00 110552.00 0.00
233.97 0.66 0.70 0.70 0.00 0.39 37.00
sdz 43003.00 0.00 1410.00 0.00 177652.00 0.00
251.99 1.11 0.79 0.79 0.00 0.45 63.00
md127 0.00 0.00 72051.00 0.00 288204.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.44 0.00 3.61 11.48 0.00 84.46
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 36943.00 1069.00 1192.00 60.00 152544.00 4516.00
250.89 2.02 1.61 1.21 9.67 0.65 82.00
sdz 15378.00 1068.00 498.00 61.00 63504.00 4516.00
243.36 1.14 2.04 0.60 13.77 0.45 25.00
md127 0.00 0.00 54012.00 1127.00 216048.00 4508.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.93 0.00 6.51 7.90 0.00 84.66
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 37451.00 0.00 1225.00 0.00 154872.00 0.00
252.85 0.93 0.76 0.76 0.00 0.44 54.00
sdz 39693.00 0.00 1311.00 0.00 163848.00 0.00
249.96 0.88 0.67 0.67 0.00 0.36 47.00
md127 0.00 0.00 79680.00 0.00 318720.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.17 0.00 4.52 6.45 0.00 88.87
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdz 76542.00 0.00 2487.00 0.00 316288.00 0.00
254.35 1.78 0.72 0.72 0.00 0.40 100.00
md127 0.00 0.00 79040.00 0.00 316160.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.16 0.00 3.09 4.93 0.00 91.82
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 1779.00 0.00 66.00 0.00 7132.00 0.00
216.12 0.05 0.76 0.76 0.00 0.45 3.00
sdz 68544.00 0.00 2234.00 0.00 283296.00 0.00
253.62 1.68 0.76 0.76 0.00 0.43 97.00
md127 0.00 0.00 72639.00 0.00 290556.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.37 0.00 3.54 5.49 0.00 90.60
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 71774.00 477.00 2327.00 35.00 296432.00 2048.00
252.73 1.90 0.80 0.75 4.57 0.42 100.00
sdz 1.00 475.00 3.00 37.00 16.00 2048.00
103.20 0.07 1.75 0.00 1.89 0.25 1.00
md127 0.00 0.00 74112.00 510.00 296448.00 2040.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.29 0.00 2.15 5.42 0.00 92.14
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 71460.00 579.00 2317.00 18.00 295152.00 2388.00
254.85 2.84 1.22 0.77 58.89 0.43 100.00
sdz 1.00 579.00 3.00 18.00 16.00 2388.00
228.95 0.09 4.29 0.00 5.00 0.48 1.00
md127 0.00 0.00 73792.00 595.00 295168.00 2380.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.21 0.00 3.72 5.19 0.00 90.88
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 81107.00 0.00 2646.00 0.00 335096.00 0.00
253.28 1.66 0.63 0.63 0.00 0.38 100.00
sdz 0.00 0.00 2.00 0.00 8.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 83776.00 0.00 335104.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.78 0.00 4.13 9.21 0.00 85.87
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 49380.00 0.00 1609.00 0.00 204044.00 0.00
253.63 1.19 0.74 0.74 0.00 0.44 70.00
sdz 22655.00 0.00 732.00 0.00 93548.00 0.00
255.60 0.56 0.77 0.77 0.00 0.41 30.00
md127 0.00 0.00 74389.00 0.00 297556.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.92 0.00 3.18 11.37 0.00 84.54
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 72799.00 0.00 2362.00 0.00 300760.00 0.00
254.67 1.98 0.84 0.84 0.00 0.42 100.00
sdz 0.00 0.00 1.00 0.00 4.00 0.00
8.00 0.01 10.00 10.00 0.00 10.00 1.00
md127 0.00 0.00 75200.00 0.00 300800.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.28 0.00 3.53 6.13 0.00 90.06
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 75862.00 1.00 2522.00 68.00 313668.00 276.00
242.43 1.74 0.68 0.69 0.00 0.39 100.00
sdz 0.00 1.00 2.00 68.00 8.00 276.00
8.11 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 78419.00 67.00 313676.00 268.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.80 0.00 3.20 9.16 0.00 86.84
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 62589.00 1061.00 2061.00 87.00 258684.00 4600.00
245.14 2.36 1.10 0.73 9.77 0.40 86.00
sdz 10630.00 1061.00 371.00 87.00 44112.00 4600.00
212.72 1.34 2.93 1.51 8.97 0.41 19.00
md127 0.00 0.00 75696.00 1148.00 302784.00 4592.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.32 0.00 2.40 7.51 0.00 89.77
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 41672.00 0.00 1365.00 0.00 172300.00 0.00
252.45 1.02 0.75 0.75 0.00 0.41 56.00
sdz 37268.00 0.00 1252.00 0.00 154212.00 0.00
246.35 0.80 0.64 0.64 0.00 0.36 45.00
md127 0.00 0.00 81631.00 0.00 326524.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.21 0.00 4.73 8.36 0.00 86.70
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 74771.00 0.00 2507.00 0.00 309352.00 0.00
246.79 1.72 0.69 0.69 0.00 0.40 100.00
sdz 0.00 0.00 6.00 0.00 24.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 77344.00 0.00 309376.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.72 0.00 4.46 7.52 0.00 87.30
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 78441.00 0.00 2561.00 0.00 323960.00 0.00
252.99 1.70 0.66 0.66 0.00 0.39 99.00
sdz 0.00 0.00 2.00 0.00 8.00 0.00
8.00 0.01 5.00 5.00 0.00 5.00 1.00
md127 0.00 0.00 80992.00 0.00 323968.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.23 0.00 3.45 5.95 0.00 90.36
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 77922.00 0.00 2569.00 0.00 322244.00 0.00
250.87 1.72 0.67 0.67 0.00 0.39 100.00
sdz 1.00 0.00 12.00 0.00 60.00 0.00
10.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 80576.00 0.00 322304.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.29 0.00 4.05 4.93 0.00 90.73
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 76787.00 0.00 2478.00 0.00 317056.00 0.00
255.90 1.68 0.68 0.68 0.00 0.40 100.00
sdz 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 79264.00 0.00 317056.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.64 0.00 4.56 5.72 0.00 89.08
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 75114.00 1060.00 2422.00 57.00 310144.00 4468.00
253.82 2.11 0.85 0.67 8.60 0.40 100.00
sdz 0.00 1058.00 0.00 59.00 0.00 4468.00
151.46 0.53 8.98 0.00 8.98 0.51 3.00
md127 0.00 0.00 77536.00 1115.00 310144.00 4460.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.27 0.00 2.89 5.27 0.00 91.58
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 75165.00 0.00 2444.00 0.00 310364.00 0.00
253.98 1.69 0.69 0.69 0.00 0.41 99.00
sdz 4.00 0.00 5.00 0.00 36.00 0.00
14.40 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 77600.00 0.00 310400.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.12 0.00 3.05 4.79 0.00 92.04
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 75971.00 0.00 2485.00 0.00 313972.00 0.00
252.69 1.70 0.68 0.68 0.00 0.40 99.00
sdz 1.00 0.00 2.00 0.00 12.00 0.00
12.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 78496.00 0.00 313984.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.91 0.00 3.42 10.65 0.00 85.02
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 75028.00 481.00 2464.00 13.00 310096.00 1976.00
251.98 1.88 0.76 0.74 4.62 0.40 100.00
sdz 0.00 481.00 2.00 13.00 8.00 1976.00
264.53 0.02 1.33 0.00 1.54 0.67 1.00
md127 0.00 0.00 77526.00 492.00 310104.00 1968.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.31 0.00 2.41 8.45 0.00 88.84
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 77012.00 0.00 2537.00 0.00 318204.00 0.00
250.85 1.92 0.76 0.76 0.00 0.39 100.00
sdz 0.00 0.00 1.00 0.00 4.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 79552.00 0.00 318208.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.48 0.00 5.38 9.56 0.00 84.58
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 73600.00 596.00 2466.00 43.00 304464.00 2564.00
244.74 1.91 0.76 0.74 2.09 0.40 100.00
sdz 262.00 594.00 12.00 46.00 1096.00 2564.00
126.21 0.04 0.69 0.00 0.87 0.17 1.00
md127 0.00 0.00 76390.00 639.00 305560.00 2556.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.92 0.00 4.39 9.15 0.00 85.54
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 74577.00 0.00 2475.00 0.00 308360.00 0.00
249.18 1.73 0.70 0.70 0.00 0.40 100.00
sdz 3.00 0.00 11.00 0.00 68.00 0.00
12.36 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 77107.00 0.00 308428.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.34 0.00 2.28 6.92 0.00 90.46
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 70017.00 0.00 2316.00 0.00 289496.00 0.00
250.00 1.87 0.81 0.81 0.00 0.43 100.00
sdz 2.00 0.00 8.00 0.00 40.00 0.00
10.00 0.07 8.75 8.75 0.00 1.25 1.00
md127 0.00 0.00 72384.00 0.00 289536.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.09 0.00 3.15 8.33 0.00 88.43
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 64289.00 0.00 2129.00 3.00 266028.00 12.00
249.57 1.64 0.77 0.77 0.00 0.43 91.00
sdz 5375.00 0.00 175.00 3.00 22080.00 12.00
248.22 0.14 0.79 0.80 0.00 0.51 9.00
md127 0.00 0.00 71995.00 1.00 287980.00 4.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.76 0.00 3.94 11.82 0.00 83.47
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 34027.00 0.00 1102.00 0.00 140392.00 0.00
254.79 0.89 0.81 0.81 0.00 0.46 51.00
sdz 40199.00 0.00 1318.00 0.00 166244.00 0.00
252.27 0.94 0.71 0.71 0.00 0.39 51.00
md127 0.00 0.00 76691.00 0.00 306764.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.37 0.00 4.12 10.92 0.00 84.58
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 63785.00 1081.00 2134.00 25.00 263672.00 4432.00
248.36 2.53 1.17 1.05 11.20 0.46 99.00
sdz 49.00 1080.00 6.00 27.00 220.00 4432.00
281.94 0.25 7.58 1.67 8.89 0.91 3.00
md127 0.00 0.00 65973.00 1106.00 263892.00 4424.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.28 0.00 2.44 5.93 0.00 91.35
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 22142.00 0.00 762.00 0.00 91992.00 0.00
241.45 0.43 0.56 0.56 0.00 0.35 27.00
sdz 45677.00 0.00 1531.00 0.00 188784.00 0.00
246.62 1.34 0.88 0.88 0.00 0.48 73.00
md127 0.00 0.00 70194.00 0.00 280776.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.43 0.00 2.70 5.72 0.00 91.16
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 48812.00 0.00 1679.00 0.00 202004.00 0.00
240.62 1.24 0.74 0.74 0.00 0.42 70.00
sdz 23931.00 0.00 834.00 0.00 99352.00 0.00
238.25 0.52 0.62 0.62 0.00 0.37 31.00
md127 0.00 0.00 75339.00 0.00 301356.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.30 0.00 2.54 5.06 0.00 92.10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 58871.00 0.00 1964.00 0.00 243464.00 0.00
247.93 1.33 0.68 0.68 0.00 0.39 77.00
sdz 19190.00 0.00 651.00 0.00 79448.00 0.00
244.08 0.43 0.66 0.66 0.00 0.37 24.00
md127 0.00 0.00 80728.00 0.00 322912.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.17 0.00 1.81 5.74 0.00 92.28
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 47404.00 0.00 1562.00 0.00 196172.00 0.00
251.18 1.67 1.07 1.07 0.00 0.59 92.00
sdz 3889.00 0.00 153.00 0.00 15976.00 0.00
208.84 0.16 1.05 1.05 0.00 0.59 9.00
md127 0.00 0.00 53037.00 0.00 212148.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.85 0.00 4.49 8.08 0.00 86.58
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 9363.00 1053.00 304.00 58.00 38668.00 4444.00
238.19 0.68 1.88 0.89 7.07 0.47 17.00
sdz 62479.00 1060.00 2016.00 48.00 258008.00 4444.00
254.31 2.44 1.18 0.70 21.25 0.41 85.00
md127 0.00 0.00 74169.00 1109.00 296676.00 4436.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.42 0.00 5.22 9.73 0.00 84.63
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sdy 20569.00 0.00 676.00 0.00 84772.00 0.00
250.80 0.38 0.56 0.56 0.00 0.37 25.00
sdz 64949.00 0.00 2098.00 0.00 268408.00 0.00
255.87 1.34 0.64 0.64 0.00 0.36 75.00
md127 0.00 0.00 88293.00 0.00 353180.00 0.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
^C
server #
--Borg Onion
[-- Attachment #2: raid1.gif --]
[-- Type: image/gif, Size: 20618 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: RAID1 round robin read support in md?
2011-12-04 23:31 RAID1 round robin read support in md? Borg Onion
[not found] ` <CALFpzo48C2-HHgw=iOefdHaAUCwEJfjoiXjq=Ngv62a-0yEpcQ@mail.gmail.com>
@ 2011-12-05 5:33 ` NeilBrown
1 sibling, 0 replies; 6+ messages in thread
From: NeilBrown @ 2011-12-05 5:33 UTC (permalink / raw)
To: Borg Onion; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1402 bytes --]
On Sun, 4 Dec 2011 15:31:30 -0800 Borg Onion <borg.onion@gmail.com> wrote:
> Hello,
>
> Is it possible to implement round-robin read support for raid1.c in
> md? I sure would appreciate the extra speed given my RAID1x3SSD
> setup. A possible algorithm might go like this:
>
> 1) Given the RAID1 drives are enumerated 0,1,2
> 2) For each read request, call an atomic increment
> 3) mod the returned value with # of drives (3)
> 4) Use the modulus result as the drive # to read from for this operation
>
> Part of implementation might look like this:
>
> uint32_t get_next_read_drive(uint32_t num_drives)
> {
> static uint32_t op_count;
>
> return __sync_fetch_and_add(&op_count, 1) % num_drives;
> }
>
> I'm making the assumption that the md code is multi-threaded. This
> implementation won't work on all platforms, others might have to do
> some locking. The num_drives and drive list enumeration is assumed to
> exclude write-mostly devices. Temporary balancing fairness errors
> from the num_drives variable changing value (adding or removing of
> drives) should be innocuous given these are just read ops on mirrored
> data.
>
> Thoughts?
I would suggest using RAID10 if 'far' mode instead.
Then reads are distributed across all your drives just like with RAID0, and
you get to choose the chunk size for distribution.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: RAID1 round robin read support in md?
2011-12-05 5:21 ` Borg Onion
@ 2011-12-05 5:53 ` Doug Dumitru
2011-12-05 12:28 ` Roberto Spadim
0 siblings, 1 reply; 6+ messages in thread
From: Doug Dumitru @ 2011-12-05 5:53 UTC (permalink / raw)
To: Borg Onion; +Cc: Marcus Sorensen, linux-raid
What you are seeing is very SSD specific.
With rotating media, it is very important to intentionally stay on one
disk even if it leaves other mirrors quiet. Rotating disks do "in the
drive" read-ahead and take advantage of the heads being on the correct
track, so streaming straight-line reads are efficient.
With SSDs in an array, things are very different. Drives don't really
read ahead at all (actually they do, but this is more of a side effect
of error correction than performance tuning, and the lengths are
short). If your application is spitting out 4MB read requests, they
get cut into 512K (1024 sector) bio calls, and sent to a single drive
if they are linear. Because the code is optimized for HDDs, future
linear calls should go to the same drive because an HDD is very likely
to have at least some of the read sectors in the read-ahead cache.
A different algorithm for SSDs would be better, but one concern is
that this might slow down short read requests in a multi-threaded
environment. Actually managing a mix intelligently is probably best
started with a Google literature search for SSD scheduling papers. I
suspect that UCSD's super-computing department might have done some
work in this area.
With the same data available from two drives, for low thread count
applications, it might be better to actually cut up the inbound
requests into even smaller chunks, and send them in parallel to the
drives. A quick test on a Crucial C300 shows the following transfer
rates at different block sizes.
512K 319 MB/sec
256K 299 MB/sec
128K 298 MB/sec
64K 287 MB/sec
32K 275 MB/sec
This is with a single 'dd' process and 'iflag=direct' bypassing linux
read-ahead and buffer caching. The test was only a second long or so,
so the noise could be quite high. Also, C300s may behave very
differently with this workload than other drives, so you have to test
each type of disk.
What this implies is that if the md raid-1 layer "were to be" SSD
aware, it should consider cutting up long requests and keeping all
drives busy. The logic would be something like:
* If any request is >= 32K, split it into 'n' parts', and issue them
in parallel.
This would be best implemented "down low" in the md stack.
Unfortunately, the queuing where requests are collated, happens below
md completely (I think), so there is no easy point to insert this.
The idea of round-robin scheduling the requests is probably a little
off-base. The important part is, with SSDs, to cut up the requests
into smaller sizes, and push them in parallel. A round-robin might
trick the scheduler into this sometimes, but is probably only an
edge-case solution.
This same logic applies to raid-0, raid-5/6, and raid-10 arrays. With
HDDs is is often more efficient to keep the stripe size large so that
individual in-drive read-ahead is exploited. With SSDs, smaller
stripes are often better (at least on reads) because it tends to keep
all of the drive busy.
Now it is important to note that this discussion is 100% about reads.
SSD writes are a much more complicated animal.
--
Doug Dumitru
EasyCo LLC
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: RAID1 round robin read support in md?
2011-12-05 5:53 ` Doug Dumitru
@ 2011-12-05 12:28 ` Roberto Spadim
0 siblings, 0 replies; 6+ messages in thread
From: Roberto Spadim @ 2011-12-05 12:28 UTC (permalink / raw)
To: doug; +Cc: Borg Onion, Marcus Sorensen, linux-raid
check this old topic... (kernel 2.6.37 i think)
http://www.spadim.com.br/raid1/
http://www.issociate.de/board/post/507493/raid1_new_read_balance,_first_test,_some_doubt,_can_anyone_help?.html
but you will not have a veeeery big read speed....
raid1 is for multi thread work
raid10 far is for less threads with more sequencial reads
check what is better to you
this code at top of email, implement a round robin, if you do each
read one device you will have a slow md device, but changing to
10,100,1000 reads to change device, is better (i think the problme is
cpu use or another problem... must check with iostat and others
statistics softwares)
2011/12/5 Doug Dumitru <doug@easyco.com>
>
> What you are seeing is very SSD specific.
>
> With rotating media, it is very important to intentionally stay on one
> disk even if it leaves other mirrors quiet. Rotating disks do "in the
> drive" read-ahead and take advantage of the heads being on the correct
> track, so streaming straight-line reads are efficient.
>
> With SSDs in an array, things are very different. Drives don't really
> read ahead at all (actually they do, but this is more of a side effect
> of error correction than performance tuning, and the lengths are
> short). If your application is spitting out 4MB read requests, they
> get cut into 512K (1024 sector) bio calls, and sent to a single drive
> if they are linear. Because the code is optimized for HDDs, future
> linear calls should go to the same drive because an HDD is very likely
> to have at least some of the read sectors in the read-ahead cache.
>
> A different algorithm for SSDs would be better, but one concern is
> that this might slow down short read requests in a multi-threaded
> environment. Actually managing a mix intelligently is probably best
> started with a Google literature search for SSD scheduling papers. I
> suspect that UCSD's super-computing department might have done some
> work in this area.
>
> With the same data available from two drives, for low thread count
> applications, it might be better to actually cut up the inbound
> requests into even smaller chunks, and send them in parallel to the
> drives. A quick test on a Crucial C300 shows the following transfer
> rates at different block sizes.
>
> 512K 319 MB/sec
> 256K 299 MB/sec
> 128K 298 MB/sec
> 64K 287 MB/sec
> 32K 275 MB/sec
>
> This is with a single 'dd' process and 'iflag=direct' bypassing linux
> read-ahead and buffer caching. The test was only a second long or so,
> so the noise could be quite high. Also, C300s may behave very
> differently with this workload than other drives, so you have to test
> each type of disk.
>
> What this implies is that if the md raid-1 layer "were to be" SSD
> aware, it should consider cutting up long requests and keeping all
> drives busy. The logic would be something like:
>
> * If any request is >= 32K, split it into 'n' parts', and issue them
> in parallel.
>
> This would be best implemented "down low" in the md stack.
> Unfortunately, the queuing where requests are collated, happens below
> md completely (I think), so there is no easy point to insert this.
>
> The idea of round-robin scheduling the requests is probably a little
> off-base. The important part is, with SSDs, to cut up the requests
> into smaller sizes, and push them in parallel. A round-robin might
> trick the scheduler into this sometimes, but is probably only an
> edge-case solution.
>
> This same logic applies to raid-0, raid-5/6, and raid-10 arrays. With
> HDDs is is often more efficient to keep the stripe size large so that
> individual in-drive read-ahead is exploited. With SSDs, smaller
> stripes are often better (at least on reads) because it tends to keep
> all of the drive busy.
>
> Now it is important to note that this discussion is 100% about reads.
> SSD writes are a much more complicated animal.
>
> --
> Doug Dumitru
> EasyCo LLC
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-12-05 12:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-04 23:31 RAID1 round robin read support in md? Borg Onion
[not found] ` <CALFpzo48C2-HHgw=iOefdHaAUCwEJfjoiXjq=Ngv62a-0yEpcQ@mail.gmail.com>
[not found] ` <CAAs5ODv01LE6HRZjteYYsaMF5stTNv-8pdK9JDP18u4QeNGUOQ@mail.gmail.com>
2011-12-05 3:13 ` Marcus Sorensen
2011-12-05 5:21 ` Borg Onion
2011-12-05 5:53 ` Doug Dumitru
2011-12-05 12:28 ` Roberto Spadim
2011-12-05 5:33 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).