raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
@ 2012-09-12  0:03 Kasper Sandberg
  2012-09-12  9:42 ` Peter Grandi
  2012-09-19  6:25 ` NeilBrown
  0 siblings, 2 replies; 7+ messages in thread
From: Kasper Sandberg @ 2012-09-12  0:03 UTC (permalink / raw)
  To: linux-raid

Hello.

I have setup an array of 7x3tb WD30EZRX drives, though its meant for 8x,
so it runs in singledegraded mode.

the issue is that i get very poor performance, generally roughly 25MB/s
writes only. individually the disks are fine.

iowait and idle is high:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.29    0.00    5.85   22.14    0.00   69.72
 
Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb            1370.00   732.00  109.00   30.00 11824.00  5864.00  
127.25     1.50   10.71   2.45  34.00
sdf            1392.00   839.00   88.00   33.00 11840.00  6976.00  
155.50     3.42   28.26   4.43  53.60
sdd            1422.00   863.00   64.00   29.00 13448.00  6384.00  
213.25     2.28   34.54   4.56  42.40
sdg            1388.00   446.00   92.00   18.00 11840.00  3248.00  
137.16     0.63    5.71   1.60  17.60
sdc            1395.00   857.00   85.00   40.00 11840.00  6944.00  
150.27     1.07    8.58   1.86  23.20
sda            1370.00   985.00  111.00   41.00 11888.00  7744.00  
129.16     5.30   35.79   5.21  79.20
sde            1417.00   669.00   70.00   21.00 13528.00  5040.00  
204.04     1.94   32.79   4.53  41.20
sdh               0.00     0.00    0.00    0.00     0.00     0.00    
0.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00   86.00     0.00 17656.00  
205.30     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00    
0.00  4174.84    0.00   0.00 100.00

http://paste.kde.org/547370/ - the same as above, but on pastebin since
might be annoying to read in mail client depending on font used.

(sdh is not part of array)

mdadm detail:
/dev/md0:
        Version : 1.2
  Creation Time : Sat Sep  8 23:01:11 2012
     Raid Level : raid6
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 2930265088 (2794.52 GiB 3000.59 GB)
   Raid Devices : 8
  Total Devices : 7
    Persistence : Superblock is persistent

    Update Time : Wed Sep 12 01:55:49 2012
          State : active, degraded
 Active Devices : 7
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : mainserver:0  (local to host mainserver)
           UUID : d48566eb:ca2fce69:907602f4:84120ee4
         Events : 26413

    Number   Major   Minor   RaidDevice State
       0       8        0        0      active sync   /dev/sda
       1       8       16        1      active sync   /dev/sdb
       2       8       32        2      active sync   /dev/sdc
       3       8       48        3      active sync   /dev/sdd
       4       8       64        4      active sync   /dev/sde
       5       8       80        5      active sync   /dev/sdf
       8       8       96        6      active sync   /dev/sdg
       7       0        0        7      removed


It should be noted that I tested the chunksizes extensively, from 4k to
2048k, and the default seemed to offer best performance allround, very
close to the best performers for all workloads, and much much better
than the worst.

I conducted tests with dd directly on md0, and xfs on md0, and with
dm-crypt on top md md0, both dd and xfs on the dm-0 device. resulting in
marginal performance difference, so i must assume the issue is in the
raid layer.

kernel is:
Linux mainserver 3.2.0-0.bpo.1-amd64 #1 SMP Sat Feb 11 08:41:32 UTC 2012
x86_64 GNU/Linux

could this be due to being in singledegraded mode? after im done copying
files to this array i will be adding the last disk.

Any input is welcomed.

Oh, and im not subscribed to the list, so please CC me.

-- 
Kasper Sandberg
Sandberg Enterprises
+45 51944242
http://www.sandbergenterprises.dk


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
  2012-09-12  0:03 raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb) Kasper Sandberg
@ 2012-09-12  9:42 ` Peter Grandi
  2012-09-12 13:38   ` Peter Grandi
  2012-09-19  6:25 ` NeilBrown
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Grandi @ 2012-09-12  9:42 UTC (permalink / raw)
  To: Linux RAID

[ ... ]

> I have setup an array of 7x3tb WD30EZRX drives, though its
> meant for 8x, so it runs in singledegraded mode.

An (euphemism) amazing concept, with (euphemism) surprising
effects:

> the issue is that i get very poor performance, generally
> roughly 25MB/s writes only. [ ... ]

> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s [ ... ]
> sdb            1370.00   732.00  109.00   30.00 11824.00  5864.00 [ ... ]
> sdf            1392.00   839.00   88.00   33.00 11840.00  6976.00 [ ... ]
> sdd            1422.00   863.00   64.00   29.00 13448.00  6384.00 [ ... ]
> sdg            1388.00   446.00   92.00   18.00 11840.00  3248.00 [ ... ]
> sdc            1395.00   857.00   85.00   40.00 11840.00  6944.00 [ ... ]
> sda            1370.00   985.00  111.00   41.00 11888.00  7744.00 [ ... ] 
> sde            1417.00   669.00   70.00   21.00 13528.00  5040.00 [ ... ]
> sdh               0.00     0.00    0.00    0.00     0.00     0.00 [ ... ]
> md0               0.00     0.00    0.00   86.00     0.00 17656.00 [ ... ]
> dm-0              0.00     0.00    0.00    0.00     0.00     0.00 [ ... ]

It is (euphemism) mysterious that 'rsec/s' is nearly twice as
high as 'wsec/s', and (euphemism) obscure that the RAID set is
also 'sync'ing:

>     Number   Major   Minor   RaidDevice State
>        0       8        0        0      active sync   /dev/sda
>        1       8       16        1      active sync   /dev/sdb
>        2       8       32        2      active sync   /dev/sdc
>        3       8       48        3      active sync   /dev/sdd
>        4       8       64        4      active sync   /dev/sde
>        5       8       80        5      active sync   /dev/sdf
>        8       8       96        6      active sync   /dev/sdg
>        7       0        0        7      removed

> [ ... ] so i must assume the issue is in the raid layer.

Or perhaps in the greyware layer :-).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
  2012-09-12  9:42 ` Peter Grandi
@ 2012-09-12 13:38   ` Peter Grandi
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Grandi @ 2012-09-12 13:38 UTC (permalink / raw)
  To: Linux RAID

[ ... ]

> It is (euphemism) mysterious that 'rsec/s' is nearly twice as
> high as 'wsec/s', and (euphemism) obscure that the RAID set is
> also 'sync'ing:

Oops I meant "sync"'ed. It is actually slightly obscure because
twice as many reads as writes even in degraded mode should not
happen, if writes are somewhat naturally aligned. It would be
easier to think that MD is rebuilding, but it is not.

>> Number   Major   Minor   RaidDevice State
>> 0       8        0        0      active sync   /dev/sda
>> [ ... ]
>> 8       8       96        6      active sync   /dev/sdg
>> 7       0        0        7      removed

Perhaps writes are indeed poorly aligned and a way to get this
(euphemism) astutely degraded-by-design setup to be less slow
would be to greatly increase the stripe cache size, but that
might not work, as this reply by NeilB for a a similar previous
case hints that things are somewhat complicated:

  http://www.spinics.net/lists/raid/msg38922.html

BAARF.com! :-)

[ ... ]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
  2012-09-12  0:03 raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb) Kasper Sandberg
  2012-09-12  9:42 ` Peter Grandi
@ 2012-09-19  6:25 ` NeilBrown
  2012-09-19  7:40   ` Roman Mamedov
  1 sibling, 1 reply; 7+ messages in thread
From: NeilBrown @ 2012-09-19  6:25 UTC (permalink / raw)
  To: Kasper Sandberg; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4348 bytes --]

On Wed, 12 Sep 2012 02:03:31 +0200 Kasper Sandberg
<kontakt@sandberg-consult.dk> wrote:

> Hello.
> 
> I have setup an array of 7x3tb WD30EZRX drives, though its meant for 8x,
> so it runs in singledegraded mode.
> 
> the issue is that i get very poor performance, generally roughly 25MB/s
> writes only. individually the disks are fine.
> 
> iowait and idle is high:
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            2.29    0.00    5.85   22.14    0.00   69.72
>  
> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sdb            1370.00   732.00  109.00   30.00 11824.00  5864.00  
> 127.25     1.50   10.71   2.45  34.00
> sdf            1392.00   839.00   88.00   33.00 11840.00  6976.00  
> 155.50     3.42   28.26   4.43  53.60
> sdd            1422.00   863.00   64.00   29.00 13448.00  6384.00  
> 213.25     2.28   34.54   4.56  42.40
> sdg            1388.00   446.00   92.00   18.00 11840.00  3248.00  
> 137.16     0.63    5.71   1.60  17.60
> sdc            1395.00   857.00   85.00   40.00 11840.00  6944.00  
> 150.27     1.07    8.58   1.86  23.20
> sda            1370.00   985.00  111.00   41.00 11888.00  7744.00  
> 129.16     5.30   35.79   5.21  79.20
> sde            1417.00   669.00   70.00   21.00 13528.00  5040.00  
> 204.04     1.94   32.79   4.53  41.20
> sdh               0.00     0.00    0.00    0.00     0.00     0.00    
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00    0.00   86.00     0.00 17656.00  
> 205.30     0.00    0.00   0.00   0.00
> dm-0              0.00     0.00    0.00    0.00     0.00     0.00    
> 0.00  4174.84    0.00   0.00 100.00
> 
> http://paste.kde.org/547370/ - the same as above, but on pastebin since
> might be annoying to read in mail client depending on font used.
> 
> (sdh is not part of array)
> 
> mdadm detail:
> /dev/md0:
>         Version : 1.2
>   Creation Time : Sat Sep  8 23:01:11 2012
>      Raid Level : raid6
>      Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
>   Used Dev Size : 2930265088 (2794.52 GiB 3000.59 GB)
>    Raid Devices : 8
>   Total Devices : 7
>     Persistence : Superblock is persistent
> 
>     Update Time : Wed Sep 12 01:55:49 2012
>           State : active, degraded
>  Active Devices : 7
> Working Devices : 7
>  Failed Devices : 0
>   Spare Devices : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>            Name : mainserver:0  (local to host mainserver)
>            UUID : d48566eb:ca2fce69:907602f4:84120ee4
>          Events : 26413
> 
>     Number   Major   Minor   RaidDevice State
>        0       8        0        0      active sync   /dev/sda
>        1       8       16        1      active sync   /dev/sdb
>        2       8       32        2      active sync   /dev/sdc
>        3       8       48        3      active sync   /dev/sdd
>        4       8       64        4      active sync   /dev/sde
>        5       8       80        5      active sync   /dev/sdf
>        8       8       96        6      active sync   /dev/sdg
>        7       0        0        7      removed
> 
> 
> It should be noted that I tested the chunksizes extensively, from 4k to
> 2048k, and the default seemed to offer best performance allround, very
> close to the best performers for all workloads, and much much better
> than the worst.
> 
> I conducted tests with dd directly on md0, and xfs on md0, and with
> dm-crypt on top md md0, both dd and xfs on the dm-0 device. resulting in
> marginal performance difference, so i must assume the issue is in the
> raid layer.
> 
> kernel is:
> Linux mainserver 3.2.0-0.bpo.1-amd64 #1 SMP Sat Feb 11 08:41:32 UTC 2012
> x86_64 GNU/Linux
> 
> could this be due to being in singledegraded mode? after im done copying
> files to this array i will be adding the last disk.

A good way to test that would be to create a non-degraded 7-device array and
see how that performs.

RAID5/RAID6 write speed is never going to be brilliant, but there probably is
room for improvement.  Hopefully one day I figure out how to effect that
improvement.

NeilBrown


> 
> Any input is welcomed.
> 
> Oh, and im not subscribed to the list, so please CC me.
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
  2012-09-19  6:25 ` NeilBrown
@ 2012-09-19  7:40   ` Roman Mamedov
  2012-09-19 16:31     ` Stan Hoeppner
  2012-09-20  0:26     ` Kasper Sandberg
  0 siblings, 2 replies; 7+ messages in thread
From: Roman Mamedov @ 2012-09-19  7:40 UTC (permalink / raw)
  To: NeilBrown; +Cc: Kasper Sandberg, linux-raid

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Wed, 19 Sep 2012 16:25:57 +1000
NeilBrown <neilb@suse.de> wrote:

> RAID5/RAID6 write speed is never going to be brilliant, but there probably is
> room for improvement.  Hopefully one day I figure out how to effect that
> improvement.

"/sys/block/md*/md/stripe_cache_size" is absolutely the first thing one should
look at when facing a problem with md RAID5/6 write performance. It is
understandable that the default value can't be high (as it consumes a lot of
RAM on large disk counts), but increasing it if you have the RAM can increase
the write speed by 200 to 400%:

  http://peterkieser.com/2009/11/29/raid-mdraid-stripe_cache_size-vs-write-transfer/

-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
  2012-09-19  7:40   ` Roman Mamedov
@ 2012-09-19 16:31     ` Stan Hoeppner
  2012-09-20  0:26     ` Kasper Sandberg
  1 sibling, 0 replies; 7+ messages in thread
From: Stan Hoeppner @ 2012-09-19 16:31 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: NeilBrown, Kasper Sandberg, linux-raid

On 9/19/2012 2:40 AM, Roman Mamedov wrote:

> "/sys/block/md*/md/stripe_cache_size" is absolutely the first thing one should
> look at when facing a problem with md RAID5/6 write performance.

No.  The first thing one should do is verify that the md/RAID stripe is
properly aligned to physical sector boundaries on the drives, and that
the filesystem is properly aligned to the md/RAID stripe size.  If these
are not correct then tweaking stripe_cache_size isn't going to help
much, if at all.

-- 
Stan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb)
  2012-09-19  7:40   ` Roman Mamedov
  2012-09-19 16:31     ` Stan Hoeppner
@ 2012-09-20  0:26     ` Kasper Sandberg
  1 sibling, 0 replies; 7+ messages in thread
From: Kasper Sandberg @ 2012-09-20  0:26 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: NeilBrown, linux-raid

On 19/09/12 09:40, Roman Mamedov wrote:
> On Wed, 19 Sep 2012 16:25:57 +1000
> NeilBrown <neilb@suse.de> wrote:
>
>> RAID5/RAID6 write speed is never going to be brilliant, but there probably is
>> room for improvement.  Hopefully one day I figure out how to effect that
>> improvement.
> "/sys/block/md*/md/stripe_cache_size" is absolutely the first thing one should
> look at when facing a problem with md RAID5/6 write performance. It is
> understandable that the default value can't be high (as it consumes a lot of
> RAM on large disk counts), but increasing it if you have the RAM can increase
> the write speed by 200 to 400%:
>
>   http://peterkieser.com/2009/11/29/raid-mdraid-stripe_cache_size-vs-write-transfer/
>

Thank you for your answer.

I did try this, it had zero effect.

And Neil, thank you for your answer too,

I have now put the disks into the other box, and resynced with the 8th
disk, and things are fast now. How fast i cannot exactly say, as the
filesystem is put on top of dm-crypt, which caps me at about 100MB/s on
this cpu, but that would seem to indicate that things are functioning as
they should.


-- 
Kasper Sandberg
Sandberg Enterprises
+45 51944242
http://www.sandbergenterprises.dk


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-09-20  0:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-12  0:03 raid6 low performance 8x3tb drives in singledegraded mode(=7x3tb) Kasper Sandberg
2012-09-12  9:42 ` Peter Grandi
2012-09-12 13:38   ` Peter Grandi
2012-09-19  6:25 ` NeilBrown
2012-09-19  7:40   ` Roman Mamedov
2012-09-19 16:31     ` Stan Hoeppner
2012-09-20  0:26     ` Kasper Sandberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).