Linux RAID subsystem development
 help / color / mirror / Atom feed
* Slow(?) raid5 to raid6 reshape speed
@ 2010-02-16 11:50 Michael
  2010-02-16 11:53 ` Mikael Abrahamsson
  2010-02-16 17:34 ` Bill Davidsen
  0 siblings, 2 replies; 6+ messages in thread
From: Michael @ 2010-02-16 11:50 UTC (permalink / raw)
  To: linux-raid

Hello,

iam reshaping my 4-drive raid5 to a 5-drive raid6, but the speed is a
little slow.

md2 : active raid6 sdc3[0] sdg3[4] sdf3[3] sda3[2] sdd3[1]
      2898182016 blocks super 0.91 level 6, 64k chunk, algorithm 18 [5/4]
[UUUU_]
      [======>..............]  reshape = 34.2% (330530816/966060672)
finish=2897.5min speed=3655K/sec

i know it is a expensive process, but my system near-to-idle, so there may
be something wrong.
with top, i can see there is no noticable cpu load caused by the md2_raid6
process or any other.
with iotop, i can see mdadm reading and writing a little data once a
secound, but not continuous.

i think the kernel's raid io is not visible at iotop or iam wrong?

i have moved the --backup-file from my usb drive to a internal ide hard
drive and gained 800k/sec more speed.
~3000-4000k/sec are not so bad that the reshaping takes forever, but could
be faster, right?

i have tried playing around with sync_speed_min and sync_speed_max without
any result.
setting the stripe_cache to 8192 or something did not show a real
performance gain.

is my speed bad, good, normal? any ideas how to "tune" it a bit?

now some infos:
Linux raw 2.6.32-ARCH #1 SMP PREEMPT

raid reshape process continued like this (from dmesg):
raid5: reshape will continue                                              
                                                                     
raid5: device sdc3 operational as raid disk 0                             
                                                                     
raid5: device sdf3 operational as raid disk 3                             
                                                                     
raid5: device sda3 operational as raid disk 2                             
                                                                     
raid5: device sdd3 operational as raid disk 1                             
                                                                     
raid5: allocated 5259kB for md2                                           
                                                                     
0: w=1 pa=18 pr=5 m=2 a=2 r=5 op1=0 op2=0                                 
                                                                     
4: w=1 pa=18 pr=5 m=2 a=2 r=5 op1=1 op2=0                                 
                                                                     
3: w=2 pa=18 pr=5 m=2 a=2 r=5 op1=0 op2=0                                 
                                                                     
2: w=3 pa=18 pr=5 m=2 a=2 r=5 op1=0 op2=0                                 
                                                                     
1: w=4 pa=18 pr=5 m=2 a=2 r=5 op1=0 op2=0                                 
                                                                     
raid5: raid level 6 set md2 active with 4 out of 5 devices, algorithm 2   
                                                                     
RAID5 conf printout:                                                      
                                                                     
 --- rd:5 wd:4                                                            
                                                                     
 disk 0, o:1, dev:sdc3                                                    
                                                                     
 disk 1, o:1, dev:sdd3                                                    
                                                                     
 disk 2, o:1, dev:sda3                                                    
                                                                     
 disk 3, o:1, dev:sdf3                                                    
                                                                     
 disk 4, o:1, dev:sdg3                                                    
                                                                     
...ok start reshape thread                                                
                                                                     
md2: detected capacity change from 0 to 2967738384384                     
                                                                     
md: md2 switched to read-write mode.                                      
                                                                     
md: reshape of RAID array md2                                             
                                                                     
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.                        
                                                                     
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reshape.                                                     
md: using 128k window, over a total of 966060672 blocks.                  
                                                                     
 md2: unknown partition table                


[root@raw S02-complete-]mdadm --detail /dev/md2
/dev/md2:
        Version : 0.91
  Creation Time : Thu Feb 11 16:01:12 2010
     Raid Level : raid6
     Array Size : 2898182016 (2763.92 GiB 2967.74 GB)
  Used Dev Size : 966060672 (921.31 GiB 989.25 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Feb 16 12:45:13 2010
          State : clean, degraded, recovering
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric-6
     Chunk Size : 64K

 Reshape Status : 34% complete
     New Layout : left-symmetric

           UUID : 9815a2c6:c83a9a53:2a8015ce:9d8e5e8c (local to host raw)
         Events : 0.375234

    Number   Major   Minor   RaidDevice State
       0       8       35        0      active sync   /dev/sdc3
       1       8       51        1      active sync   /dev/sdd3
       2       8        3        2      active sync   /dev/sda3
       3       8       83        3      active sync   /dev/sdf3
       4       8       99        4      spare rebuilding   /dev/sdg3


thanks, michael.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow(?) raid5 to raid6 reshape speed
  2010-02-16 11:50 Slow(?) raid5 to raid6 reshape speed Michael
@ 2010-02-16 11:53 ` Mikael Abrahamsson
  2010-02-16 12:18   ` Michael
  2010-02-16 17:34 ` Bill Davidsen
  1 sibling, 1 reply; 6+ messages in thread
From: Mikael Abrahamsson @ 2010-02-16 11:53 UTC (permalink / raw)
  To: Michael; +Cc: linux-raid

On Tue, 16 Feb 2010, Michael wrote:

> i think the kernel's raid io is not visible at iotop or iam wrong?

Use "iostat -x 5" to see what is going on, personally I used 
"sync_speed_min" so my drives were around 50-80% busy because for some 
reason it tried to use much lower speed a otherwise.

If your drives are already at 100% then there is nothing more to do than 
to wait it out...

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow(?) raid5 to raid6 reshape speed
  2010-02-16 11:53 ` Mikael Abrahamsson
@ 2010-02-16 12:18   ` Michael
       [not found]     ` <alpine.DEB.1.10.1002161334170.4986@uplift.swm.pp.se>
  0 siblings, 1 reply; 6+ messages in thread
From: Michael @ 2010-02-16 12:18 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

here is the output. i have watched it

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.87    0.00    2.23    8.52    0.00   85.38

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda             859.60   857.60   74.40   21.20 14192.00  7030.40   221.99
   0.98   10.33   6.35  60.72
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00
   0.00    0.00   0.00   0.00
sdd             831.40   840.60  100.40   43.20 14198.40  7070.40   148.11
   0.50    3.50   1.73  24.86
sdc             831.20   836.40  100.60   44.20 14198.40  7044.80   146.71
   0.51    3.52   1.81  26.14
sde               0.00     0.00    0.00   40.40     0.00 19673.60   486.97
   8.72  215.79   4.46  18.00
sdf             854.60   865.40   77.20   17.20 14174.40  7060.80   224.95
   1.28   13.61   4.95  46.72
sdg               0.00   854.80    0.20   27.80     1.60  7060.80   252.23
   0.69   24.60  11.17  31.28

a minute later (just to state another one):
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.15    0.00    2.16   10.28    0.00   84.41

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda             811.60   813.80   92.20   16.60 16446.40  6643.20   212.22
   0.97    9.05   5.54  60.32
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00
   0.00    0.00   0.00   0.00
sdd             785.00   790.60  117.60   40.60 16436.80  6649.60   145.93
   0.47    2.98   1.50  23.66
sdc             785.80   791.20  116.80   40.00 16436.80  6649.60   147.23
   0.49    3.14   1.57  24.68
sde               0.00     0.00    0.00   45.20     0.00 21844.80   483.29
  10.51  197.05   4.38  19.80
sdf             809.00   815.60   94.80   14.60 16446.40  6641.60   211.04
   1.26   11.47   4.27  46.74
sdg               0.00   808.00    0.00   23.40     0.00  6651.20   284.24
   0.64   27.41  12.33  28.86

so this is not 100%, right?

On Tue, 16 Feb 2010 12:53:27 +0100 (CET), Mikael Abrahamsson
<swmike@swm.pp.se> wrote:
> On Tue, 16 Feb 2010, Michael wrote:
> 
>> i think the kernel's raid io is not visible at iotop or iam wrong?
> 
> Use "iostat -x 5" to see what is going on, personally I used 
> "sync_speed_min" so my drives were around 50-80% busy because for some 
> reason it tried to use much lower speed a otherwise.
> 
> If your drives are already at 100% then there is nothing more to do than

> to wait it out...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow(?) raid5 to raid6 reshape speed
       [not found]     ` <alpine.DEB.1.10.1002161334170.4986@uplift.swm.pp.se>
@ 2010-02-16 13:31       ` Michael
  0 siblings, 0 replies; 6+ messages in thread
From: Michael @ 2010-02-16 13:31 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Linux raid

it already is and was all the time:

[root@raw ~]cat /sys/block/md2/md/sync_speed_*
200000 (system)
200000 (local)

so that's not the limit. any other ideas?

On Tue, 16 Feb 2010 13:34:36 +0100 (CET), Mikael Abrahamsson
<swmike@swm.pp.se> wrote:
> On Tue, 16 Feb 2010, Michael wrote:
> 
>> so this is not 100%, right?
> 
> I'd increase speed_min more.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow(?) raid5 to raid6 reshape speed
  2010-02-16 11:50 Slow(?) raid5 to raid6 reshape speed Michael
  2010-02-16 11:53 ` Mikael Abrahamsson
@ 2010-02-16 17:34 ` Bill Davidsen
  2010-02-16 19:11   ` Michael
  1 sibling, 1 reply; 6+ messages in thread
From: Bill Davidsen @ 2010-02-16 17:34 UTC (permalink / raw)
  To: Michael; +Cc: linux-raid

Michael wrote:
> Hello,
>
> iam reshaping my 4-drive raid5 to a 5-drive raid6, but the speed is a
> little slow.
>
> md2 : active raid6 sdc3[0] sdg3[4] sdf3[3] sda3[2] sdd3[1]
>       2898182016 blocks super 0.91 level 6, 64k chunk, algorithm 18 [5/4]
> [UUUU_]
>       [======>..............]  reshape = 34.2% (330530816/966060672)
> finish=2897.5min speed=3655K/sec
>
> i know it is a expensive process, but my system near-to-idle, so there may
> be something wrong.
> with top, i can see there is no noticable cpu load caused by the md2_raid6
> process or any other.
> with iotop, i can see mdadm reading and writing a little data once a
> secound, but not continuous.
>
> i think the kernel's raid io is not visible at iotop or iam wrong?
>
> i have moved the --backup-file from my usb drive to a internal ide hard
> drive and gained 800k/sec more speed.
> ~3000-4000k/sec are not so bad that the reshaping takes forever, but could
> be faster, right?
>
> i have tried playing around with sync_speed_min and sync_speed_max without
> any result.
> setting the stripe_cache to 8192 or something did not show a real
> performance gain.
>
>   
You can go larger than that, double it or try 32k. I have heard reports 
of evil at 64k, I wouldn't go there other than with test data.

> is my speed bad, good, normal? any ideas how to "tune" it a bit?
>   


-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow(?) raid5 to raid6 reshape speed
  2010-02-16 17:34 ` Bill Davidsen
@ 2010-02-16 19:11   ` Michael
  0 siblings, 0 replies; 6+ messages in thread
From: Michael @ 2010-02-16 19:11 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-raid

On Tue, 16 Feb 2010 12:34:48 -0500, Bill Davidsen <davidsen@tmr.com>
wrote:
> Michael wrote:
>> setting the stripe_cache to 8192 or something did not show a real
>> performance gain.
>>
>>   
> You can go larger than that, double it or try 32k. I have heard reports 
> of evil at 64k, I wouldn't go there other than with test data.

well, my "evil" value is at 32k, crashed my system to a unuseable state.
could be related to my 4gb memory in a 32bit system. i think people with 64
bit can go higher.

something from my logfile can be read here if it is interesting:
http://pastebin.ca/1798794
my x server crashed, and i was not able to do anything anymore related to
the raid device.
i.e. echo 1024 > /sys/block/md2/md/stripe_cache_size hangs, also cat
/proc/mdstat.

was a nice crash that stopped my heart for some seconds. thanks to neil
for making "resuming" of reshape processes possible.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-02-16 19:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-16 11:50 Slow(?) raid5 to raid6 reshape speed Michael
2010-02-16 11:53 ` Mikael Abrahamsson
2010-02-16 12:18   ` Michael
     [not found]     ` <alpine.DEB.1.10.1002161334170.4986@uplift.swm.pp.se>
2010-02-16 13:31       ` Michael
2010-02-16 17:34 ` Bill Davidsen
2010-02-16 19:11   ` Michael

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox