Stress testing system?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Stress testing system?
@ 2004-10-08 21:32 Robin Bowes
  2004-10-08 22:00 ` Mike Hardy
  2004-10-08 22:02 ` Gordon Henderson
  0 siblings, 2 replies; 15+ messages in thread
From: Robin Bowes @ 2004-10-08 21:32 UTC (permalink / raw)
  To: linux-raid

Hi,

I've got six 250GB Maxtor drives connected to 2 Promise SATA controllers 
  configured as follows:

Each disk has two partitions: 1.5G and 248.5G.

/dev/sda1 & /dev/sdd1 are mirrored and form the root filesystem.

/dev/sd[abcdef]2 are configured as a RAID5 array with one hot spare.

I use lvm to create a 10G /usr partition, a 5G /var partition, and the 
rest of the array (994G) in /home.

The system in which I installed these drives was rock-solid before I 
added the RAID storage (it had a single 120G drive). However, since 
adding the 6 disks I have experienced the system simply powering down 
and requiring filesystem recovering when it restarted.

I suspected this was down to an inadequate power supply (it was 400W) so 
I've upgrade to an OCZ 520W PSU.

I'd like to stress test the system to see if the new PSU has sorted the 
problem, i.e. really work the disks.

What's the best way to do get all six drives working as hard as possible?

Thanks,

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-08 21:32 Stress testing system? Robin Bowes
@ 2004-10-08 22:00 ` Mike Hardy
  2004-10-08 22:07   ` Gordon Henderson
  2004-10-08 23:49   ` Guy
  2004-10-08 22:02 ` Gordon Henderson
  1 sibling, 2 replies; 15+ messages in thread
From: Mike Hardy @ 2004-10-08 22:00 UTC (permalink / raw)
  To: 'linux-raid@vger.kernel.org'

This is a little off topic, but I stress systems with four loops

loop one unpacks the kernel source, moves it to a new name, unpacks the 
kernel source again, and diffs the two, deletes and repeats (tests 
memory, disk caching)

loop two unpacks the kernel source, does a make allconfig and a make -j5 
bzImage modules, then a make clean and repeats. should get the cpu burning

loop three should run bonnie++ on the array

loop four should work with another machine. each machine should wget 
some very large file (1GBish) with output to /dev/null so that the NIC 
has to serve interrupts at max

If that doesn't cook your machine in 48 hours or so, I can't think of 
anything that will.

This catches out every machine I try it on for some reason or another, 
but after a couple tweaks its usually solid.

Slightly more on-topic, one thing that I have do to frequently is boot 
with noapic or acpi=off due to interrupt handling problems with various 
motherboards.

Additionally, I think there have been reports of problems with raid and 
LVM, and there have also been problems with SATA and possibly with 
Maxtor drives, so you have may have some tweaking to do. Mentioning 
versions of things (distribution, kernel, hardware parts and part 
numbers etc) would help

I'm interested to hear what other people do to burn their machines in 
though...

Good luck.
-Mike

Robin Bowes wrote:
> Hi,
> 
> I've got six 250GB Maxtor drives connected to 2 Promise SATA controllers 
>  configured as follows:
> 
> Each disk has two partitions: 1.5G and 248.5G.
> 
> /dev/sda1 & /dev/sdd1 are mirrored and form the root filesystem.
> 
> /dev/sd[abcdef]2 are configured as a RAID5 array with one hot spare.
> 
> I use lvm to create a 10G /usr partition, a 5G /var partition, and the 
> rest of the array (994G) in /home.
> 
> The system in which I installed these drives was rock-solid before I 
> added the RAID storage (it had a single 120G drive). However, since 
> adding the 6 disks I have experienced the system simply powering down 
> and requiring filesystem recovering when it restarted.
> 
> I suspected this was down to an inadequate power supply (it was 400W) so 
> I've upgrade to an OCZ 520W PSU.
> 
> I'd like to stress test the system to see if the new PSU has sorted the 
> problem, i.e. really work the disks.
> 
> What's the best way to do get all six drives working as hard as possible?
> 
> Thanks,
> 
> R.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-08 21:32 Stress testing system? Robin Bowes
  2004-10-08 22:00 ` Mike Hardy
@ 2004-10-08 22:02 ` Gordon Henderson
  2004-10-08 23:44   ` Robin Bowes
  1 sibling, 1 reply; 15+ messages in thread
From: Gordon Henderson @ 2004-10-08 22:02 UTC (permalink / raw)
  To: Robin Bowes; +Cc: linux-raid

On Fri, 8 Oct 2004, Robin Bowes wrote:

> What's the best way to do get all six drives working as hard as possible?

I always run 'bonnie' on each partition (sometimes 2 to a partition) when
soak-testing a new server. Try to leave it running for as long as
possible. (ie. days)

That seems to get the disk heads moving (start the bonnies off at 1-2
minute interfals, but they desynchronise over time anyway, so some end up
reading, some writing, and some seeking) and I suspect head movement is
whats going to cause a disk drive to comsume the most current after
startup.

If I have a handy PC next to it, then I'll also run some scripted network
FTPs (via wget) of 1GB binary files copying to /dev/null on the recipient
side.

I certinaly see the motherboard temperature rise when I do this
(lm-sensors is your friend, but can be a PITA to get going)

Gordon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-08 22:00 ` Mike Hardy
@ 2004-10-08 22:07   ` Gordon Henderson
  2004-10-08 23:49   ` Guy
  1 sibling, 0 replies; 15+ messages in thread
From: Gordon Henderson @ 2004-10-08 22:07 UTC (permalink / raw)
  To: Mike Hardy; +Cc: 'linux-raid@vger.kernel.org'

On Fri, 8 Oct 2004, Mike Hardy wrote:

> Slightly more on-topic, one thing that I have do to frequently is boot
> with noapic or acpi=off due to interrupt handling problems with various
> motherboards.

I've seen this (and made mention of it here in the past) it basically
boils down to cheap motherboards with buggy chipsets )-: Some BIOSes let
you select PIC or APIC - on these buggy ones, I stick to PIC.

> Additionally, I think there have been reports of problems with raid and
> LVM, and there have also been problems with SATA and possibly with
> Maxtor drives, so you have may have some tweaking to do. Mentioning
> versions of things (distribution, kernel, hardware parts and part
> numbers etc) would help

I had problems some 2 years ago when I first dabbled with LVM. Quickly
found problems with performance and random crashes and not having time to
persue it further, left it alone and haven't ventured back since!

> I'm interested to hear what other people do to burn their machines in
> though...

Looks like we do someting similar... So that can't be bad! I'm sure there
is plenty cpu left over for me to do stuff like compiles, etc.

Many years ago I used to write diagnostics for compter systems - memory,
cpu, IO, etc. there were cases where the systems would pass all the diags
faultlessly for days on end, then crash 2 minutes into the application, so
theres no substitute for real life testing...

Gordon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-08 22:02 ` Gordon Henderson
@ 2004-10-08 23:44   ` Robin Bowes
  2004-10-08 23:48     ` Guy
  2004-10-10 20:36     ` Gordon Henderson
  0 siblings, 2 replies; 15+ messages in thread
From: Robin Bowes @ 2004-10-08 23:44 UTC (permalink / raw)
  To: Gordon Henderson; +Cc: linux-raid

Gordon Henderson wrote:
> On Fri, 8 Oct 2004, Robin Bowes wrote:
> 
> 
>>What's the best way to do get all six drives working as hard as possible?
> 
> 
> I always run 'bonnie' on each partition (sometimes 2 to a partition) when
> soak-testing a new server. Try to leave it running for as long as
> possible. (ie. days)

Hi Gordon,

I tried this - just a simple command to start with:

# bonnie++ -d /home -s10 -r4 -u0

This gave the following results:

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
dude.robinbowes 10M 11482  92 +++++ +++ +++++ +++ 15370 100 +++++ +++ 13406 124
                     ------Sequential Create------ --------Random Create--------
                     -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  16   347  88 +++++ +++ 19794  91   332  86 +++++ +++  1106  93
dude.robinbowes.com,10M,11482,92,+++++,+++,+++++,+++,15370,100,+++++,+++,13406.0
,124,16,347,88,+++++,+++,19794,91,332,86,+++++,+++,1106,93


I then noticed that my raid array was using a lot of CPU:

top - 00:41:28 up 33 min,  2 users,  load average: 1.80, 1.78, 1.57
Tasks:  89 total,   1 running,  88 sleeping,   0 stopped,   0 zombie
Cpu(s):  4.1% us, 32.9% sy,  0.0% ni, 59.8% id,  0.0% wa,  0.5% hi,  2.6% si
Mem:   1554288k total,   368212k used,  1186076k free,    70520k buffers
Swap:        0k total,        0k used,        0k free,   200140k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   239 root      15   0     0    0    0 S 61.0  0.0  20:08.37 md5_raid5
  1414 slimserv  15   0 43644  38m 5772 S  9.0  2.5   2:38.99 slimserver.pl
   241 root      15   0     0    0    0 D  6.3  0.0   2:05.45 md5_resync
  1861 root      16   0  2888  908 1620 R  1.0  0.1   0:00.28 top
  1826 root      16   0  9332 2180 4232 S  0.3  0.1   0:00.28 sshd

So I checked the array:

[root@dude root]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 6
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Sat Oct  9 00:08:22 2004
           State : dirty, resyncing
  Active Devices : 5
Working Devices : 6
  Failed Devices : 0
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 128K

  Rebuild Status : 12% complete

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.1410301

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       50        3      active sync   /dev/sdd2
        4       8       66        4      active sync   /dev/sde2

        5       8       82        -      spare   /dev/sdf2

Is this normal? Should running bonnie++ result in the array being dirty and requiring resyncing?

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Stress testing system?
  2004-10-08 23:44   ` Robin Bowes
@ 2004-10-08 23:48     ` Guy
  2004-10-09  9:52       ` Robin Bowes
  2004-10-10 20:36     ` Gordon Henderson
  1 sibling, 1 reply; 15+ messages in thread
From: Guy @ 2004-10-08 23:48 UTC (permalink / raw)
  To: 'Robin Bowes', 'Gordon Henderson'; +Cc: linux-raid

Not normal.  Was it ever synced?
Wait for it to re-sync, then reboot and check the status.
Some people have problems after a reboot.

Cat /proc/mdstat for the ETA and status.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Friday, October 08, 2004 7:44 PM
To: Gordon Henderson
Cc: linux-raid@vger.kernel.org
Subject: Re: Stress testing system?

Gordon Henderson wrote:
> On Fri, 8 Oct 2004, Robin Bowes wrote:
> 
> 
>>What's the best way to do get all six drives working as hard as possible?
> 
> 
> I always run 'bonnie' on each partition (sometimes 2 to a partition) when
> soak-testing a new server. Try to leave it running for as long as
> possible. (ie. days)

Hi Gordon,

I tried this - just a simple command to start with:

# bonnie++ -d /home -s10 -r4 -u0

This gave the following results:

Version  1.03       ------Sequential Output------ --Sequential Input-
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
dude.robinbowes 10M 11482  92 +++++ +++ +++++ +++ 15370 100 +++++ +++ 13406
124
                     ------Sequential Create------ --------Random
Create--------
                     -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
%CP
                  16   347  88 +++++ +++ 19794  91   332  86 +++++ +++  1106
93
dude.robinbowes.com,10M,11482,92,+++++,+++,+++++,+++,15370,100,+++++,+++,134
06.0
,124,16,347,88,+++++,+++,19794,91,332,86,+++++,+++,1106,93


I then noticed that my raid array was using a lot of CPU:

top - 00:41:28 up 33 min,  2 users,  load average: 1.80, 1.78, 1.57
Tasks:  89 total,   1 running,  88 sleeping,   0 stopped,   0 zombie
Cpu(s):  4.1% us, 32.9% sy,  0.0% ni, 59.8% id,  0.0% wa,  0.5% hi,  2.6% si
Mem:   1554288k total,   368212k used,  1186076k free,    70520k buffers
Swap:        0k total,        0k used,        0k free,   200140k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   239 root      15   0     0    0    0 S 61.0  0.0  20:08.37 md5_raid5
  1414 slimserv  15   0 43644  38m 5772 S  9.0  2.5   2:38.99 slimserver.pl
   241 root      15   0     0    0    0 D  6.3  0.0   2:05.45 md5_resync
  1861 root      16   0  2888  908 1620 R  1.0  0.1   0:00.28 top
  1826 root      16   0  9332 2180 4232 S  0.3  0.1   0:00.28 sshd

So I checked the array:

[root@dude root]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 6
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Sat Oct  9 00:08:22 2004
           State : dirty, resyncing
  Active Devices : 5
Working Devices : 6
  Failed Devices : 0
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 128K

  Rebuild Status : 12% complete

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.1410301

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       50        3      active sync   /dev/sdd2
        4       8       66        4      active sync   /dev/sde2

        5       8       82        -      spare   /dev/sdf2

Is this normal? Should running bonnie++ result in the array being dirty and
requiring resyncing?

R.
-- 
http://robinbowes.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Stress testing system?
  2004-10-08 22:00 ` Mike Hardy
  2004-10-08 22:07   ` Gordon Henderson
@ 2004-10-08 23:49   ` Guy
  1 sibling, 0 replies; 15+ messages in thread
From: Guy @ 2004-10-08 23:49 UTC (permalink / raw)
  To: linux-raid

When using wget, run it on both systems (full duplex).  This will transfer
about twice as much data per second.  Assuming you have a full duplex
network!  If not, switches are very low cost today.

I think this is what you intended, but your note was vague IMO.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mike Hardy
Sent: Friday, October 08, 2004 6:00 PM
To: 'linux-raid@vger.kernel.org'
Subject: Re: Stress testing system?

This is a little off topic, but I stress systems with four loops

loop one unpacks the kernel source, moves it to a new name, unpacks the 
kernel source again, and diffs the two, deletes and repeats (tests 
memory, disk caching)

loop two unpacks the kernel source, does a make allconfig and a make -j5 
bzImage modules, then a make clean and repeats. should get the cpu burning

loop three should run bonnie++ on the array

loop four should work with another machine. each machine should wget 
some very large file (1GBish) with output to /dev/null so that the NIC 
has to serve interrupts at max

If that doesn't cook your machine in 48 hours or so, I can't think of 
anything that will.

This catches out every machine I try it on for some reason or another, 
but after a couple tweaks its usually solid.

Slightly more on-topic, one thing that I have do to frequently is boot 
with noapic or acpi=off due to interrupt handling problems with various 
motherboards.

Additionally, I think there have been reports of problems with raid and 
LVM, and there have also been problems with SATA and possibly with 
Maxtor drives, so you have may have some tweaking to do. Mentioning 
versions of things (distribution, kernel, hardware parts and part 
numbers etc) would help

I'm interested to hear what other people do to burn their machines in 
though...

Good luck.
-Mike

Robin Bowes wrote:
> Hi,
> 
> I've got six 250GB Maxtor drives connected to 2 Promise SATA controllers 
>  configured as follows:
> 
> Each disk has two partitions: 1.5G and 248.5G.
> 
> /dev/sda1 & /dev/sdd1 are mirrored and form the root filesystem.
> 
> /dev/sd[abcdef]2 are configured as a RAID5 array with one hot spare.
> 
> I use lvm to create a 10G /usr partition, a 5G /var partition, and the 
> rest of the array (994G) in /home.
> 
> The system in which I installed these drives was rock-solid before I 
> added the RAID storage (it had a single 120G drive). However, since 
> adding the 6 disks I have experienced the system simply powering down 
> and requiring filesystem recovering when it restarted.
> 
> I suspected this was down to an inadequate power supply (it was 400W) so 
> I've upgrade to an OCZ 520W PSU.
> 
> I'd like to stress test the system to see if the new PSU has sorted the 
> problem, i.e. really work the disks.
> 
> What's the best way to do get all six drives working as hard as possible?
> 
> Thanks,
> 
> R.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-08 23:48     ` Guy
@ 2004-10-09  9:52       ` Robin Bowes
  2004-10-09 16:58         ` Guy
  0 siblings, 1 reply; 15+ messages in thread
From: Robin Bowes @ 2004-10-09  9:52 UTC (permalink / raw)
  To: Guy; +Cc: 'Gordon Henderson', linux-raid

Guy wrote:
> Not normal.  Was it ever synced?
> Wait for it to re-sync, then reboot and check the status.
> Some people have problems after a reboot.
> 
> Cat /proc/mdstat for the ETA and status.

Well, the array re-synced overnight and I re-ran bonnie++ again a couple 
of times this morning with no apparent ill-effect.

To be honest, I'd just had some problems with a loose power cable so I 
suspect that the array was already re-syncing before I ran bonnie.

Thanks,

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Stress testing system?
  2004-10-09  9:52       ` Robin Bowes
@ 2004-10-09 16:58         ` Guy
  2004-10-09 17:19           ` Robin Bowes
  0 siblings, 1 reply; 15+ messages in thread
From: Guy @ 2004-10-09 16:58 UTC (permalink / raw)
  To: 'Robin Bowes'; +Cc: linux-raid

Once a drive fails, md will not re-sync it automatically.  It will just sit
there in a failed state.  If you had a spare, then it would re-sync
automatically.  I am not 100% sure but I think...if you were to reboot,
after the reboot md will resync.

If you reboot before the re-sync is done, the re-sync will start over, at
lease with my version.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Saturday, October 09, 2004 5:52 AM
To: Guy
Cc: 'Gordon Henderson'; linux-raid@vger.kernel.org
Subject: Re: Stress testing system?

Guy wrote:
> Not normal.  Was it ever synced?
> Wait for it to re-sync, then reboot and check the status.
> Some people have problems after a reboot.
> 
> Cat /proc/mdstat for the ETA and status.

Well, the array re-synced overnight and I re-ran bonnie++ again a couple 
of times this morning with no apparent ill-effect.

To be honest, I'd just had some problems with a loose power cable so I 
suspect that the array was already re-syncing before I ran bonnie.

Thanks,

R.
-- 
http://robinbowes.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-09 16:58         ` Guy
@ 2004-10-09 17:19           ` Robin Bowes
  0 siblings, 0 replies; 15+ messages in thread
From: Robin Bowes @ 2004-10-09 17:19 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid

Guy wrote:
> Once a drive fails, md will not re-sync it automatically.  It will just sit
> there in a failed state.  If you had a spare, then it would re-sync
> automatically.  I am not 100% sure but I think...if you were to reboot,
> after the reboot md will resync.
> 
> If you reboot before the re-sync is done, the re-sync will start over, at
> lease with my version.

I think that's what must have happened.

I replaced the power supply and brought the box back up. It then froze 
on me, so I powered down again and investigated, bring the box up and 
down a few times in the process. I then noticed the loose power 
connection, fixed it, and brought the box back up. I reckon the array 
must have been re-syncing from that point and it had nothing to do with 
bonnie++.

Incidentally, I do have a spare drive:

[root@dude home]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 6
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Sat Oct  9 18:18:54 2004
           State : clean
  Active Devices : 5
Working Devices : 6
  Failed Devices : 0
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 128K

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.1410301

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       50        3      active sync   /dev/sdd2
        4       8       66        4      active sync   /dev/sde2

        5       8       82        -      spare   /dev/sdf2

Cheers,

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-08 23:44   ` Robin Bowes
  2004-10-08 23:48     ` Guy
@ 2004-10-10 20:36     ` Gordon Henderson
  2004-10-10 21:35       ` Robin Bowes
  1 sibling, 1 reply; 15+ messages in thread
From: Gordon Henderson @ 2004-10-10 20:36 UTC (permalink / raw)
  To: Robin Bowes; +Cc: linux-raid

On Sat, 9 Oct 2004, Robin Bowes wrote:

> I tried this - just a simple command to start with:
>
> # bonnie++ -d /home -s10 -r4 -u0
>
> This gave the following results:
>
> Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
>                      -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
> dude.robinbowes 10M 11482  92 +++++ +++ +++++ +++ 15370 100 +++++ +++ 13406 124

Why are you removing the speed - is it something to be embarased about?

If it's very slow, are you sure the devices are operating in a DMA mode?
POI mode will use a lot of CPU and geneally make things really clunky..

> Is this normal? Should running bonnie++ result in the array being dirty
> and requiring resyncing?

No - but reading some of the later replies it seems it might not have been
fully synced to start with?

Have you let it sync now and run the tests again?

Ah right - I've just run that bonnie myself - it's +++'d out the times as
10MB is really too small a file to do anything accurate with and you've
told it you only have 4MB of RAM. It'll all end up in memory cache. I got
similar results with that command.

Don't bother with the -n option, and do get it to use a filesize of double
your RAM size. You really just want to move data into & out of the disks,
who cares (at this point) about actual file, seek, etc. IO. I use the
following scripts when testing:

/usr/local/bin/doBon:

  #!/bin/csh
  @ n = 1
  while (1)
    echo Pass number $n
    bonnie -u0 -g0 -n0 -s 1024
    @ n = $n + 1
  end

/usr/local/bin/doBon2:

  #!/bin/csh
  doBon & sleep 120
  doBon

and usually run a "doBon2" on each partition. Memory size here is 512MB.

Gordon

Ps. stop this with killall doBon2 ; killall doBon ; killall Bonnie
  ...  then rm the Bonnie* files...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-10 20:36     ` Gordon Henderson
@ 2004-10-10 21:35       ` Robin Bowes
  2004-10-10 22:38         ` Guy
  2004-10-11  8:38         ` Gordon Henderson
  0 siblings, 2 replies; 15+ messages in thread
From: Robin Bowes @ 2004-10-10 21:35 UTC (permalink / raw)
  To: Gordon Henderson; +Cc: linux-raid

Gordon Henderson wrote:
> On Sat, 9 Oct 2004, Robin Bowes wrote:
>>Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
>>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
>>Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
>>dude.robinbowes 10M 11482  92 +++++ +++ +++++ +++ 15370 100 +++++ +++ 13406 124
> 
> 
> Why are you removing the speed - is it something to be embarased about?

As you found out, bonnie does this without any carbon-based intervention!

>>Is this normal? Should running bonnie++ result in the array being dirty
>>and requiring resyncing?
> 
> 
> No - but reading some of the later replies it seems it might not have been
> fully synced to start with?

On reflection, I'm pretty sure it wasn't. It is now.

> Have you let it sync now and run the tests again?

Yes. It was faster when the array had re-synced :)

> Ah right - I've just run that bonnie myself - it's +++'d out the times as
> 10MB is really too small a file to do anything accurate with and you've
> told it you only have 4MB of RAM. It'll all end up in memory cache. I got
> similar results with that command.
> 
> Don't bother with the -n option, and do get it to use a filesize of double
> your RAM size. You really just want to move data into & out of the disks,
> who cares (at this point) about actual file, seek, etc. IO. I use the
> following scripts when testing:
> 
> /usr/local/bin/doBon:
> 
>   #!/bin/csh
>   @ n = 1
>   while (1)
>     echo Pass number $n
>     bonnie -u0 -g0 -n0 -s 1024
>     @ n = $n + 1
>   end
> 
> /usr/local/bin/doBon2:
> 
>   #!/bin/csh
>   doBon & sleep 120
>   doBon
> 
> and usually run a "doBon2" on each partition. Memory size here is 512MB.

OK, I've tried:

    bonnie++ -d /home -u0 -g0 -n0 -s 3096

(I've got 1.5G of RAM here - RAM's so cheap it's daft not to!)

This gave the following results:

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
dude.robinbow 3096M 13081  95 34159  75 12617  21 15311  92 40429  30 436.1   3
dude.robinbowes.com,3096M,13081,95,34159,75,12617,21,15311,92,40429,30,436.1,3,,,,,,,,,,,,,

I don't actually know what the figures mean - is this fast??

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Stress testing system?
  2004-10-10 21:35       ` Robin Bowes
@ 2004-10-10 22:38         ` Guy
  2004-10-11  8:38         ` Gordon Henderson
  1 sibling, 0 replies; 15+ messages in thread
From: Guy @ 2004-10-10 22:38 UTC (permalink / raw)
  To: 'Robin Bowes', 'Gordon Henderson'; +Cc: linux-raid

My system is a 500 Mhz P3 with 2 CPUs and 512 Meg ram.
My array is faster!  Hehe :)

I have a 14 disk raid5 array.  18 Gig SCSI disks, 3 SCSI buses.

bonnie++ -u0 -g0 -n0 -s 1024
Version  1.03 ------Sequential Output------ --Sequential Input- --Random-
              -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine  Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
watkins-ho 1G  3293  97 34587  89 22306  53  3549  99 67492  63 500.1   9
dude.ro 3096M 13081  95 34159  75 12617  21 15311  92 40429  30 436.1   3

2 at the same time.  This used both CPUs.  Over 100 Meg per second reads!
watkins-ho 1G  3091  85 18593  44  9733  24  3443  97 59895  60 249.6   6
watkins-ho 1G  2980  87 21176  54 10167  23  3478  99 44525  44 384.2   9
               ----     -----     -----      ----    ------     -----
Total          6071     39769     19900      6921    104420     633.8

You win on "Per Chr", this is CPU bound since it reads only 1 byte at a
time.  This is more of a CPU speed test than a disk speed test, IMHO.
During the "Per Chr" test, only 1 CPU had a load, it was at about 100%.
My guess is you have a real computer!  Maybe 1.5 Ghz.

In the other tests your CPU usage was lower, which is good for you.

Ramdom seeks... My guess is having 14 moving heads helps me a lot on this
one!  Since my disks are old.  But they are 10,000 RPM.

The bottom line:  I don't know if my array is considered fast.  I bet my
array is slow.  Today disks are so much faster than what I have.  But I have
more of them which helps performance.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Sunday, October 10, 2004 5:35 PM
To: Gordon Henderson
Cc: linux-raid@vger.kernel.org
Subject: Re: Stress testing system?

Gordon Henderson wrote:
> On Sat, 9 Oct 2004, Robin Bowes wrote:
>>Version  1.03       ------Sequential Output------ --Sequential Input-
--Random-
>>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
>>Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP
>>dude.robinbowes 10M 11482  92 +++++ +++ +++++ +++ 15370 100 +++++ +++
13406 124
> 
> 
> Why are you removing the speed - is it something to be embarased about?

As you found out, bonnie does this without any carbon-based intervention!

>>Is this normal? Should running bonnie++ result in the array being dirty
>>and requiring resyncing?
> 
> 
> No - but reading some of the later replies it seems it might not have been
> fully synced to start with?

On reflection, I'm pretty sure it wasn't. It is now.

> Have you let it sync now and run the tests again?

Yes. It was faster when the array had re-synced :)

> Ah right - I've just run that bonnie myself - it's +++'d out the times as
> 10MB is really too small a file to do anything accurate with and you've
> told it you only have 4MB of RAM. It'll all end up in memory cache. I got
> similar results with that command.
> 
> Don't bother with the -n option, and do get it to use a filesize of double
> your RAM size. You really just want to move data into & out of the disks,
> who cares (at this point) about actual file, seek, etc. IO. I use the
> following scripts when testing:
> 
> /usr/local/bin/doBon:
> 
>   #!/bin/csh
>   @ n = 1
>   while (1)
>     echo Pass number $n
>     bonnie -u0 -g0 -n0 -s 1024
>     @ n = $n + 1
>   end
> 
> /usr/local/bin/doBon2:
> 
>   #!/bin/csh
>   doBon & sleep 120
>   doBon
> 
> and usually run a "doBon2" on each partition. Memory size here is 512MB.

OK, I've tried:

    bonnie++ -d /home -u0 -g0 -n0 -s 3096

(I've got 1.5G of RAM here - RAM's so cheap it's daft not to!)

This gave the following results:

Version  1.03       ------Sequential Output------ --Sequential Input-
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
dude.robinbow 3096M 13081  95 34159  75 12617  21 15311  92 40429  30 436.1
3
dude.robinbowes.com,3096M,13081,95,34159,75,12617,21,15311,92,40429,30,436.1
,3,,,,,,,,,,,,,

I don't actually know what the figures mean - is this fast??

R.
-- 
http://robinbowes.com

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-10 21:35       ` Robin Bowes
  2004-10-10 22:38         ` Guy
@ 2004-10-11  8:38         ` Gordon Henderson
  2004-10-11  9:01           ` Brad Campbell
  1 sibling, 1 reply; 15+ messages in thread
From: Gordon Henderson @ 2004-10-11  8:38 UTC (permalink / raw)
  To: Robin Bowes; +Cc: linux-raid

On Sun, 10 Oct 2004, Robin Bowes wrote:

> This gave the following results:
>
> Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
>                      -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
> dude.robinbow 3096M 13081  95 34159  75 12617  21 15311  92 40429  30 436.1   3
> dude.robinbowes.com,3096M,13081,95,34159,75,12617,21,15311,92,40429,30,436.1,3,,,,,,,,,,,,,
>
> I don't actually know what the figures mean - is this fast??

It's not brilliant, but reasonable for a RAID5 on IDE drives. Disk head
bandwidth for comodity 7200 RPM drives is about 55MB/sec peak - although I
haven't been able to get that with the SATA controllers I've used so-far,
but I suspect thats because they are running in PATA mode.

This is a Dell with 4 SCSI drivers split over 2 controllers:

Version 1.02b       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
pixel         2000M 20492  72 54865  12 28260   6 25413  86 108190  13 332.9   0


This is a dual-Athlon with 5 IDE drives (4+hot spare)

Version 1.02b       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
red           1023M 14216  99 26663  25 17697  13 13767  95 72211  34 239.5   2

This is anothe dual athlon with just 4 IDE drives:

Version 1.02b       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
blue          1023M 12003  99 40147  44 24048  19 12078  98 87887  46 232.6   2

Your block input seems a shade low, but this is what I experienced on a
server with SATA drives which look like /dev/hdX drives. I suspect the
drivers have a bit more development to go through though.

And in any-case, depending on what you are using it for, it's probably
fast enough anyway... 100Mb Ethernet can only chuck files out at 10MB/sec
anyway, but it's always nice to have bandwidth in-hand!

Gordon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Stress testing system?
  2004-10-11  8:38         ` Gordon Henderson
@ 2004-10-11  9:01           ` Brad Campbell
  0 siblings, 0 replies; 15+ messages in thread
From: Brad Campbell @ 2004-10-11  9:01 UTC (permalink / raw)
  To: Gordon Henderson; +Cc: Robin Bowes, linux-raid

Gordon Henderson wrote:
> On Sun, 10 Oct 2004, Robin Bowes wrote:
> 
> 
>>This gave the following results:
>>
>>Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
>>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
>>Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
>>dude.robinbow 3096M 13081  95 34159  75 12617  21 15311  92 40429  30 436.1   3
>>dude.robinbowes.com,3096M,13081,95,34159,75,12617,21,15311,92,40429,30,436.1,3,,,,,,,,,,,,,
>>
>>I don't actually know what the figures mean - is this fast??
> 
> 
> It's not brilliant, but reasonable for a RAID5 on IDE drives. Disk head
> bandwidth for comodity 7200 RPM drives is about 55MB/sec peak - although I
> haven't been able to get that with the SATA controllers I've used so-far,
> but I suspect thats because they are running in PATA mode.
> 

Just a point of reference. This is on a single Athlon 2600+ with 10 7200 RPM Maxtor drives on 3 
Promise SATA150TX4 controllers

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
srv              1G 14502  45 23251  12 17102   9 23390  67 67688  24 638.4   1
srv,1G,14502,45,23251,12,17102,9,23390,67,67688,24,638.4,1,,,,,,,,,,,,,

Brad

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-10-11  9:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-08 21:32 Stress testing system? Robin Bowes
2004-10-08 22:00 ` Mike Hardy
2004-10-08 22:07   ` Gordon Henderson
2004-10-08 23:49   ` Guy
2004-10-08 22:02 ` Gordon Henderson
2004-10-08 23:44   ` Robin Bowes
2004-10-08 23:48     ` Guy
2004-10-09  9:52       ` Robin Bowes
2004-10-09 16:58         ` Guy
2004-10-09 17:19           ` Robin Bowes
2004-10-10 20:36     ` Gordon Henderson
2004-10-10 21:35       ` Robin Bowes
2004-10-10 22:38         ` Guy
2004-10-11  8:38         ` Gordon Henderson
2004-10-11  9:01           ` Brad Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).