raid1 performance

All of lore.kernel.org
 help / color / mirror / Atom feed

* raid1 performance
@ 2002-04-30 12:23 Jaime Medrano
  2002-04-30 12:38 ` Arjan van de Ven
  0 siblings, 1 reply; 26+ messages in thread
From: Jaime Medrano @ 2002-04-30 12:23 UTC (permalink / raw)
  To: linux-kernel

I have several raid arrays (level 0 and 1) in my machine and I have
noticed that raid1 is much more slower than I expected.

The arrays are made from two equal hds (/dev/hde, /dev/hdg). And some
numbers about the read performances are:

/dev/hde: 29 Mb/s
/dev/hdg: 29 Mb/s
/dev/md0: 27 Mb/s (raid1)
/dev/md1: 56 Mb/s (raid0)
/dev/md2: 27 Mb/s (raid1)

These numbers comes from hdparm -tT. I have noticed a very poor
performance when reading sequentially a large file from raid1 (I suppose
this is what hdparm does).

I have taken a look at the read balancing code at raid1.c and I have found
that when a sequential read happens no balancing is done, and so all the
reading is done from only one of the mirrors while the others are iddle.ç

I have tried to modify the balancing algorithm in order to balance also
sequential access, but I have got almost the same numbers.

I have thought that the reason may be that some layer bellow is making
reads of greater size than the chunks in which I balance, and so the same
work is being done twice; but I don't know the way to find this.

Does anybody know how this works?

Regards,
Jaime Medrano

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-04-30 12:23 raid1 performance Jaime Medrano
@ 2002-04-30 12:38 ` Arjan van de Ven
  2002-04-30 14:21   ` Kent Borg
  0 siblings, 1 reply; 26+ messages in thread
From: Arjan van de Ven @ 2002-04-30 12:38 UTC (permalink / raw)
  To: Jaime Medrano; +Cc: linux-kernel

Jaime Medrano wrote:
> 
> I have several raid arrays (level 0 and 1) in my machine and I have
> noticed that raid1 is much more slower than I expected.
> 
> The arrays are made from two equal hds (/dev/hde, /dev/hdg). And some
> numbers about the read performances are:
> 
> /dev/hde: 29 Mb/s
> /dev/hdg: 29 Mb/s
> /dev/md0: 27 Mb/s (raid1)
> /dev/md1: 56 Mb/s (raid0)
> /dev/md2: 27 Mb/s (raid1)
> 
> These numbers comes from hdparm -tT. I have noticed a very poor
> performance when reading sequentially a large file from raid1 (I suppose
> this is what hdparm does).
> 
> I have taken a look at the read balancing code at raid1.c and I have found
> that when a sequential read happens no balancing is done, and so all the
> reading is done from only one of the mirrors while the others are iddle.ç

Yes this is expected. Sequential reads from RAID1 with the 
current on disk format are as fast as the fastest disk.
The reason for this is simple: 

<ascii art of the on disk layout, each letter is a "block">

Disk 1:  ABCDEFGHIJK
Disk 2:  ABCDEFGHIJK

If you read block A from disk 1, to get more than the speed for just 1
disk
you would need to read block B from disk 2 *in parallel*, and so far so
good.
However then you need to read block C, and to do it in parallel you need
to
read it from Disk 1, but disk 1's diskhead was at block A -> so you get
a head seek.
or if the drive is trying to be intelligent it'll read block B into it's
own cache 
anyway and then block C after that (which is the more common case). Etc
etc.
This later case effectively means that Disk 1 will still read ALL blocks
from the platter
into the drive's cache, and of course Disk 2 will do likewise. In just
about all
cases you care about the platter transfer rate is the limiting facter
and not the 
"disk to host" rate. So both disk 1 and disk 2 are reading ALL the data
at platter speed,
which means the maximum speed at which you can get the data is at
platter speed.

Now if the disk wasn't smart and was doing seeks, it would suck much
much more due
to the high cost of seeks....

The only way to get the "1 thread sequential read" case faster is by
modifying the 
disk layout to be

Disk 1: ACEGIKBDFHJ
Disk 2: ACEGIKBDFHJ

where disk 1 again reads block A, and disk 2 reads block B.
To read block C, disk 1 doesn't have to move it's head or read a dummy
block away,
it can read block C sequention, and disk 2 can read block D that way.

That way the disks actually each only read the relevant blocks in a
sequential way
and you get (in theory) 2x the performance of 1 disk.

Greetings,
    Arjan van de Ven

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-04-30 12:38 ` Arjan van de Ven
@ 2002-04-30 14:21   ` Kent Borg
  2002-05-01 16:35     ` Jakob Østergaard
  0 siblings, 1 reply; 26+ messages in thread
From: Kent Borg @ 2002-04-30 14:21 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Jaime Medrano, linux-kernel

On Tue, Apr 30, 2002 at 01:38:16PM +0100, Arjan van de Ven wrote, very
roughly: 
[that RAID 1 is only as fast in reading as the fastest disk because of
seeking over alternate blocks, and ]

> The only way to get the "1 thread sequential read" case faster is by
> modifying the disk layout to be
> 
> Disk 1: ACEGIKBDFHJ
> Disk 2: ACEGIKBDFHJ
> 
> where disk 1 again reads block A, and disk 2 reads block B.  To read
> block C, disk 1 doesn't have to move it's head or read a dummy block
> away, it can read block C sequention, and disk 2 can read block D
> that way.
>
> That way the disks actually each only read the relevant blocks in a
> sequential way and you get (in theory) 2x the performance of 1 disk.

I am confused.  

Assuming a big enough read is requested to allow a parallelizing to
two disks, why can't the second disk be told not to read alternate
blocks but to start reading sequential blocks starting half way up the
request?

Also, why does hdparm give me significantly faster read numbers on
/dev/md<whatever> than it does on /dev/hd<whatever>?  I had assumed
there was parallelizing going on.  Does this mean I would get a speed
improvement if I ran my single disk notebook as a single disk RAID 1
because there is some bigger or better buffering going on in that code
even without parallelizing?

Thanks,

-kb

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-04-30 14:21   ` Kent Borg
@ 2002-05-01 16:35     ` Jakob Østergaard
  2002-05-01 17:01       ` Kent Borg
  0 siblings, 1 reply; 26+ messages in thread
From: Jakob Østergaard @ 2002-05-01 16:35 UTC (permalink / raw)
  To: Kent Borg; +Cc: Arjan van de Ven, Jaime Medrano, linux-kernel

On Tue, Apr 30, 2002 at 10:21:48AM -0400, Kent Borg wrote:
> On Tue, Apr 30, 2002 at 01:38:16PM +0100, Arjan van de Ven wrote, very
> roughly: 
> [that RAID 1 is only as fast in reading as the fastest disk because of
> seeking over alternate blocks, and ]
> 
> > The only way to get the "1 thread sequential read" case faster is by
> > modifying the disk layout to be
> > 
> > Disk 1: ACEGIKBDFHJ
> > Disk 2: ACEGIKBDFHJ
> > 
> > where disk 1 again reads block A, and disk 2 reads block B.  To read
> > block C, disk 1 doesn't have to move it's head or read a dummy block
> > away, it can read block C sequention, and disk 2 can read block D
> > that way.
> >
> > That way the disks actually each only read the relevant blocks in a
> > sequential way and you get (in theory) 2x the performance of 1 disk.
> 
> I am confused.  
> 
> Assuming a big enough read is requested to allow a parallelizing to
> two disks, why can't the second disk be told not to read alternate
> blocks but to start reading sequential blocks starting half way up the
> request?

This is *not* as simple as it sounds.  Believe me, I spent a week trying...

However, with ext2 (and other filesystems as well), a large sequential file
read is *not* sequential on the disk.  You should actually see better performance
on RAID-1 than on a single disk for very large reads, becuase some of the lookups
needed (block indirection or whatever) will be run by the "best" disk in the given
situation.

> 
> Also, why does hdparm give me significantly faster read numbers on
> /dev/md<whatever> than it does on /dev/hd<whatever>?  I had assumed
> there was parallelizing going on.  Does this mean I would get a speed
> improvement if I ran my single disk notebook as a single disk RAID 1
> because there is some bigger or better buffering going on in that code
> even without parallelizing?

hdparm is not a good benchmark for this.

Use bonnie, bonnie++, tiotest, or even 'dd' with *huge* files.

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-05-01 16:35     ` Jakob Østergaard
@ 2002-05-01 17:01       ` Kent Borg
  2002-05-01 17:16         ` Justin Cormack
  2002-05-01 21:23         ` Bernd Eckenfels
  0 siblings, 2 replies; 26+ messages in thread
From: Kent Borg @ 2002-05-01 17:01 UTC (permalink / raw)
  To: Jakob Østergaard, Arjan van de Ven, Jaime Medrano,
	linux-kernel

On Wed, May 01, 2002 at 06:35:53PM +0200, Jakob Østergaard wrote:
> This is *not* as simple as it sounds.  Believe me, I spent a week trying...
> 
> However, with ext2 (and other filesystems as well), a large sequential file
> read is *not* sequential on the disk.  You should actually see better performance
> on RAID-1 than on a single disk for very large reads, becuase some of the lookups
> needed (block indirection or whatever) will be run by the "best" disk in the given
> situation.

Lemme see if I am getting closer.  

When reading the disk there will be head seeks necessary.  When there
are two disks, each with its own complete copy of all the data, there
is no reason to keep the two disks' heads in the same place.  If their
heads are in different places, a read can be issued to the disk whose
heads are closer to the desired location.

This then brings up two more questions:

  1. Does the OS even know where the heads are in a modern IDE disk?

  2. Is "closer" any more finely grained than a binary
     positioned/not-positioned?

And I guess another question: How much does RAID 1 help and under what
kinds of usage?

Thanks,

-kb, the Kent who is getting smarter.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-05-01 17:01       ` Kent Borg
@ 2002-05-01 17:16         ` Justin Cormack
  2002-05-01 21:23         ` Bernd Eckenfels
  1 sibling, 0 replies; 26+ messages in thread
From: Justin Cormack @ 2002-05-01 17:16 UTC (permalink / raw)
  To: Kent Borg; +Cc: linux-kernel

> Lemme see if I am getting closer.  
> 
> When reading the disk there will be head seeks necessary.  When there
> are two disks, each with its own complete copy of all the data, there
> is no reason to keep the two disks' heads in the same place.  If their
> heads are in different places, a read can be issued to the disk whose
> heads are closer to the desired location.

yes. Look at raid1.c: the code is quite clear. Older versions didnt.

> This then brings up two more questions:
> 
>   1. Does the OS even know where the heads are in a modern IDE disk?

Not really. But there is probably a vague correspondence. Especially if
you havent remapped any bad sectors.

>   2. Is "closer" any more finely grained than a binary
>      positioned/not-positioned?

I think so. You can see different performance regions on disks (ie they
are faster on the outside for example). You could of course write a program
to test seek times from different areas and build up a real locality map.
It might not be worth it though.

> And I guess another question: How much does RAID 1 help and under what
> kinds of usage?

the latency is noticeably less in some cases, as the seeks should be smaller
on average. I have found this useful sometimes.

Justin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-05-01 17:01       ` Kent Borg
  2002-05-01 17:16         ` Justin Cormack
@ 2002-05-01 21:23         ` Bernd Eckenfels
  2002-05-02 16:37           ` Jakob Østergaard
  1 sibling, 1 reply; 26+ messages in thread
From: Bernd Eckenfels @ 2002-05-01 21:23 UTC (permalink / raw)
  To: linux-kernel

In article <20020501130127.A10936@borg.org> you wrote:
>  1. Does the OS even know where the heads are in a modern IDE disk?

>  2. Is "closer" any more finely grained than a binary
>     positioned/not-positioned?

> And I guess another question: How much does RAID 1 help and under what
> kinds of usage?

No, you just distribute the ready round robin, this means each disk has only
half the seeks it had before. As long as you do not spread continous blocks
(readahead) stats are good you actually reduce overall seeks. This helps
actually even if no seek is involved because of the fact that you need to
wait for the begin of a track to read it.

Greetings
Bernd

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-05-01 21:23         ` Bernd Eckenfels
@ 2002-05-02 16:37           ` Jakob Østergaard
  2002-06-29  0:01             ` Bernd Eckenfels
  0 siblings, 1 reply; 26+ messages in thread
From: Jakob Østergaard @ 2002-05-02 16:37 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

On Wed, May 01, 2002 at 11:23:23PM +0200, Bernd Eckenfels wrote:
> In article <20020501130127.A10936@borg.org> you wrote:
> >  1. Does the OS even know where the heads are in a modern IDE disk?
> 
> >  2. Is "closer" any more finely grained than a binary
> >     positioned/not-positioned?
> 
> > And I guess another question: How much does RAID 1 help and under what
> > kinds of usage?
> 
> No, you just distribute the ready round robin, this means each disk has only
> half the seeks it had before. 

No, this is the way it was done a long time ago.

It turns out to be an incredibly bad idea.  In fact, it is the most CPU-efficient
way of guaranteeing the largest average seek times on your disks  ;)

The RAID-1 code now looks at which disk worked closest to the wanted position
last, and picks that disk for the seek.

> As long as you do not spread continous blocks
> (readahead) stats are good you actually reduce overall seeks. This helps
> actually even if no seek is involved because of the fact that you need to
> wait for the begin of a track to read it.

The "new" code (which is not that new anymore) will allow one disk to keep
on a single sequential read for a long time (eventually it will kick in the
idle disk(s) though).

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2002-05-02 16:37           ` Jakob Østergaard
@ 2002-06-29  0:01             ` Bernd Eckenfels
  0 siblings, 0 replies; 26+ messages in thread
From: Bernd Eckenfels @ 2002-06-29  0:01 UTC (permalink / raw)
  To: linux-kernel

In article <20020502183758.Q31556@unthought.net> you wrote:
>> No, you just distribute the ready round robin, this means each disk has only
>> half the seeks it had before. 

> No, this is the way it was done a long time ago.

> It turns out to be an incredibly bad idea.  In fact, it is the most CPU-efficient
> way of guaranteeing the largest average seek times on your disks  ;)

> The RAID-1 code now looks at which disk worked closest to the wanted position
> last, and picks that disk for the seek.

Thats right, it is done on the distance in sector numbers. Thats a simple
compare, not sure if one could do that better.

raid1.c:raid1_read_balance()

Greetings
Bernd

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RAID1: Performance.
@ 2004-04-21  5:57 Mike Mestnik
  0 siblings, 0 replies; 26+ messages in thread
From: Mike Mestnik @ 2004-04-21  5:57 UTC (permalink / raw)
  To: Linux-RAID

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2187 bytes --]

I have looked extensively at the code in raid1.c and found that it has a
fue bugs that cause it not to do what it was intended.
The Biggest problem is that it seams to not consider that a Write will
move the heads into the same spot.  In the case of multi-process IO idle
drives may stay idle if there heads are not CLOSEST.  The other problems I
see were already discussed in a thread of a similar name.  The solution I
have come up with I think will satisfy every ones concerns.

The idea of ordering the blocks ACEBDF is vary close to the right
solution.  When drives are idle is there no point in not having them read
ahead?
If the heads are in the same place, having just finished a write, they are
both just as close.  Lets use this to our advantage, after selecting what
drive we will use.  Use that drives read-ahead plus the start of the read
and use an idle disks and read that the length of it's read-ahead value. 
This means the md device will have "N * read-ahead" read-ahead or the sum
of all read-ahead, this should be then documented.

In the case of multi-io using idle disks, thought they may not be as
close, will be better than talking a disk that is working and asking it to
move the the end of the drive only to have it move back less than 1/4 of
the disk.  Here is how...

20 <-- write.
80 <-- read(1)
96 <-- read(1) NOT 2 or 3 as they are still on 20
81 <-- read(1)
97 <-- read(1)
73 <-- read(1)

Taking non-sequential read requests and handing them ought round-robin
might look better for the example above.  IMHO the only thing saving the
current code is that it some what randomly round-robins the disks.  If you
take that ought you will see that only one drive takes the brunt of more
than %70 of the load.

I don't think the oldest used disk is a good way to go this only works if
we know that the read-ahead is full of unused data.  This would mean that
the drive is idle and it can be counted in the search for the closest
drive of a new read.  Keeping in mind any wright erases all of our book keeping.

__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25¢
http://photos.yahoo.com/ph/print_splash

^ permalink raw reply	[flat|nested] 26+ messages in thread

* raid1 performance
@ 2010-07-19 12:14 Marco
  0 siblings, 0 replies; 26+ messages in thread
From: Marco @ 2010-07-19 12:14 UTC (permalink / raw)
  To: linux-raid

Hi all,
doing a simple performance tests i obtained some very unexpected results: if i 
issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i obtain 
84 - 87 MB/s. I didn't expect a so big difference between md2 and one of its 
member. What can cause  this difference ? 

I'm running the test on a Centos 5.4, /dev/md2 is a mounted and used block 
device (lvm on top of md2 and the root partion on the lvm volume group), but the 
machine was quite idle during the tests. The controller is an Intel ICH9 with 
AHCI enabled.

thank you in advance!

 Marco

^ permalink raw reply	[flat|nested] 26+ messages in thread

* raid1 performance
@ 2010-07-25 14:58 Marco
  2010-07-25 15:19 ` Roman Mamedov
  0 siblings, 1 reply; 26+ messages in thread
From: Marco @ 2010-07-25 14:58 UTC (permalink / raw)
  To: linux-raid

Hi all,
I'm posting again the same message because I have had some problem subscribing 
the list so i'm not sure it has been received:

doing a simple performance tests i obtained some very unexpected results: if i 
issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i obtain 
84 - 87 MB/s. I didn't expect a so big difference between md2 and one of its 
member. What can cause  this difference ? 

I'm running the test on a Centos 5.4, /dev/md2 is a mounted and used block 
device (lvm on top of md2 and the root partion on the lvm volume group), but the 

machine was quite idle during the tests. The controller is an Intel ICH9 with 
AHCI enabled.

thank you in advance!

Marco

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-25 14:58 raid1 performance Marco
@ 2010-07-25 15:19 ` Roman Mamedov
  2010-07-26  9:37   ` Marco
  0 siblings, 1 reply; 26+ messages in thread
From: Roman Mamedov @ 2010-07-25 15:19 UTC (permalink / raw)
  To: Marco; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]

On Sun, 25 Jul 2010 14:58:37 +0000 (GMT)
Marco <jjletho67-diar@yahoo.it> wrote:

> doing a simple performance tests i obtained some very unexpected results: if
> i issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
> directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i
> obtain 84 - 87 MB/s. I didn't expect a so big difference between md2 and one
> of its member. What can cause  this difference ? 

Maybe their read-ahead settings are different?
Check out "blockdev --getra /dev/md2", and compare that with the same
setting of the member disks. You can experiment with changing it by using
"--setra" as well.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-25 15:19 ` Roman Mamedov
@ 2010-07-26  9:37   ` Marco
  2010-07-26 10:24     ` Keld Simonsen
  2010-07-26 11:03     ` Neil Brown
  0 siblings, 2 replies; 26+ messages in thread
From: Marco @ 2010-07-26  9:37 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid



>> doing a simple performance tests i obtained some very unexpected results: if
>> i issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
>> directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i
>> obtain 84 - 87 MB/s. I didn't expect a so big difference between md2 and one
>> of its member. What can cause  this difference ? 
>
>Maybe their read-ahead settings are different?
>Check out "blockdev --getra /dev/md2", and compare that with the same
>setting of the member disks. You can experiment with changing it by using
>"--setra" as well.

Hi Roman,
thank you for your hint, I verified the read-ahead settings and they are the 
same for all the block devices involved in the test: the value is 256 for all 
/dev/sd?? and for all /dev/md?
there should be something else which is influencing  raid 1 performance.
Have someone of you ever had a similar issue ?

thank you

Marco



      

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-26  9:37   ` Marco
@ 2010-07-26 10:24     ` Keld Simonsen
  2010-07-26 10:53       ` John Robinson
  2010-07-27 16:10       ` Marco
  2010-07-26 11:03     ` Neil Brown
  1 sibling, 2 replies; 26+ messages in thread
From: Keld Simonsen @ 2010-07-26 10:24 UTC (permalink / raw)
  To: Marco; +Cc: Roman Mamedov, linux-raid

On Mon, Jul 26, 2010 at 09:37:20AM +0000, Marco wrote:
> 
> 
> >> doing a simple performance tests i obtained some very unexpected results: if
> >> i issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
> >> directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i
> >> obtain 84 - 87 MB/s. I didn't expect a so big difference between md2 and one
> >> of its member. What can cause  this difference ? 
> >
> >Maybe their read-ahead settings are different?
> >Check out "blockdev --getra /dev/md2", and compare that with the same
> >setting of the member disks. You can experiment with changing it by using
> >"--setra" as well.
> 
> Hi Roman,
> thank you for your hint, I verified the read-ahead settings and they are the 
> same for all the block devices involved in the test: the value is 256 for all 
> /dev/sd?? and for all /dev/md?
> there should be something else which is influencing  raid 1 performance.
> Have someone of you ever had a similar issue ?

Did you try:

# Set read-ahead.
echo "Setting read-ahead to 64 MiB for /dev/md3"
blockdev --setra 65536 /dev/md3

Best regards
keld

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-26 10:24     ` Keld Simonsen
@ 2010-07-26 10:53       ` John Robinson
  2010-07-26 11:30         ` Keld Simonsen
  2010-07-27 16:10       ` Marco
  1 sibling, 1 reply; 26+ messages in thread
From: John Robinson @ 2010-07-26 10:53 UTC (permalink / raw)
  To: Keld Simonsen; +Cc: Marco, Roman Mamedov, linux-raid

On 26/07/2010 11:24, Keld Simonsen wrote:
[...]
> Did you try:
>
> # Set read-ahead.
> echo "Setting read-ahead to 64 MiB for /dev/md3"
> blockdev --setra 65536 /dev/md3

That'll set read-ahead to 32MiB, because blockdev works in sectors.

Cheers,

John.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-26  9:37   ` Marco
  2010-07-26 10:24     ` Keld Simonsen
@ 2010-07-26 11:03     ` Neil Brown
  2010-07-27  1:23       ` Leslie Rhorer
  2010-07-27 16:10       ` Marco
  1 sibling, 2 replies; 26+ messages in thread
From: Neil Brown @ 2010-07-26 11:03 UTC (permalink / raw)
  To: Marco; +Cc: Roman Mamedov, linux-raid

On Mon, 26 Jul 2010 09:37:20 +0000 (GMT)
Marco <jjletho67-diar@yahoo.it> wrote:

> 
> 
> >> doing a simple performance tests i obtained some very unexpected results: if
> >> i issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
> >> directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i
> >> obtain 84 - 87 MB/s. I didn't expect a so big difference between md2 and one
> >> of its member. What can cause  this difference ? 
> >
> >Maybe their read-ahead settings are different?
> >Check out "blockdev --getra /dev/md2", and compare that with the same
> >setting of the member disks. You can experiment with changing it by using
> >"--setra" as well.
> 
> Hi Roman,
> thank you for your hint, I verified the read-ahead settings and they are the 
> same for all the block devices involved in the test: the value is 256 for all 
> /dev/sd?? and for all /dev/md?
> there should be something else which is influencing  raid 1 performance.
> Have someone of you ever had a similar issue ?
> 

Very odd.
I just tested my test hardware and get exactly the same 56 MB/sec both for
the RAID1 and the individual devices.

There is only one way that I can think of that the accesses going via RAID1
would be different from those going direct, and that is that the starting
offset might be different if you are using 1.x metadata.
I guess if you had those new 4K-sector devices that might make a difference,
but I wouldn't really expect it to.

For a sequential read like that, md/raid1 doesn't even do read-balancing, all
the reads go the the same device.

If you look at /proc/diskstats and particularly the 4th and 6th fields for
the device that you are interested in, and then take the differences for each
field between 'before' and 'after' running a test you will get
  - the number of IO requests
  - the number of sectors

that were serviced during that time.  Taking a ratio will get you the number
of sectors per IO.  Normally more is better.

I just tested 'sde' which gave 8.25 sectors per request - so most requests
were 4K.
md2 on the other hand gave 31.02, so many requests were 16K.  That really
surprises me.
Looking at he 'queue' numbers in /sys/block/X/queue - some of which guide the
breaking up of pages into requests - all the md2 number are the same as sde
or smaller.  So I'm currently rather confused.

It might be interesting to find out what the data offset is for your RAID1
(mdadm --examine will tell you if there is one), and compare the
request/sector numbers and see if they show anything.

NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-26 10:53       ` John Robinson
@ 2010-07-26 11:30         ` Keld Simonsen
  0 siblings, 0 replies; 26+ messages in thread
From: Keld Simonsen @ 2010-07-26 11:30 UTC (permalink / raw)
  To: John Robinson; +Cc: Marco, Roman Mamedov, linux-raid

On Mon, Jul 26, 2010 at 11:53:10AM +0100, John Robinson wrote:
> On 26/07/2010 11:24, Keld Simonsen wrote:
> [...]
> >Did you try:
> >
> ># Set read-ahead.
> >echo "Setting read-ahead to 64 MiB for /dev/md3"
> >blockdev --setra 65536 /dev/md3
> 
> That'll set read-ahead to 32MiB, because blockdev works in sectors.

Thanks for the correction, updated on wiki.

keld

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: raid1 performance
  2010-07-26 11:03     ` Neil Brown
@ 2010-07-27  1:23       ` Leslie Rhorer
  2010-07-27 16:10       ` Marco
  1 sibling, 0 replies; 26+ messages in thread
From: Leslie Rhorer @ 2010-07-27  1:23 UTC (permalink / raw)
  To: 'Neil Brown'; +Cc: linux-raid

> > Hi Roman,
> > thank you for your hint, I verified the read-ahead settings and they are
> the
> > same for all the block devices involved in the test: the value is 256
> for all
> > /dev/sd?? and for all /dev/md?
> > there should be something else which is influencing  raid 1 performance.
> > Have someone of you ever had a similar issue ?
> >
> 
> Very odd.
> I just tested my test hardware and get exactly the same 56 MB/sec both for
> the RAID1 and the individual devices.
> 
> There is only one way that I can think of that the accesses going via
> RAID1
> would be different from those going direct, and that is that the starting
> offset might be different if you are using 1.x metadata.
> I guess if you had those new 4K-sector devices that might make a
> difference,
> but I wouldn't really expect it to.
> 
> For a sequential read like that, md/raid1 doesn't even do read-balancing,
> all
> the reads go the the same device.
> 
> If you look at /proc/diskstats and particularly the 4th and 6th fields for
> the device that you are interested in, and then take the differences for
> each
> field between 'before' and 'after' running a test you will get
>   - the number of IO requests
>   - the number of sectors
> 
> that were serviced during that time.  Taking a ratio will get you the
> number
> of sectors per IO.  Normally more is better.
> 
> I just tested 'sde' which gave 8.25 sectors per request - so most requests
> were 4K.
> md2 on the other hand gave 31.02, so many requests were 16K.  That really
> surprises me.
> Looking at he 'queue' numbers in /sys/block/X/queue - some of which guide
> the
> breaking up of pages into requests - all the md2 number are the same as
> sde
> or smaller.  So I'm currently rather confused.
> 
> It might be interesting to find out what the data offset is for your RAID1
> (mdadm --examine will tell you if there is one), and compare the
> request/sector numbers and see if they show anything.

	Interesting.  My video server has been up for 11 days since my last
reboot, and the individual members are showing an average of 173
sectors/request, while the RAID6 array is showing 167.  Meanwhile, the
backup server (up 9 days) is showing an average of 309 sectors/request,
while the RAID6 array there is showing 210.  Are these high numbers compared
to yours because most of the accesses are sequential?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-26 11:03     ` Neil Brown
  2010-07-27  1:23       ` Leslie Rhorer
@ 2010-07-27 16:10       ` Marco
  2010-07-27 22:23         ` Neil Brown
  1 sibling, 1 reply; 26+ messages in thread
From: Marco @ 2010-07-27 16:10 UTC (permalink / raw)
  To: Neil Brown; +Cc: Roman Mamedov, linux-raid





>If you look at /proc/diskstats and particularly the 4th and 6th fields for
>the device that you are interested in, and then take the differences for each
>field between 'before' and 'after' running a test you will get
> - the number of IO requests
>  - the number of sectors
>
>that were serviced during that time.  Taking a ratio will get you the number
>of sectors per IO.  Normally more is better.

Hi Neil,
are you sure the right value are the 4th and the 6th fields ? I see strange 
value in them (the 4th field is bigger then the 6th while i was expecting the 
contrary)
looking in the iostats.txt file (kernel documentation) i suspect the fields you 
are interested are the first and the third.
In this hypothesis (1st and 3rd fields) the ratio i obtain are:

sda 84.02 
sdb 113,69
md2 8,01

md2 and his meber have a very different ratio....

>It might be interesting to find out what the data offset is for your RAID1
>(mdadm --examine will tell you if there is one), and compare the
> request/sector numbers and see if they show anything.

this is the output of mdadm --examine /dev/sda3

/dev/sda3:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : d7ca6fdd:8cf9e3ed:bea46eb0:98c63a97
  Creation Time : Tue Sep 23 15:49:29 2008
     Raid Level : raid1
  Used Dev Size : 237826176 (226.81 GiB 243.53 GB)
     Array Size : 237826176 (226.81 GiB 243.53 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2

    Update Time : Tue Jul 27 18:07:41 2010
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 8afc34a - correct
         Events : 18156


      Number   Major   Minor   RaidDevice State
this     0       8        3        0      active sync   /dev/sda3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3

Do you have some hypothesis ?

thank you all

Marco



      

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-26 10:24     ` Keld Simonsen
  2010-07-26 10:53       ` John Robinson
@ 2010-07-27 16:10       ` Marco
  1 sibling, 0 replies; 26+ messages in thread
From: Marco @ 2010-07-27 16:10 UTC (permalink / raw)
  To: Keld Simonsen; +Cc: Roman Mamedov, linux-raid

>> 
>> 
>> >> doing a simple performance tests i obtained some very unexpected results: 
>if
>> >> i issue hdparm -t /dev/md2 i obtain 61 - 65 MB/s while issuing the same test 
>>
>> >> directly on the partitions which compose md2 (/dev/sda3 and /dev/sdb3) i
>> >> obtain 84 - 87 MB/s. I didn't expect a so big difference between md2 and 
>one
>> >> of its member. What can cause  this difference ? 
>> >
>> >Maybe their read-ahead settings are different?
>> >Check out "blockdev --getra /dev/md2", and compare that with the same
>> >setting of the member disks. You can experiment with changing it by using
>> >"--setra" as well.
>> 
>> Hi Roman,
>> thank you for your hint, I verified the read-ahead settings and they are the 
>> same for all the block devices involved in the test: the value is 256 for all 

>> /dev/sd?? and for all /dev/md?
>> there should be something else which is influencing  raid 1 performance.
>> Have someone of you ever had a similar issue ?
>
>Did you try:
>
># Set read-ahead.
>echo "Setting read-ahead to 64 MiB for /dev/md3"
>blockdev --setra 65536 /dev/md3
>
Hi,
i did some test modifying the readahead value with this results:

1) setting the ra value to 65536 only to the /dev/md2 has no effect on 
performance 

2) setting the ra value to 65536 to both /dev/md2 and on his member (/dev/sda3 
and /dev/sdb3) has REDUCED the performance of the single member (hdparm -t 
/dev/sd?3 produce value betwen 65 and 75 Mb/s) and has no effect on /dev/md2 (60 
-65 MB/s)

Onestly i'm confused...an increased read ahead value should normally increase 
performance on sequetial read, shouldn't it? 


Marco


      

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-27 16:10       ` Marco
@ 2010-07-27 22:23         ` Neil Brown
  2010-07-28 12:10           ` Marco
  2010-07-31 15:21           ` Marco
  0 siblings, 2 replies; 26+ messages in thread
From: Neil Brown @ 2010-07-27 22:23 UTC (permalink / raw)
  To: Marco; +Cc: Roman Mamedov, linux-raid

On Tue, 27 Jul 2010 16:10:36 +0000 (GMT)
Marco <jjletho67-diar@yahoo.it> wrote:

> 
> 
> 
> 
> >If you look at /proc/diskstats and particularly the 4th and 6th fields for
> >the device that you are interested in, and then take the differences for each
> >field between 'before' and 'after' running a test you will get
> > - the number of IO requests
> >  - the number of sectors
> >
> >that were serviced during that time.  Taking a ratio will get you the number
> >of sectors per IO.  Normally more is better.
> 
> Hi Neil,
> are you sure the right value are the 4th and the 6th fields ? I see strange 
> value in them (the 4th field is bigger then the 6th while i was expecting the 
> contrary)
> looking in the iostats.txt file (kernel documentation) i suspect the fields you 
> are interested are the first and the third.

No, the first field is the major device number, and the third is the device
name ....
so I guess we are taking the same ratio, but you start counting at a
different place to me.

> In this hypothesis (1st and 3rd fields) the ratio i obtain are:
> 
> sda 84.02 
> sdb 113,69
> md2 8,01
> 
> md2 and his meber have a very different ratio....

Strange isn't it.  And the ratios are the other-way-around to what I get.
I don't currently understand why ... but it might not be at all relevant to
the speed difference.

> 
> >It might be interesting to find out what the data offset is for your RAID1
> >(mdadm --examine will tell you if there is one), and compare the
> > request/sector numbers and see if they show anything.
> 
> this is the output of mdadm --examine /dev/sda3
> 
> /dev/sda3:
>           Magic : a92b4efc
>         Version : 0.90.00

OK, so there is no data offset with 0.90, so that rules out differing offsets
being an issue.

> 
> Do you have some hypothesis ?

No.  It might be worth exploring why the request sizes are different - I
don't know if it will lead anywhere useful though.

NeilBrown

> 
> thank you all
> 
> Marco
> 
> 
> 
>       


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-27 22:23         ` Neil Brown
@ 2010-07-28 12:10           ` Marco
  2010-07-28 12:24             ` Neil Brown
  2010-07-31 15:21           ` Marco
  1 sibling, 1 reply; 26+ messages in thread
From: Marco @ 2010-07-28 12:10 UTC (permalink / raw)
  To: Neil Brown; +Cc: Roman Mamedov, linux-raid


> No, the first field is the major device number, and the third is the device
> name ....
> so I guess we are taking the same ratio, but you start counting at a
> different place to me.

yes you are right, i started counting after the device name :-) Sorry for the 
confusion

> No.  It might be worth exploring why the request sizes are different - I
> don't know if it will lead anywhere useful though.

Have someone else    ever had a similar issue ?
In the meanwhile i will do some more test, trying to play with blockdevice 
related parameters...
Do you think that having the root partition mounted over /dev/md2 can influence 
the test results ? 


Marco


      

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-28 12:10           ` Marco
@ 2010-07-28 12:24             ` Neil Brown
  0 siblings, 0 replies; 26+ messages in thread
From: Neil Brown @ 2010-07-28 12:24 UTC (permalink / raw)
  To: Marco; +Cc: Roman Mamedov, linux-raid

On Wed, 28 Jul 2010 12:10:28 +0000 (GMT)
Marco <jjletho67-diar@yahoo.it> wrote:
> Do you think that having the root partition mounted over /dev/md2 can influence 
> the test results ? 

I wouldn't think so, no (assuming the system is largely idle of course).

NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-27 22:23         ` Neil Brown
  2010-07-28 12:10           ` Marco
@ 2010-07-31 15:21           ` Marco
  2010-07-31 16:04             ` Keld Simonsen
  1 sibling, 1 reply; 26+ messages in thread
From: Marco @ 2010-07-31 15:21 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

>> sda 84.02 
>> sdb 113,69
>> md2 8,01
>> 
>> md2 and his meber have a very different ratio....
>
>Strange isn't it.  And the ratios are the other-way-around to what I get.
>I don't currently understand why ... but it might not be at all relevant to
>the speed difference.

Hi,
i did some test on a different machine with centos 5.5 and ICH10 disk 
controller. On this machine i have no performance issue: hdparm -t return the 
same value (about 107 MB/s) on both /dev/md0 and its members

I measured the "ratio" and i obtained 8.02 for md0 and 505 for meber disk. I it 
seems that the "raito" is not so relevant for this issue. What do you think ?
In this case the md device was not mounted.
I also tried several different io scheduler whithout noting any effect on 
sequential read eprformance.

Marco

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: raid1 performance
  2010-07-31 15:21           ` Marco
@ 2010-07-31 16:04             ` Keld Simonsen
  0 siblings, 0 replies; 26+ messages in thread
From: Keld Simonsen @ 2010-07-31 16:04 UTC (permalink / raw)
  To: Marco; +Cc: Neil Brown, linux-raid

On Sat, Jul 31, 2010 at 03:21:40PM +0000, Marco wrote:
> >> sda 84.02 
> >> sdb 113,69
> >> md2 8,01
> >> 
> >> md2 and his meber have a very different ratio....
> >
> >Strange isn't it.  And the ratios are the other-way-around to what I get.
> >I don't currently understand why ... but it might not be at all relevant to
> >the speed difference.
> 
> Hi,
> i did some test on a different machine with centos 5.5 and ICH10 disk 
> controller. On this machine i have no performance issue: hdparm -t return the 
> same value (about 107 MB/s) on both /dev/md0 and its members
> 
> I measured the "ratio" and i obtained 8.02 for md0 and 505 for meber disk. I it 
> seems that the "raito" is not so relevant for this issue. What do you think ?
> In this case the md device was not mounted.
> I also tried several different io scheduler whithout noting any effect on 
> sequential read eprformance.
> 
> Marco

maybe try out raid10,f2 instead of raid1. You should get about double
the sequential read performance out of your raids, then, eg tested
with hdparm.

best regards
keld

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-07-31 16:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-25 14:58 raid1 performance Marco
2010-07-25 15:19 ` Roman Mamedov
2010-07-26  9:37   ` Marco
2010-07-26 10:24     ` Keld Simonsen
2010-07-26 10:53       ` John Robinson
2010-07-26 11:30         ` Keld Simonsen
2010-07-27 16:10       ` Marco
2010-07-26 11:03     ` Neil Brown
2010-07-27  1:23       ` Leslie Rhorer
2010-07-27 16:10       ` Marco
2010-07-27 22:23         ` Neil Brown
2010-07-28 12:10           ` Marco
2010-07-28 12:24             ` Neil Brown
2010-07-31 15:21           ` Marco
2010-07-31 16:04             ` Keld Simonsen
  -- strict thread matches above, loose matches on Subject: below --
2010-07-19 12:14 Marco
2004-04-21  5:57 RAID1: Performance Mike Mestnik
2002-04-30 12:23 raid1 performance Jaime Medrano
2002-04-30 12:38 ` Arjan van de Ven
2002-04-30 14:21   ` Kent Borg
2002-05-01 16:35     ` Jakob Østergaard
2002-05-01 17:01       ` Kent Borg
2002-05-01 17:16         ` Justin Cormack
2002-05-01 21:23         ` Bernd Eckenfels
2002-05-02 16:37           ` Jakob Østergaard
2002-06-29  0:01             ` Bernd Eckenfels

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.