* raid1 performance @ 2002-04-30 12:23 Jaime Medrano 2002-04-30 12:38 ` Arjan van de Ven 0 siblings, 1 reply; 9+ messages in thread From: Jaime Medrano @ 2002-04-30 12:23 UTC (permalink / raw) To: linux-kernel I have several raid arrays (level 0 and 1) in my machine and I have noticed that raid1 is much more slower than I expected. The arrays are made from two equal hds (/dev/hde, /dev/hdg). And some numbers about the read performances are: /dev/hde: 29 Mb/s /dev/hdg: 29 Mb/s /dev/md0: 27 Mb/s (raid1) /dev/md1: 56 Mb/s (raid0) /dev/md2: 27 Mb/s (raid1) These numbers comes from hdparm -tT. I have noticed a very poor performance when reading sequentially a large file from raid1 (I suppose this is what hdparm does). I have taken a look at the read balancing code at raid1.c and I have found that when a sequential read happens no balancing is done, and so all the reading is done from only one of the mirrors while the others are iddle.ç I have tried to modify the balancing algorithm in order to balance also sequential access, but I have got almost the same numbers. I have thought that the reason may be that some layer bellow is making reads of greater size than the chunks in which I balance, and so the same work is being done twice; but I don't know the way to find this. Does anybody know how this works? Regards, Jaime Medrano ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-04-30 12:23 raid1 performance Jaime Medrano @ 2002-04-30 12:38 ` Arjan van de Ven 2002-04-30 14:21 ` Kent Borg 0 siblings, 1 reply; 9+ messages in thread From: Arjan van de Ven @ 2002-04-30 12:38 UTC (permalink / raw) To: Jaime Medrano; +Cc: linux-kernel Jaime Medrano wrote: > > I have several raid arrays (level 0 and 1) in my machine and I have > noticed that raid1 is much more slower than I expected. > > The arrays are made from two equal hds (/dev/hde, /dev/hdg). And some > numbers about the read performances are: > > /dev/hde: 29 Mb/s > /dev/hdg: 29 Mb/s > /dev/md0: 27 Mb/s (raid1) > /dev/md1: 56 Mb/s (raid0) > /dev/md2: 27 Mb/s (raid1) > > These numbers comes from hdparm -tT. I have noticed a very poor > performance when reading sequentially a large file from raid1 (I suppose > this is what hdparm does). > > I have taken a look at the read balancing code at raid1.c and I have found > that when a sequential read happens no balancing is done, and so all the > reading is done from only one of the mirrors while the others are iddle.ç Yes this is expected. Sequential reads from RAID1 with the current on disk format are as fast as the fastest disk. The reason for this is simple: <ascii art of the on disk layout, each letter is a "block"> Disk 1: ABCDEFGHIJK Disk 2: ABCDEFGHIJK If you read block A from disk 1, to get more than the speed for just 1 disk you would need to read block B from disk 2 *in parallel*, and so far so good. However then you need to read block C, and to do it in parallel you need to read it from Disk 1, but disk 1's diskhead was at block A -> so you get a head seek. or if the drive is trying to be intelligent it'll read block B into it's own cache anyway and then block C after that (which is the more common case). Etc etc. This later case effectively means that Disk 1 will still read ALL blocks from the platter into the drive's cache, and of course Disk 2 will do likewise. In just about all cases you care about the platter transfer rate is the limiting facter and not the "disk to host" rate. So both disk 1 and disk 2 are reading ALL the data at platter speed, which means the maximum speed at which you can get the data is at platter speed. Now if the disk wasn't smart and was doing seeks, it would suck much much more due to the high cost of seeks.... The only way to get the "1 thread sequential read" case faster is by modifying the disk layout to be Disk 1: ACEGIKBDFHJ Disk 2: ACEGIKBDFHJ where disk 1 again reads block A, and disk 2 reads block B. To read block C, disk 1 doesn't have to move it's head or read a dummy block away, it can read block C sequention, and disk 2 can read block D that way. That way the disks actually each only read the relevant blocks in a sequential way and you get (in theory) 2x the performance of 1 disk. Greetings, Arjan van de Ven ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-04-30 12:38 ` Arjan van de Ven @ 2002-04-30 14:21 ` Kent Borg 2002-05-01 16:35 ` Jakob Østergaard 0 siblings, 1 reply; 9+ messages in thread From: Kent Borg @ 2002-04-30 14:21 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Jaime Medrano, linux-kernel On Tue, Apr 30, 2002 at 01:38:16PM +0100, Arjan van de Ven wrote, very roughly: [that RAID 1 is only as fast in reading as the fastest disk because of seeking over alternate blocks, and ] > The only way to get the "1 thread sequential read" case faster is by > modifying the disk layout to be > > Disk 1: ACEGIKBDFHJ > Disk 2: ACEGIKBDFHJ > > where disk 1 again reads block A, and disk 2 reads block B. To read > block C, disk 1 doesn't have to move it's head or read a dummy block > away, it can read block C sequention, and disk 2 can read block D > that way. > > That way the disks actually each only read the relevant blocks in a > sequential way and you get (in theory) 2x the performance of 1 disk. I am confused. Assuming a big enough read is requested to allow a parallelizing to two disks, why can't the second disk be told not to read alternate blocks but to start reading sequential blocks starting half way up the request? Also, why does hdparm give me significantly faster read numbers on /dev/md<whatever> than it does on /dev/hd<whatever>? I had assumed there was parallelizing going on. Does this mean I would get a speed improvement if I ran my single disk notebook as a single disk RAID 1 because there is some bigger or better buffering going on in that code even without parallelizing? Thanks, -kb ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-04-30 14:21 ` Kent Borg @ 2002-05-01 16:35 ` Jakob Østergaard 2002-05-01 17:01 ` Kent Borg 0 siblings, 1 reply; 9+ messages in thread From: Jakob Østergaard @ 2002-05-01 16:35 UTC (permalink / raw) To: Kent Borg; +Cc: Arjan van de Ven, Jaime Medrano, linux-kernel On Tue, Apr 30, 2002 at 10:21:48AM -0400, Kent Borg wrote: > On Tue, Apr 30, 2002 at 01:38:16PM +0100, Arjan van de Ven wrote, very > roughly: > [that RAID 1 is only as fast in reading as the fastest disk because of > seeking over alternate blocks, and ] > > > The only way to get the "1 thread sequential read" case faster is by > > modifying the disk layout to be > > > > Disk 1: ACEGIKBDFHJ > > Disk 2: ACEGIKBDFHJ > > > > where disk 1 again reads block A, and disk 2 reads block B. To read > > block C, disk 1 doesn't have to move it's head or read a dummy block > > away, it can read block C sequention, and disk 2 can read block D > > that way. > > > > That way the disks actually each only read the relevant blocks in a > > sequential way and you get (in theory) 2x the performance of 1 disk. > > I am confused. > > Assuming a big enough read is requested to allow a parallelizing to > two disks, why can't the second disk be told not to read alternate > blocks but to start reading sequential blocks starting half way up the > request? This is *not* as simple as it sounds. Believe me, I spent a week trying... However, with ext2 (and other filesystems as well), a large sequential file read is *not* sequential on the disk. You should actually see better performance on RAID-1 than on a single disk for very large reads, becuase some of the lookups needed (block indirection or whatever) will be run by the "best" disk in the given situation. > > Also, why does hdparm give me significantly faster read numbers on > /dev/md<whatever> than it does on /dev/hd<whatever>? I had assumed > there was parallelizing going on. Does this mean I would get a speed > improvement if I ran my single disk notebook as a single disk RAID 1 > because there is some bigger or better buffering going on in that code > even without parallelizing? hdparm is not a good benchmark for this. Use bonnie, bonnie++, tiotest, or even 'dd' with *huge* files. -- ................................................................ : jakob@unthought.net : And I see the elder races, : :.........................: putrid forms of man : : Jakob Østergaard : See him rise and claim the earth, : : OZ9ABN : his downfall is at hand. : :.........................:............{Konkhra}...............: ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-05-01 16:35 ` Jakob Østergaard @ 2002-05-01 17:01 ` Kent Borg 2002-05-01 17:16 ` Justin Cormack 2002-05-01 21:23 ` Bernd Eckenfels 0 siblings, 2 replies; 9+ messages in thread From: Kent Borg @ 2002-05-01 17:01 UTC (permalink / raw) To: Jakob Østergaard, Arjan van de Ven, Jaime Medrano, linux-kernel On Wed, May 01, 2002 at 06:35:53PM +0200, Jakob Østergaard wrote: > This is *not* as simple as it sounds. Believe me, I spent a week trying... > > However, with ext2 (and other filesystems as well), a large sequential file > read is *not* sequential on the disk. You should actually see better performance > on RAID-1 than on a single disk for very large reads, becuase some of the lookups > needed (block indirection or whatever) will be run by the "best" disk in the given > situation. Lemme see if I am getting closer. When reading the disk there will be head seeks necessary. When there are two disks, each with its own complete copy of all the data, there is no reason to keep the two disks' heads in the same place. If their heads are in different places, a read can be issued to the disk whose heads are closer to the desired location. This then brings up two more questions: 1. Does the OS even know where the heads are in a modern IDE disk? 2. Is "closer" any more finely grained than a binary positioned/not-positioned? And I guess another question: How much does RAID 1 help and under what kinds of usage? Thanks, -kb, the Kent who is getting smarter. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-05-01 17:01 ` Kent Borg @ 2002-05-01 17:16 ` Justin Cormack 2002-05-01 21:23 ` Bernd Eckenfels 1 sibling, 0 replies; 9+ messages in thread From: Justin Cormack @ 2002-05-01 17:16 UTC (permalink / raw) To: Kent Borg; +Cc: linux-kernel > Lemme see if I am getting closer. > > When reading the disk there will be head seeks necessary. When there > are two disks, each with its own complete copy of all the data, there > is no reason to keep the two disks' heads in the same place. If their > heads are in different places, a read can be issued to the disk whose > heads are closer to the desired location. yes. Look at raid1.c: the code is quite clear. Older versions didnt. > This then brings up two more questions: > > 1. Does the OS even know where the heads are in a modern IDE disk? Not really. But there is probably a vague correspondence. Especially if you havent remapped any bad sectors. > 2. Is "closer" any more finely grained than a binary > positioned/not-positioned? I think so. You can see different performance regions on disks (ie they are faster on the outside for example). You could of course write a program to test seek times from different areas and build up a real locality map. It might not be worth it though. > And I guess another question: How much does RAID 1 help and under what > kinds of usage? the latency is noticeably less in some cases, as the seeks should be smaller on average. I have found this useful sometimes. Justin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-05-01 17:01 ` Kent Borg 2002-05-01 17:16 ` Justin Cormack @ 2002-05-01 21:23 ` Bernd Eckenfels 2002-05-02 16:37 ` Jakob Østergaard 1 sibling, 1 reply; 9+ messages in thread From: Bernd Eckenfels @ 2002-05-01 21:23 UTC (permalink / raw) To: linux-kernel In article <20020501130127.A10936@borg.org> you wrote: > 1. Does the OS even know where the heads are in a modern IDE disk? > 2. Is "closer" any more finely grained than a binary > positioned/not-positioned? > And I guess another question: How much does RAID 1 help and under what > kinds of usage? No, you just distribute the ready round robin, this means each disk has only half the seeks it had before. As long as you do not spread continous blocks (readahead) stats are good you actually reduce overall seeks. This helps actually even if no seek is involved because of the fact that you need to wait for the begin of a track to read it. Greetings Bernd ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-05-01 21:23 ` Bernd Eckenfels @ 2002-05-02 16:37 ` Jakob Østergaard 2002-06-29 0:01 ` Bernd Eckenfels 0 siblings, 1 reply; 9+ messages in thread From: Jakob Østergaard @ 2002-05-02 16:37 UTC (permalink / raw) To: Bernd Eckenfels; +Cc: linux-kernel On Wed, May 01, 2002 at 11:23:23PM +0200, Bernd Eckenfels wrote: > In article <20020501130127.A10936@borg.org> you wrote: > > 1. Does the OS even know where the heads are in a modern IDE disk? > > > 2. Is "closer" any more finely grained than a binary > > positioned/not-positioned? > > > And I guess another question: How much does RAID 1 help and under what > > kinds of usage? > > No, you just distribute the ready round robin, this means each disk has only > half the seeks it had before. No, this is the way it was done a long time ago. It turns out to be an incredibly bad idea. In fact, it is the most CPU-efficient way of guaranteeing the largest average seek times on your disks ;) The RAID-1 code now looks at which disk worked closest to the wanted position last, and picks that disk for the seek. > As long as you do not spread continous blocks > (readahead) stats are good you actually reduce overall seeks. This helps > actually even if no seek is involved because of the fact that you need to > wait for the begin of a track to read it. The "new" code (which is not that new anymore) will allow one disk to keep on a single sequential read for a long time (eventually it will kick in the idle disk(s) though). -- ................................................................ : jakob@unthought.net : And I see the elder races, : :.........................: putrid forms of man : : Jakob Østergaard : See him rise and claim the earth, : : OZ9ABN : his downfall is at hand. : :.........................:............{Konkhra}...............: ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: raid1 performance 2002-05-02 16:37 ` Jakob Østergaard @ 2002-06-29 0:01 ` Bernd Eckenfels 0 siblings, 0 replies; 9+ messages in thread From: Bernd Eckenfels @ 2002-06-29 0:01 UTC (permalink / raw) To: linux-kernel In article <20020502183758.Q31556@unthought.net> you wrote: >> No, you just distribute the ready round robin, this means each disk has only >> half the seeks it had before. > No, this is the way it was done a long time ago. > It turns out to be an incredibly bad idea. In fact, it is the most CPU-efficient > way of guaranteeing the largest average seek times on your disks ;) > The RAID-1 code now looks at which disk worked closest to the wanted position > last, and picks that disk for the seek. Thats right, it is done on the distance in sector numbers. Thats a simple compare, not sure if one could do that better. raid1.c:raid1_read_balance() Greetings Bernd ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2002-06-28 23:58 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-04-30 12:23 raid1 performance Jaime Medrano 2002-04-30 12:38 ` Arjan van de Ven 2002-04-30 14:21 ` Kent Borg 2002-05-01 16:35 ` Jakob Østergaard 2002-05-01 17:01 ` Kent Borg 2002-05-01 17:16 ` Justin Cormack 2002-05-01 21:23 ` Bernd Eckenfels 2002-05-02 16:37 ` Jakob Østergaard 2002-06-29 0:01 ` Bernd Eckenfels
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox