linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Readahead with softraid1
@ 2005-07-08 12:00 Erik Slagter
  2005-07-08 12:05 ` Jens Axboe
  2005-07-08 12:16 ` Danny Cox
  0 siblings, 2 replies; 7+ messages in thread
From: Erik Slagter @ 2005-07-08 12:00 UTC (permalink / raw)
  To: linux-ide

[-- Attachment #1: Type: text/plain, Size: 810 bytes --]

Hi,

I am using softraid 1 on two sata disks and I'm trying to get the best
possible performance. IMHO read actions (if properly addressed) should
be split over the two drivers and performed independently. However, I
don't notice anything to back this up. The read performance (with the
dreaded hdparm) shows read performance on sda,sdb and md0 exactly the
same.

I've been playing a bit with readahead and it does matter a bit in that
if I disable readahead for sda/sdb completely, the read rate for these
goes completely down (to be expected) whilst the read rate on md0 stays
the same (also a bit to be expected). Other combinations do not show any
significant impact.

I also played with the i/o scheduler and nr_requests (as from previous
messages here).

What am I doing wrong here???

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Readahead with softraid1
  2005-07-08 12:00 Readahead with softraid1 Erik Slagter
@ 2005-07-08 12:05 ` Jens Axboe
  2005-07-08 12:16 ` Danny Cox
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2005-07-08 12:05 UTC (permalink / raw)
  To: Erik Slagter; +Cc: linux-ide

On Fri, Jul 08 2005, Erik Slagter wrote:
> Hi,
> 
> I am using softraid 1 on two sata disks and I'm trying to get the best
> possible performance. IMHO read actions (if properly addressed) should
> be split over the two drivers and performed independently. However, I
> don't notice anything to back this up. The read performance (with the
> dreaded hdparm) shows read performance on sda,sdb and md0 exactly the
> same.
> 
> I've been playing a bit with readahead and it does matter a bit in that
> if I disable readahead for sda/sdb completely, the read rate for these
> goes completely down (to be expected) whilst the read rate on md0 stays
> the same (also a bit to be expected). Other combinations do not show any
> significant impact.
> 
> I also played with the i/o scheduler and nr_requests (as from previous
> messages here).
> 
> What am I doing wrong here???

raid1 doesn't split reads, currently you need more than one process
doing io to get a performance increase.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Readahead with softraid1
  2005-07-08 12:00 Readahead with softraid1 Erik Slagter
  2005-07-08 12:05 ` Jens Axboe
@ 2005-07-08 12:16 ` Danny Cox
  2005-07-08 13:16   ` Erik Slagter
  2005-07-08 15:28   ` Greg Freemyer
  1 sibling, 2 replies; 7+ messages in thread
From: Danny Cox @ 2005-07-08 12:16 UTC (permalink / raw)
  To: Erik Slagter; +Cc: Linux IDE List

Erik,

On Fri, 2005-07-08 at 14:00 +0200, Erik Slagter wrote:
> I am using softraid 1 on two sata disks and I'm trying to get the best
> possible performance. IMHO read actions (if properly addressed) should
> be split over the two drivers and performed independently. However, I
> don't notice anything to back this up. The read performance (with the
> dreaded hdparm) shows read performance on sda,sdb and md0 exactly the
> same.
...
> What am I doing wrong here???

	Nothing.  I'll take a shot at answering this one instead of lurking
this time.  Then, I'll crawl back under my rock.

	The raid1 driver keeps a "last visited block" for each drive.  This is
the block number that was most recently read or written by that drive.
When a read request arrives, the driver examines each drive for the
nearest last visited block to the one requested.  Guess what?  If the
read starts with drive sda, then it will *always* be the one chosen to
service the read in the future, because the last visited block number is
only one off.  This would only change if there are multiple processes
performing I/O on the md device.  Then, it may switch to another drive.
In any case, it will *tend* to stick with the same drive.

	Did I explain that well, or only muddy the waters?

-- 
Daniel S. Cox
Internet Commerce Corporation


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Readahead with softraid1
  2005-07-08 12:16 ` Danny Cox
@ 2005-07-08 13:16   ` Erik Slagter
  2005-07-08 13:30     ` Jens Axboe
  2005-07-08 13:42     ` Danny Cox
  2005-07-08 15:28   ` Greg Freemyer
  1 sibling, 2 replies; 7+ messages in thread
From: Erik Slagter @ 2005-07-08 13:16 UTC (permalink / raw)
  To: DCox; +Cc: Linux IDE List

[-- Attachment #1: Type: text/plain, Size: 1470 bytes --]

On Fri, 2005-07-08 at 08:16 -0400, Danny Cox wrote:

> > What am I doing wrong here???
> 
> 	Nothing.  I'll take a shot at answering this one instead of lurking
> this time.  Then, I'll crawl back under my rock.
> 
> 	The raid1 driver keeps a "last visited block" for each drive.  This is
> the block number that was most recently read or written by that drive.
> When a read request arrives, the driver examines each drive for the
> nearest last visited block to the one requested.  Guess what?  If the
> read starts with drive sda, then it will *always* be the one chosen to
> service the read in the future, because the last visited block number is
> only one off.  This would only change if there are multiple processes
> performing I/O on the md device.  Then, it may switch to another drive.
> In any case, it will *tend* to stick with the same drive.
> 
> 	Did I explain that well, or only muddy the waters?

perfect explanation, thanks (and also Jens!).

Is this a design decision or is it fundamentaly impossible to split the
work amongst several drives?  I guess the md driver at least do a
prefetch of the next block/chunk on the "other" drive(s)?

Now I am still wondering about the readahead issue. What would be a sane
setting? I guess having both the drives and md doing readahead is not
optimal?

Also I noticed that the default readahead value changed significantly
from 2.4 to 2.6, is there a particular reason for that?

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Readahead with softraid1
  2005-07-08 13:16   ` Erik Slagter
@ 2005-07-08 13:30     ` Jens Axboe
  2005-07-08 13:42     ` Danny Cox
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2005-07-08 13:30 UTC (permalink / raw)
  To: Erik Slagter; +Cc: DCox, Linux IDE List

On Fri, Jul 08 2005, Erik Slagter wrote:
> On Fri, 2005-07-08 at 08:16 -0400, Danny Cox wrote:
> 
> > > What am I doing wrong here???
> > 
> > 	Nothing.  I'll take a shot at answering this one instead of lurking
> > this time.  Then, I'll crawl back under my rock.
> > 
> > 	The raid1 driver keeps a "last visited block" for each drive.  This is
> > the block number that was most recently read or written by that drive.
> > When a read request arrives, the driver examines each drive for the
> > nearest last visited block to the one requested.  Guess what?  If the
> > read starts with drive sda, then it will *always* be the one chosen to
> > service the read in the future, because the last visited block number is
> > only one off.  This would only change if there are multiple processes
> > performing I/O on the md device.  Then, it may switch to another drive.
> > In any case, it will *tend* to stick with the same drive.
> > 
> > 	Did I explain that well, or only muddy the waters?
> 
> perfect explanation, thanks (and also Jens!).
> 
> Is this a design decision or is it fundamentaly impossible to split the
> work amongst several drives?  I guess the md driver at least do a
> prefetch of the next block/chunk on the "other" drive(s)?

Not sure if it's a design decision or just "this works ok, I'll fix it
later". Clearly there is a lot of room for improvement in the balancing
logic, to get more cases correct/faster. It's quite doable to split a
bio and send bits of it to various drives.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Readahead with softraid1
  2005-07-08 13:16   ` Erik Slagter
  2005-07-08 13:30     ` Jens Axboe
@ 2005-07-08 13:42     ` Danny Cox
  1 sibling, 0 replies; 7+ messages in thread
From: Danny Cox @ 2005-07-08 13:42 UTC (permalink / raw)
  To: Erik Slagter; +Cc: Linux IDE List

Erik,

On Fri, 2005-07-08 at 15:16 +0200, Erik Slagter wrote:
> > 	Did I explain that well, or only muddy the waters?
> 
> perfect explanation, thanks (and also Jens!).

	Now, I don't know about perfect.... ;-)

> Is this a design decision or is it fundamentaly impossible to split the
> work amongst several drives?  I guess the md driver at least do a
> prefetch of the next block/chunk on the "other" drive(s)?

	If you're doing random reads on the drive, the raid1 logic can really
help.  I don't know of a benchmark off the top of my head that may
demonstrate that, although something from the Samba suite may.  I seem
to recall a program named something like 'tbench', but I don't remember
if that's the random reader or not.

> Now I am still wondering about the readahead issue. What would be a sane
> setting? I guess having both the drives and md doing readahead is not
> optimal?

	The readahead is implemented at the block layer (I think), and I can't
help you with that.  That's two levels (at least) up from the driver.
As I recall, the chain is: driver->md->block->filesystem->vfs, with
possibly the logical disk manager inserted also.  The only detail I seem
to recall is that the md driver changes a READA to a READ (read ahead to
a regular read), or at least it used to.

	Hmm.  I just took a look at 2.6.9 (yes, I'm out of date ;-), and the
function 'read_balance' actually special cases sequential reads, and
reuses the same drive.  I didn't find the READA to READ change that I
remembered, so scratch that.  Looks like Neil has been busy, and I've
been out of touch for awhile.

-- 
Daniel S. Cox
Internet Commerce Corporation


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Readahead with softraid1
  2005-07-08 12:16 ` Danny Cox
  2005-07-08 13:16   ` Erik Slagter
@ 2005-07-08 15:28   ` Greg Freemyer
  1 sibling, 0 replies; 7+ messages in thread
From: Greg Freemyer @ 2005-07-08 15:28 UTC (permalink / raw)
  To: DCox; +Cc: Erik Slagter, Linux IDE List

On 7/8/05, Danny Cox wrote:
> Erik,
> 
> On Fri, 2005-07-08 at 14:00 +0200, Erik Slagter wrote:
> > I am using softraid 1 on two sata disks and I'm trying to get the best
> > possible performance. IMHO read actions (if properly addressed) should
> > be split over the two drivers and performed independently. However, I
> > don't notice anything to back this up. The read performance (with the
> > dreaded hdparm) shows read performance on sda,sdb and md0 exactly the
> > same.
> ...
> > What am I doing wrong here???
> 
>         Nothing.  I'll take a shot at answering this one instead of lurking
> this time.  Then, I'll crawl back under my rock.
> 
>         The raid1 driver keeps a "last visited block" for each drive.  This is
> the block number that was most recently read or written by that drive.
> When a read request arrives, the driver examines each drive for the
> nearest last visited block to the one requested.  Guess what?  If the
> read starts with drive sda, then it will *always* be the one chosen to
> service the read in the future, because the last visited block number is
> only one off.  This would only change if there are multiple processes
> performing I/O on the md device.  Then, it may switch to another drive.
> In any case, it will *tend* to stick with the same drive.
> 
>         Did I explain that well, or only muddy the waters?
> 
> --
> Daniel S. Cox
> Internet Commerce Corporation
> 

Interesting.  Unfortunately, I do a lot of sequential reading with
little or no other computer activity and had wondered about the "slow"
speed of RAID 1 on read.

Does anyone know if that a common implementation on Hardware Raid
controllers too?

I have actually been working mostly with 3ware 7000 series cards, so
the md implementation does not affect me, but if that is a common
design then the 3ware card may have a similar algorithm.

Greg
-- 
Greg Freemyer
The Norcross Group
Forensics for the 21st Century

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-07-08 15:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-08 12:00 Readahead with softraid1 Erik Slagter
2005-07-08 12:05 ` Jens Axboe
2005-07-08 12:16 ` Danny Cox
2005-07-08 13:16   ` Erik Slagter
2005-07-08 13:30     ` Jens Axboe
2005-07-08 13:42     ` Danny Cox
2005-07-08 15:28   ` Greg Freemyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).