Re: Benefits from computing physical IDE disk geometry?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Benefits from computing physical IDE disk geometry?
@ 2003-04-15 18:33 Chuck Ebbert
  2003-04-16  1:16 ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2003-04-15 18:33 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, linux-raid

Nick Piggin wrote:


> OK right. As far as I can see, the algorithm in the RAID1 code
> is used to select the best drive to read from? If that is the
> case then I don't think it could make better decisions given
> more knowledge.


  How about if it just asks the elevator whether or not a given read
is a good fit with its current workload?  I saw in 2.5 where the balance
code is looking at the number of pending requests and if it's zero then
it sends it to that device.  Somehow I think something better than
that could be done, anyway.


> It seems to me that a better way to layer it would be to have
> the complex (ie deadline/AS/CFQ/etc) scheduler handling all
> requests into the raid block device, then having a raid
> scheduler distributing to the disks, and having the disks
> run no scheduler (fifo).


 That only works if RAID1 is working at the physical disk level (which
it should be AFAIC but people want flexibility to mirror partitions.)


> In practice the current scheme probably works OK, though I
> wouldn't know due to lack of resources here :P


 I've been playing with the 2.4 read balance code and have some
improvements, but real gains need a new approach.

(cc'd to linux-raid)

--
 Chuck

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Benefits from computing physical IDE disk geometry?
  2003-04-15 18:33 Benefits from computing physical IDE disk geometry? Chuck Ebbert
@ 2003-04-16  1:16 ` Nick Piggin
  2003-04-16  1:59   ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2003-04-16  1:16 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, linux-raid

Chuck Ebbert wrote:

>Nick Piggin wrote:
>
>
>
>>OK right. As far as I can see, the algorithm in the RAID1 code
>>is used to select the best drive to read from? If that is the
>>case then I don't think it could make better decisions given
>>more knowledge.
>>
>
>
>  How about if it just asks the elevator whether or not a given read
>is a good fit with its current workload?  I saw in 2.5 where the balance
>code is looking at the number of pending requests and if it's zero then
>it sends it to that device.  Somehow I think something better than
>that could be done, anyway.
>
That balance code is probably the IDE or SCSI channel balancing?
In that case, the driver simply wants to know which device it
should service next, which is an appropriate fit (is that what
you were talking about? I don't have source here sorry)

We could ask the elevator if a given read is a good fit. It
would probably help.

>
>
>
>>It seems to me that a better way to layer it would be to have
>>the complex (ie deadline/AS/CFQ/etc) scheduler handling all
>>requests into the raid block device, then having a raid
>>scheduler distributing to the disks, and having the disks
>>run no scheduler (fifo).
>>
>
>
> That only works if RAID1 is working at the physical disk level (which
>it should be AFAIC but people want flexibility to mirror partitions.)
>
How so? Basically you want your high level scheduler to run first.
You want it to act on the stream of requests from the system, not
on the stream of requests to the device. If you know what I mean.

I might be wrong here. I haven't done any testing, and only a
little bit of thinking.

>
>
>
>>In practice the current scheme probably works OK, though I
>>wouldn't know due to lack of resources here :P
>>
>
>
> I've been playing with the 2.4 read balance code and have some
>improvements, but real gains need a new approach.
>
The problem I see, is the higher level schedulers (deadline for
example, as opposed to the RAID scheduler) will find it difficult
to tell if a request will be "good" for them or not. For example
we have 2 devices, 100 requests in each scheduler queue.

Device A's head is at sector x and next request is at x+100,
Device B's head is at sector x+10 and next request is at x+200.

RAID wants to know which queue should take a request at sector
x+1000. What do you do?

The way you would do a good "goodness" function, I guess,
would be to search through all requests on the device, and return
the minimum distance from the request you are running the query
on. Do this for both queues, and insert the request into the
queue with the smallest delta. I don't see much else doing any
good.

On the other hand, if you simply have a fifo after the RAID
scheduler, the RAID scheduler itself knows where each disk's
head will end up simply by tracking the value of the last
sector it has submitted to the device. It also has the advantage
that it doesn't have "high level" scheduling stuff below it
ie. request deadline handling, elevator scheme, etc.

This gives the RAID scheduler more information, without
taking any away from the high level scheduler AFAIKS.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Benefits from computing physical IDE disk geometry?
  2003-04-16  1:16 ` Nick Piggin
@ 2003-04-16  1:59   ` Nick Piggin
  0 siblings, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2003-04-16  1:59 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Chuck Ebbert, linux-kernel, linux-raid

Nick Piggin wrote:

> Chuck Ebbert wrote:
>
>> Nick Piggin wrote:
>>
>>
>>
>>> OK right. As far as I can see, the algorithm in the RAID1 code
>>> is used to select the best drive to read from? If that is the
>>> case then I don't think it could make better decisions given
>>> more knowledge.
>>>
>>
>>
>>  How about if it just asks the elevator whether or not a given read
>> is a good fit with its current workload?  I saw in 2.5 where the balance
>> code is looking at the number of pending requests and if it's zero then
>> it sends it to that device.  Somehow I think something better than
>> that could be done, anyway.
>>
> That balance code is probably the IDE or SCSI channel balancing?
> In that case, the driver simply wants to know which device it
> should service next, which is an appropriate fit (is that what
> you were talking about? I don't have source here sorry)
>
>
> We could ask the elevator if a given read is a good fit. It
> would probably help.
>
>>
>>
>>
>>> It seems to me that a better way to layer it would be to have
>>> the complex (ie deadline/AS/CFQ/etc) scheduler handling all
>>> requests into the raid block device, then having a raid
>>> scheduler distributing to the disks, and having the disks
>>> run no scheduler (fifo).
>>>
>>
>>
>> That only works if RAID1 is working at the physical disk level (which
>> it should be AFAIC but people want flexibility to mirror partitions.)
>>
> How so? Basically you want your high level scheduler to run first.
> You want it to act on the stream of requests from the system, not
> on the stream of requests to the device. If you know what I mean.
>
> I might be wrong here. I haven't done any testing, and only a
> little bit of thinking.
>
>>
>>
>>
>>> In practice the current scheme probably works OK, though I
>>> wouldn't know due to lack of resources here :P
>>>
>>
>>
>> I've been playing with the 2.4 read balance code and have some
>> improvements, but real gains need a new approach.
>>
> The problem I see, is the higher level schedulers (deadline for
> example, as opposed to the RAID scheduler) will find it difficult
> to tell if a request will be "good" for them or not. For example
> we have 2 devices, 100 requests in each scheduler queue.
>
> Device A's head is at sector x and next request is at x+100,
> Device B's head is at sector x+10 and next request is at x+200.
>
> RAID wants to know which queue should take a request at sector
> x+1000. What do you do?
>
> The way you would do a good "goodness" function, I guess,
> would be to search through all requests on the device, and return
> the minimum distance from the request you are running the query
> on. Do this for both queues, and insert the request into the
> queue with the smallest delta. I don't see much else doing any
> good. 

Well no I'm an idiot. You obviously don't have to "search
through all requests" as they are (for AS, DL, CFQ) in an
rbtree. So that might not be too bad an idea to investigate.
But...
It still means you get the high level scheduling below where
you want it. This means the read/write batches for each queue
will not stay in sync (not sure if this is a bad thing), request
deadlines will mean even the good "goodness" calculation
does not always be good, process fairness could be badly
impacted for some loads, and AS has other problems
(hopefully not too bad).



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Benefits from computing physical IDE disk geometry?
@ 2003-04-16 13:28 Chuck Ebbert
  2003-04-16 23:06 ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2003-04-16 13:28 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, linux-raid


> The way you would do a good "goodness" function, I guess,
> would be to search through all requests on the device, and return
> the minimum distance from the request you are running the query
> on. Do this for both queues, and insert the request into the
> queue with the smallest delta. I don't see much else doing any
> good.


  That would be perfect.  And like you say in a later message, they're
in a tree so it might actually work.  Then the read balance code
wouldn't need to do that calculation at all.

  How hard would this be to add?



> On the other hand, if you simply have a fifo after the RAID
> scheduler, the RAID scheduler itself knows where each disk's
> head will end up simply by tracking the value of the last
> sector it has submitted to the device. It also has the advantage
> that it doesn't have "high level" scheduling stuff below it
> ie. request deadline handling, elevator scheme, etc.
> 
> This gives the RAID scheduler more information, without
> taking any away from the high level scheduler AFAIKS.


 But then wouldn't you have to put all that into the RAID
scheduler?

--
 Chuck

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Benefits from computing physical IDE disk geometry?
  2003-04-16 13:28 Chuck Ebbert
@ 2003-04-16 23:06 ` Nick Piggin
  0 siblings, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2003-04-16 23:06 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, linux-raid



Chuck Ebbert wrote:

>>The way you would do a good "goodness" function, I guess,
>>would be to search through all requests on the device, and return
>>the minimum distance from the request you are running the query
>>on. Do this for both queues, and insert the request into the
>>queue with the smallest delta. I don't see much else doing any
>>good.
>>
>
>
>  That would be perfect.  And like you say in a later message, they're
>in a tree so it might actually work.  Then the read balance code
>wouldn't need to do that calculation at all.
>
>  How hard would this be to add?
>
It would be easy to add. Though of course it would have to
be shown to give an improvement somewhere to be included.

>
>
>
>
>>On the other hand, if you simply have a fifo after the RAID
>>scheduler, the RAID scheduler itself knows where each disk's
>>head will end up simply by tracking the value of the last
>>sector it has submitted to the device. It also has the advantage
>>that it doesn't have "high level" scheduling stuff below it
>>ie. request deadline handling, elevator scheme, etc.
>>
>>This gives the RAID scheduler more information, without
>>taking any away from the high level scheduler AFAIKS.
>>
>
>
> But then wouldn't you have to put all that into the RAID
>scheduler?
>
No - as far as I can see, the RAID scheduler already does
this. Having FIFOs between it and the disks would simply
make its assumptions valid.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-04-16 23:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-15 18:33 Benefits from computing physical IDE disk geometry? Chuck Ebbert
2003-04-16  1:16 ` Nick Piggin
2003-04-16  1:59   ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2003-04-16 13:28 Chuck Ebbert
2003-04-16 23:06 ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).