* queue_depth tracking from LLD
@ 2009-04-16 9:36 Christof Schmitt
2009-04-16 14:13 ` James Smart
0 siblings, 1 reply; 8+ messages in thread
From: Christof Schmitt @ 2009-04-16 9:36 UTC (permalink / raw)
To: linux-scsi
I just came across the SCSI midlayer function scsi_track_queue_full.
If a SCSI command is returned with a status of QUEUE_FULL, then this
is mapped to ADD_TO_MLQUEUE and "device blocked". So, there is already
a mechanism in place. Is a LLD driver expected to additionally call
something like this to decrease the queue depth?
if (status_byte(scmd->result) == QUEUE_FULL)
scsi_track_queue_full(sdev, sdev->queue_depth - 1))
If a LLD does this, should it also increase the queue depth again when
no more QUEUE_FULL status are seen? To me this looks like a
duplication of the midlayer device blocking, but i assume there is a
reason in having both, scsi_track_queue_full and the device blocking.
--
Christof Schmitt
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 9:36 queue_depth tracking from LLD Christof Schmitt
@ 2009-04-16 14:13 ` James Smart
2009-04-16 14:27 ` Mike Christie
2009-04-16 14:33 ` Matthew Wilcox
0 siblings, 2 replies; 8+ messages in thread
From: James Smart @ 2009-04-16 14:13 UTC (permalink / raw)
To: Christof Schmitt; +Cc: linux-scsi@vger.kernel.org
The mid-layer queue depth handling is really designed/optimized around
behavior for
a JBOD. This, if it's a single-lun device, the LLDD could largely ignore
doing anything
with adjusting the queue depth.
However, for arrays, with multiple luns, the queue depth is usually a
target-level resource,
so the midlayer/block-layer's implementation falls on its face fairly
quickly. I brought this
up 2 yrs ago at storage summit. What needs to happen is the creation of
queue ramp-down
and ramp-up policies that can be selected on a per-lun basis, and have
these implemented
in the midlayer (why should the LLDD ever look at scsi command
results). What will make
this difficult is the ramp-up policies, as it can be very target
device-specific or configuration/load
centric.
In the meantime, if you look at any LLDD that is worth its salt, and it
will be implementing it's
own queue ramp-down and ramp-up algorithms internally. They will look
for QUEUE_FULLs
to ramp-down, and selecting a rate and methodology for the ramp-up. They
will use this routine
to do the queue depth changing.
-- james s
Christof Schmitt wrote:
> I just came across the SCSI midlayer function scsi_track_queue_full.
>
> If a SCSI command is returned with a status of QUEUE_FULL, then this
> is mapped to ADD_TO_MLQUEUE and "device blocked". So, there is already
> a mechanism in place. Is a LLD driver expected to additionally call
> something like this to decrease the queue depth?
>
> if (status_byte(scmd->result) == QUEUE_FULL)
> scsi_track_queue_full(sdev, sdev->queue_depth - 1))
>
> If a LLD does this, should it also increase the queue depth again when
> no more QUEUE_FULL status are seen? To me this looks like a
> duplication of the midlayer device blocking, but i assume there is a
> reason in having both, scsi_track_queue_full and the device blocking.
>
> --
> Christof Schmitt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 14:13 ` James Smart
@ 2009-04-16 14:27 ` Mike Christie
2009-04-16 14:38 ` James Smart
2009-04-16 14:33 ` Matthew Wilcox
1 sibling, 1 reply; 8+ messages in thread
From: Mike Christie @ 2009-04-16 14:27 UTC (permalink / raw)
To: James Smart; +Cc: Christof Schmitt, linux-scsi@vger.kernel.org
James Smart wrote:
> The mid-layer queue depth handling is really designed/optimized around
> behavior for
> a JBOD. This, if it's a single-lun device, the LLDD could largely ignore
> doing anything
> with adjusting the queue depth.
>
> However, for arrays, with multiple luns, the queue depth is usually a
> target-level resource,
> so the midlayer/block-layer's implementation falls on its face fairly
> quickly. I brought this
> up 2 yrs ago at storage summit. What needs to happen is the creation of
> queue ramp-down
> and ramp-up policies that can be selected on a per-lun basis, and have
> these implemented
> in the midlayer (why should the LLDD ever look at scsi command
> results). What will make
> this difficult is the ramp-up policies, as it can be very target
> device-specific or configuration/load
> centric.
For the rampup are you referring to code like lpfc_rampup_queue_depth?
Were were just talking about this on the fcoe list. Why did lpfc and
qla2xxx end up implememting their own code? We started to look into
moving this into the scsi layer. It does not seem like there was a major
reason why it should not have been more common. Was it just one of those
things where it got added in one driver then added in another?
If we moved code like that to the scsi layer, then is all the is needed
is a interface to config this?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 14:13 ` James Smart
2009-04-16 14:27 ` Mike Christie
@ 2009-04-16 14:33 ` Matthew Wilcox
2009-04-16 14:40 ` James Smart
1 sibling, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2009-04-16 14:33 UTC (permalink / raw)
To: James Smart; +Cc: Christof Schmitt, linux-scsi@vger.kernel.org
On Thu, Apr 16, 2009 at 10:13:42AM -0400, James Smart wrote:
> However, for arrays, with multiple luns, the queue depth is usually a
> target-level resource,
> so the midlayer/block-layer's implementation falls on its face fairly
> quickly. I brought this
If the problem were as simple as the resource being target-level instead
of LUN-level, it would be fairly easy to fix (we could do accounting
per-target instead of per-LUN). The problem, AIUI, is multi-initiator
where you can't know whether resources are in use or not.
> up 2 yrs ago at storage summit. What needs to happen is the creation of
> queue ramp-down
> and ramp-up policies that can be selected on a per-lun basis, and have
> these implemented
> in the midlayer (why should the LLDD ever look at scsi command
> results). What will make
> this difficult is the ramp-up policies, as it can be very target
> device-specific or configuration/load
> centric.
While not disagreeing that it's complex, I don't think putting it in the
driver makes it less complex. I completely agree that LLDDs should not
be snooping scsi commands or scsi command results. It should all be in
the midlayer so we all share the same bugs ;-)
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 14:27 ` Mike Christie
@ 2009-04-16 14:38 ` James Smart
2009-04-16 15:27 ` Christof Schmitt
0 siblings, 1 reply; 8+ messages in thread
From: James Smart @ 2009-04-16 14:38 UTC (permalink / raw)
To: Mike Christie; +Cc: Christof Schmitt, linux-scsi@vger.kernel.org
Mike Christie wrote:
> James Smart wrote:
>
>> The mid-layer queue depth handling is really designed/optimized around
>> behavior for
>> a JBOD. This, if it's a single-lun device, the LLDD could largely ignore
>> doing anything
>> with adjusting the queue depth.
>>
>> However, for arrays, with multiple luns, the queue depth is usually a
>> target-level resource,
>> so the midlayer/block-layer's implementation falls on its face fairly
>> quickly. I brought this
>> up 2 yrs ago at storage summit. What needs to happen is the creation of
>> queue ramp-down
>> and ramp-up policies that can be selected on a per-lun basis, and have
>> these implemented
>> in the midlayer (why should the LLDD ever look at scsi command
>> results). What will make
>> this difficult is the ramp-up policies, as it can be very target
>> device-specific or configuration/load
>> centric.
>>
>
> For the rampup are you referring to code like lpfc_rampup_queue_depth?
> Were were just talking about this on the fcoe list. Why did lpfc and
> qla2xxx end up implememting their own code? We started to look into
> moving this into the scsi layer. It does not seem like there was a major
> reason why it should not have been more common. Was it just one of those
> things where it got added in one driver then added in another?
>
No good reason. It should be in the midlayer, and that was the
recommendation I made at
storage summit a couple of years ago. It hasn't as, for the drivers that
care, they had already
implemented it. It also isn't a relished task, as there will be lots of
discussion on how the
ramp-up should be implemented - which may mean, the need for more
algorithms.
> If we moved code like that to the scsi layer, then is all the is needed
> is a interface to config this?
>
Yep. As mentioned, figuring out what algorithm, for what device and
configuration, will be
the more interesting thing.
-- james
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 14:33 ` Matthew Wilcox
@ 2009-04-16 14:40 ` James Smart
0 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2009-04-16 14:40 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Christof Schmitt, linux-scsi@vger.kernel.org
Completely Agree. The multi-initiator point is the one I try to hammer
home. It's what the
current algorithm completely misses.
Even though I said its complex - it's really not that difficult. The
pain is just figuring out
what to group and what the rates should be.
-- james s
Matthew Wilcox wrote:
> On Thu, Apr 16, 2009 at 10:13:42AM -0400, James Smart wrote:
>
>> However, for arrays, with multiple luns, the queue depth is usually a
>> target-level resource,
>> so the midlayer/block-layer's implementation falls on its face fairly
>> quickly. I brought this
>>
>
> If the problem were as simple as the resource being target-level instead
> of LUN-level, it would be fairly easy to fix (we could do accounting
> per-target instead of per-LUN). The problem, AIUI, is multi-initiator
> where you can't know whether resources are in use or not.
>
>
>> up 2 yrs ago at storage summit. What needs to happen is the creation of
>> queue ramp-down
>> and ramp-up policies that can be selected on a per-lun basis, and have
>> these implemented
>> in the midlayer (why should the LLDD ever look at scsi command
>> results). What will make
>> this difficult is the ramp-up policies, as it can be very target
>> device-specific or configuration/load
>> centric.
>>
>
> While not disagreeing that it's complex, I don't think putting it in the
> driver makes it less complex. I completely agree that LLDDs should not
> be snooping scsi commands or scsi command results. It should all be in
> the midlayer so we all share the same bugs ;-)
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 14:38 ` James Smart
@ 2009-04-16 15:27 ` Christof Schmitt
2009-04-16 15:32 ` James Smart
0 siblings, 1 reply; 8+ messages in thread
From: Christof Schmitt @ 2009-04-16 15:27 UTC (permalink / raw)
To: James Smart; +Cc: Mike Christie, linux-scsi@vger.kernel.org
On Thu, Apr 16, 2009 at 10:38:06AM -0400, James Smart wrote:
> Mike Christie wrote:
[...]
> No good reason. It should be in the midlayer, and that was the
> recommendation I made at storage summit a couple of years ago. It
> hasn't as, for the drivers that care, they had already implemented
> it. It also isn't a relished task, as there will be lots of
> discussion on how the ramp-up should be implemented - which may
> mean, the need for more algorithms.
zfcp is one of the drivers that don't have ramp-up/down mechanism in
place. And i am trying to understand what is required here. Is there
currently work being done to get queue_depth ramp-up/down in the
midlayer?
>> If we moved code like that to the scsi layer, then is all the is needed
>> is a interface to config this?
> Yep. As mentioned, figuring out what algorithm, for what device and
> configuration, will be the more interesting thing.
If the LLDs that currently have a private ramp-up/down mechanism in
place use a similar strategy, would it make sense to first move them
to common code that can be activated from a LLD? And later refine it
with device-specific behaviour?
--
Christof Schmitt
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: queue_depth tracking from LLD
2009-04-16 15:27 ` Christof Schmitt
@ 2009-04-16 15:32 ` James Smart
0 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2009-04-16 15:32 UTC (permalink / raw)
To: Christof Schmitt; +Cc: Mike Christie, linux-scsi@vger.kernel.org
Christof Schmitt wrote:
> zfcp is one of the drivers that don't have ramp-up/down mechanism in
> place. And i am trying to understand what is required here. Is there
> currently work being done to get queue_depth ramp-up/down in the
> midlayer?
No work started - but desired.
I'd recommend, rather than implementing it in zfcp, implement it in the
midlayer for everyone.
>
>>> If we moved code like that to the scsi layer, then is all the is needed
>>> is a interface to config this?
>
>> Yep. As mentioned, figuring out what algorithm, for what device and
>> configuration, will be the more interesting thing.
>
> If the LLDs that currently have a private ramp-up/down mechanism in
> place use a similar strategy, would it make sense to first move them
> to common code that can be activated from a LLD? And later refine it
> with device-specific behaviour?
Yes.
-- james s
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-04-16 15:33 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-16 9:36 queue_depth tracking from LLD Christof Schmitt
2009-04-16 14:13 ` James Smart
2009-04-16 14:27 ` Mike Christie
2009-04-16 14:38 ` James Smart
2009-04-16 15:27 ` Christof Schmitt
2009-04-16 15:32 ` James Smart
2009-04-16 14:33 ` Matthew Wilcox
2009-04-16 14:40 ` James Smart
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox