requeuing a Scsi

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* requeuing a Scsi_Cmnd?
@ 2003-05-11 20:25 Jeff Garzik
  2003-05-12  5:43 ` Luben Tuikov
  2003-05-12 14:44 ` James Bottomley
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff Garzik @ 2003-05-11 20:25 UTC (permalink / raw)
  To: linux-scsi

This question applies to 2.4 as well as 2.5 (I believe the strategies 
are different for the two?)

Suppose I am passed several Scsi_Cmnd structures via ->queuecommand. 
TCQ depth is >1.  An event causes the entire queue to be aborted, but I 
know that the majority of the queue was actually ok.  So, my LLD would 
need to requeue and resend most of the recently-aborted Scsi_Cmnds.

How do I tell the SCSI layer to requeue and resend a Scsi_Cmnd [almost] 
immediately?

Thanks,

	Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: requeuing a Scsi_Cmnd?
  2003-05-11 20:25 requeuing a Scsi_Cmnd? Jeff Garzik
@ 2003-05-12  5:43 ` Luben Tuikov
  2003-05-12 14:44 ` James Bottomley
  1 sibling, 0 replies; 7+ messages in thread
From: Luben Tuikov @ 2003-05-12  5:43 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-scsi

Jeff Garzik wrote:
> 
> Suppose I am passed several Scsi_Cmnd structures via ->queuecommand. TCQ 
> depth is >1.  An event causes the entire queue to be aborted, but I know 
> that the majority of the queue was actually ok.

If the event was generated by the application client/SCSI Core,
then the normal logic applies -- either cancel all commands,
or cancel those which were not completed and finish those
which were.  I personally prefer the first option (unconditionally
cancel all commands on abort/clear task set) since this is what
the the spec says.

If the event was generated by other means, i.e. NOT by
the application client/SCSI Core, you can error out the
unfinished commands and return ok for the finished ones
(i.e. finish them).

>  So, my LLD would need 
> to requeue and resend most of the recently-aborted Scsi_Cmnds.
> 
> How do I tell the SCSI layer to requeue and resend a Scsi_Cmnd [almost] 
> immediately?

Return BUSY for the task(s) or TASK SET FULL for the task(s), which
you want to be requeued almost immediately.  There's a difference
between those two status codes as to their meaning, as per the spec.

In 2.4, if you return BUSY, the command is tried almost immediately,
and if you return TASK SET FULL, it is also tried but things
get too convoluted... so you're better off with BUSY.

In 2.5, BUSY and TASK SET FULL are treated the same, so you can
either go by the spec and return whichever makes sense, or
just return BUSY and you'll be fine for 2.4 and 2.5.

On a side note: remember that, as far as SCSI Transports
are concerned, after a task is cancelled (initiator sends
ABORT TASK to the TARGET) successfully, the target should
NOT send any status/response for that task, as the initiator
is NOT expecting ANY responses for a cancelled task
(across a transport, i.e. ``below'' a LLDD).

-- 
Luben

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: requeuing a Scsi_Cmnd?
  2003-05-11 20:25 requeuing a Scsi_Cmnd? Jeff Garzik
  2003-05-12  5:43 ` Luben Tuikov
@ 2003-05-12 14:44 ` James Bottomley
  2003-05-12 18:12   ` Luben Tuikov
  1 sibling, 1 reply; 7+ messages in thread
From: James Bottomley @ 2003-05-12 14:44 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: SCSI Mailing List

On Sun, 2003-05-11 at 15:25, Jeff Garzik wrote:
> This question applies to 2.4 as well as 2.5 (I believe the strategies 
> are different for the two?)
> 
> Suppose I am passed several Scsi_Cmnd structures via ->queuecommand. 
> TCQ depth is >1.  An event causes the entire queue to be aborted, but I 
> know that the majority of the queue was actually ok.  So, my LLD would 
> need to requeue and resend most of the recently-aborted Scsi_Cmnds.

You keep finding these unhandled condidions, sigh.  The correct thing to
do (since this is a situation identical to QErr set) is to return a
check condition to the failing command and to return a status of TASK
ABORTED for all the others (SPC3).  Of course, the SCSI-2 behaviour was
just to expect all tasks to be silently aborted on QErr=1.  Neither of
these, of course, is coded into the mid-layer.

> How do I tell the SCSI layer to requeue and resend a Scsi_Cmnd [almost] 
> immediately?

Probably the best thing to do is to return DID_BUS_BUSY which will force
a fast retry.  Note, that's the driver return, *not* the status BUSY,
which will force a requeue and take longer---DID_BUS_BUSY will retry
immediately from the SCSI tasklet.

James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: requeuing a Scsi_Cmnd?
  2003-05-12 14:44 ` James Bottomley
@ 2003-05-12 18:12   ` Luben Tuikov
  2003-05-12 20:30     ` James Bottomley
  2003-05-13  0:20     ` Jeff Garzik
  0 siblings, 2 replies; 7+ messages in thread
From: Luben Tuikov @ 2003-05-12 18:12 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jeff Garzik, SCSI Mailing List

James Bottomley wrote:
> On Sun, 2003-05-11 at 15:25, Jeff Garzik wrote:
> 
>>This question applies to 2.4 as well as 2.5 (I believe the strategies 
>>are different for the two?)
>>
>>Suppose I am passed several Scsi_Cmnd structures via ->queuecommand. 
>>TCQ depth is >1.  An event causes the entire queue to be aborted, but I 
>>know that the majority of the queue was actually ok.  So, my LLD would 
>>need to requeue and resend most of the recently-aborted Scsi_Cmnds.
> 
> 
> You keep finding these unhandled condidions, sigh.  The correct thing to
> do (since this is a situation identical to QErr set) is to return a
> check condition to the failing command and to return a status of TASK
> ABORTED for all the others (SPC3).  Of course, the SCSI-2 behaviour was
> just to expect all tasks to be silently aborted on QErr=1.  Neither of
> these, of course, is coded into the mid-layer.

Iff TAS is set and if TST is 001, and there is more than one initiator
whose task are being nuked, then this is correct.

Jeff didn't give much information, but it sounds like ABORT/CLEAR TASK SET.
Anyway, no point in speculating.

>>How do I tell the SCSI layer to requeue and resend a Scsi_Cmnd [almost] 
>>immediately?
> 
> 
> Probably the best thing to do is to return DID_BUS_BUSY which will force
> a fast retry.  Note, that's the driver return, *not* the status BUSY,
> which will force a requeue and take longer---DID_BUS_BUSY will retry
> immediately from the SCSI tasklet.

Uuuuh... Shouldn't this be fixed?

The problem with this is that on BUSY, the LLDD _may_ get the task
which got this condition out of order...  Furthermore, there's no
such thing as BUS BUSY anymore, it's an SPI left over.  In case
where the transport is unavailable, a service response of
SERVICE DELIVERY OR TARGET FAILURE should be returned*.
Currently SCSI Core has no facility to return a ``service response''.

* The LLDD should try to reconnect/reestablish before returning
this, of course.

I want newer LLDD to return the (task) status codes, so
that we have less problems later.  They are actually req'd to
return (task) status codes by newer versions of their transport
protocols.

-- 
Luben

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: requeuing a Scsi_Cmnd?
  2003-05-12 18:12   ` Luben Tuikov
@ 2003-05-12 20:30     ` James Bottomley
  2003-05-13  0:20     ` Jeff Garzik
  1 sibling, 0 replies; 7+ messages in thread
From: James Bottomley @ 2003-05-12 20:30 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Jeff Garzik, SCSI Mailing List

On Mon, 2003-05-12 at 13:12, Luben Tuikov wrote:
> Uuuuh... Shouldn't this be fixed?

Not really, its design behaviour.  BUSY or QUEUE_FULL return means the
actual device couldn't accept the command, so we push back on the block
queue and suspend until a returning I/O frees the queue.  DID_BUS_BUSY
indicates transient failures in the host side which are eligible for
immediate requeueing

> The problem with this is that on BUSY, the LLDD _may_ get the task
> which got this condition out of order...  Furthermore, there's no

This would be why we don't support barriers.

> such thing as BUS BUSY anymore, it's an SPI left over.  In case
> where the transport is unavailable, a service response of
> SERVICE DELIVERY OR TARGET FAILURE should be returned*.
> Currently SCSI Core has no facility to return a ``service response''.

The service response are what the DID_ attributes are about.  At the end
of the day, though, the mid layer will either 

1. Fail the command
2. retry immediately, or
3. retry after another command returns.

That's what the large set of case statements in
scsi_error.c:scsi_decide_disposition() is about.

James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: requeuing a Scsi_Cmnd?
  2003-05-12 18:12   ` Luben Tuikov
  2003-05-12 20:30     ` James Bottomley
@ 2003-05-13  0:20     ` Jeff Garzik
  2003-10-30 23:48       ` Andre Hedrick
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff Garzik @ 2003-05-13  0:20 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: James Bottomley, SCSI Mailing List

Luben Tuikov wrote:
> James Bottomley wrote:
> 
>> On Sun, 2003-05-11 at 15:25, Jeff Garzik wrote:
>>
>>> This question applies to 2.4 as well as 2.5 (I believe the strategies 
>>> are different for the two?)
>>>
>>> Suppose I am passed several Scsi_Cmnd structures via ->queuecommand. 
>>> TCQ depth is >1.  An event causes the entire queue to be aborted, but 
>>> I know that the majority of the queue was actually ok.  So, my LLD 
>>> would need to requeue and resend most of the recently-aborted 
>>> Scsi_Cmnds.
>>
>>
>>
>> You keep finding these unhandled condidions, sigh.  The correct thing to
>> do (since this is a situation identical to QErr set) is to return a
>> check condition to the failing command and to return a status of TASK
>> ABORTED for all the others (SPC3).  Of course, the SCSI-2 behaviour was
>> just to expect all tasks to be silently aborted on QErr=1.  Neither of
>> these, of course, is coded into the mid-layer.
> 
> 
> Iff TAS is set and if TST is 001, and there is more than one initiator
> whose task are being nuked, then this is correct.
> 
> Jeff didn't give much information, but it sounds like ABORT/CLEAR TASK SET.
> Anyway, no point in speculating.

I'm working on a SCSI low-level driver that drives SATA host 
controllers.  For ATAPI, it's mainly a passthru.  The headache comes in 
the translation of SCSI->ATA, and in the error handling.  The SCSI->ATA 
translator can be effectively considered a SCSI simulator (or at least 
that's how I look at it), so like iSCSI I'm creating a software target, 
and I want my target to be compliant to spec.  (Which spec, you ask? 
Well, initially SCSI-2, but long term James has convinced me SCSI-3)

So for my specific example, I'm passed a bunch of Scsi_Cmnds.  I queue 
them.  And then according to spec, an error on the active command will 
cause the entire queue to abort.  Clearly, I do not want to error-out 
the other probably-valid commands, only the specific one that caused the 
error.  So, the remaining ones need to be retried.

	Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: requeuing a Scsi_Cmnd?
  2003-05-13  0:20     ` Jeff Garzik
@ 2003-10-30 23:48       ` Andre Hedrick
  0 siblings, 0 replies; 7+ messages in thread
From: Andre Hedrick @ 2003-10-30 23:48 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Luben Tuikov, James Bottomley, SCSI Mailing List


Jeff,

If the goal is to add a queue depth to hba's w/o a qdma ring, you need a
whole pile of tasklets and thread queues.

If you have a hba with qdma ring(s) then it is just like standard scsi.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

On Mon, 12 May 2003, Jeff Garzik wrote:

> Luben Tuikov wrote:
> > James Bottomley wrote:
> > 
> >> On Sun, 2003-05-11 at 15:25, Jeff Garzik wrote:
> >>
> >>> This question applies to 2.4 as well as 2.5 (I believe the strategies 
> >>> are different for the two?)
> >>>
> >>> Suppose I am passed several Scsi_Cmnd structures via ->queuecommand. 
> >>> TCQ depth is >1.  An event causes the entire queue to be aborted, but 
> >>> I know that the majority of the queue was actually ok.  So, my LLD 
> >>> would need to requeue and resend most of the recently-aborted 
> >>> Scsi_Cmnds.
> >>
> >>
> >>
> >> You keep finding these unhandled condidions, sigh.  The correct thing to
> >> do (since this is a situation identical to QErr set) is to return a
> >> check condition to the failing command and to return a status of TASK
> >> ABORTED for all the others (SPC3).  Of course, the SCSI-2 behaviour was
> >> just to expect all tasks to be silently aborted on QErr=1.  Neither of
> >> these, of course, is coded into the mid-layer.
> > 
> > 
> > Iff TAS is set and if TST is 001, and there is more than one initiator
> > whose task are being nuked, then this is correct.
> > 
> > Jeff didn't give much information, but it sounds like ABORT/CLEAR TASK SET.
> > Anyway, no point in speculating.
> 
> 
> I'm working on a SCSI low-level driver that drives SATA host 
> controllers.  For ATAPI, it's mainly a passthru.  The headache comes in 
> the translation of SCSI->ATA, and in the error handling.  The SCSI->ATA 
> translator can be effectively considered a SCSI simulator (or at least 
> that's how I look at it), so like iSCSI I'm creating a software target, 
> and I want my target to be compliant to spec.  (Which spec, you ask? 
> Well, initially SCSI-2, but long term James has convinced me SCSI-3)
> 
> So for my specific example, I'm passed a bunch of Scsi_Cmnds.  I queue 
> them.  And then according to spec, an error on the active command will 
> cause the entire queue to abort.  Clearly, I do not want to error-out 
> the other probably-valid commands, only the specific one that caused the 
> error.  So, the remaining ones need to be retried.
> 
> 	Jeff
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-10-30 23:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-11 20:25 requeuing a Scsi_Cmnd? Jeff Garzik
2003-05-12  5:43 ` Luben Tuikov
2003-05-12 14:44 ` James Bottomley
2003-05-12 18:12   ` Luben Tuikov
2003-05-12 20:30     ` James Bottomley
2003-05-13  0:20     ` Jeff Garzik
2003-10-30 23:48       ` Andre Hedrick

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox