[RFC] libata new EH document

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] libata new EH document
@ 2005-08-29  6:11 Tejun Heo
  2005-08-29  6:13 ` Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Tejun Heo @ 2005-08-29  6:11 UTC (permalink / raw)
  To: Jeff Garzik, albertcc; +Cc: linux-ide

 Hello, Jeff, Albert & ATA developers.

 This is the final one of recent document series for libata EH - SCSI
EH, ATA exceptions, libata EH and, this one - libata new EH.

 This document tries to discuss how to implement new advanced EH.  It
also describes some proposed mechanisms in detail.  I'm aware that
things are vague without actual code, but I still think this document
alone can at least help discussion if nothing else.  As long as some
consensus is reached regarding general desing, I'll follow up with
patches.

 Jeff, a lot are from my previous new EH/NCQ patchset but also quite
a bit has changed (for better, I hope).

 Thanks.

libata new EH
======================================

 As discussed in the previous libata EH doc, the current libata EH
needs some improvements.  This document discusses goals of new libata
EH and how to reach them.  Please read SCSI EH, ATA exceptions and
libata EH documents first.

TABLE OF CONTENTS

[1] Goals & design choices
    [1-1] Use SCSI hostt->eh_strategy_handler()
    [1-2] Unified error path in an EH thread
    [1-3] Synchronization
    [1-4] Clean mechanism to hand off qc's to EH
    [1-5] Separate EH qc
    [1-6] SCSI/libata separation
[2] Designs
    [2-1] Handoff of failed qc's
    [2-2] Timed out scmd's and qc's
    [2-3] Summary of [2-1] and [2-2]
    [2-4] EH processing & completion
[3] Ideas
    [3-1] Using EH for non-error exceptions and dynamic reconfiguration
    [3-2] Using EH for host_set level exclusion
[4] Implementation plan

[1] Goals & design choices

 The final goal is implementing advanced error handling as described
in ATA exceptions document including NCQ EH, dynamic transport
reconfiguration and non-error exception handling for power management
and hot plugging.

 The followings are sub goals and design choices to reach the final
goal.

[1-1] Use SCSI hostt->eh_strategy_handler()

    We have two other alternatives here - one is using fine-grained
    SCSI EH callbacks and the other is implementing separate EH for
    libata.

    Using fine-grained SCSI EH callbacks is possible, but it has too
    much SCSI/SPI assumptions in it - ATA error handling can be quite
    different from SCSI error handling.  Also, as described in the
    SCSI EH doc, it issues several SCSI commands for recovery.  They
    can be translated but recovery through translation is a bit
    creepy, IMHO.

    The second option - private EH implementation - is attractive in
    that it will be better integrated into libata.  However,
    implementing a full EH when a generic framwork is already in place
    doesn't make a lot of people happy.  And, I think integration
    problems can be worked around without too much trouble.

    The basic semantics of eh_strategy_handler() are

    - Full context EH.

    - After EH is started, all normal command processing is suspended
      until EH is complete.

    - Once EH is determined to be necessary, active commands are
      drained by suppressing all command issuing and waiting for
      in-flight commands.  When EH is finally entered, all active
      commands are failed commands.

    IMO, above semantics are fairly fundamental to block device error
    handling and, in the future, to whatever framework libata
    migrates, assuming above semantics shouldn't hurt too much.

[1-2] Unified error path in an EH thread

    Currently EH is scattered around several places including the
    interrupt handler and polling tasks.  This is problemetic for the
    following reasons.

    a. Full EH context is required for error handling.

       Advanced recovery usually involves resetting, command issuing
       and other blocking operations.

    b. Simple errors may trigger complex error handling behavior.

       For example, when an ABRT error occurs, reporting to upper
       layer is sufficent for most cases; however, repeated ABRT
       errors for known-to-be-supported commands might indicate too
       high transmission speed.  In such cases, full EH context is
       required to perform error handling.

    c. Scattered complex EH is difficult to implement and maintain.

       EH logic can be somewhat complex and scattering won't help
       implementing and maintaining it.  Also, libata low level
       drivers are allowed to override callbacks where part of EH
       logic may reside making matters worse.

[1-3] Synchronization

    A simple & concrete qc synchronization model to make sure that EH
    and any other processing don't occur concurrently is needed.

[1-4] Clean mechanism to hand off qc's to EH

    For EH to handle errors and timeouts, letting EH deal with and
    complete both errored and timed out qc's is good for simplicity
    and consistency.  To achieve this, we need a mechanism to hand off
    a qc to EH.

    Currently, libata EH has a similar mechanism to hand off a failed
    ATAPI qc to EH.  As described in libata EH doc, such qc is
    half-completed and used as place holder until EH is kicked in and
    handles it.

    This half-completion isn't very clean semantically and requires
    calling splitted internal completion routines directly.  Also, as
    such qc's are not explicitly marked as failed, not-very-intuitive
    stuff has to be done to avoid spurious interrupts or other events
    from messing with it after error has occurred.

[1-5] Separate EH qc

    EH needs to issue qc's for recovery.  There can be several ways to
    allocate EH qc.

    a. reserve one extra qc for internal/EH commands
    b. reserve one of normal qc's
    c. use failed qc
    d. complete failed qc first and reuse it

    The preferred choice is #a for the following reasons.

    - Allowing only one concurrent internal command is okay as long as
      proper allocation mechanism is implemented or only one user is
      guaranteed.

    - EH commands are restricted to non-NCQ commands, so reserving an
      extra qc won't break qc to tag mapping.

    - #b is impossible for non-NCQ devices because only one qc is
      available.

    - #c requires dancing with qc's internals.  No real nerd likes
       dancing.

    - It may be necessary to issue commands to determine whether to
      finish or retry a qc, so #d is out.

[1-6] SCSI/libata separation

    Internal libata EH logic implementation should be free from SCSI
    considerations.  All glueing work should be localized to EH
    frontend and once in the actual error handling EH should only deal
    with qc's.

[2] Designs

 This section proposes detailed design of several important mechanisms
to help discussion and verification.

[2-1] Handoff of failed qc's

 As described above, when normal command processing determines that a
qc has failed, those qc's have to be handed off to EH without being
lost.

 A new qc flag ATA_QCFLAG_ERROR is defined to mark qc's which have
failed and ata_qc_error() is defined to be used by command processing
to mark failed qc and schedule EH.  ata_qc_error() has to be called
under the same condition as ata_qc_complete() - under host_lock - and
performs the following.

 1. First check if the command is already marked with
    ATA_QCFLAG_ERROR.  If so, this isn't the first error completion
    attempt, just return.

 2. Mark the qc with ATA_QCFLAG_ERROR.

 3. As, currently, SCSI command issuing is not atomic with respect to
    SHOST_RECOVERY flag, we need a separate atomic mechanism to plug
    command issuing.  Per-port flag ATA_FLAG_ERROR is set here to
    prevent further command issuing.

 4. Corresponding scmd's result code is set to
    SAM_STAT_CHECK_CONDITION and qc->scsidone() callback is called
    directly.  As we haven't filled sense data,
    scsi_determine_disposition() will return FAILED and SCSI EH will
    be scheduled.  Note that as we directly call qc->scsidone(), qc is
    left intact.

 After above function is complete, the following conditions are true.

 a. The qc has ATA_QCFLAG_ERROR set and no further normal qc
    processing will happen for the command.

 b. No new qc will be issued for the port.

 c. EH is scheduled.

 d. Corresponding scmd and qc are left alone until EH processes them.

 Note that to achieve above behavior, we need to modify other places
too.  e.g. ata_qc_complete() needs to be modified to ignore failed
qc's and command issuing part to fail issuing if ATA_FLAG_ERROR is
set.

[2-2] Timed out scmd's and qc's

 Because libata keeps separate command list as qc array, there can be
disagreement regarding which commands have timed out between SCSI and
libata.  Consider the following scenario.

 1. A scmd is issued.

 2. Corresponding qc is allocated, initialized & issued.

 3. SCSI timeout occurs & EH scheduled.

 4. The qc completes successfully.  Because timer already has expired,
    scsi_done() will return without doing anything.

 5. EH starts.

 In above case, we have a timed out scmd but the corresponding qc has
already completed and been deallocated, and this is the only case
where a failed or timed out scmd doesn't have its corresponding qc.
Note that if the qc failed in step #4, ata_qc_error() would have been
called, the qc tagged with ATA_QCFLAG_ERROR and EH would take steps in
[2-1].

 This can be easily worked around by scanning scmds on shost->eh_cmd_q
and complete scmds which don't have corresponding qc's with success
code.  This way, internal libata EH can be insulated from SCSI details
and can only deal with qc's.

 qc's which are determined to have timed out are marked with
ATA_QCFLAG_ERROR | ATA_QCFLAG_TIMEDOUT.  Note that all above should
happen atomically as we don't wanna race with interrupt handler or
polling tasks.

[2-3] Summary of [2-1] and [2-2]

 - All failed qc's will have ATA_QCFLAG_ERROR set.

 - All timed out qc's will have ATA_QCFLAG_ERROR and
   ATA_QCFLAG_TIMEDOUT set.

 - Whenever ATA_QCFLAG_EROR bit is set, ATA_FLAG_ERROR should also be
   set.

 - All of above three should be done while holding host_set lock.

 - ata_qc_complete() and ata_qc_error() should not perform any
   operation on qc's which have ATA_QCFLAG_ERROR set.

 - No non-internal commands should be allowed on ports which have
   ATA_FLAG_ERROR set.

[2-4] EH processing & completion

 After entered, EH can issue internal qc's for recovery.  Note that we
need to implement separate mechanisms for error handling and timeout
as we can't call into EH recursively.

 For errors, just reporting failure should be enough and this can be
easily implemented by calling ata_qc_complete() from ata_qc_error()
for internal commands.

 Separate timer should be used for internal commands.  When this timer
expires, the best we can do is completing the qc with failed status.

 EH code should be prepared to take appropriate actions to handle both
errors and timeouts such as resetting device on timeout.

 After necessary steps are taken for recovery and disposition is
determined for each failed qc, EH should retry or finish each failed
qc.  As noted in SCSI EH doc, eh_strategy_handler() should call either
scsi_finish_command() or scsi_queue_insert().  Because the failed qc's
are still active, overriding their ->scsidone callbacks appropriately
and performing ata_qc_complete() on those will do the job.  Note that
ATA_QCFLAG_ERROR checking should be bypassed when finishing off failed
qc's from EH.

 After all failed qc's are taken care of, libata EH should make sure
that all integrity constraints described in SCSH EH doc is met and
clear ATA_FLAG_ERROR on the port.  Returning from
hostt->eh_strategy_handler() will make SCSI midlayer resume normal
processing.

[3] Ideas

[3-1] Using EH for non-error exceptions and dynamic reconfiguration

 Handling non-error exceptions like hot plugging and dynamic
reconfigurations such as transfer speed lowering are best done inside
EH, but currently there is no way to invoke EH without failed or
timedout scmds.  IMHO, a mechanism to allow EH invocation without
failed scmd should be simple to implement in SCSI midlayer and can
solve this issue nicely.

[3-2] Using EH for host_set level exclusion

 Some EH / configuration actions require host_set level exclusion.
This also can be solved by adding the mechanism described in [3-1].
Before starting such an operation, EH's can be invoked on all other
ports.  After all ports are safely parked inside EH, the operation can
be performed.  After the operation is complete, other ports can be
released from EH.

[4] Implementation plan

 Implementation of new EH can be separated into two stages.  The first
is implementing EH framework.  i.e. qc handoff, EH invocation, qc
completion in EH and resuming normal operation.  The latter part is
implementing actual error handling logic according to ATA exceptions
doc.

 After completing the first step, the current error handling logic can
be moved onto the new framework.  As this won't change libata's
behavior viewed from controllers and devices, we only have to verify
the framework itself and can continue to use the current logic until
the second part is complete.

 As getting error handling logic right would take some time for
testing if not for development and there are some high-on-wishlist
features delayed due to EH - NCQ and hot plugging.  Once the new EH
framework is complete, fitting those in first and implementing unified
EH logic later might be a good idea.  NCQ can be easily integrated
once the framework is in place, but I'm not sure about hotplugging.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-29  6:11 [RFC] libata new EH document Tejun Heo
@ 2005-08-29  6:13 ` Tejun Heo
  2005-08-30  9:10 ` Albert Lee
  2005-09-07  8:25 ` Jeff Garzik
  2 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2005-08-29  6:13 UTC (permalink / raw)
  To: Jeff Garzik, albertcc; +Cc: linux-ide

 Oops, typos.

 Gotta take a nap.

On Mon, Aug 29, 2005 at 03:11:24PM +0900, Tejun Heo wrote:
>  Hello, Jeff, Albert & ATA developers.
> 
>  This is the final one of recent document series for libata EH - SCSI
> EH, ATA exceptions, libata EH and, this one - libata new EH.
> 
>  This document tries to discuss how to implement new advanced EH.  It
> also describes some proposed mechanisms in detail.  I'm aware that
> things are vague without actual code, but I still think this document
> alone can at least help discussion if nothing else.  As long as some

                                                       As soon as

> consensus is reached regarding general desing, I'll follow up with

                                         design/direction

> patches.
> 
>  Jeff, a lot are from my previous new EH/NCQ patchset but also quite
> a bit has changed (for better, I hope).

 Sorry. :-)

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-29  6:11 [RFC] libata new EH document Tejun Heo
  2005-08-29  6:13 ` Tejun Heo
@ 2005-08-30  9:10 ` Albert Lee
  2005-08-30 10:26   ` Tejun Heo
  2005-08-30 14:27   ` James Bottomley
  2005-09-07  8:25 ` Jeff Garzik
  2 siblings, 2 replies; 33+ messages in thread
From: Albert Lee @ 2005-08-30  9:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, linux-ide, linux-scsi, Doug Maxey

Tejun Heo wrote:

> Hello, Jeff, Albert & ATA developers.
>
> This is the final one of recent document series for libata EH - SCSI
>EH, ATA exceptions, libata EH and, this one - libata new EH.
>
> This document tries to discuss how to implement new advanced EH.  It
>also describes some proposed mechanisms in detail.  I'm aware that
>things are vague without actual code, but I still think this document
>alone can at least help discussion if nothing else.  As long as some
>consensus is reached regarding general desing, I'll follow up with
>patches.
>
> Jeff, a lot are from my previous new EH/NCQ patchset but also quite
>a bit has changed (for better, I hope).
>
> Thanks.
>
>
>libata new EH
>======================================
>
> As discussed in the previous libata EH doc, the current libata EH
>needs some improvements.  This document discusses goals of new libata
>EH and how to reach them.  Please read SCSI EH, ATA exceptions and
>libata EH documents first.
>
>TABLE OF CONTENTS
>
>[1] Goals & design choices
>    [1-1] Use SCSI hostt->eh_strategy_handler()
>    [1-2] Unified error path in an EH thread
>    [1-3] Synchronization
>    [1-4] Clean mechanism to hand off qc's to EH
>    [1-5] Separate EH qc
>    [1-6] SCSI/libata separation
>[2] Designs
>    [2-1] Handoff of failed qc's
>    [2-2] Timed out scmd's and qc's
>    [2-3] Summary of [2-1] and [2-2]
>    [2-4] EH processing & completion
>[3] Ideas
>    [3-1] Using EH for non-error exceptions and dynamic reconfiguration
>    [3-2] Using EH for host_set level exclusion
>[4] Implementation plan
>
>
>[1] Goals & design choices
>
> The final goal is implementing advanced error handling as described
>in ATA exceptions document including NCQ EH, dynamic transport
>reconfiguration and non-error exception handling for power management
>and hot plugging.
>
> The followings are sub goals and design choices to reach the final
>goal.
>
>
>[1-1] Use SCSI hostt->eh_strategy_handler()
>
>    We have two other alternatives here - one is using fine-grained
>    SCSI EH callbacks and the other is implementing separate EH for
>    libata.
>
>    Using fine-grained SCSI EH callbacks is possible, but it has too
>    much SCSI/SPI assumptions in it - ATA error handling can be quite
>    different from SCSI error handling.  Also, as described in the
>    SCSI EH doc, it issues several SCSI commands for recovery.  They
>    can be translated but recovery through translation is a bit
>    creepy, IMHO.
>
>    The second option - private EH implementation - is attractive in
>    that it will be better integrated into libata.  However,
>    implementing a full EH when a generic framwork is already in place
>    doesn't make a lot of people happy.  And, I think integration
>    problems can be worked around without too much trouble.
>
>    The basic semantics of eh_strategy_handler() are
>
>    - Full context EH.
>
>    - After EH is started, all normal command processing is suspended
>      until EH is complete.
>
>    - Once EH is determined to be necessary, active commands are
>      drained by suppressing all command issuing and waiting for
>      in-flight commands.  When EH is finally entered, all active
>      commands are failed commands.
>
>    IMO, above semantics are fairly fundamental to block device error
>    handling and, in the future, to whatever framework libata
>    migrates, assuming above semantics shouldn't hurt too much.
>
>
>[1-2] Unified error path in an EH thread
>
>    Currently EH is scattered around several places including the
>    interrupt handler and polling tasks.  This is problemetic for the
>    following reasons.
>
>    a. Full EH context is required for error handling.
>
>       Advanced recovery usually involves resetting, command issuing
>       and other blocking operations.
>
>    b. Simple errors may trigger complex error handling behavior.
>
>       For example, when an ABRT error occurs, reporting to upper
>       layer is sufficent for most cases; however, repeated ABRT
>       errors for known-to-be-supported commands might indicate too
>       high transmission speed.  In such cases, full EH context is
>       required to perform error handling.
>
>    c. Scattered complex EH is difficult to implement and maintain.
>
>       EH logic can be somewhat complex and scattering won't help
>       implementing and maintaining it.  Also, libata low level
>       drivers are allowed to override callbacks where part of EH
>       logic may reside making matters worse.
>
>
>[1-3] Synchronization
>
>    A simple & concrete qc synchronization model to make sure that EH
>    and any other processing don't occur concurrently is needed.
>
>
>[1-4] Clean mechanism to hand off qc's to EH
>
>    For EH to handle errors and timeouts, letting EH deal with and
>    complete both errored and timed out qc's is good for simplicity
>    and consistency.  To achieve this, we need a mechanism to hand off
>    a qc to EH.
>
>    Currently, libata EH has a similar mechanism to hand off a failed
>    ATAPI qc to EH.  As described in libata EH doc, such qc is
>    half-completed and used as place holder until EH is kicked in and
>    handles it.
>
>    This half-completion isn't very clean semantically and requires
>    calling splitted internal completion routines directly.  Also, as
>    such qc's are not explicitly marked as failed, not-very-intuitive
>    stuff has to be done to avoid spurious interrupts or other events
>    from messing with it after error has occurred.
>
>
>[1-5] Separate EH qc
>
>    EH needs to issue qc's for recovery.  There can be several ways to
>    allocate EH qc.
>
>    a. reserve one extra qc for internal/EH commands
>    b. reserve one of normal qc's
>    c. use failed qc
>    d. complete failed qc first and reuse it
>
>    The preferred choice is #a for the following reasons.
>
>    - Allowing only one concurrent internal command is okay as long as
>      proper allocation mechanism is implemented or only one user is
>      guaranteed.
>
>    - EH commands are restricted to non-NCQ commands, so reserving an
>      extra qc won't break qc to tag mapping.
>
>    - #b is impossible for non-NCQ devices because only one qc is
>      available.
>
>    - #c requires dancing with qc's internals.  No real nerd likes
>       dancing.
>
>    - It may be necessary to issue commands to determine whether to
>      finish or retry a qc, so #d is out.
>
>
>[1-6] SCSI/libata separation
>
>    Internal libata EH logic implementation should be free from SCSI
>    considerations.  All glueing work should be localized to EH
>    frontend and once in the actual error handling EH should only deal
>    with qc's.
>
>
>[2] Designs
>
> This section proposes detailed design of several important mechanisms
>to help discussion and verification.
>
>
>[2-1] Handoff of failed qc's
>
> As described above, when normal command processing determines that a
>qc has failed, those qc's have to be handed off to EH without being
>lost.
>
> A new qc flag ATA_QCFLAG_ERROR is defined to mark qc's which have
>failed and ata_qc_error() is defined to be used by command processing
>to mark failed qc and schedule EH.  ata_qc_error() has to be called
>under the same condition as ata_qc_complete() - under host_lock - and
>performs the following.
>
> 1. First check if the command is already marked with
>    ATA_QCFLAG_ERROR.  If so, this isn't the first error completion
>    attempt, just return.
>
> 2. Mark the qc with ATA_QCFLAG_ERROR.
>
> 3. As, currently, SCSI command issuing is not atomic with respect to
>    SHOST_RECOVERY flag, we need a separate atomic mechanism to plug
>    command issuing.  Per-port flag ATA_FLAG_ERROR is set here to
>    prevent further command issuing.
>
> 4. Corresponding scmd's result code is set to
>    SAM_STAT_CHECK_CONDITION and qc->scsidone() callback is called
>    directly.  As we haven't filled sense data,
>    scsi_determine_disposition() will return FAILED and SCSI EH will
>    be scheduled.  Note that as we directly call qc->scsidone(), qc is
>    left intact.
>  
>

Could we get the sense data before calling qc->scsidone()?  (Using the 
proposed separate
EH qc can keep the original qc intact.)

The issue:
When a DVD drive returns MEDIUM_ERROR in the sense data, libata doesn't 
retry the command.

For libata, when scsi_softirq() calls scsi_decide_disposition() and 
scsi_check_sense() to determine
how to handle the result, scsi_check_sense() always returns "fail" since 
the sense data is not there
yet. The sense data is requested later in the libata error handler. But 
the command has already been
considered as an "error".

By having the sense data ready before calling qc->scsidone(), we can 
make the
NEEDS_RETRY work in scsi_softirq().  So, for things like MEDIUM_ERROR, 
the device has
a chance to retry/recover the error. This seems to be important for 
devices with built-in
defect management system.

> After above function is complete, the following conditions are true.
>
> a. The qc has ATA_QCFLAG_ERROR set and no further normal qc
>    processing will happen for the command.
>
> b. No new qc will be issued for the port.
>
> c. EH is scheduled.
>
> d. Corresponding scmd and qc are left alone until EH processes them.
>
> Note that to achieve above behavior, we need to modify other places
>too.  e.g. ata_qc_complete() needs to be modified to ignore failed
>qc's and command issuing part to fail issuing if ATA_FLAG_ERROR is
>set.
>
>
>  
>



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-30  9:10 ` Albert Lee
@ 2005-08-30 10:26   ` Tejun Heo
  2005-08-30 14:32     ` Luben Tuikov
  2005-09-01  2:22     ` Jeff Garzik
  2005-08-30 14:27   ` James Bottomley
  1 sibling, 2 replies; 33+ messages in thread
From: Tejun Heo @ 2005-08-30 10:26 UTC (permalink / raw)
  To: Albert Lee; +Cc: Jeff Garzik, linux-ide, linux-scsi, Doug Maxey

Albert Lee wrote:
>>
>> 4. Corresponding scmd's result code is set to
>>    SAM_STAT_CHECK_CONDITION and qc->scsidone() callback is called
>>    directly.  As we haven't filled sense data,
>>    scsi_determine_disposition() will return FAILED and SCSI EH will
>>    be scheduled.  Note that as we directly call qc->scsidone(), qc is
>>    left intact.
>>  
>>
> 
> Could we get the sense data before calling qc->scsidone()?  (Using the 
> proposed separate
> EH qc can keep the original qc intact.)
> 
> The issue:
> When a DVD drive returns MEDIUM_ERROR in the sense data, libata doesn't 
> retry the command.
> 
> For libata, when scsi_softirq() calls scsi_decide_disposition() and 
> scsi_check_sense() to determine
> how to handle the result, scsi_check_sense() always returns "fail" since 
> the sense data is not there
> yet. The sense data is requested later in the libata error handler. But 
> the command has already been
> considered as an "error".
> 
> By having the sense data ready before calling qc->scsidone(), we can 
> make the
> NEEDS_RETRY work in scsi_softirq().  So, for things like MEDIUM_ERROR, 
> the device has
> a chance to retry/recover the error. This seems to be important for 
> devices with built-in
> defect management system.

  There are two ways a scmd can leave EH - retry by scsi_queue_insert() 
and finish by scsi_finish_cmd().  I think the problem you described can 
be easily solved by choosing the former method when finishing the qc 
from EH.  Note that other advanced EH stuff like reconfiguring transport 
speed also requires retrying, so we will surely have a mechanism for 
retrying failed qc's from EH.

  Wouldn't that be enough?

  Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-30  9:10 ` Albert Lee
  2005-08-30 10:26   ` Tejun Heo
@ 2005-08-30 14:27   ` James Bottomley
  1 sibling, 0 replies; 33+ messages in thread
From: James Bottomley @ 2005-08-30 14:27 UTC (permalink / raw)
  To: Albert Lee
  Cc: Tejun Heo, Jeff Garzik, linux-ide, SCSI Mailing List, Doug Maxey

On Tue, 2005-08-30 at 17:10 +0800, Albert Lee wrote:
> Could we get the sense data before calling qc->scsidone()?  (Using the 
> proposed separate
> EH qc can keep the original qc intact.)

Well, the way most SCSI drivers do it today is to turn around the
command returning Check Condition from irq context.  This is sort of
like ACA simulation.  The reason it's done is because while a drive
holds a contingent allegiance condition it cannot accept other commands.
This effectively holds everything in abeyance until the sense is taken,
so for fast operation, SCSI drivers take the sense as soon as possible.

Ideally, though this mechanism should be deprecated because it causes us
to keep a complete copy of the old command and parameters around which
is a hit in the command allocation and set up fast path.

> The issue:
> When a DVD drive returns MEDIUM_ERROR in the sense data, libata doesn't 
> retry the command.
> 
> For libata, when scsi_softirq() calls scsi_decide_disposition() and 
> scsi_check_sense() to determine
> how to handle the result, scsi_check_sense() always returns "fail" since 
> the sense data is not there
> yet. The sense data is requested later in the libata error handler. But 
> the command has already been
> considered as an "error".
> 
> By having the sense data ready before calling qc->scsidone(), we can 
> make the
> NEEDS_RETRY work in scsi_softirq().  So, for things like MEDIUM_ERROR, 
> the device has
> a chance to retry/recover the error. This seems to be important for 
> devices with built-in
> defect management system.

It sounds like the ATA error handler is missing the sense interpretation
part.  If you look at the scsi error handler, we will request sense and
then redecide the disposition based on the return.

James

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-30 10:26   ` Tejun Heo
@ 2005-08-30 14:32     ` Luben Tuikov
  2005-09-01  1:17       ` Tejun Heo
  2005-09-01  2:22     ` Jeff Garzik
  1 sibling, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-08-30 14:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Albert Lee, Jeff Garzik, linux-ide, linux-scsi, Doug Maxey

On 08/30/05 06:26, Tejun Heo wrote:
> Albert Lee wrote:
> 
>>>4. Corresponding scmd's result code is set to
>>>   SAM_STAT_CHECK_CONDITION and qc->scsidone() callback is called
>>>   directly.  As we haven't filled sense data,
>>>   scsi_determine_disposition() will return FAILED and SCSI EH will
>>>   be scheduled.  Note that as we directly call qc->scsidone(), qc is
>>>   left intact.
>>> 
>>>
>>
>>Could we get the sense data before calling qc->scsidone()?  (Using the 
>>proposed separate
>>EH qc can keep the original qc intact.)
>>
>>The issue:
>>When a DVD drive returns MEDIUM_ERROR in the sense data, libata doesn't 
>>retry the command.
>>
>>For libata, when scsi_softirq() calls scsi_decide_disposition() and 
>>scsi_check_sense() to determine
>>how to handle the result, scsi_check_sense() always returns "fail" since 
>>the sense data is not there
>>yet. The sense data is requested later in the libata error handler. But 
>>the command has already been
>>considered as an "error".
>>
>>By having the sense data ready before calling qc->scsidone(), we can 
>>make the
>>NEEDS_RETRY work in scsi_softirq().  So, for things like MEDIUM_ERROR, 
>>the device has
>>a chance to retry/recover the error. This seems to be important for 
>>devices with built-in
>>defect management system.
> 
> 
>   There are two ways a scmd can leave EH - retry by scsi_queue_insert() 
> and finish by scsi_finish_cmd().  I think the problem you described can 
> be easily solved by choosing the former method when finishing the qc 
> from EH.  Note that other advanced EH stuff like reconfiguring transport 
> speed also requires retrying, so we will surely have a mechanism for 
> retrying failed qc's from EH.

What is needed is autosense simulation for ATA, so that SCSI Core doesn't
know that the device doesn't support autosense.

So, before a failed command reaches SCSI Core recovery, it should pass by
ATA layer recovery to get sense.

Note: if you send another command for execution after the failed command
_and_ no autosense is provided, then any sense data is lost -- this is further
subject to more rules set forth in SAM and SPC.

	Luben


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-30 14:32     ` Luben Tuikov
@ 2005-09-01  1:17       ` Tejun Heo
  2005-09-01  2:22         ` Jeff Garzik
  2005-09-01  3:30         ` Luben Tuikov
  0 siblings, 2 replies; 33+ messages in thread
From: Tejun Heo @ 2005-09-01  1:17 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Albert Lee, Jeff Garzik, linux-ide, linux-scsi, Doug Maxey

Luben Tuikov wrote:
> On 08/30/05 06:26, Tejun Heo wrote:
> 
>>Albert Lee wrote:
>>
>>
>>>>4. Corresponding scmd's result code is set to
>>>>  SAM_STAT_CHECK_CONDITION and qc->scsidone() callback is called
>>>>  directly.  As we haven't filled sense data,
>>>>  scsi_determine_disposition() will return FAILED and SCSI EH will
>>>>  be scheduled.  Note that as we directly call qc->scsidone(), qc is
>>>>  left intact.
>>>>
>>>>
>>>
>>>Could we get the sense data before calling qc->scsidone()?  (Using the 
>>>proposed separate
>>>EH qc can keep the original qc intact.)
>>>
>>>The issue:
>>>When a DVD drive returns MEDIUM_ERROR in the sense data, libata doesn't 
>>>retry the command.
>>>
>>>For libata, when scsi_softirq() calls scsi_decide_disposition() and 
>>>scsi_check_sense() to determine
>>>how to handle the result, scsi_check_sense() always returns "fail" since 
>>>the sense data is not there
>>>yet. The sense data is requested later in the libata error handler. But 
>>>the command has already been
>>>considered as an "error".
>>>
>>>By having the sense data ready before calling qc->scsidone(), we can 
>>>make the
>>>NEEDS_RETRY work in scsi_softirq().  So, for things like MEDIUM_ERROR, 
>>>the device has
>>>a chance to retry/recover the error. This seems to be important for 
>>>devices with built-in
>>>defect management system.
>>
>>
>>  There are two ways a scmd can leave EH - retry by scsi_queue_insert() 
>>and finish by scsi_finish_cmd().  I think the problem you described can 
>>be easily solved by choosing the former method when finishing the qc 
>>from EH.  Note that other advanced EH stuff like reconfiguring transport 
>>speed also requires retrying, so we will surely have a mechanism for 
>>retrying failed qc's from EH.
> 
> 
> What is needed is autosense simulation for ATA, so that SCSI Core doesn't
> know that the device doesn't support autosense.
> 
> So, before a failed command reaches SCSI Core recovery, it should pass by
> ATA layer recovery to get sense.
> 
> Note: if you send another command for execution after the failed command
> _and_ no autosense is provided, then any sense data is lost -- this is further
> subject to more rules set forth in SAM and SPC.
> 

  IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command 
mapping as long as possible.  And, in the suggested framework, it's 
guaranteed that no other command can come inbetween CHECK_SENSE and 
REQUEST_SENSE.

  Requesting sense from EH, calling scsi_decide_disposition() on the 
sense and following the verdict should achieve the same effect as 
emulating autosense.  Is there any compelling reason to break one qc to 
one command mapping?

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  1:17       ` Tejun Heo
@ 2005-09-01  2:22         ` Jeff Garzik
  2005-09-01  2:42           ` Tejun Heo
  2005-09-01  3:33           ` Luben Tuikov
  2005-09-01  3:30         ` Luben Tuikov
  1 sibling, 2 replies; 33+ messages in thread
From: Jeff Garzik @ 2005-09-01  2:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Luben Tuikov, Albert Lee, linux-ide, linux-scsi, Doug Maxey

Tejun Heo wrote:
>  IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command 
> mapping as long as possible.  And, in the suggested framework, it's 
> guaranteed that no other command can come inbetween CHECK_SENSE and 
> REQUEST_SENSE.
> 
>  Requesting sense from EH, calling scsi_decide_disposition() on the 
> sense and following the verdict should achieve the same effect as 
> emulating autosense.  Is there any compelling reason to break one qc to 
> one command mapping?

Yes, you should have one qc <-> one ATA/ATAPI command.  That's why, in 
the NCQ scenario, I wanted to make sure that one qc was always reserved 
for error handling:  REQUEST SENSE or READ LOG EXT, most importantly.

For SAT layer MODE SELECT translations, that implies multiple calls to 
qc_new/qc_issue/qc_complete before completing the overall SCSI command. 
  The same for handling sata_sil mod15write:  I am beginning to feel 
like the mod15write workaround might be best implemented in a manner 
that caused libata-scsi (not sata_sil) to create/issue/complete multiple 
ATA commands.

The only problem you run into is that a qc may be active during EH, when 
you need another qc.  So avoiding recursive details becomes an issue.

	Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-30 10:26   ` Tejun Heo
  2005-08-30 14:32     ` Luben Tuikov
@ 2005-09-01  2:22     ` Jeff Garzik
  1 sibling, 0 replies; 33+ messages in thread
From: Jeff Garzik @ 2005-09-01  2:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Albert Lee, linux-ide, linux-scsi, Doug Maxey

BTW I still have three of your documents to review and comment on. 
Haven't forgotten about them.

	Jeff




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  2:22         ` Jeff Garzik
@ 2005-09-01  2:42           ` Tejun Heo
  2005-09-01  3:33           ` Luben Tuikov
  1 sibling, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2005-09-01  2:42 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Luben Tuikov, Albert Lee, linux-ide, linux-scsi, Doug Maxey

 Hello, Jeff.

On Wed, Aug 31, 2005 at 10:22:17PM -0400, Jeff Garzik wrote:
> Tejun Heo wrote:
> > IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command 
> >mapping as long as possible.  And, in the suggested framework, it's 
> >guaranteed that no other command can come inbetween CHECK_SENSE and 
> >REQUEST_SENSE.
> >
> > Requesting sense from EH, calling scsi_decide_disposition() on the 
> >sense and following the verdict should achieve the same effect as 
> >emulating autosense.  Is there any compelling reason to break one qc to 
> >one command mapping?
> 
> 
> Yes, you should have one qc <-> one ATA/ATAPI command.  That's why, in 
> the NCQ scenario, I wanted to make sure that one qc was always reserved 
> for error handling:  REQUEST SENSE or READ LOG EXT, most importantly.

 Having an extra (as opposed to reserved) EH qc doesn't break one qc
<-> one command mapping.

 a. All EH commands are non-NCQ.
 b. Inside EH, no other command is allowed.

 So, we can allocate a qc which does not have a corresponding NCQ tag.
This qc will never be used for normal commands.  It's used only for
internal commands when no other qc can be active.

 If we don't have an extra qc for EH, as non-NCQ devices have only one
qc, we should either,

 a. Rewrite failed qc to issue recovery command
 b. Complete failed qc and issue recovery command

 Both are not too attractive, IMHO.

 I currently don't understand very well why you don't like extra qc
approach.  Can you please elaborate?

> 
> For SAT layer MODE SELECT translations, that implies multiple calls to 
> qc_new/qc_issue/qc_complete before completing the overall SCSI command. 
>  The same for handling sata_sil mod15write:  I am beginning to feel 
> like the mod15write workaround might be best implemented in a manner 
> that caused libata-scsi (not sata_sil) to create/issue/complete multiple 
> ATA commands.

 That's what I've done for multi-qc SCSI cmd translation patch I've
posted the other day, and I think it would be really neat to do
similar thing for m15w.  However, to do so, we'll need some callbacks
at libata scsi/core layers (say, driver-overridable command
translation callbacks?) at the very least and I'm not sure about
adding those just for m15w.

> The only problem you run into is that a qc may be active during EH, when 
> you need another qc.  So avoiding recursive details becomes an issue.

 I guess this means the same thing I've described above about non-NCQ
devices, right?

 Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  1:17       ` Tejun Heo
  2005-09-01  2:22         ` Jeff Garzik
@ 2005-09-01  3:30         ` Luben Tuikov
  2005-09-01  3:44           ` Tejun Heo
  1 sibling, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01  3:30 UTC (permalink / raw)
  To: Tejun Heo, Luben Tuikov
  Cc: Albert Lee, Jeff Garzik, linux-ide, linux-scsi, Doug Maxey

--- Tejun Heo <htejun@gmail.com> wrote:
>   IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command 
> mapping as long as possible.  And, in the suggested framework, it's 

Yes, that makes sense.

> guaranteed that no other command can come inbetween CHECK_SENSE and 
> REQUEST_SENSE.

That's good.

>   Requesting sense from EH,

Done in an ATA eh handler.

> calling scsi_decide_disposition() on the 
> sense 

Done in SCSI Core.

> and following the verdict should achieve the same effect as
> emulating autosense.

Yes, precisely.

> Is there any compelling reason to break one qc to 
> one command mapping?

?

     Luben


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  2:22         ` Jeff Garzik
  2005-09-01  2:42           ` Tejun Heo
@ 2005-09-01  3:33           ` Luben Tuikov
  1 sibling, 0 replies; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01  3:33 UTC (permalink / raw)
  To: Jeff Garzik, Tejun Heo
  Cc: Luben Tuikov, Albert Lee, linux-ide, linux-scsi, Doug Maxey

--- Jeff Garzik <jgarzik@pobox.com> wrote:

> Tejun Heo wrote:
> >  IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command 
> > mapping as long as possible.  And, in the suggested framework, it's 
> > guaranteed that no other command can come inbetween CHECK_SENSE and 
> > REQUEST_SENSE.
> > 
> >  Requesting sense from EH, calling scsi_decide_disposition() on the 
> > sense and following the verdict should achieve the same effect as 
> > emulating autosense.  Is there any compelling reason to break one qc to 
> > one command mapping?
> 
> 
> Yes, you should have one qc <-> one ATA/ATAPI command.  That's why, in 

Agree.

> the NCQ scenario, I wanted to make sure that one qc was always reserved 
> for error handling:  REQUEST SENSE or READ LOG EXT, most importantly.

Yes.

> For SAT layer MODE SELECT translations, that implies multiple calls to 
> qc_new/qc_issue/qc_complete before completing the overall SCSI command. 
>   The same for handling sata_sil mod15write:  I am beginning to feel 
> like the mod15write workaround might be best implemented in a manner 
> that caused libata-scsi (not sata_sil) to create/issue/complete multiple 
> ATA commands.
> 
> The only problem you run into is that a qc may be active during EH, when 
> you need another qc.  So avoiding recursive details becomes an issue.

Hmm...

     Luben


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  3:30         ` Luben Tuikov
@ 2005-09-01  3:44           ` Tejun Heo
  2005-09-01  4:38             ` Luben Tuikov
  0 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2005-09-01  3:44 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Luben Tuikov, Albert Lee, Jeff Garzik, linux-ide, linux-scsi,
	Doug Maxey

 Hi, Luben.

On Wed, Aug 31, 2005 at 08:30:27PM -0700, Luben Tuikov wrote:
> --- Tejun Heo <htejun@gmail.com> wrote:
> >   IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command 
> > mapping as long as possible.  And, in the suggested framework, it's 
> 
> Yes, that makes sense.
> 
> > guaranteed that no other command can come inbetween CHECK_SENSE and 
> > REQUEST_SENSE.
> 
> That's good.
> 
> >   Requesting sense from EH,
> 
> Done in an ATA eh handler.
> 
> > calling scsi_decide_disposition() on the 
> > sense 
> 
> Done in SCSI Core.
> 
> > and following the verdict should achieve the same effect as
> > emulating autosense.
> 
> Yes, precisely.
> 
> > Is there any compelling reason to break one qc to 
> > one command mapping?
> 
> ?
> 

 I wasn't clear enough.  I'll try again.  :-)

 As implementing autosensing will probably need rewriting failed qc
for REQUEST SENSE command, I'm opposing it.  My proposal is to do the
following, which, in effect, should be equivalent to autosensing.

 1. ATAPI CHECK SENSE occurs
 2. libata fails the command
 3. SCSI sees failure code but no sense data, SCSI EH invoked
 4. libata EH invoked
 5. REQUEST SENSE
 6. sense data acquired
 7. scsi_decide_disposition() called (this needs to be exported from SCSI)
 8. libata handles the failed qc according to the verdict.

 This is very similar to what SCSI EH currently does for commands
without sense data.

 As ATAPI device's queue depth is always one (ignoring SERVICE cruft
everyone seems to hate), I don't think there will be any noticeable
performance penalty as James was describing in the other mail in this
thread.

 Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  3:44           ` Tejun Heo
@ 2005-09-01  4:38             ` Luben Tuikov
  2005-09-01  5:44               ` Tejun Heo
  0 siblings, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01  4:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Luben Tuikov, Albert Lee, Jeff Garzik, linux-ide, linux-scsi,
	Doug Maxey

--- Tejun Heo <htejun@gmail.com> wrote:
>  As implementing autosensing will probably need rewriting failed qc
> for REQUEST SENSE command, I'm opposing it.  My proposal is to do the
> following, which, in effect, should be equivalent to autosensing.
> 
>  1. ATAPI CHECK SENSE occurs
>  2. libata fails the command
>  3. SCSI sees failure code but no sense data, SCSI EH invoked
>  4. libata EH invoked
>  5. REQUEST SENSE
>  6. sense data acquired
>  7. scsi_decide_disposition() called (this needs to be exported from SCSI)
>  8. libata handles the failed qc according to the verdict.

Hmm, yes.  It sounds good, except can you make it so that step 3
doesn't exist, ever.  This means that you would _reduce_ the
double "bouncing" between eh's _and_ implement autosense.

SCSI Core should never know what happened.  I.e. if the command
has completed with CHECK SENSE, sense data _is_ present => "autosense".

> This is very similar to what SCSI EH currently does for commands
> without sense data.

Yes, you're right -- it is very similar to what SCSI EH currently does.
Unfortunately it isn't quite correct.

>  As ATAPI device's queue depth is always one (ignoring SERVICE cruft
> everyone seems to hate), I don't think there will be any noticeable
> performance penalty as James was describing in the other mail in this
> thread.

What you can do is keep a qc around to request sense immediately
afterwards.  If _that_ qc fails, then you know you need the big hammer.

      Luben


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  4:38             ` Luben Tuikov
@ 2005-09-01  5:44               ` Tejun Heo
  2005-09-01  5:54                 ` Jeff Garzik
  0 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2005-09-01  5:44 UTC (permalink / raw)
  To: ltuikov
  Cc: Luben Tuikov, Albert Lee, Jeff Garzik, linux-ide, linux-scsi,
	Doug Maxey

  Hello, Luben.

Luben Tuikov wrote:
> --- Tejun Heo <htejun@gmail.com> wrote:
> 
>> As implementing autosensing will probably need rewriting failed qc
>>for REQUEST SENSE command, I'm opposing it.  My proposal is to do the
>>following, which, in effect, should be equivalent to autosensing.
>>
>> 1. ATAPI CHECK SENSE occurs
>> 2. libata fails the command
>> 3. SCSI sees failure code but no sense data, SCSI EH invoked
>> 4. libata EH invoked
>> 5. REQUEST SENSE
>> 6. sense data acquired
>> 7. scsi_decide_disposition() called (this needs to be exported from SCSI)
>> 8. libata handles the failed qc according to the verdict.
> 
> 
> Hmm, yes.  It sounds good, except can you make it so that step 3
> doesn't exist, ever.  This means that you would _reduce_ the
> double "bouncing" between eh's _and_ implement autosense.
> 

  libata EH is invoked from SCSI EH via hostt->eh_strategy_handler(), so 
they're one - libata EH uses SCSH EH framework to operate.  I'm having 
hard time understanding what you mean by 'double bounncing'.

> SCSI Core should never know what happened.  I.e. if the command
> has completed with CHECK SENSE, sense data _is_ present => "autosense".
> 
> 
>>This is very similar to what SCSI EH currently does for commands
>>without sense data.
> 
> 
> Yes, you're right -- it is very similar to what SCSI EH currently does.
> Unfortunately it isn't quite correct.
> 

  Can you please elaborate why getting sense data from EH is bad idea 
for ATAPI?  For more advanced SCSI transports, I agree with you that 
autosensing is necessary with queueing and multiple initiator and etc, 
but I don't really see how requesting sense from EH would be bad for ATAPI.

> 
>> As ATAPI device's queue depth is always one (ignoring SERVICE cruft
>>everyone seems to hate), I don't think there will be any noticeable
>>performance penalty as James was describing in the other mail in this
>>thread.
> 
> 
> What you can do is keep a qc around to request sense immediately
> afterwards.  If _that_ qc fails, then you know you need the big hammer.

  Yes, that is also a possibility, but I was opting for REQUEST SENSE 
from EH for the following two reasons.

  a. As we're gonna have facilities to issue EH cmds from EH, ATAPI can 
just join the crowd without implementing separate mechanism to issue 
REQUEST SENSE.

  b. It's not a hot path and I think performance gain from implementing 
autosense would be negligible.

  Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  5:44               ` Tejun Heo
@ 2005-09-01  5:54                 ` Jeff Garzik
  2005-09-01 13:24                   ` James Bottomley
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Garzik @ 2005-09-01  5:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: ltuikov, Luben Tuikov, Albert Lee, linux-ide, linux-scsi,
	Doug Maxey

On Thu, Sep 01, 2005 at 02:44:00PM +0900, Tejun Heo wrote:
>  Can you please elaborate why getting sense data from EH is bad idea 
> for ATAPI?  For more advanced SCSI transports, I agree with you that 
> autosensing is necessary with queueing and multiple initiator and etc, 
> but I don't really see how requesting sense from EH would be bad for ATAPI.

The long term direction for the SCSI core seems to be that of
requiring auto-sensing.

libata is simply being lazy:  while the SCSI core continues to support
kicking the EH thread when sense is missing, it's preferred for libata
to reuse that infrastructure.

Auto-sensing (and READ LOG EXT for NCQ errors) requires either an
FSM or a kernel thread, to initiate a secondary qc for REQUEST SENSE.
Since the common infrastructure already exists for this, libata reuses
the existing SCSI EH kernel thread.

We should move libata-scsi to auto-sensing, but it's not an urgent priority.

	Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01  5:54                 ` Jeff Garzik
@ 2005-09-01 13:24                   ` James Bottomley
  2005-09-01 21:40                     ` Luben Tuikov
  0 siblings, 1 reply; 33+ messages in thread
From: James Bottomley @ 2005-09-01 13:24 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, ltuikov, Luben Tuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On Thu, 2005-09-01 at 01:54 -0400, Jeff Garzik wrote:
> The long term direction for the SCSI core seems to be that of
> requiring auto-sensing.

No, I don't see the mid-layer error thread handling of this ever going
away.

> libata is simply being lazy:  while the SCSI core continues to support
> kicking the EH thread when sense is missing, it's preferred for libata
> to reuse that infrastructure.

That makes the most sense ;-)

> Auto-sensing (and READ LOG EXT for NCQ errors) requires either an
> FSM or a kernel thread, to initiate a secondary qc for REQUEST SENSE.
> Since the common infrastructure already exists for this, libata reuses
> the existing SCSI EH kernel thread.

The current SCSI autosense in drivers doesn't require this because we
reuse the existing command that got the contingent allegiance condition.
This is the piece I'd like to get rid of because the extra fields and
extra setup to allow the command to be reused are a critical path hit.
If you look at any driver that does this (53c700.c for instance) you'll
see that the command is turned around and resubmitted in the irq
routine).

James

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 13:24                   ` James Bottomley
@ 2005-09-01 21:40                     ` Luben Tuikov
  2005-09-01 21:46                       ` Jeff Garzik
  2005-09-01 21:55                       ` James Bottomley
  0 siblings, 2 replies; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 21:40 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 09:24, James Bottomley wrote:
> On Thu, 2005-09-01 at 01:54 -0400, Jeff Garzik wrote:
> 
>>The long term direction for the SCSI core seems to be that of
>>requiring auto-sensing.
> 
> 
> No, I don't see the mid-layer error thread handling of this ever going
> away.
> 
>>libata is simply being lazy:  while the SCSI core continues to support
>>kicking the EH thread when sense is missing, it's preferred for libata
>>to reuse that infrastructure.
> 
> 
> That makes the most sense ;-)

For libata it doesn't really matter, since it is _ATA_.

> The current SCSI autosense in drivers doesn't require this because we
> reuse the existing command that got the contingent allegiance condition.

Care to elaborate what "contingent allegiance condition" is,
how SCSI Core got it, how SCSI Core is using it, and how SCSI Core set
it up with the LU?

> This is the piece I'd like to get rid of because the extra fields and
> extra setup to allow the command to be reused are a critical path hit.

If you _do_ get rid of the extra fields, then you _really_ need
LLDD/protocols to support autosense.

> If you look at any driver that does this (53c700.c for instance) you'll
> see that the command is turned around and resubmitted in the irq
> routine).

That's ok.

	Luben

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 21:40                     ` Luben Tuikov
@ 2005-09-01 21:46                       ` Jeff Garzik
  2005-09-01 22:09                         ` Luben Tuikov
  2005-09-01 22:22                         ` Luben Tuikov
  2005-09-01 21:55                       ` James Bottomley
  1 sibling, 2 replies; 33+ messages in thread
From: Jeff Garzik @ 2005-09-01 21:46 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: James Bottomley, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

Luben Tuikov wrote:
> On 09/01/05 09:24, James Bottomley wrote:
> 
>>On Thu, 2005-09-01 at 01:54 -0400, Jeff Garzik wrote:
>>
>>
>>>The long term direction for the SCSI core seems to be that of
>>>requiring auto-sensing.
>>
>>
>>No, I don't see the mid-layer error thread handling of this ever going
>>away.
>>
>>
>>>libata is simply being lazy:  while the SCSI core continues to support
>>>kicking the EH thread when sense is missing, it's preferred for libata
>>>to reuse that infrastructure.
>>
>>
>>That makes the most sense ;-)
> 
> 
> For libata it doesn't really matter, since it is _ATA_.


It matters quite a bit.  One of the main reasons libata uses the SCSI 
layer is for its infrastructure.

This is the same reason a couple RAID drivers use the SCSI layer.  It 
has nothing to do with SCSI-as-defined-by-T10, and more to do with the 
fact that SCSI provides a robust queueing/EH/block interface infrastructure.

My long term plans include moving some of this not-SCSI-related 
infrastructure from the SCSI layer to the block layer.

	Jeff



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 21:40                     ` Luben Tuikov
  2005-09-01 21:46                       ` Jeff Garzik
@ 2005-09-01 21:55                       ` James Bottomley
  2005-09-01 22:07                         ` Luben Tuikov
  1 sibling, 1 reply; 33+ messages in thread
From: James Bottomley @ 2005-09-01 21:55 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On Thu, 2005-09-01 at 17:40 -0400, Luben Tuikov wrote:
> > The current SCSI autosense in drivers doesn't require this because we
> > reuse the existing command that got the contingent allegiance condition.
> 
> Care to elaborate what "contingent allegiance condition" is,
> how SCSI Core got it, how SCSI Core is using it, and how SCSI Core set
> it up with the LU?

Well, not really, since it's basic SCSI and the explanation's pretty
long.  However, the standards have several pages about it.  For your
reading pleasure, I suggest SAM-2 section 5.9.1 Contingent allegiance
(CA) and auto contingent allegiance (ACA)

James



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 21:55                       ` James Bottomley
@ 2005-09-01 22:07                         ` Luben Tuikov
  2005-09-01 22:23                           ` James Bottomley
  0 siblings, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 22:07 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 17:55, James Bottomley wrote:
> On Thu, 2005-09-01 at 17:40 -0400, Luben Tuikov wrote:
> 
>>>The current SCSI autosense in drivers doesn't require this because we
>>>reuse the existing command that got the contingent allegiance condition.
>>
>>Care to elaborate what "contingent allegiance condition" is,
>>how SCSI Core got it, how SCSI Core is using it, and how SCSI Core set
>>it up with the LU?
> 
> 
> Well, not really, since it's basic SCSI and the explanation's pretty
> long.  However, the standards have several pages about it.  For your
> reading pleasure, I suggest SAM-2 section 5.9.1 Contingent allegiance
> (CA) and auto contingent allegiance (ACA)

SCSI Core knows nothing about ACA and/or how to use it.

You should also know that no one actually spells out CA or ACA,
they just use the capitalized abbreviation, plus the fact that
CA is obsolete.

Stop impressing the children!

	Luben


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 21:46                       ` Jeff Garzik
@ 2005-09-01 22:09                         ` Luben Tuikov
  2005-09-01 22:27                           ` Jeff Garzik
  2005-09-01 22:22                         ` Luben Tuikov
  1 sibling, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 22:09 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: James Bottomley, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 17:46, Jeff Garzik wrote:
>>>>libata is simply being lazy:  while the SCSI core continues to support
>>>>kicking the EH thread when sense is missing, it's preferred for libata
>>>>to reuse that infrastructure.
>>>
>>>
>>>That makes the most sense ;-)
>>
>>
>>For libata it doesn't really matter, since it is _ATA_.
> 
> It matters quite a bit.  One of the main reasons libata uses the SCSI 
> layer is for its infrastructure.

Hmm, maybe I should've been more clear.

> This is the same reason a couple RAID drivers use the SCSI layer.  It 
> has nothing to do with SCSI-as-defined-by-T10, and more to do with the 
> fact that SCSI provides a robust queueing/EH/block interface infrastructure.

You must be kidding! "robust"?  What are you comparing this to?

I think it's only because "it's there" and that it provides
a uniform access -- provided by SCSI, _not_ by that particular
SCSI implementation.

> My long term plans include moving some of this not-SCSI-related 
> infrastructure from the SCSI layer to the block layer.

Which is that infrastructure?

	Luben


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 21:46                       ` Jeff Garzik
  2005-09-01 22:09                         ` Luben Tuikov
@ 2005-09-01 22:22                         ` Luben Tuikov
  2005-09-01 22:31                           ` Jeff Garzik
  1 sibling, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 22:22 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: James Bottomley, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 17:46, Jeff Garzik wrote:
> This is the same reason a couple RAID drivers use the SCSI layer.  It 
> has nothing to do with SCSI-as-defined-by-T10, and more to do with the 

Do you think T10 has anything to offer Linux SCSI Core?

	Luben

> fact that SCSI provides a robust queueing/EH/block interface infrastructure.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:07                         ` Luben Tuikov
@ 2005-09-01 22:23                           ` James Bottomley
  2005-09-01 22:36                             ` Luben Tuikov
  0 siblings, 1 reply; 33+ messages in thread
From: James Bottomley @ 2005-09-01 22:23 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On Thu, 2005-09-01 at 18:07 -0400, Luben Tuikov wrote:
> > Well, not really, since it's basic SCSI and the explanation's pretty
> > long.  However, the standards have several pages about it.  For your
> > reading pleasure, I suggest SAM-2 section 5.9.1 Contingent allegiance
> > (CA) and auto contingent allegiance (ACA)
> 
> SCSI Core knows nothing about ACA and/or how to use it.

I don't recall ever claiming that it did.  The discussion was about how
the error handler clears contingent allegiance conditions.

> You should also know that no one actually spells out CA or ACA,
> they just use the capitalized abbreviation, plus the fact that
> CA is obsolete.

Lets just say I'm TLA averse.

> Stop impressing the children!

Is that what people who constantly refer to standards are trying to do?
I must say I did wonder ...

James



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:09                         ` Luben Tuikov
@ 2005-09-01 22:27                           ` Jeff Garzik
  2005-09-01 23:17                             ` Luben Tuikov
  2005-09-02  7:09                             ` Stefan Richter
  0 siblings, 2 replies; 33+ messages in thread
From: Jeff Garzik @ 2005-09-01 22:27 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: James Bottomley, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

Luben Tuikov wrote:
> On 09/01/05 17:46, Jeff Garzik wrote:
> 
>>>>>libata is simply being lazy:  while the SCSI core continues to support
>>>>>kicking the EH thread when sense is missing, it's preferred for libata
>>>>>to reuse that infrastructure.
>>>>
>>>>
>>>>That makes the most sense ;-)
>>>
>>>
>>>For libata it doesn't really matter, since it is _ATA_.
>>
>>It matters quite a bit.  One of the main reasons libata uses the SCSI 
>>layer is for its infrastructure.
> 
> 
> Hmm, maybe I should've been more clear.
> 
> 
>>This is the same reason a couple RAID drivers use the SCSI layer.  It 
>>has nothing to do with SCSI-as-defined-by-T10, and more to do with the 
>>fact that SCSI provides a robust queueing/EH/block interface infrastructure.
> 
> 
> You must be kidding! "robust"?  What are you comparing this to?
> 
> I think it's only because "it's there" and that it provides
> a uniform access -- provided by SCSI, _not_ by that particular
> SCSI implementation.

You're correct in one sense, but I still don't think you understand 
Linux development at a fundamental level.

Linux is NOT about big designs.  Linus says "do what you must, and no 
more."  Linux is a fluid, organic biological organism that evolves 
through small changes over time.

So, yes, the reason is "it's there"   And that's a really good reason!

The future will bring other "baby steps" that evolve us towards a more 
modular design where each LLD may register themselves with a storage 
system, associate themselves with one or more transport classes, which 
in turn create associations with device classes.


>>My long term plans include moving some of this not-SCSI-related 
>>infrastructure from the SCSI layer to the block layer.
> 
> 
> Which is that infrastructure?

Queueing, EH, transport classes (which should already be independent of 
SCSI), maybe driver API, and other fun stuff.  :)

	Jeff



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:22                         ` Luben Tuikov
@ 2005-09-01 22:31                           ` Jeff Garzik
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff Garzik @ 2005-09-01 22:31 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: James Bottomley, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

Luben Tuikov wrote:
> On 09/01/05 17:46, Jeff Garzik wrote:
> 
>>This is the same reason a couple RAID drivers use the SCSI layer.  It 
>>has nothing to do with SCSI-as-defined-by-T10, and more to do with the 
> 
> 
> Do you think T10 has anything to offer Linux SCSI Core?

I was attempting to differentiate between SCSI, the protocol, and SCSI, 
the Linux implementation.

I have no idea how to answer your question.  We give feedback to T10 all 
the time.

	Jeff




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:23                           ` James Bottomley
@ 2005-09-01 22:36                             ` Luben Tuikov
  2005-09-01 23:01                               ` James Bottomley
  0 siblings, 1 reply; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 22:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 18:23, James Bottomley wrote:
> On Thu, 2005-09-01 at 18:07 -0400, Luben Tuikov wrote:
> 
>>>Well, not really, since it's basic SCSI and the explanation's pretty
>>>long.  However, the standards have several pages about it.  For your
>>>reading pleasure, I suggest SAM-2 section 5.9.1 Contingent allegiance
>>>(CA) and auto contingent allegiance (ACA)
>>
>>SCSI Core knows nothing about ACA and/or how to use it.
> 
> 
> I don't recall ever claiming that it did.  The discussion was about how
> the error handler clears contingent allegiance conditions.

:-)

So are you claiming that 
    "the error handler clears contingent allegiance conditions" ?

Please point me to the lines in the source code where it does this
and how it does it.

>>You should also know that no one actually spells out CA or ACA,
>>they just use the capitalized abbreviation, plus the fact that
>>CA is obsolete.
> 
> 
> Lets just say I'm TLA averse.
> 
> 
>>Stop impressing the children!
> 
> 
> Is that what people who constantly refer to standards are trying to do?
> I must say I did wonder ...

Yes, there's a bunch of us here, count DG too.

	Luben

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:36                             ` Luben Tuikov
@ 2005-09-01 23:01                               ` James Bottomley
  2005-09-01 23:03                                 ` Luben Tuikov
  2005-09-01 23:27                                 ` Luben Tuikov
  0 siblings, 2 replies; 33+ messages in thread
From: James Bottomley @ 2005-09-01 23:01 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On Thu, 2005-09-01 at 18:36 -0400, Luben Tuikov wrote:
> So are you claiming that 
>     "the error handler clears contingent allegiance conditions" ?
> 
> Please point me to the lines in the source code where it does this
> and how it does it.

That's this bit:

static void scsi_unjam_host(struct Scsi_Host *shost)
{
	unsigned long flags;
	LIST_HEAD(eh_work_q);
	LIST_HEAD(eh_done_q);

	spin_lock_irqsave(shost->host_lock, flags);
	list_splice_init(&shost->eh_cmd_q, &eh_work_q);
	spin_unlock_irqrestore(shost->host_lock, flags);

	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));

	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		if (!scsi_eh_abort_cmds(&eh_work_q, &eh_done_q))
			scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);

James



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 23:01                               ` James Bottomley
@ 2005-09-01 23:03                                 ` Luben Tuikov
  2005-09-01 23:27                                 ` Luben Tuikov
  1 sibling, 0 replies; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 23:03 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 19:01, James Bottomley wrote:
> On Thu, 2005-09-01 at 18:36 -0400, Luben Tuikov wrote:
> 
>>So are you claiming that 
>>    "the error handler clears contingent allegiance conditions" ?
>>
>>Please point me to the lines in the source code where it does this
>>and how it does it.
> 
> 
> That's this bit:
> 
> static void scsi_unjam_host(struct Scsi_Host *shost)
> {
> 	unsigned long flags;
> 	LIST_HEAD(eh_work_q);
> 	LIST_HEAD(eh_done_q);
> 
> 	spin_lock_irqsave(shost->host_lock, flags);
> 	list_splice_init(&shost->eh_cmd_q, &eh_work_q);
> 	spin_unlock_irqrestore(shost->host_lock, flags);
> 
> 	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
> 
> 	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Yeah, that ought to do it. ;-)

	Luben


> 		if (!scsi_eh_abort_cmds(&eh_work_q, &eh_done_q))
> 			scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
> 
> James
> 
> 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:27                           ` Jeff Garzik
@ 2005-09-01 23:17                             ` Luben Tuikov
  2005-09-02  7:09                             ` Stefan Richter
  1 sibling, 0 replies; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 23:17 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: James Bottomley, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 18:27, Jeff Garzik wrote:
>>I think it's only because "it's there" and that it provides
>>a uniform access -- provided by SCSI, _not_ by that particular
>>SCSI implementation.
> 
> 
> You're correct in one sense, but I still don't think you understand 
> Linux development at a fundamental level.
> 
> Linux is NOT about big designs.  Linus says "do what you must, and no 
> more."  Linux is a fluid, organic biological organism that evolves 
> through small changes over time.

Jeff, this is no longer true.

Maybe in the years 1991-1995, but this is _absolutely_ no longer true.
Far more so for SCSI Core.  Read Documentation/ManagingStyle.

Furthermore, take a look at the changes Christoph is about to
do now regarding, about and around struct scsi_target.

Those changes have been talked about _before_ Christoph
moved from XFS to SCSI Core and _before_ you moved from
8139too to SCSI Core, first by Justin Gibbs and then by, yours
truly.

Another expample, 64 bit LUNs.  First they are not supported,
second, they are _interpreted_ by SCSI Core in scsilun_to_int().
This is beyond wrong, this is incompetence.

I've been asking for those since 2000, after my work on iSCSI.

So, as you see, _you are right_, but only for *other* Linux
subsystems, where their maintaners operate on the Documentation/ManagingStyle
style.  Here at SCSI Core, the maintainers like to control everything,
and unless _they_ themselves came up with the idea, to reject it
immediately.

So over the years, a lot of "design features", "enhancements", etc
have stacked up, against the antiquated SCSI Core.  It is very sad
and unfortunate.

Had those been listened to and heeded to, we'd have the human
framework that you talk about.

The best thing you can have is maintainers who _listen_ to people
and _listen_ to core engineers who _live and breathe_ the industry.
This is the Documentation/ManagingStyle.

The worst thing you can have is maintainers who try to _guess_
and do their own designs and whims, without actually addressing
real problems which exist now and have existed for a long time.

The presense of channel/id in SCSI Core.  It is unthinkable to leave
that in and go and work on a "transport class", whatever that means.

Priorities need to be set straight.  The architecture (future) needs
to be studied or at least the maintainers should listen to vendors
and double check with others and then triple check with a spec.

Parallel SCSI is slowly and surely going away.

FC, USB, FireWire, SATA and SAS is coming our way and will completely
replace SPI.

You can join the future or you can steadfastly hang on to something
which you know will not cut it anywhich way you mess with it.

> So, yes, the reason is "it's there"   And that's a really good reason!
> 
> The future will bring other "baby steps" that evolve us towards a more 
> modular design where each LLD may register themselves with a storage 
> system, associate themselves with one or more transport classes, which 
> in turn create associations with device classes.

I have this working already _and_ it is T10 compliant.
It is modular and _layered_.

> Queueing, EH, transport classes (which should already be independent of 
> SCSI), maybe driver API, and other fun stuff.  :)

You'd better have one good and well thought out infrastructure...!
Of course you cannot do any of this unless a lot of specs have been
read and studied.... It's like writing papers -- you do if after having
read a lot of them.

As to your saying 
	"transport classes (which should already be independent of
	 SCSI)"
You see, this will _never_ happen.

First of all, them JB's "transport classes" were writting ONLY
to export *attributes*, and _never_ to provide a layer of abstraction
for the specific transport.

Second, as I mentioned in a previous email, you cannot create
a "superclass" of all transports. Because:
	- this is what SAM is,
	- this is what SCSI Core *should be*!

Third, they were never intended to *manage* a transport infrastructure.
Just look at the troubles they have with SAS's infrastructure.

Fourth, they are "hooked" at the _wrong_ level and indirection,
just as you say
	"transport classes (which should already be independent of
	 SCSI)".
They are NOT and will never be -- the mindset of their creators
is that they will control also the transports, instead of
concentrating on SCSI Core.

So all in all, a lot of thinking needs to be done. Add to this
a white sheet of paper, pencil on one side and a spec on the other.

	Luben

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 23:01                               ` James Bottomley
  2005-09-01 23:03                                 ` Luben Tuikov
@ 2005-09-01 23:27                                 ` Luben Tuikov
  1 sibling, 0 replies; 33+ messages in thread
From: Luben Tuikov @ 2005-09-01 23:27 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jeff Garzik, Tejun Heo, ltuikov, Albert Lee, linux-ide,
	SCSI Mailing List, Doug Maxey

On 09/01/05 19:01, James Bottomley wrote:
> On Thu, 2005-09-01 at 18:36 -0400, Luben Tuikov wrote:
> 
>>So are you claiming that 
>>    "the error handler clears contingent allegiance conditions" ?
>>
>>Please point me to the lines in the source code where it does this
>>and how it does it.
> 
> 
> That's this bit:
> 
> static void scsi_unjam_host(struct Scsi_Host *shost)
> {
> 	unsigned long flags;
> 	LIST_HEAD(eh_work_q);
> 	LIST_HEAD(eh_done_q);
> 
> 	spin_lock_irqsave(shost->host_lock, flags);
> 	list_splice_init(&shost->eh_cmd_q, &eh_work_q);
> 	spin_unlock_irqrestore(shost->host_lock, flags);
> 
> 	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
> 
> 	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

James,

1. I'd suggest you take a look at the QErr and TST bits in the
   Control Mode page.

2. I'd also suggest looking at the ACA Task Attribute.

3. I'd also suggest looking at the NACA bit of the SCSI CDB.

	Luben



> 		if (!scsi_eh_abort_cmds(&eh_work_q, &eh_done_q))
> 			scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
> 
> James
> 
> 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-09-01 22:27                           ` Jeff Garzik
  2005-09-01 23:17                             ` Luben Tuikov
@ 2005-09-02  7:09                             ` Stefan Richter
  1 sibling, 0 replies; 33+ messages in thread
From: Stefan Richter @ 2005-09-02  7:09 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Luben Tuikov, James Bottomley, Tejun Heo, ltuikov, Albert Lee,
	linux-ide, SCSI Mailing List, Doug Maxey

Jeff Garzik wrote:
> Linux is NOT about big designs.  Linus says "do what you must, and no 
> more."

The scsi subsystem is used in so many divergent areas, its set of 
requirements is "big". Therefore its design has to be "big".

> The future will bring other "baby steps" that evolve us towards a more 
> modular design where each LLD may register themselves with a storage 
> system, associate themselves with
[...]

...and hopefully a future where a LLD can be written and maintained 
without profound knowledge of many implementation details of scsi mid- 
and high-level.

(Sorry for digressing further from the thread's subject, although it 
still seems related.)
-- 
Stefan Richter
-=====-=-=-= =--= ---=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC] libata new EH document
  2005-08-29  6:11 [RFC] libata new EH document Tejun Heo
  2005-08-29  6:13 ` Tejun Heo
  2005-08-30  9:10 ` Albert Lee
@ 2005-09-07  8:25 ` Jeff Garzik
  2 siblings, 0 replies; 33+ messages in thread
From: Jeff Garzik @ 2005-09-07  8:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: albertcc, linux-ide

Tejun Heo wrote:
>  Hello, Jeff, Albert & ATA developers.
> 
>  This is the final one of recent document series for libata EH - SCSI
> EH, ATA exceptions, libata EH and, this one - libata new EH.
> 
>  This document tries to discuss how to implement new advanced EH.  It
> also describes some proposed mechanisms in detail.  I'm aware that
> things are vague without actual code, but I still think this document
> alone can at least help discussion if nothing else.  As long as some
> consensus is reached regarding general desing, I'll follow up with
> patches.
> 
>  Jeff, a lot are from my previous new EH/NCQ patchset but also quite
> a bit has changed (for better, I hope).
> 
>  Thanks.
> 
> 
> libata new EH
> ======================================
> 
>  As discussed in the previous libata EH doc, the current libata EH
> needs some improvements.  This document discusses goals of new libata
> EH and how to reach them.  Please read SCSI EH, ATA exceptions and
> libata EH documents first.
> 
> TABLE OF CONTENTS
> 
> [1] Goals & design choices
>     [1-1] Use SCSI hostt->eh_strategy_handler()
>     [1-2] Unified error path in an EH thread
>     [1-3] Synchronization
>     [1-4] Clean mechanism to hand off qc's to EH
>     [1-5] Separate EH qc
>     [1-6] SCSI/libata separation
> [2] Designs
>     [2-1] Handoff of failed qc's
>     [2-2] Timed out scmd's and qc's
>     [2-3] Summary of [2-1] and [2-2]
>     [2-4] EH processing & completion
> [3] Ideas
>     [3-1] Using EH for non-error exceptions and dynamic reconfiguration
>     [3-2] Using EH for host_set level exclusion
> [4] Implementation plan
> 
> 
> [1] Goals & design choices
> 
>  The final goal is implementing advanced error handling as described
> in ATA exceptions document including NCQ EH, dynamic transport
> reconfiguration and non-error exception handling for power management
> and hot plugging.
> 
>  The followings are sub goals and design choices to reach the final
> goal.
> 
> 
> [1-1] Use SCSI hostt->eh_strategy_handler()
> 
>     We have two other alternatives here - one is using fine-grained
>     SCSI EH callbacks and the other is implementing separate EH for
>     libata.
> 
>     Using fine-grained SCSI EH callbacks is possible, but it has too
>     much SCSI/SPI assumptions in it -

Not really.  When you notice an error, and inform the SCSI stack of that, it

- tries to abort the command using abort_handler
- if that failed, tries to reset the device
- if that failed, tries to reset the bus
- if that failed, tries to reset the host

Nothing SPI-specific about that.
Nothing SCSI-specific about that, either :)

That is the ordering that we would like to use, and it maps directly to 
SCSI EH


>  Also, as described in the
>     SCSI EH doc, it issues several SCSI commands for recovery.  They
>     can be translated but recovery through translation is a bit
>     creepy, IMHO.

Agreed RE translation

I'll reply to more of this doc when I have some sleep :)

	Jeff



^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2005-09-07  8:25 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-29  6:11 [RFC] libata new EH document Tejun Heo
2005-08-29  6:13 ` Tejun Heo
2005-08-30  9:10 ` Albert Lee
2005-08-30 10:26   ` Tejun Heo
2005-08-30 14:32     ` Luben Tuikov
2005-09-01  1:17       ` Tejun Heo
2005-09-01  2:22         ` Jeff Garzik
2005-09-01  2:42           ` Tejun Heo
2005-09-01  3:33           ` Luben Tuikov
2005-09-01  3:30         ` Luben Tuikov
2005-09-01  3:44           ` Tejun Heo
2005-09-01  4:38             ` Luben Tuikov
2005-09-01  5:44               ` Tejun Heo
2005-09-01  5:54                 ` Jeff Garzik
2005-09-01 13:24                   ` James Bottomley
2005-09-01 21:40                     ` Luben Tuikov
2005-09-01 21:46                       ` Jeff Garzik
2005-09-01 22:09                         ` Luben Tuikov
2005-09-01 22:27                           ` Jeff Garzik
2005-09-01 23:17                             ` Luben Tuikov
2005-09-02  7:09                             ` Stefan Richter
2005-09-01 22:22                         ` Luben Tuikov
2005-09-01 22:31                           ` Jeff Garzik
2005-09-01 21:55                       ` James Bottomley
2005-09-01 22:07                         ` Luben Tuikov
2005-09-01 22:23                           ` James Bottomley
2005-09-01 22:36                             ` Luben Tuikov
2005-09-01 23:01                               ` James Bottomley
2005-09-01 23:03                                 ` Luben Tuikov
2005-09-01 23:27                                 ` Luben Tuikov
2005-09-01  2:22     ` Jeff Garzik
2005-08-30 14:27   ` James Bottomley
2005-09-07  8:25 ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).