linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* If abort request comes in for command not known to LLD?
@ 2012-03-02 15:44 scameron
  2012-03-02 21:10 ` Mike Christie
  0 siblings, 1 reply; 4+ messages in thread
From: scameron @ 2012-03-02 15:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: scameron


What should the LLD do if an abort request comes into the
abort error handler from the midlayer for a command which is
not known to the LLD?

I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd():

no_cmd:
        /*
         * Our assumption is that if we don't have the command, no
         * recovery action was required, so we return success.  Again,
         * the semantics of the mid-layer recovery engine are not
         * well defined, so this may change in time.
         */
        retval = SUCCESS;

Is that the right thing to do?  Seems a bit weird, but if that's
the right thing to do, I can do that too.

-- steve



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: If abort request comes in for command not known to LLD?
  2012-03-02 15:44 If abort request comes in for command not known to LLD? scameron
@ 2012-03-02 21:10 ` Mike Christie
  2012-03-02 23:01   ` scameron
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Christie @ 2012-03-02 21:10 UTC (permalink / raw)
  To: scameron; +Cc: linux-scsi

On 03/02/2012 09:44 AM, scameron@beardog.cce.hp.com wrote:
> 
> What should the LLD do if an abort request comes into the
> abort error handler from the midlayer for a command which is
> not known to the LLD?
> 
> I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd():
> 
> no_cmd:
>         /*
>          * Our assumption is that if we don't have the command, no
>          * recovery action was required, so we return success.  Again,
>          * the semantics of the mid-layer recovery engine are not
>          * well defined, so this may change in time.
>          */
>         retval = SUCCESS;
> 
> Is that the right thing to do?  Seems a bit weird, but if that's
> the right thing to do, I can do that too.
> 

How do you hit this case?

I think it is ok. The reasons I have seen drivers hit it this is that
race where the driver is completing a command while the timer code is
starting to go off, or the cmd has timed out then the driver completes
the command before the abort code is run.

In those cases the driver has cleaned up its internal accounting because
the command has completed. At that point there is not much it can do
even if it wanted to. It does not have away to look up things like
internal tags/ids for the command.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: If abort request comes in for command not known to LLD?
  2012-03-02 21:10 ` Mike Christie
@ 2012-03-02 23:01   ` scameron
  2012-03-04 10:25     ` Mike Christie
  0 siblings, 1 reply; 4+ messages in thread
From: scameron @ 2012-03-02 23:01 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi, scameron

On Fri, Mar 02, 2012 at 03:10:02PM -0600, Mike Christie wrote:
> On 03/02/2012 09:44 AM, scameron@beardog.cce.hp.com wrote:
> > 
> > What should the LLD do if an abort request comes into the
> > abort error handler from the midlayer for a command which is
> > not known to the LLD?
> > 
> > I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd():
> > 
> > no_cmd:
> >         /*
> >          * Our assumption is that if we don't have the command, no
> >          * recovery action was required, so we return success.  Again,
> >          * the semantics of the mid-layer recovery engine are not
> >          * well defined, so this may change in time.
> >          */
> >         retval = SUCCESS;
> > 
> > Is that the right thing to do?  Seems a bit weird, but if that's
> > the right thing to do, I can do that too.
> > 
> 
> How do you hit this case?

I'm not quite sure.  I haven't hit it, but have a report of it on RHEL5u5
with XFS filesystem under heavy load.  As a guess, I'd say a race between
driver completing the command and a timeout in the mid layer.  In any
case, it'd be nice to know what the kernel expects a driver to do if
it should encounter that situation.

> 
> I think it is ok. The reasons I have seen drivers hit it this is that
> race where the driver is completing a command while the timer code is
> starting to go off, or the cmd has timed out then the driver completes
> the command before the abort code is run.
> 
> In those cases the driver has cleaned up its internal accounting because
> the command has completed. At that point there is not much it can do
> even if it wanted to. It does not have away to look up things like
> internal tags/ids for the command.

Right, but it just seems weird for the driver to effectively say, "Sure,
I aborted that command", when it did no such thing.  If the driver tells
the kernel that a write got aborted when really it was completed, that
seems like it could be kind of bad.

-- steve


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: If abort request comes in for command not known to LLD?
  2012-03-02 23:01   ` scameron
@ 2012-03-04 10:25     ` Mike Christie
  0 siblings, 0 replies; 4+ messages in thread
From: Mike Christie @ 2012-03-04 10:25 UTC (permalink / raw)
  To: scameron; +Cc: linux-scsi

On 03/02/2012 05:01 PM, scameron@beardog.cce.hp.com wrote:
> On Fri, Mar 02, 2012 at 03:10:02PM -0600, Mike Christie wrote:
>> On 03/02/2012 09:44 AM, scameron@beardog.cce.hp.com wrote:
>>>
>>> What should the LLD do if an abort request comes into the
>>> abort error handler from the midlayer for a command which is
>>> not known to the LLD?
>>>
>>> I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd():
>>>
>>> no_cmd:
>>>         /*
>>>          * Our assumption is that if we don't have the command, no
>>>          * recovery action was required, so we return success.  Again,
>>>          * the semantics of the mid-layer recovery engine are not
>>>          * well defined, so this may change in time.
>>>          */
>>>         retval = SUCCESS;
>>>
>>> Is that the right thing to do?  Seems a bit weird, but if that's
>>> the right thing to do, I can do that too.
>>>
>>
>> How do you hit this case?
> 
> I'm not quite sure.  I haven't hit it, but have a report of it on RHEL5u5
> with XFS filesystem under heavy load.  As a guess, I'd say a race between
> driver completing the command and a timeout in the mid layer.  In any
> case, it'd be nice to know what the kernel expects a driver to do if
> it should encounter that situation.
> 
>>
>> I think it is ok. The reasons I have seen drivers hit it this is that
>> race where the driver is completing a command while the timer code is
>> starting to go off, or the cmd has timed out then the driver completes
>> the command before the abort code is run.
>>
>> In those cases the driver has cleaned up its internal accounting because
>> the command has completed. At that point there is not much it can do
>> even if it wanted to. It does not have away to look up things like
>> internal tags/ids for the command.
> 
> Right, but it just seems weird for the driver to effectively say, "Sure,
> I aborted that command", when it did no such thing.  If the driver tells
> the kernel that a write got aborted when really it was completed, that
> seems like it could be kind of bad.

I see what you are saying. Yeah, it would be better if we had a new
error code for this, so we could return it from the abort handler. Then
scsi_error.c could also skip retrying it.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-03-04 10:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-02 15:44 If abort request comes in for command not known to LLD? scameron
2012-03-02 21:10 ` Mike Christie
2012-03-02 23:01   ` scameron
2012-03-04 10:25     ` Mike Christie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).