* If abort request comes in for command not known to LLD?
@ 2012-03-02 15:44 scameron
2012-03-02 21:10 ` Mike Christie
0 siblings, 1 reply; 4+ messages in thread
From: scameron @ 2012-03-02 15:44 UTC (permalink / raw)
To: linux-scsi; +Cc: scameron
What should the LLD do if an abort request comes into the
abort error handler from the midlayer for a command which is
not known to the LLD?
I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd():
no_cmd:
/*
* Our assumption is that if we don't have the command, no
* recovery action was required, so we return success. Again,
* the semantics of the mid-layer recovery engine are not
* well defined, so this may change in time.
*/
retval = SUCCESS;
Is that the right thing to do? Seems a bit weird, but if that's
the right thing to do, I can do that too.
-- steve
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: If abort request comes in for command not known to LLD? 2012-03-02 15:44 If abort request comes in for command not known to LLD? scameron @ 2012-03-02 21:10 ` Mike Christie 2012-03-02 23:01 ` scameron 0 siblings, 1 reply; 4+ messages in thread From: Mike Christie @ 2012-03-02 21:10 UTC (permalink / raw) To: scameron; +Cc: linux-scsi On 03/02/2012 09:44 AM, scameron@beardog.cce.hp.com wrote: > > What should the LLD do if an abort request comes into the > abort error handler from the midlayer for a command which is > not known to the LLD? > > I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd(): > > no_cmd: > /* > * Our assumption is that if we don't have the command, no > * recovery action was required, so we return success. Again, > * the semantics of the mid-layer recovery engine are not > * well defined, so this may change in time. > */ > retval = SUCCESS; > > Is that the right thing to do? Seems a bit weird, but if that's > the right thing to do, I can do that too. > How do you hit this case? I think it is ok. The reasons I have seen drivers hit it this is that race where the driver is completing a command while the timer code is starting to go off, or the cmd has timed out then the driver completes the command before the abort code is run. In those cases the driver has cleaned up its internal accounting because the command has completed. At that point there is not much it can do even if it wanted to. It does not have away to look up things like internal tags/ids for the command. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: If abort request comes in for command not known to LLD? 2012-03-02 21:10 ` Mike Christie @ 2012-03-02 23:01 ` scameron 2012-03-04 10:25 ` Mike Christie 0 siblings, 1 reply; 4+ messages in thread From: scameron @ 2012-03-02 23:01 UTC (permalink / raw) To: Mike Christie; +Cc: linux-scsi, scameron On Fri, Mar 02, 2012 at 03:10:02PM -0600, Mike Christie wrote: > On 03/02/2012 09:44 AM, scameron@beardog.cce.hp.com wrote: > > > > What should the LLD do if an abort request comes into the > > abort error handler from the midlayer for a command which is > > not known to the LLD? > > > > I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd(): > > > > no_cmd: > > /* > > * Our assumption is that if we don't have the command, no > > * recovery action was required, so we return success. Again, > > * the semantics of the mid-layer recovery engine are not > > * well defined, so this may change in time. > > */ > > retval = SUCCESS; > > > > Is that the right thing to do? Seems a bit weird, but if that's > > the right thing to do, I can do that too. > > > > How do you hit this case? I'm not quite sure. I haven't hit it, but have a report of it on RHEL5u5 with XFS filesystem under heavy load. As a guess, I'd say a race between driver completing the command and a timeout in the mid layer. In any case, it'd be nice to know what the kernel expects a driver to do if it should encounter that situation. > > I think it is ok. The reasons I have seen drivers hit it this is that > race where the driver is completing a command while the timer code is > starting to go off, or the cmd has timed out then the driver completes > the command before the abort code is run. > > In those cases the driver has cleaned up its internal accounting because > the command has completed. At that point there is not much it can do > even if it wanted to. It does not have away to look up things like > internal tags/ids for the command. Right, but it just seems weird for the driver to effectively say, "Sure, I aborted that command", when it did no such thing. If the driver tells the kernel that a write got aborted when really it was completed, that seems like it could be kind of bad. -- steve ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: If abort request comes in for command not known to LLD? 2012-03-02 23:01 ` scameron @ 2012-03-04 10:25 ` Mike Christie 0 siblings, 0 replies; 4+ messages in thread From: Mike Christie @ 2012-03-04 10:25 UTC (permalink / raw) To: scameron; +Cc: linux-scsi On 03/02/2012 05:01 PM, scameron@beardog.cce.hp.com wrote: > On Fri, Mar 02, 2012 at 03:10:02PM -0600, Mike Christie wrote: >> On 03/02/2012 09:44 AM, scameron@beardog.cce.hp.com wrote: >>> >>> What should the LLD do if an abort request comes into the >>> abort error handler from the midlayer for a command which is >>> not known to the LLD? >>> >>> I see aic7xxx_osm.c handles it in this way in ahc_linux_queue_recovery_cmd(): >>> >>> no_cmd: >>> /* >>> * Our assumption is that if we don't have the command, no >>> * recovery action was required, so we return success. Again, >>> * the semantics of the mid-layer recovery engine are not >>> * well defined, so this may change in time. >>> */ >>> retval = SUCCESS; >>> >>> Is that the right thing to do? Seems a bit weird, but if that's >>> the right thing to do, I can do that too. >>> >> >> How do you hit this case? > > I'm not quite sure. I haven't hit it, but have a report of it on RHEL5u5 > with XFS filesystem under heavy load. As a guess, I'd say a race between > driver completing the command and a timeout in the mid layer. In any > case, it'd be nice to know what the kernel expects a driver to do if > it should encounter that situation. > >> >> I think it is ok. The reasons I have seen drivers hit it this is that >> race where the driver is completing a command while the timer code is >> starting to go off, or the cmd has timed out then the driver completes >> the command before the abort code is run. >> >> In those cases the driver has cleaned up its internal accounting because >> the command has completed. At that point there is not much it can do >> even if it wanted to. It does not have away to look up things like >> internal tags/ids for the command. > > Right, but it just seems weird for the driver to effectively say, "Sure, > I aborted that command", when it did no such thing. If the driver tells > the kernel that a write got aborted when really it was completed, that > seems like it could be kind of bad. I see what you are saying. Yeah, it would be better if we had a new error code for this, so we could return it from the abort handler. Then scsi_error.c could also skip retrying it. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-03-04 10:29 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-02 15:44 If abort request comes in for command not known to LLD? scameron 2012-03-02 21:10 ` Mike Christie 2012-03-02 23:01 ` scameron 2012-03-04 10:25 ` Mike Christie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).