* [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE @ 2007-02-02 22:04 Edward Goggin 2007-02-02 22:54 ` James Bottomley 0 siblings, 1 reply; 5+ messages in thread From: Edward Goggin @ 2007-02-02 22:04 UTC (permalink / raw) To: linux-scsi; +Cc: James.Bottomley, eric.moore Patch Set Summary: 1 Define new SCSI ML host status DID_COND_REQUEUE and add its handling code to scsi_decide_disposition. Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF not REQ_FAILFAST. 2 Return DID_COND_REQUEUE instead of DID_BUS_BUSY host status in MPT fusion driver when IOC status is SUCCESS and scsi status is busy. Signed-off-by: Ed Goggin <egoggin@vmware.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE 2007-02-02 22:04 [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE Edward Goggin @ 2007-02-02 22:54 ` James Bottomley 2007-02-02 23:11 ` Edward Goggin 0 siblings, 1 reply; 5+ messages in thread From: James Bottomley @ 2007-02-02 22:54 UTC (permalink / raw) To: Edward Goggin; +Cc: linux-scsi, eric.moore On Fri, 2007-02-02 at 17:04 -0500, Edward Goggin wrote: > Patch Set Summary: > > 1 Define new SCSI ML host status DID_COND_REQUEUE and > add its handling code to scsi_decide_disposition. > Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF > not REQ_FAILFAST. > > 2 Return DID_COND_REQUEUE instead of DID_BUS_BUSY host status > in MPT fusion driver when IOC status is SUCCESS and scsi > status is busy. Please, no. In the first place, as I already said on the previous thread, I don't think the driver should be interpreting the BUSY return. Secondly, the original problem was with fibre devices which seemed to want FAILFAST on BUSY (which looks very bogus to me), but no-one asked for this behaviour to be preserved. The original bug report: "When a target device responds with BUSY status, the MPT driver was sending DID_OK to the SCSI mid layer, which caused the IO to be retried indefinitely betweenthe mid layer and the driver. By changing the driver return status to DID_BUS_BUSY, the target BUSY status can now flow through the mid layer to an upper layer Failover driver, whichwill manage the I/O timeout." is about behaviour which is now fixed (BUSY is retried for the command's maximum lifetime but no longer). Thirdly, the VMware issue was that the fibre fix was causing your implementation to time out too fast. The solution, then, as I said previously should simply be to pass the BUSY status up unmodified from the fusion driver. James ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE 2007-02-02 22:54 ` James Bottomley @ 2007-02-02 23:11 ` Edward Goggin 2007-02-02 23:18 ` James Bottomley 0 siblings, 1 reply; 5+ messages in thread From: Edward Goggin @ 2007-02-02 23:11 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi, eric.moore On Fri, 2007-02-02 at 16:54 -0600, James Bottomley wrote: > On Fri, 2007-02-02 at 17:04 -0500, Edward Goggin wrote: > > Patch Set Summary: > > > > 1 Define new SCSI ML host status DID_COND_REQUEUE and > > add its handling code to scsi_decide_disposition. > > Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF > > not REQ_FAILFAST. > > > > 2 Return DID_COND_REQUEUE instead of DID_BUS_BUSY host status > > in MPT fusion driver when IOC status is SUCCESS and scsi > > status is busy. > > Please, no. > > In the first place, as I already said on the previous thread, I don't > think the driver should be interpreting the BUSY return. > > Secondly, the original problem was with fibre devices which seemed to > want FAILFAST on BUSY (which looks very bogus to me), but no-one asked > for this behaviour to be preserved. The original bug report: > > "When a target device responds with BUSY status, the MPT driver > was > sending DID_OK to the SCSI mid layer, which caused the IO to be > retried indefinitely betweenthe mid layer and the > driver. By changing the driver return status to DID_BUS_BUSY, > the target BUSY status can now flow through the mid layer to an > upper layer Failover driver, whichwill manage the I/O timeout." > > is about behaviour which is now fixed (BUSY is retried for the command's > maximum lifetime but no longer). > > Thirdly, the VMware issue was that the fibre fix was causing your > implementation to time out too fast. > > The solution, then, as I said previously should simply be to pass the > BUSY status up unmodified from the fusion driver. > That solution doesn't work for the RDAC/MPP driver as the BUSY status handler retries indefinitely. We need a solution which works for both a bare metal host running RDAC/MPP which for this use case, wants to get control over the failed command ASAP and a VMware host which may need to retry longer than DID_BUS_BUSY currently allows for. I'll let the LSI/Engenio people comment further on their needs. > James > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE 2007-02-02 23:11 ` Edward Goggin @ 2007-02-02 23:18 ` James Bottomley 2007-02-02 23:33 ` Edward Goggin 0 siblings, 1 reply; 5+ messages in thread From: James Bottomley @ 2007-02-02 23:18 UTC (permalink / raw) To: Edward Goggin; +Cc: linux-scsi, eric.moore On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: > That solution doesn't work for the RDAC/MPP driver as the BUSY status > handler retries indefinitely. We need a solution which works for both a > bare metal host running RDAC/MPP which for this use case, wants to get > control over the failed command ASAP and a VMware host which may need to > retry longer than DID_BUS_BUSY currently allows for. No it doesn't, not any longer... the mid-layer retries for the command up to its timeout before failing. That's the point about questioning the validity of the original problem. James ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE 2007-02-02 23:18 ` James Bottomley @ 2007-02-02 23:33 ` Edward Goggin 0 siblings, 0 replies; 5+ messages in thread From: Edward Goggin @ 2007-02-02 23:33 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi, eric.moore On Fri, 2007-02-02 at 17:18 -0600, James Bottomley wrote: > On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: > > That solution doesn't work for the RDAC/MPP driver as the BUSY status > > handler retries indefinitely. We need a solution which works for both a > > bare metal host running RDAC/MPP which for this use case, wants to get > > control over the failed command ASAP and a VMware host which may need to > > retry longer than DID_BUS_BUSY currently allows for. > > No it doesn't, not any longer... the mid-layer retries for the command > up to its timeout before failing. That's the point about questioning > the validity of the original problem. > > James > > I think I see your argument ... retries for BUSY and all other scsi/host status's are limited by the code in scsi_softirq_done which filters the disposition returned by scsi_decide_disposition, so no status will yield an indefinite retry. Not clear if that's soon enough for RDAC/MPP. For the VMware case, it appears to allow an additional 30 seconds (beyond what DID_BUSY_BUSY would allow) for a retry. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-02-02 23:33 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-02-02 22:04 [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE Edward Goggin 2007-02-02 22:54 ` James Bottomley 2007-02-02 23:11 ` Edward Goggin 2007-02-02 23:18 ` James Bottomley 2007-02-02 23:33 ` Edward Goggin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox