* [PATCH V2 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
@ 2013-04-16 20:26 wenxiong
2013-04-16 20:26 ` [PATCH V2 1/1] " wenxiong
0 siblings, 1 reply; 3+ messages in thread
From: wenxiong @ 2013-04-16 20:26 UTC (permalink / raw)
To: James.Bottomley; +Cc: linux-scsi, brking
After we discussed your comments, We intergated your patch and
generated this updated patch.
Please help us review the new patch. If it is ok, you are free
to add your name in sign-off list.
Thanks for your help!
Wendy
--
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH V2 1/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
2013-04-16 20:26 [PATCH V2 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd wenxiong
@ 2013-04-16 20:26 ` wenxiong
2013-04-23 14:52 ` Hannes Reinecke
0 siblings, 1 reply; 3+ messages in thread
From: wenxiong @ 2013-04-16 20:26 UTC (permalink / raw)
To: James.Bottomley; +Cc: linux-scsi, brking, Wen Xiong
[-- Attachment #1: scsi_error_v2 --]
[-- Type: text/plain, Size: 2582 bytes --]
We discussed James's concern. We intergated James's patch and generated
this updated patch.
Fix scsi_send_eh_cmnd to check the return code of queuecommand when
sending commands and retry for a bit if the LLDD returns a busy response.
This fixes an issue seen with the ipr driver where an ipr initiated reset
immediately following an eh_host_reset caused EH initiated commands to fail,
resulting in devices being taken offline. This patch resolves the issue.
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---
drivers/scsi/scsi_error.c | 34 +++++++++++++++++++++++++---------
1 file changed, 25 insertions(+), 9 deletions(-)
Index: b/drivers/scsi/scsi_error.c
===================================================================
--- a/drivers/scsi/scsi_error.c 2013-04-16 12:56:16.617857960 -0500
+++ b/drivers/scsi/scsi_error.c 2013-04-16 12:57:00.279108838 -0500
@@ -791,18 +791,21 @@ static int scsi_send_eh_cmnd(struct scsi
struct scsi_device *sdev = scmd->device;
struct Scsi_Host *shost = sdev->host;
DECLARE_COMPLETION_ONSTACK(done);
- unsigned long timeleft;
+ unsigned long timeleft = timeout;
struct scsi_eh_save ses;
+ const int stall_for = min(HZ/10, 1);
int rtn;
scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes);
+retry:
shost->eh_action = &done;
scsi_log_send(scmd);
scmd->scsi_done = scsi_eh_done;
- shost->hostt->queuecommand(shost, scmd);
+ rtn = shost->hostt->queuecommand(shost, scmd);
- timeleft = wait_for_completion_timeout(&done, timeout);
+ if (!rtn)
+ timeleft = wait_for_completion_timeout(&done, timeout);
shost->eh_action = NULL;
@@ -819,10 +822,19 @@ static int scsi_send_eh_cmnd(struct scsi
* about this command.
*/
if (timeleft) {
- rtn = scsi_eh_completed_normally(scmd);
- SCSI_LOG_ERROR_RECOVERY(3,
- printk("%s: scsi_eh_completed_normally %x\n",
- __func__, rtn));
+ switch (rtn) {
+ case 0:
+ rtn = scsi_eh_completed_normally(scmd);
+ SCSI_LOG_ERROR_RECOVERY(3,
+ printk("%s: scsi_eh_completed_normally %x\n",
+ __func__, rtn));
+ break;
+ case FAILED:
+ break;
+ default:
+ rtn = ADD_TO_MLQUEUE;
+ break;
+ }
switch (rtn) {
case SUCCESS:
@@ -831,8 +843,12 @@ static int scsi_send_eh_cmnd(struct scsi
case TARGET_ERROR:
break;
case ADD_TO_MLQUEUE:
- rtn = NEEDS_RETRY;
- break;
+ if (timeleft > stall_for) {
+ timeout = timeleft - stall_for;
+ msleep(stall_for);
+ goto retry;
+ }
+ /* fall through */
default:
rtn = FAILED;
break;
--
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH V2 1/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
2013-04-16 20:26 ` [PATCH V2 1/1] " wenxiong
@ 2013-04-23 14:52 ` Hannes Reinecke
0 siblings, 0 replies; 3+ messages in thread
From: Hannes Reinecke @ 2013-04-23 14:52 UTC (permalink / raw)
To: wenxiong; +Cc: James.Bottomley, linux-scsi, brking
On 04/16/2013 10:26 PM, wenxiong@linux.vnet.ibm.com wrote:
> We discussed James's concern. We intergated James's patch and generated
> this updated patch.
>
> Fix scsi_send_eh_cmnd to check the return code of queuecommand when
> sending commands and retry for a bit if the LLDD returns a busy response.
> This fixes an issue seen with the ipr driver where an ipr initiated reset
> immediately following an eh_host_reset caused EH initiated commands to fail,
> resulting in devices being taken offline. This patch resolves the issue.
>
>
> Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
> ---
> drivers/scsi/scsi_error.c | 34 +++++++++++++++++++++++++---------
> 1 file changed, 25 insertions(+), 9 deletions(-)
>
> Index: b/drivers/scsi/scsi_error.c
> ===================================================================
> --- a/drivers/scsi/scsi_error.c 2013-04-16 12:56:16.617857960 -0500
> +++ b/drivers/scsi/scsi_error.c 2013-04-16 12:57:00.279108838 -0500
> @@ -791,18 +791,21 @@ static int scsi_send_eh_cmnd(struct scsi
> struct scsi_device *sdev = scmd->device;
> struct Scsi_Host *shost = sdev->host;
> DECLARE_COMPLETION_ONSTACK(done);
> - unsigned long timeleft;
> + unsigned long timeleft = timeout;
> struct scsi_eh_save ses;
> + const int stall_for = min(HZ/10, 1);
> int rtn;
>
> scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes);
> +retry:
> shost->eh_action = &done;
>
> scsi_log_send(scmd);
> scmd->scsi_done = scsi_eh_done;
> - shost->hostt->queuecommand(shost, scmd);
> + rtn = shost->hostt->queuecommand(shost, scmd);
>
> - timeleft = wait_for_completion_timeout(&done, timeout);
> + if (!rtn)
> + timeleft = wait_for_completion_timeout(&done, timeout);
>
> shost->eh_action = NULL;
>
Hmm. This seems to be a generic bug fix here; it is perfectly
ok for queuecommand() to return a non-zero value without calling
->scsi_done. At which point ->complete is pointless to try as no
completion ever would be invoked.
Mind separating that out as a separate patch?
> @@ -819,10 +822,19 @@ static int scsi_send_eh_cmnd(struct scsi
> * about this command.
> */
> if (timeleft) {
> - rtn = scsi_eh_completed_normally(scmd);
> - SCSI_LOG_ERROR_RECOVERY(3,
> - printk("%s: scsi_eh_completed_normally %x\n",
> - __func__, rtn));
> + switch (rtn) {
> + case 0:
> + rtn = scsi_eh_completed_normally(scmd);
> + SCSI_LOG_ERROR_RECOVERY(3,
> + printk("%s: scsi_eh_completed_normally %x\n",
> + __func__, rtn));
> + break;
> + case FAILED:
> + break;
> + default:
> + rtn = ADD_TO_MLQUEUE;
> + break;
> + }
>
> switch (rtn) {
> case SUCCESS:
Bzzt.
'FAILED' is a valid response for scsi_eh_completed_normally, but not
for ->queuecommand.
> @@ -831,8 +843,12 @@ static int scsi_send_eh_cmnd(struct scsi
> case TARGET_ERROR:
> break;
> case ADD_TO_MLQUEUE:
> - rtn = NEEDS_RETRY;
> - break;
> + if (timeleft > stall_for) {
> + timeout = timeleft - stall_for;
> + msleep(stall_for);
> + goto retry;
> + }
> + /* fall through */
> default:
> rtn = FAILED;
> break;
>
We're already calling 'wait_for_completion_timeout'.
So normally the 'msleep' wouldn't be necessary.
It's only required for the case when ->queuecommand
returns non-zero.
So I'd rather see to have the msleep into section
where the return command from queuecommand is evaluated;
here we'll actually decrease responsiveness as
we're always waiting for X seconds, even if the command
would've been completed during that time.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-04-23 14:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-16 20:26 [PATCH V2 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd wenxiong
2013-04-16 20:26 ` [PATCH V2 1/1] " wenxiong
2013-04-23 14:52 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).