* [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd @ 2013-04-25 6:10 Hannes Reinecke 2013-05-03 14:24 ` wenxiong 0 siblings, 1 reply; 5+ messages in thread From: Hannes Reinecke @ 2013-04-25 6:10 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi, Hannes Reinecke, Wen Xiong, Brian King scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.de> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index d58db32..6a3c1d2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -889,22 +889,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd->device; struct Scsi_Host *shost = sdev->host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const int stall_for = min(HZ/10, 1); int rtn; +retry: scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes); shost->eh_action = &done; scsi_log_send(scmd); scmd->scsi_done = scsi_eh_done; - shost->hostt->queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(&done, timeout); + rtn = shost->hostt->queuecommand(shost, scmd); + if (rtn) { + if (timeleft) { + scsi_eh_restore_cmnd(scmd, &ses); + timeleft -= stall_for; + msleep(stall_for); + goto retry; + } + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(&done, timeout); shost->eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk("%s: scmd: %p, timeleft: %ld\n", @@ -935,7 +945,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd 2013-04-25 6:10 [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd Hannes Reinecke @ 2013-05-03 14:24 ` wenxiong 2013-05-03 18:23 ` James Bottomley 0 siblings, 1 reply; 5+ messages in thread From: wenxiong @ 2013-05-03 14:24 UTC (permalink / raw) To: Hannes Reinecke; +Cc: James Bottomley, linux-scsi, Brian King Quoting Hannes Reinecke <hare@suse.de>: > scsi_send_eh_cmnd() is calling queuecommand() directly, so > it needs to check the return value here. > The only valid return codes for queuecommand() are 'busy' > states, so we need to wait for a bit to allow the LLDD > to recover. > > Based on an earlier patch from Wen Xiong. > > Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com> > Cc: Brian King <brking@linux.vnet.ibm.com> > Signed-off-by: Hannes Reinecke <hare@suse.de> > Hi James, I have verified this patch with two new ipr adapters. EEH error can be recoery successfully. Do you have any question about this new patch? Thanks, Wendy > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index d58db32..6a3c1d2 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -889,22 +889,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd > *scmd, unsigned char *cmnd, > struct scsi_device *sdev = scmd->device; > struct Scsi_Host *shost = sdev->host; > DECLARE_COMPLETION_ONSTACK(done); > - unsigned long timeleft; > + unsigned long timeleft = timeout; > struct scsi_eh_save ses; > + const int stall_for = min(HZ/10, 1); > int rtn; > > +retry: > scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes); > shost->eh_action = &done; > > scsi_log_send(scmd); > scmd->scsi_done = scsi_eh_done; > - shost->hostt->queuecommand(shost, scmd); > - > - timeleft = wait_for_completion_timeout(&done, timeout); > + rtn = shost->hostt->queuecommand(shost, scmd); > + if (rtn) { > + if (timeleft) { > + scsi_eh_restore_cmnd(scmd, &ses); > + timeleft -= stall_for; > + msleep(stall_for); > + goto retry; > + } > + rtn = NEEDS_RETRY; > + } else > + timeleft = wait_for_completion_timeout(&done, timeout); > > shost->eh_action = NULL; > > - scsi_log_completion(scmd, SUCCESS); > + scsi_log_completion(scmd, rtn); > > SCSI_LOG_ERROR_RECOVERY(3, > printk("%s: scmd: %p, timeleft: %ld\n", > @@ -935,7 +945,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd > *scmd, unsigned char *cmnd, > rtn = FAILED; > break; > } > - } else { > + } else if (!rtn) { > scsi_abort_eh_cmnd(scmd); > rtn = FAILED; > } ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd 2013-05-03 14:24 ` wenxiong @ 2013-05-03 18:23 ` James Bottomley 2013-05-04 17:02 ` Bart Van Assche 0 siblings, 1 reply; 5+ messages in thread From: James Bottomley @ 2013-05-03 18:23 UTC (permalink / raw) To: wenxiong@linux.vnet.ibm.com Cc: Hannes Reinecke, linux-scsi@vger.kernel.org, Brian King On Fri, 2013-05-03 at 10:24 -0400, wenxiong@linux.vnet.ibm.com wrote: > Quoting Hannes Reinecke <hare@suse.de>: > > > scsi_send_eh_cmnd() is calling queuecommand() directly, so > > it needs to check the return value here. > > The only valid return codes for queuecommand() are 'busy' > > states, so we need to wait for a bit to allow the LLDD > > to recover. > > > > Based on an earlier patch from Wen Xiong. > > > > Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com> > > Cc: Brian King <brking@linux.vnet.ibm.com> > > Signed-off-by: Hannes Reinecke <hare@suse.de> > > > Hi James, > > I have verified this patch with two new ipr adapters. EEH error can be > recoery successfully. > Do you have any question about this new patch? Yes, it's not correct: stall_for is in jiffies not msec, so msleep(stall_for) is taking far too long. Plus you don't know that stall_for is a divisor of timeleft, so timeleft -= stall_for could wrap and cause the retry loop to go on effectively forever. I think the below is the correct patch, so I'll commit it unless there are objections. James --- >From a74498dda7acf6f5fb99e43df54f2c1f5c6beec9 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare@suse.de> Date: Thu, 25 Apr 2013 08:10:00 +0200 Subject: [PATCH] [SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. [jejb: fix confusion between msec and jiffies values and other issues] Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..cfd1ef2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -25,6 +25,7 @@ #include <linux/interrupt.h> #include <linux/blkdev.h> #include <linux/delay.h> +#include <linux/jiffies.h> #include <scsi/scsi.h> #include <scsi/scsi_cmnd.h> @@ -791,22 +792,33 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd->device; struct Scsi_Host *shost = sdev->host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const unsigned long stall_for = min(msecs_to_jiffies(10), 1UL); int rtn; +retry: scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes); shost->eh_action = &done; scsi_log_send(scmd); scmd->scsi_done = scsi_eh_done; - shost->hostt->queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(&done, timeout); + rtn = shost->hostt->queuecommand(shost, scmd); + if (rtn) { + if (timeleft > stall_for) { + scsi_eh_restore_cmnd(scmd, &ses); + timeleft -= stall_for; + msleep(jiffies_to_msecs(stall_for)); + goto retry; + } + timeleft = 0; + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(&done, timeout); shost->eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk("%s: scmd: %p, timeleft: %ld\n", @@ -837,7 +849,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd 2013-05-03 18:23 ` James Bottomley @ 2013-05-04 17:02 ` Bart Van Assche 2013-05-04 18:20 ` James Bottomley 0 siblings, 1 reply; 5+ messages in thread From: Bart Van Assche @ 2013-05-04 17:02 UTC (permalink / raw) To: James Bottomley Cc: wenxiong@linux.vnet.ibm.com, Hannes Reinecke, linux-scsi@vger.kernel.org, Brian King On 05/03/13 20:23, James Bottomley wrote: > + const unsigned long stall_for = min(msecs_to_jiffies(10), 1UL); Hello James, Can you please clarify what the intention of this statement is ? Is the purpose of this statement to avoid that stall_for would be zero in case HZ < 100 ? If that is the case, maybe you meant max() instead of min() ? Also, are you aware that msecs_to_jiffies() already rounds up the result of the division ? Bart. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd 2013-05-04 17:02 ` Bart Van Assche @ 2013-05-04 18:20 ` James Bottomley 0 siblings, 0 replies; 5+ messages in thread From: James Bottomley @ 2013-05-04 18:20 UTC (permalink / raw) To: Bart Van Assche Cc: wenxiong@linux.vnet.ibm.com, Hannes Reinecke, linux-scsi@vger.kernel.org, Brian King On Sat, 2013-05-04 at 19:02 +0200, Bart Van Assche wrote: > On 05/03/13 20:23, James Bottomley wrote: > > + const unsigned long stall_for = min(msecs_to_jiffies(10), 1UL); > > Hello James, > > Can you please clarify what the intention of this statement is ? Is the > purpose of this statement to avoid that stall_for would be zero in case > HZ < 100 ? If that is the case, maybe you meant max() instead of min() ? > Also, are you aware that msecs_to_jiffies() already rounds up the result > of the division ? Yes, I thought afterwards I should dump the bogus min statement as well. Plus HZ/10 is actually 100ms, so the value is 10x wrong. I've fixed it up below (plus a bit of comment rework and some style fixes). Thanks, James --- >From 4bd9ef9789ad86656d8e52e8fff5422b741097e1 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare@suse.de> Date: Thu, 25 Apr 2013 08:10:00 +0200 Subject: [PATCH] [SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. [jejb: fix confusion between msec and jiffies values and other issues] [bvanassche: correct stall_for interval] Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..f43de1e 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -25,6 +25,7 @@ #include <linux/interrupt.h> #include <linux/blkdev.h> #include <linux/delay.h> +#include <linux/jiffies.h> #include <scsi/scsi.h> #include <scsi/scsi_cmnd.h> @@ -791,32 +792,48 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd->device; struct Scsi_Host *shost = sdev->host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const unsigned long stall_for = msecs_to_jiffies(100); int rtn; +retry: scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes); shost->eh_action = &done; scsi_log_send(scmd); scmd->scsi_done = scsi_eh_done; - shost->hostt->queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(&done, timeout); + rtn = shost->hostt->queuecommand(shost, scmd); + if (rtn) { + if (timeleft > stall_for) { + scsi_eh_restore_cmnd(scmd, &ses); + timeleft -= stall_for; + msleep(jiffies_to_msecs(stall_for)); + goto retry; + } + /* signal not to enter either branch of the if () below */ + timeleft = 0; + rtn = NEEDS_RETRY; + } else { + timeleft = wait_for_completion_timeout(&done, timeout); + } shost->eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk("%s: scmd: %p, timeleft: %ld\n", __func__, scmd, timeleft)); /* - * If there is time left scsi_eh_done got called, and we will - * examine the actual status codes to see whether the command - * actually did complete normally, else tell the host to forget - * about this command. + * If there is time left scsi_eh_done got called, and we will examine + * the actual status codes to see whether the command actually did + * complete normally, else if we have a zero return and no time left, + * the command must still be pending, so abort it and return FAILED. + * If we never actually managed to issue the command, because + * ->queuecommand() kept returning non zero, use the rtn = FAILED + * value above (so don't execute either branch of the if) */ if (timeleft) { rtn = scsi_eh_completed_normally(scmd); @@ -837,7 +854,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-05-04 18:20 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-25 6:10 [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd Hannes Reinecke 2013-05-03 14:24 ` wenxiong 2013-05-03 18:23 ` James Bottomley 2013-05-04 17:02 ` Bart Van Assche 2013-05-04 18:20 ` James Bottomley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox