public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFT] libata "DMA timeout" fix
@ 2004-02-28 19:10 Jeff Garzik
  2004-02-28 20:49 ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Garzik @ 2004-02-28 19:10 UTC (permalink / raw)
  To: Linux Kernel, linux-ide, SCSI Mailing List

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]


The desired effect of a DMA timeout should be to throw an I/O error, but 
that doesn't appear to be happening.

Those seeing DMA timeout messages, please test this patch.

Kernel hacker note:  James B recommended that I implement my own 
scsi_done() function, which duplicates the real scsi_done() but omits 
the scsi_delete_timer() call.  This is probably the best long term fix, 
but doing so involves exporting several currently-private bits of SCSI 
mid-layer, which I would rather not do.  Probably best to create a 
__scsi_done() inside the SCSI mid-layer, and call that.

	Jeff, the only user of ->eh_strategy_handler() in any kernel




[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 624 bytes --]

===== drivers/scsi/libata-core.c 1.19 vs edited =====
--- 1.19/drivers/scsi/libata-core.c	Wed Feb 25 22:41:13 2004
+++ edited/drivers/scsi/libata-core.c	Sat Feb 28 14:03:18 2004
@@ -2130,6 +2130,14 @@
 				cmd->result = SAM_STAT_CHECK_CONDITION;
 			else
 				ata_to_sense_error(qc);
+
+			/* hack alert! we need this to get past the
+			 * first check in scsi_done().  libata is the
+			 * -only- user of ->eh_strategy_handler() in
+			 * any kernel tree, which exposes some incorrect
+			 * assumptions in the SCSI layer.
+			 */
+			scsi_add_timer(cmd, 2000 * HZ, NULL);
 		} else {
 			cmd->result = SAM_STAT_GOOD;
 		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFT] libata "DMA timeout" fix
  2004-02-28 19:10 [PATCH/RFT] libata "DMA timeout" fix Jeff Garzik
@ 2004-02-28 20:49 ` James Bottomley
  2004-02-28 21:23   ` Jeff Garzik
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-02-28 20:49 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel, linux-ide, SCSI Mailing List

On Sat, 2004-02-28 at 13:10, Jeff Garzik wrote:
> ===== drivers/scsi/libata-core.c 1.19 vs edited =====
> --- 1.19/drivers/scsi/libata-core.c	Wed Feb 25 22:41:13 2004
> +++ edited/drivers/scsi/libata-core.c	Sat Feb 28 14:03:18 2004
> @@ -2130,6 +2130,14 @@
>  				cmd->result = SAM_STAT_CHECK_CONDITION;
>  			else
>  				ata_to_sense_error(qc);
> +
> +			/* hack alert! we need this to get past the
> +			 * first check in scsi_done().  libata is the
> +			 * -only- user of ->eh_strategy_handler() in
> +			 * any kernel tree, which exposes some incorrect
> +			 * assumptions in the SCSI layer.
> +			 */
> +			scsi_add_timer(cmd, 2000 * HZ, NULL);
>  		} else {
>  			cmd->result = SAM_STAT_GOOD;
>  		}

You can't do this.  Supposing there command's delayed, the timer fires
and then the command returns with a sense error?  The done will go
through automatically completing the command, but your strategy handler
will still think it has a failed command to handle.

The correct fix is this, I think (uncompiled, but you get the idea):

===== libata-core.c 1.19 vs edited =====
--- 1.19/drivers/scsi/libata-core.c	Wed Feb 25 21:41:13 2004
+++ edited/libata-core.c	Sat Feb 28 14:46:17 2004
@@ -1972,6 +1972,11 @@
 	/* FIXME */
 }
 
+static void ata_eng_timeout_done(struct scsi_cmnd *cmnd)
+{
+	scsi_finish_command(cmnd);
+}
+
 /**
  *	ata_eng_timeout - Handle timeout of queued command
  *	@ap: Port on which timed-out command is active
@@ -2005,6 +2010,7 @@
 		goto out;
 	}
 
+	qc->scsidone = ata_eng_timeout_done;
 	switch (qc->tf.protocol) {
 	case ATA_PROT_DMA_READ:
 	case ATA_PROT_DMA_WRITE:

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFT] libata "DMA timeout" fix
  2004-02-28 20:49 ` James Bottomley
@ 2004-02-28 21:23   ` Jeff Garzik
  2004-02-28 21:28     ` James Bottomley
  2004-02-29 17:26     ` Justin Cormack
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff Garzik @ 2004-02-28 21:23 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linux Kernel, linux-ide, SCSI Mailing List

James Bottomley wrote:
> On Sat, 2004-02-28 at 13:10, Jeff Garzik wrote:
> 
>>===== drivers/scsi/libata-core.c 1.19 vs edited =====
>>--- 1.19/drivers/scsi/libata-core.c	Wed Feb 25 22:41:13 2004
>>+++ edited/drivers/scsi/libata-core.c	Sat Feb 28 14:03:18 2004
>>@@ -2130,6 +2130,14 @@
>> 				cmd->result = SAM_STAT_CHECK_CONDITION;
>> 			else
>> 				ata_to_sense_error(qc);
>>+
>>+			/* hack alert! we need this to get past the
>>+			 * first check in scsi_done().  libata is the
>>+			 * -only- user of ->eh_strategy_handler() in
>>+			 * any kernel tree, which exposes some incorrect
>>+			 * assumptions in the SCSI layer.
>>+			 */
>>+			scsi_add_timer(cmd, 2000 * HZ, NULL);
>> 		} else {
>> 			cmd->result = SAM_STAT_GOOD;
>> 		}
> 
> 
> You can't do this.  Supposing there command's delayed, the timer fires
> and then the command returns with a sense error?  The done will go
> through automatically completing the command, but your strategy handler
> will still think it has a failed command to handle.

hmmm, yeah that will be a problem iff we are not already in the strategy 
handler.


> The correct fix is this, I think (uncompiled, but you get the idea):

Yeah, that's much better.  That function is not exported though ;-)

	Jeff




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFT] libata "DMA timeout" fix
  2004-02-28 21:23   ` Jeff Garzik
@ 2004-02-28 21:28     ` James Bottomley
  2004-02-29 17:26     ` Justin Cormack
  1 sibling, 0 replies; 6+ messages in thread
From: James Bottomley @ 2004-02-28 21:28 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel, linux-ide, SCSI Mailing List

On Sat, 2004-02-28 at 15:23, Jeff Garzik wrote:
> Yeah, that's much better.  That function is not exported though ;-)

I can fix that.  It really is a necessary function for drivers doing
their own strategy handler ... of which yours seems to be the only one.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFT] libata "DMA timeout" fix
  2004-02-28 21:23   ` Jeff Garzik
  2004-02-28 21:28     ` James Bottomley
@ 2004-02-29 17:26     ` Justin Cormack
  2004-02-29 17:31       ` Jeff Garzik
  1 sibling, 1 reply; 6+ messages in thread
From: Justin Cormack @ 2004-02-29 17:26 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: James Bottomley, Linux Kernel

Thanks, testing James's fix on one machine at the moment. Will have
another six or so machines as a libata test farm tomorrow.

Justin


On Sat, 2004-02-28 at 21:23, Jeff Garzik wrote:
> James Bottomley wrote:
> > On Sat, 2004-02-28 at 13:10, Jeff Garzik wrote:
> > 
> >>===== drivers/scsi/libata-core.c 1.19 vs edited =====
> >>--- 1.19/drivers/scsi/libata-core.c	Wed Feb 25 22:41:13 2004
> >>+++ edited/drivers/scsi/libata-core.c	Sat Feb 28 14:03:18 2004
> >>@@ -2130,6 +2130,14 @@
> >> 				cmd->result = SAM_STAT_CHECK_CONDITION;
> >> 			else
> >> 				ata_to_sense_error(qc);
> >>+
> >>+			/* hack alert! we need this to get past the
> >>+			 * first check in scsi_done().  libata is the
> >>+			 * -only- user of ->eh_strategy_handler() in
> >>+			 * any kernel tree, which exposes some incorrect
> >>+			 * assumptions in the SCSI layer.
> >>+			 */
> >>+			scsi_add_timer(cmd, 2000 * HZ, NULL);
> >> 		} else {
> >> 			cmd->result = SAM_STAT_GOOD;
> >> 		}
> > 
> > 
> > You can't do this.  Supposing there command's delayed, the timer fires
> > and then the command returns with a sense error?  The done will go
> > through automatically completing the command, but your strategy handler
> > will still think it has a failed command to handle.
> 
> hmmm, yeah that will be a problem iff we are not already in the strategy 
> handler.
> 
> 
> > The correct fix is this, I think (uncompiled, but you get the idea):
> 
> Yeah, that's much better.  That function is not exported though ;-)
> 
> 	Jeff
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFT] libata "DMA timeout" fix
  2004-02-29 17:26     ` Justin Cormack
@ 2004-02-29 17:31       ` Jeff Garzik
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff Garzik @ 2004-02-29 17:31 UTC (permalink / raw)
  To: Justin Cormack; +Cc: James Bottomley, Linux Kernel

[-- Attachment #1: Type: text/plain, Size: 223 bytes --]

Justin Cormack wrote:
> Thanks, testing James's fix on one machine at the moment. Will have
> another six or so machines as a libata test farm tomorrow.


You'll need a few more changes...  here's the final patch.

	Jeff



[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2895 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.1682  -> 1.1683 
#	drivers/scsi/libata-core.c	1.19    -> 1.20   
#	 drivers/scsi/scsi.c	1.136   -> 1.137  
#	drivers/scsi/scsi_priv.h	1.30    -> 1.31   
#	include/scsi/scsi_cmnd.h	1.4     -> 1.5    
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/02/28	jgarzik@redhat.com	1.1683
# [libata] Use scsi_finish_command as completion function,
# in our error handling thread callback.
# 
# This also exports scsi_finish_command in the SCSI layer.
# 
# Thanks much to James Bottomley and his patience, as this solution
# was figured out.
# --------------------------------------------
#
diff -Nru a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
--- a/drivers/scsi/libata-core.c	Sun Feb 29 12:30:47 2004
+++ b/drivers/scsi/libata-core.c	Sun Feb 29 12:30:47 2004
@@ -2005,6 +2005,14 @@
 		goto out;
 	}
 
+	/* hack alert!  We cannot use the supplied completion
+	 * function from inside the ->eh_strategy_handler() thread.
+	 * libata is the only user of ->eh_strategy_handler() in
+	 * any kernel, so the default scsi_done() assumes it is
+	 * not being called from the SCSI EH.
+	 */
+	qc->scsidone = scsi_finish_command;
+
 	switch (qc->tf.protocol) {
 	case ATA_PROT_DMA_READ:
 	case ATA_PROT_DMA_WRITE:
diff -Nru a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
--- a/drivers/scsi/scsi.c	Sun Feb 29 12:30:47 2004
+++ b/drivers/scsi/scsi.c	Sun Feb 29 12:30:47 2004
@@ -847,6 +847,7 @@
 
 	cmd->done(cmd);
 }
+EXPORT_SYMBOL(scsi_finish_command);
 
 /*
  * Function:	scsi_adjust_queue_depth()
diff -Nru a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
--- a/drivers/scsi/scsi_priv.h	Sun Feb 29 12:30:47 2004
+++ b/drivers/scsi/scsi_priv.h	Sun Feb 29 12:30:47 2004
@@ -77,7 +77,6 @@
 extern int scsi_setup_command_freelist(struct Scsi_Host *shost);
 extern void scsi_destroy_command_freelist(struct Scsi_Host *shost);
 extern void scsi_done(struct scsi_cmnd *cmd);
-extern void scsi_finish_command(struct scsi_cmnd *cmd);
 extern int scsi_retry_command(struct scsi_cmnd *cmd);
 extern int scsi_insert_special_req(struct scsi_request *sreq, int);
 extern void scsi_init_cmd_from_req(struct scsi_cmnd *cmd,
diff -Nru a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
--- a/include/scsi/scsi_cmnd.h	Sun Feb 29 12:30:47 2004
+++ b/include/scsi/scsi_cmnd.h	Sun Feb 29 12:30:47 2004
@@ -159,5 +159,6 @@
 extern struct scsi_cmnd *scsi_get_command(struct scsi_device *, int);
 extern void scsi_put_command(struct scsi_cmnd *);
 extern void scsi_io_completion(struct scsi_cmnd *, unsigned int, unsigned int);
+extern void scsi_finish_command(struct scsi_cmnd *cmd);
 
 #endif /* _SCSI_SCSI_CMND_H */

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-02-29 17:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-28 19:10 [PATCH/RFT] libata "DMA timeout" fix Jeff Garzik
2004-02-28 20:49 ` James Bottomley
2004-02-28 21:23   ` Jeff Garzik
2004-02-28 21:28     ` James Bottomley
2004-02-29 17:26     ` Justin Cormack
2004-02-29 17:31       ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox