* [PATCH/RFT] libata "DMA timeout" fix
@ 2004-02-28 19:10 Jeff Garzik
2004-02-28 20:49 ` James Bottomley
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Garzik @ 2004-02-28 19:10 UTC (permalink / raw)
To: Linux Kernel, linux-ide, SCSI Mailing List
[-- Attachment #1: Type: text/plain, Size: 645 bytes --]
The desired effect of a DMA timeout should be to throw an I/O error, but
that doesn't appear to be happening.
Those seeing DMA timeout messages, please test this patch.
Kernel hacker note: James B recommended that I implement my own
scsi_done() function, which duplicates the real scsi_done() but omits
the scsi_delete_timer() call. This is probably the best long term fix,
but doing so involves exporting several currently-private bits of SCSI
mid-layer, which I would rather not do. Probably best to create a
__scsi_done() inside the SCSI mid-layer, and call that.
Jeff, the only user of ->eh_strategy_handler() in any kernel
[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 624 bytes --]
===== drivers/scsi/libata-core.c 1.19 vs edited =====
--- 1.19/drivers/scsi/libata-core.c Wed Feb 25 22:41:13 2004
+++ edited/drivers/scsi/libata-core.c Sat Feb 28 14:03:18 2004
@@ -2130,6 +2130,14 @@
cmd->result = SAM_STAT_CHECK_CONDITION;
else
ata_to_sense_error(qc);
+
+ /* hack alert! we need this to get past the
+ * first check in scsi_done(). libata is the
+ * -only- user of ->eh_strategy_handler() in
+ * any kernel tree, which exposes some incorrect
+ * assumptions in the SCSI layer.
+ */
+ scsi_add_timer(cmd, 2000 * HZ, NULL);
} else {
cmd->result = SAM_STAT_GOOD;
}
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH/RFT] libata "DMA timeout" fix
2004-02-28 19:10 [PATCH/RFT] libata "DMA timeout" fix Jeff Garzik
@ 2004-02-28 20:49 ` James Bottomley
2004-02-28 21:23 ` Jeff Garzik
0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-02-28 20:49 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Linux Kernel, linux-ide, SCSI Mailing List
On Sat, 2004-02-28 at 13:10, Jeff Garzik wrote:
> ===== drivers/scsi/libata-core.c 1.19 vs edited =====
> --- 1.19/drivers/scsi/libata-core.c Wed Feb 25 22:41:13 2004
> +++ edited/drivers/scsi/libata-core.c Sat Feb 28 14:03:18 2004
> @@ -2130,6 +2130,14 @@
> cmd->result = SAM_STAT_CHECK_CONDITION;
> else
> ata_to_sense_error(qc);
> +
> + /* hack alert! we need this to get past the
> + * first check in scsi_done(). libata is the
> + * -only- user of ->eh_strategy_handler() in
> + * any kernel tree, which exposes some incorrect
> + * assumptions in the SCSI layer.
> + */
> + scsi_add_timer(cmd, 2000 * HZ, NULL);
> } else {
> cmd->result = SAM_STAT_GOOD;
> }
You can't do this. Supposing there command's delayed, the timer fires
and then the command returns with a sense error? The done will go
through automatically completing the command, but your strategy handler
will still think it has a failed command to handle.
The correct fix is this, I think (uncompiled, but you get the idea):
===== libata-core.c 1.19 vs edited =====
--- 1.19/drivers/scsi/libata-core.c Wed Feb 25 21:41:13 2004
+++ edited/libata-core.c Sat Feb 28 14:46:17 2004
@@ -1972,6 +1972,11 @@
/* FIXME */
}
+static void ata_eng_timeout_done(struct scsi_cmnd *cmnd)
+{
+ scsi_finish_command(cmnd);
+}
+
/**
* ata_eng_timeout - Handle timeout of queued command
* @ap: Port on which timed-out command is active
@@ -2005,6 +2010,7 @@
goto out;
}
+ qc->scsidone = ata_eng_timeout_done;
switch (qc->tf.protocol) {
case ATA_PROT_DMA_READ:
case ATA_PROT_DMA_WRITE:
James
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH/RFT] libata "DMA timeout" fix
2004-02-28 20:49 ` James Bottomley
@ 2004-02-28 21:23 ` Jeff Garzik
2004-02-28 21:28 ` James Bottomley
2004-02-29 17:26 ` Justin Cormack
0 siblings, 2 replies; 6+ messages in thread
From: Jeff Garzik @ 2004-02-28 21:23 UTC (permalink / raw)
To: James Bottomley; +Cc: Linux Kernel, linux-ide, SCSI Mailing List
James Bottomley wrote:
> On Sat, 2004-02-28 at 13:10, Jeff Garzik wrote:
>
>>===== drivers/scsi/libata-core.c 1.19 vs edited =====
>>--- 1.19/drivers/scsi/libata-core.c Wed Feb 25 22:41:13 2004
>>+++ edited/drivers/scsi/libata-core.c Sat Feb 28 14:03:18 2004
>>@@ -2130,6 +2130,14 @@
>> cmd->result = SAM_STAT_CHECK_CONDITION;
>> else
>> ata_to_sense_error(qc);
>>+
>>+ /* hack alert! we need this to get past the
>>+ * first check in scsi_done(). libata is the
>>+ * -only- user of ->eh_strategy_handler() in
>>+ * any kernel tree, which exposes some incorrect
>>+ * assumptions in the SCSI layer.
>>+ */
>>+ scsi_add_timer(cmd, 2000 * HZ, NULL);
>> } else {
>> cmd->result = SAM_STAT_GOOD;
>> }
>
>
> You can't do this. Supposing there command's delayed, the timer fires
> and then the command returns with a sense error? The done will go
> through automatically completing the command, but your strategy handler
> will still think it has a failed command to handle.
hmmm, yeah that will be a problem iff we are not already in the strategy
handler.
> The correct fix is this, I think (uncompiled, but you get the idea):
Yeah, that's much better. That function is not exported though ;-)
Jeff
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH/RFT] libata "DMA timeout" fix
2004-02-28 21:23 ` Jeff Garzik
@ 2004-02-28 21:28 ` James Bottomley
2004-02-29 17:26 ` Justin Cormack
1 sibling, 0 replies; 6+ messages in thread
From: James Bottomley @ 2004-02-28 21:28 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Linux Kernel, linux-ide, SCSI Mailing List
On Sat, 2004-02-28 at 15:23, Jeff Garzik wrote:
> Yeah, that's much better. That function is not exported though ;-)
I can fix that. It really is a necessary function for drivers doing
their own strategy handler ... of which yours seems to be the only one.
James
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH/RFT] libata "DMA timeout" fix
2004-02-28 21:23 ` Jeff Garzik
2004-02-28 21:28 ` James Bottomley
@ 2004-02-29 17:26 ` Justin Cormack
2004-02-29 17:31 ` Jeff Garzik
1 sibling, 1 reply; 6+ messages in thread
From: Justin Cormack @ 2004-02-29 17:26 UTC (permalink / raw)
To: Jeff Garzik; +Cc: James Bottomley, Linux Kernel
Thanks, testing James's fix on one machine at the moment. Will have
another six or so machines as a libata test farm tomorrow.
Justin
On Sat, 2004-02-28 at 21:23, Jeff Garzik wrote:
> James Bottomley wrote:
> > On Sat, 2004-02-28 at 13:10, Jeff Garzik wrote:
> >
> >>===== drivers/scsi/libata-core.c 1.19 vs edited =====
> >>--- 1.19/drivers/scsi/libata-core.c Wed Feb 25 22:41:13 2004
> >>+++ edited/drivers/scsi/libata-core.c Sat Feb 28 14:03:18 2004
> >>@@ -2130,6 +2130,14 @@
> >> cmd->result = SAM_STAT_CHECK_CONDITION;
> >> else
> >> ata_to_sense_error(qc);
> >>+
> >>+ /* hack alert! we need this to get past the
> >>+ * first check in scsi_done(). libata is the
> >>+ * -only- user of ->eh_strategy_handler() in
> >>+ * any kernel tree, which exposes some incorrect
> >>+ * assumptions in the SCSI layer.
> >>+ */
> >>+ scsi_add_timer(cmd, 2000 * HZ, NULL);
> >> } else {
> >> cmd->result = SAM_STAT_GOOD;
> >> }
> >
> >
> > You can't do this. Supposing there command's delayed, the timer fires
> > and then the command returns with a sense error? The done will go
> > through automatically completing the command, but your strategy handler
> > will still think it has a failed command to handle.
>
> hmmm, yeah that will be a problem iff we are not already in the strategy
> handler.
>
>
> > The correct fix is this, I think (uncompiled, but you get the idea):
>
> Yeah, that's much better. That function is not exported though ;-)
>
> Jeff
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH/RFT] libata "DMA timeout" fix
2004-02-29 17:26 ` Justin Cormack
@ 2004-02-29 17:31 ` Jeff Garzik
0 siblings, 0 replies; 6+ messages in thread
From: Jeff Garzik @ 2004-02-29 17:31 UTC (permalink / raw)
To: Justin Cormack; +Cc: James Bottomley, Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 223 bytes --]
Justin Cormack wrote:
> Thanks, testing James's fix on one machine at the moment. Will have
> another six or so machines as a libata test farm tomorrow.
You'll need a few more changes... here's the final patch.
Jeff
[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2895 bytes --]
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.1682 -> 1.1683
# drivers/scsi/libata-core.c 1.19 -> 1.20
# drivers/scsi/scsi.c 1.136 -> 1.137
# drivers/scsi/scsi_priv.h 1.30 -> 1.31
# include/scsi/scsi_cmnd.h 1.4 -> 1.5
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/02/28 jgarzik@redhat.com 1.1683
# [libata] Use scsi_finish_command as completion function,
# in our error handling thread callback.
#
# This also exports scsi_finish_command in the SCSI layer.
#
# Thanks much to James Bottomley and his patience, as this solution
# was figured out.
# --------------------------------------------
#
diff -Nru a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
--- a/drivers/scsi/libata-core.c Sun Feb 29 12:30:47 2004
+++ b/drivers/scsi/libata-core.c Sun Feb 29 12:30:47 2004
@@ -2005,6 +2005,14 @@
goto out;
}
+ /* hack alert! We cannot use the supplied completion
+ * function from inside the ->eh_strategy_handler() thread.
+ * libata is the only user of ->eh_strategy_handler() in
+ * any kernel, so the default scsi_done() assumes it is
+ * not being called from the SCSI EH.
+ */
+ qc->scsidone = scsi_finish_command;
+
switch (qc->tf.protocol) {
case ATA_PROT_DMA_READ:
case ATA_PROT_DMA_WRITE:
diff -Nru a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
--- a/drivers/scsi/scsi.c Sun Feb 29 12:30:47 2004
+++ b/drivers/scsi/scsi.c Sun Feb 29 12:30:47 2004
@@ -847,6 +847,7 @@
cmd->done(cmd);
}
+EXPORT_SYMBOL(scsi_finish_command);
/*
* Function: scsi_adjust_queue_depth()
diff -Nru a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
--- a/drivers/scsi/scsi_priv.h Sun Feb 29 12:30:47 2004
+++ b/drivers/scsi/scsi_priv.h Sun Feb 29 12:30:47 2004
@@ -77,7 +77,6 @@
extern int scsi_setup_command_freelist(struct Scsi_Host *shost);
extern void scsi_destroy_command_freelist(struct Scsi_Host *shost);
extern void scsi_done(struct scsi_cmnd *cmd);
-extern void scsi_finish_command(struct scsi_cmnd *cmd);
extern int scsi_retry_command(struct scsi_cmnd *cmd);
extern int scsi_insert_special_req(struct scsi_request *sreq, int);
extern void scsi_init_cmd_from_req(struct scsi_cmnd *cmd,
diff -Nru a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
--- a/include/scsi/scsi_cmnd.h Sun Feb 29 12:30:47 2004
+++ b/include/scsi/scsi_cmnd.h Sun Feb 29 12:30:47 2004
@@ -159,5 +159,6 @@
extern struct scsi_cmnd *scsi_get_command(struct scsi_device *, int);
extern void scsi_put_command(struct scsi_cmnd *);
extern void scsi_io_completion(struct scsi_cmnd *, unsigned int, unsigned int);
+extern void scsi_finish_command(struct scsi_cmnd *cmd);
#endif /* _SCSI_SCSI_CMND_H */
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-02-29 17:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-28 19:10 [PATCH/RFT] libata "DMA timeout" fix Jeff Garzik
2004-02-28 20:49 ` James Bottomley
2004-02-28 21:23 ` Jeff Garzik
2004-02-28 21:28 ` James Bottomley
2004-02-29 17:26 ` Justin Cormack
2004-02-29 17:31 ` Jeff Garzik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox