public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi: timeout reset timer before command is queued
@ 2009-12-11  2:41 Min Zhang
  2009-12-13  9:52 ` Boaz Harrosh
  0 siblings, 1 reply; 9+ messages in thread
From: Min Zhang @ 2009-12-11  2:41 UTC (permalink / raw)
  To: linux-scsi

Kernel panic of NULL pointer in scsi_dispatch_cmd during fiber channel 
scsi disk access. It is because scsi command can time out even before 
dispatcher queues it to lower level driver, then scsi_times_out error 
handler could complete the request and free the command memory while 
scsi_dispatch_cmd still uses the command pointer.

This patch adds new SCSI_EH_RESET_TIMER flag per command to indicate the 
command hasn't been queued, so scsi_times_out won't complete and free 
the command.  req->special command pointer is also NULLed when the 
command is freed.

This premature time out is rare, mostly triggered by occasional network 
delay for fibre channel scsi disk. The patch is verified by 
instrumenting a testing delay to force scsi_times_out and then 
monitoring the command pointer integrity.

Signed-off-by: Min Zhang <mzhang@mvista.com>

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index dd098ca..9be78a5 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -743,6 +743,9 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
      */
     scsi_cmd_get_serial(host, cmd);
 
+    /* Timeout error handler can start processing cmd now */
+    cmd->eh_eflags &= ~SCSI_EH_RESET_TIMER;
+
     if (unlikely(host->shost_state == SHOST_DEL)) {
         cmd->result = (DID_NO_CONNECT << 16);
         scsi_done(cmd);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 1b0060b..94dca64 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -128,6 +128,14 @@ enum blk_eh_timer_return scsi_times_out(struct 
request *req)
 
     scsi_log_completion(scmd, TIMEOUT_ERROR);
 
+    /*
+     * Extend timeout if cmd has not been queued yet, otherwise error
+     * handler could complete request and free the cmd memory while
+     * the dispatch handler still uses the cmd pointer.
+     */
+    if (scmd->eh_eflags | SCSI_EH_RESET_TIMER)
+        return BLK_EH_RESET_TIMER;
+
     if (scmd->device->host->transportt->eh_timed_out)
         rtn = scmd->device->host->transportt->eh_timed_out(scmd);
     else if (scmd->device->host->hostt->eh_timed_out)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 5987da8..fd1afb6 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -492,11 +492,13 @@ void scsi_next_command(struct scsi_cmnd *cmd)
 {
     struct scsi_device *sdev = cmd->device;
     struct request_queue *q = sdev->request_queue;
+    struct request *req = cmd->request;
 
     /* need to hold a reference on the device before we let go of the 
cmd */
     get_device(&sdev->sdev_gendev);
 
     scsi_put_command(cmd);
+    req->special = NULL;
     scsi_run_queue(q);
 
     /* ok to remove device now */
@@ -1491,12 +1493,21 @@ static void scsi_request_fn(struct request_queue *q)
         /*
          * Remove the request from the request list.
          */
+        cmd = req->special;
         if (!(blk_queue_tagged(q) && !blk_queue_start_tag(q, req)))
+        {
+            /*
+             * Prevent timeout error handler from processing
+             * the cmd until it is queued in scsi_dispatch_cmd,
+             * otherwise cmd pointer might be freed by error handle
+             */
+            cmd->eh_eflags |= SCSI_EH_RESET_TIMER;
+
             blk_start_request(req);
+        }
         sdev->device_busy++;
 
         spin_unlock(q->queue_lock);
-        cmd = req->special;
         if (unlikely(cmd == NULL)) {
             printk(KERN_CRIT "impossible request in %s.\n"
                      "please mail a stack trace to "
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 1fbf7c7..4c71010 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -16,6 +16,7 @@ struct scsi_nl_hdr;
  * Scsi Error Handler Flags
  */
 #define SCSI_EH_CANCEL_CMD    0x0001    /* Cancel this cmd */
+#define SCSI_EH_RESET_TIMER    0x0002    /* Reset timer on the this cmd */
 
 #define SCSI_SENSE_VALID(scmd) \
     (((scmd)->sense_buffer[0] & 0x70) == 0x70)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-12-15 20:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-11  2:41 [PATCH] scsi: timeout reset timer before command is queued Min Zhang
2009-12-13  9:52 ` Boaz Harrosh
2009-12-15  1:17   ` Min Zhang
2009-12-15  2:21     ` Matthew Wilcox
2009-12-15  2:40       ` Min Zhang
2009-12-15  2:59         ` Matthew Wilcox
2009-12-15 10:40         ` Matthew Wilcox
2009-12-15 13:16     ` Boaz Harrosh
2009-12-15 20:53       ` Min Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox