Degraded I/O performance, since 2.5.41

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Degraded I/O performance, since 2.5.41
@ 2002-10-10  1:09 Dave Hansen
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Hansen @ 2002-10-10  1:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rik van Riel

When I run a certain large webserver benchmark, I prefer to warm the 
pagecache up with the file set first, to cheat a little :)  I grep 
through 20 different 500MB file sets in parallel to do this.  It is a 
_lot_ slower in the BK snapshot than in plain 2.5.41.

And, no, these numbers aren't inflated, I have a lot of fast disks. I 
  _can_ do 50MB/sec :)

A little snipped from vmstat (I cut out the boring columns):
good kernel: 2.5.41: vmstat 4
Cached        bi    bo   bi    cs  us s id
389280     53284     7 1625  3235 12 88  0
600580     53489    19 1599  3264 11 89  0
813428     53891     0 1587  3256 12 88  0
1027260    54093     0 1609  3239 12 88  0
1241448    54183     0 1611  3251 11 89  0
1454036    53790     0 1618  3267 12 88  0
doing the entire 10GB grep takes 192 seconds.
a dd produces: ~48000 bi/sec

exact same grep operation on kernel: 2.5.41+yesterday's bk: vmstat 4
Cached       bi    bo   bi    cs us sy id
4855948    9697     1 1408   846 20 80  0
4890464    8745     0 1398   800 18 82  0
4922392    8077    55 1364   676 21 79  0
4959164    9296     1 1399   798 18 82  0
4995936    9315     0 1407   830 19 81  0
5027208    7931     0 1351   638 22 78  0
5066256    9855     9 1416   856 19 81  0
I was too impatient to wait on the greps to complete.
a dd produces: ~37800 bi/sec

So, bi/sec goes from 54,000 in 2.5.41, to ~8700 in yesterday's 
snapshot.  It goes from around 50MB/sec to about 8MB/sec.

Although vmstat shows 0% idle time, the profilers show lots of idle 
time, 98%!  I tried oprofile and readprofile.  Is the 2.0.9 vmstat 
still broken?  I'm using idle=poll if that makes any difference.

-- 
Dave Hansen
haveblue@us.ibm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Degraded I/O performance, since 2.5.41
       [not found]   ` <20021011004227.GA27073@redhat.com>
@ 2002-10-11  0:53     ` Dave Hansen
  2002-10-11  1:45       ` Dave Hansen
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2002-10-11  0:53 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Linus Torvalds, Patrick Mansfield, Andrew Morton, lkml

Doug Ledford wrote:
> I  try to keep the drivers working
> at a basic level, but until I'm done, benchmarking is pretty much a waste
> of time I think)

Benchmarking is integral in what we're doing right now.  We need to 
make quick decisions about what is good or bad before the freeze. 
This patch makes my machine unusable for anything that isn't in the 
pagecache.  A simple "make oldconfig" on a cold tree takes minutes to 
complete.  My grep test got an order of magnitude worse.  If we have 
to keep this code, can we just make the default queue HUGE for now? 
Will that work around it?

A bunch of the AIO people use QLogic cards, which I'm sure are broken 
by this as well.  I'm going to back this patch out for all the testing 
trees I do, and I suggest anyone who cares about I/O on SCSI 
(excluding aic7xxx) after 2.5.41 do the same.

-- 
Dave Hansen
haveblue@us.ibm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Degraded I/O performance, since 2.5.41
  2002-10-11  0:53     ` Dave Hansen
@ 2002-10-11  1:45       ` Dave Hansen
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Hansen @ 2002-10-11  1:45 UTC (permalink / raw)
  Cc: lkml

[-- Attachment #1: Type: text/plain, Size: 971 bytes --]

Dave Hansen wrote:
> Doug Ledford wrote:
> 
>> I  try to keep the drivers working
>> at a basic level, but until I'm done, benchmarking is pretty much a waste
>> of time I think)
> 
> 
> Benchmarking is integral in what we're doing right now.  We need to make 
> quick decisions about what is good or bad before the freeze. This patch 
> makes my machine unusable for anything that isn't in the pagecache.  A 
> simple "make oldconfig" on a cold tree takes minutes to complete.  My 
> grep test got an order of magnitude worse.  If we have to keep this 
> code, can we just make the default queue HUGE for now? Will that work 
> around it?
> 
> A bunch of the AIO people use QLogic cards, which I'm sure are broken by 
> this as well.  I'm going to back this patch out for all the testing 
> trees I do, and I suggest anyone who cares about I/O on SCSI (excluding 
> aic7xxx) after 2.5.41 do the same.

BTW, here's the cset in issue.

-- 
Dave Hansen
haveblue@us.ibm.com

[-- Attachment #2: cset-1.704.1.2.txt --]
[-- Type: text/plain, Size: 39734 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.704.1.1 -> 1.704.1.2
#	drivers/scsi/aic7xxx_old/aic7xxx.h	1.4     -> 1.5    
#	 drivers/scsi/scsi.h	1.21    -> 1.22   
#	drivers/scsi/hosts.h	1.15    -> 1.16   
#	drivers/scsi/aic7xxx_old.c	1.24    -> 1.25   
#	 drivers/scsi/scsi.c	1.40    -> 1.41   
#	drivers/scsi/scsi_scan.c	1.22    -> 1.23   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/10/08	dledford@redhat.com	1.704.1.2
# [PATCH] make SCSI queue depth adjustable
# 
# Linus, this has been tested by some people in the field to not break
# things, and it's the start of some other changes I'm making, so please put
# this in your tree so I'm not merging huge patches but instead am merging a
# little as I go.
# --------------------------------------------
#
diff -Nru a/drivers/scsi/aic7xxx_old/aic7xxx.h b/drivers/scsi/aic7xxx_old/aic7xxx.h
--- a/drivers/scsi/aic7xxx_old/aic7xxx.h	Wed Oct  9 05:38:48 2002
+++ b/drivers/scsi/aic7xxx_old/aic7xxx.h	Wed Oct  9 05:38:48 2002
@@ -46,7 +46,9 @@
 	eh_host_reset_handler: NULL,				\
 	abort: aic7xxx_abort,					\
 	reset: aic7xxx_reset,					\
-	slave_attach: NULL,					\
+	select_queue_depths: NULL,				\
+	slave_attach: aic7xxx_slave_attach,			\
+	slave_detach: aic7xxx_slave_detach,			\
 	bios_param: aic7xxx_biosparam,				\
 	can_queue: 255,		/* max simultaneous cmds      */\
 	this_id: -1,		/* scsi id of host adapter    */\
@@ -64,6 +66,8 @@
 extern int aic7xxx_reset(Scsi_Cmnd *, unsigned int);
 extern int aic7xxx_abort(Scsi_Cmnd *);
 extern int aic7xxx_release(struct Scsi_Host *);
+extern int aic7xxx_slave_attach(Scsi_Device *);
+extern void aic7xxx_slave_detach(Scsi_Device *);
 
 extern const char *aic7xxx_info(struct Scsi_Host *);
 
diff -Nru a/drivers/scsi/aic7xxx_old.c b/drivers/scsi/aic7xxx_old.c
--- a/drivers/scsi/aic7xxx_old.c	Wed Oct  9 05:38:48 2002
+++ b/drivers/scsi/aic7xxx_old.c	Wed Oct  9 05:38:48 2002
@@ -976,7 +976,7 @@
 #define  DEVICE_DTR_SCANNED             0x40
   volatile unsigned char   dev_flags[MAX_TARGETS];
   volatile unsigned char   dev_active_cmds[MAX_TARGETS];
-  volatile unsigned char   dev_temp_queue_depth[MAX_TARGETS];
+  volatile unsigned short  dev_temp_queue_depth[MAX_TARGETS];
   unsigned char            dev_commands_sent[MAX_TARGETS];
 
   unsigned int             dev_timer_active; /* Which devs have a timer set */
@@ -988,7 +988,9 @@
 
   unsigned char            dev_last_queue_full[MAX_TARGETS];
   unsigned char            dev_last_queue_full_count[MAX_TARGETS];
-  unsigned char            dev_max_queue_depth[MAX_TARGETS];
+  unsigned char            dev_lun_queue_depth[MAX_TARGETS];
+  unsigned short           dev_scbs_needed[MAX_TARGETS];
+  unsigned short           dev_max_queue_depth[MAX_TARGETS];
 
   volatile scb_queue_type  delayed_scbs[MAX_TARGETS];
 
@@ -1035,6 +1037,7 @@
   ahc_chip                 chip;             /* chip type */
   ahc_bugs                 bugs;
   dma_addr_t		   fifo_dma;	     /* DMA handle for fifo arrays */
+  Scsi_Device		  *Scsi_Dev[MAX_TARGETS][MAX_LUNS];
 
   /*
    * Statistics Kept:
@@ -2820,94 +2823,6 @@
     cmd->result |= (DID_RESET << 16);
   }
 
-  if (!(p->dev_flags[tindex] & DEVICE_PRESENT))
-  {
-    if ( (cmd->cmnd[0] == INQUIRY) && (cmd->result == DID_OK) )
-    {
-    
-      p->dev_flags[tindex] |= DEVICE_PRESENT;
-#define WIDE_INQUIRY_BITS 0x60
-#define SYNC_INQUIRY_BITS 0x10
-#define SCSI_VERSION_BITS 0x07
-#define SCSI_DT_BIT       0x04
-      if(!(p->dev_flags[tindex] & DEVICE_DTR_SCANNED)) {
-        char *buffer;
-
-        if(cmd->use_sg)
-          BUG();
-
-        buffer = (char *)cmd->request_buffer;
-
-        if ( (buffer[7] & WIDE_INQUIRY_BITS) &&
-             (p->features & AHC_WIDE) )
-        {
-          p->needwdtr |= (1<<tindex);
-          p->needwdtr_copy |= (1<<tindex);
-          p->transinfo[tindex].goal_width = p->transinfo[tindex].user_width;
-        }
-        else
-        {
-          p->needwdtr &= ~(1<<tindex);
-          p->needwdtr_copy &= ~(1<<tindex);
-          pause_sequencer(p);
-          aic7xxx_set_width(p, cmd->target, cmd->channel, cmd->lun,
-                            MSG_EXT_WDTR_BUS_8_BIT, (AHC_TRANS_ACTIVE |
-                                                     AHC_TRANS_GOAL |
-                                                     AHC_TRANS_CUR) );
-          unpause_sequencer(p, FALSE);
-        }
-        if ( (buffer[7] & SYNC_INQUIRY_BITS) &&
-              p->transinfo[tindex].user_offset )
-        {
-          p->transinfo[tindex].goal_period = p->transinfo[tindex].user_period;
-          p->transinfo[tindex].goal_options = p->transinfo[tindex].user_options;
-          if (p->features & AHC_ULTRA2)
-            p->transinfo[tindex].goal_offset = MAX_OFFSET_ULTRA2;
-          else if (p->transinfo[tindex].goal_width == MSG_EXT_WDTR_BUS_16_BIT)
-            p->transinfo[tindex].goal_offset = MAX_OFFSET_16BIT;
-          else
-            p->transinfo[tindex].goal_offset = MAX_OFFSET_8BIT;
-          if ( (((buffer[2] & SCSI_VERSION_BITS) >= 3) ||
-                 (buffer[56] & SCSI_DT_BIT) ||
-                 (p->dev_flags[tindex] & DEVICE_SCSI_3) ) &&
-                 (p->transinfo[tindex].user_period <= 9) &&
-                 (p->transinfo[tindex].user_options) )
-          {
-            p->needppr |= (1<<tindex);
-            p->needppr_copy |= (1<<tindex);
-            p->needsdtr &= ~(1<<tindex);
-            p->needsdtr_copy &= ~(1<<tindex);
-            p->needwdtr &= ~(1<<tindex);
-            p->needwdtr_copy &= ~(1<<tindex);
-            p->dev_flags[tindex] |= DEVICE_SCSI_3;
-          }
-          else
-          {
-            p->needsdtr |= (1<<tindex);
-            p->needsdtr_copy |= (1<<tindex);
-            p->transinfo[tindex].goal_period = 
-              MAX(10, p->transinfo[tindex].goal_period);
-            p->transinfo[tindex].goal_options = 0;
-          }
-        }
-        else
-        {
-          p->needsdtr &= ~(1<<tindex);
-          p->needsdtr_copy &= ~(1<<tindex);
-          p->transinfo[tindex].goal_period = 255;
-          p->transinfo[tindex].goal_offset = 0;
-          p->transinfo[tindex].goal_options = 0;
-        }
-        p->dev_flags[tindex] |= DEVICE_DTR_SCANNED;
-        p->dev_flags[tindex] |= DEVICE_PRINT_DTR;
-      }
-#undef WIDE_INQUIRY_BITS
-#undef SYNC_INQUIRY_BITS
-#undef SCSI_VERSION_BITS
-#undef SCSI_DT_BIT
-    }
-  }
-
   if ((scb->flags & SCB_MSGOUT_BITS) != 0)
   {
     unsigned short mask;
@@ -4919,15 +4834,29 @@
                 if ( (p->dev_last_queue_full_count[tindex] > 14) &&
                      (p->dev_active_cmds[tindex] > 4) )
                 {
+		  int diff, lun;
+		  if (p->dev_active_cmds[tindex] > p->dev_lun_queue_depth[tindex])
+		    /* We don't know what to do here, so bail. */
+		    break;
                   if (aic7xxx_verbose & VERBOSE_NEGOTIATION2)
                     printk(INFO_LEAD "Queue depth reduced to %d\n", p->host_no,
                       CTL_OF_SCB(scb), p->dev_active_cmds[tindex]);
-                  p->dev_max_queue_depth[tindex] = 
-                      p->dev_active_cmds[tindex];
+		  diff = p->dev_lun_queue_depth[tindex] -
+			 p->dev_active_cmds[tindex];
+		  p->dev_lun_queue_depth[tindex] -= diff;
+		  for(lun = 0; lun < p->host->max_lun; lun++)
+		  {
+		    if(p->Scsi_Dev[tindex][lun] != NULL)
+		    {
+		      p->dev_max_queue_depth[tindex] -= diff;
+		      scsi_adjust_queue_depth(p->Scsi_Dev[tindex][lun], 1,
+				              p->dev_lun_queue_depth[tindex]);
+		      if(p->dev_temp_queue_depth[tindex] > p->dev_max_queue_depth[tindex])
+		        p->dev_temp_queue_depth[tindex] = p->dev_max_queue_depth[tindex];
+		    }
+		  }
                   p->dev_last_queue_full[tindex] = 0;
                   p->dev_last_queue_full_count[tindex] = 0;
-                  p->dev_temp_queue_depth[tindex] = 
-                    p->dev_active_cmds[tindex];
                 }
                 else if (p->dev_active_cmds[tindex] == 0)
                 {
@@ -7023,10 +6952,10 @@
  *   with queue depths for individual devices.  It also allows tagged
  *   queueing to be [en|dis]abled for a specific adapter.
  *-F*************************************************************************/
-static int
+static void
 aic7xxx_device_queue_depth(struct aic7xxx_host *p, Scsi_Device *device)
 {
-  int default_depth = 3;
+  int default_depth = p->host->hostt->cmd_per_lun;
   unsigned char tindex;
   unsigned short target_mask;
 
@@ -7036,12 +6965,69 @@
   if (p->dev_max_queue_depth[tindex] > 1)
   {
     /*
-     * We've already scanned this device, leave it alone
+     * We've already scanned some lun on this device and enabled tagged
+     * queueing on it.  So, as long as this lun also supports tagged
+     * queueing, enable it here with the same depth.  Call SCSI mid layer
+     * to adjust depth on this device, and add enough to the max_queue_depth
+     * to cover the commands for this lun.
+     *
+     * Note: there is a shortcoming here.  The aic7xxx driver really assumes
+     * that if any lun on a device supports tagged queueing, then they *all*
+     * do.  Our p->tagenable field is on a per target id basis and doesn't
+     * differentiate for different luns.  If we end up with one lun that
+     * doesn't support tagged queueing, it's going to disable tagged queueing
+     * on *all* the luns on that target ID :-(
      */
-    return(p->dev_max_queue_depth[tindex]);
+    if(device->tagged_supported) {
+      if (aic7xxx_verbose & VERBOSE_NEGOTIATION2)
+      {
+        printk(INFO_LEAD "Enabled tagged queuing, queue depth %d.\n",
+               p->host_no, device->channel, device->id,
+               device->lun, device->queue_depth);
+      }
+      p->dev_max_queue_depth[tindex] += p->dev_lun_queue_depth[tindex];
+      p->dev_temp_queue_depth[tindex] += p->dev_lun_queue_depth[tindex];
+      scsi_adjust_queue_depth(device, 1, p->dev_lun_queue_depth[tindex]);
+    }
+    else
+    {
+      int lun;
+      /*
+       * Uh ohh, this is what I was talking about.  All the other devices on
+       * this target ID that support tagged queueing are going to end up
+       * getting tagged queueing turned off because of this device.  Print
+       * out a message to this effect for the user, then disable tagged
+       * queueing on all the devices on this ID.
+       */
+      printk(WARN_LEAD "does not support tagged queuing while other luns on\n"
+             "          the same target ID do!!  Tagged queueing will be disabled for\n"
+             "          all luns on this target ID!!\n", p->host_no,
+	     device->channel, device->id, device->lun);
+      
+      p->dev_lun_queue_depth[tindex] = default_depth;
+      p->dev_scbs_needed[tindex] = 0;
+      p->dev_temp_queue_depth[tindex] = 1;
+      p->dev_max_queue_depth[tindex] = 1;
+      p->tagenable &= ~target_mask;
+
+      for(lun=0; lun < p->host->max_lun; lun++)
+      {
+        if(p->Scsi_Dev[tindex][lun] != NULL)
+	{
+          printk(WARN_LEAD "disabling tagged queuing.\n", p->host_no,
+                 p->Scsi_Dev[tindex][lun]->channel,
+		 p->Scsi_Dev[tindex][lun]->id,
+		 p->Scsi_Dev[tindex][lun]->lun);
+          scsi_adjust_queue_depth(p->Scsi_Dev[tindex][lun], 0, default_depth);
+	  p->dev_scbs_needed[tindex] += default_depth;
+	}
+      }
+    }
+    return;
   }
 
-  device->queue_depth = default_depth;
+  p->dev_lun_queue_depth[tindex] = default_depth;
+  p->dev_scbs_needed[tindex] = default_depth;
   p->dev_temp_queue_depth[tindex] = 1;
   p->dev_max_queue_depth[tindex] = 1;
   p->tagenable &= ~target_mask;
@@ -7051,7 +7037,7 @@
     int tag_enabled = TRUE;
 
     default_depth = AIC7XXX_CMDS_PER_DEVICE;
- 
+
     if (!(p->discenable & target_mask))
     {
       if (aic7xxx_verbose & VERBOSE_NEGOTIATION2)
@@ -7072,7 +7058,7 @@
                            " the aic7xxx.c source file.\n");
           print_warning = FALSE;
         }
-        device->queue_depth = default_depth;
+        p->dev_lun_queue_depth[tindex] = default_depth;
       }
       else
       {
@@ -7080,19 +7066,18 @@
         if (aic7xxx_tag_info[p->instance].tag_commands[tindex] == 255)
         {
           tag_enabled = FALSE;
-          device->queue_depth = 3;  /* Tagged queueing is disabled. */
         }
         else if (aic7xxx_tag_info[p->instance].tag_commands[tindex] == 0)
         {
-          device->queue_depth = default_depth;
+          p->dev_lun_queue_depth[tindex] = default_depth;
         }
         else
         {
-          device->queue_depth =
+          p->dev_lun_queue_depth[tindex] =
             aic7xxx_tag_info[p->instance].tag_commands[tindex];
         }
       }
-      if ((device->tagged_queue == 0) && tag_enabled)
+      if (tag_enabled)
       {
         if (aic7xxx_verbose & VERBOSE_NEGOTIATION2)
         {
@@ -7100,46 +7085,70 @@
                 p->host_no, device->channel, device->id,
                 device->lun, device->queue_depth);
         }
-        p->dev_max_queue_depth[tindex] = device->queue_depth;
-        p->dev_temp_queue_depth[tindex] = device->queue_depth;
+        p->dev_max_queue_depth[tindex] = p->dev_lun_queue_depth[tindex];
+        p->dev_temp_queue_depth[tindex] = p->dev_lun_queue_depth[tindex];
+        p->dev_scbs_needed[tindex] = p->dev_lun_queue_depth[tindex];
         p->tagenable |= target_mask;
         p->orderedtag |= target_mask;
-        device->tagged_queue = 1;
-        device->current_tag = SCB_LIST_NULL;
+	scsi_adjust_queue_depth(device, 1, p->dev_lun_queue_depth[tindex]);
       }
     }
   }
-  return(p->dev_max_queue_depth[tindex]);
+  return;
 }
 
 /*+F*************************************************************************
  * Function:
- *   aic7xxx_select_queue_depth
+ *   aic7xxx_slave_detach
  *
  * Description:
- *   Sets the queue depth for each SCSI device hanging off the input
- *   host adapter.  We use a queue depth of 2 for devices that do not
- *   support tagged queueing.  If AIC7XXX_CMDS_PER_LUN is defined, we
- *   use that for tagged queueing devices; otherwise we use our own
- *   algorithm for determining the queue depth based on the maximum
- *   SCBs for the controller.
+ *   prepare for this device to go away
  *-F*************************************************************************/
-static void
-aic7xxx_select_queue_depth(struct Scsi_Host *host,
-    Scsi_Device *scsi_devs)
+void
+aic7xxx_slave_detach(Scsi_Device *sdpnt)
 {
-  Scsi_Device *device;
-  struct aic7xxx_host *p = (struct aic7xxx_host *) host->hostdata;
-  int scbnum;
+  struct aic7xxx_host *p = (struct aic7xxx_host *) sdpnt->host->hostdata;
+  int lun, tindex;
 
-  scbnum = 0;
-  for (device = scsi_devs; device != NULL; device = device->next)
+  tindex = sdpnt->id | (sdpnt->channel << 3);
+  lun = sdpnt->lun;
+  if(p->Scsi_Dev[tindex][lun] == NULL)
+    return;
+
+  if(p->tagenable & (1 << tindex))
   {
-    if (device->host == host)
-    {
-      scbnum += aic7xxx_device_queue_depth(p, device);
-    }
+    p->dev_max_queue_depth[tindex] -= p->dev_lun_queue_depth[tindex];
+    if(p->dev_temp_queue_depth[tindex] > p->dev_max_queue_depth[tindex])
+      p->dev_temp_queue_depth[tindex] = p->dev_max_queue_depth[tindex];
   }
+  p->dev_scbs_needed[tindex] -= p->dev_lun_queue_depth[tindex];
+  p->Scsi_Dev[tindex][lun] = NULL;
+  return;
+}
+
+/*+F*************************************************************************
+ * Function:
+ *   aic7xxx_slave_attach
+ *
+ * Description:
+ *   Configure the device we are attaching to the controller.  This is
+ *   where we get to do things like scan the INQUIRY data, set queue
+ *   depths, allocate command structs, etc.
+ *-F*************************************************************************/
+int
+aic7xxx_slave_attach(Scsi_Device *sdpnt)
+{
+  struct aic7xxx_host *p = (struct aic7xxx_host *) sdpnt->host->hostdata;
+  int scbnum, tindex, i;
+
+  tindex = sdpnt->id | (sdpnt->channel << 3);
+  p->dev_flags[tindex] |= DEVICE_PRESENT;
+
+  p->Scsi_Dev[tindex][sdpnt->lun] = sdpnt;
+  aic7xxx_device_queue_depth(p, sdpnt);
+
+  for(i = 0, scbnum = 0; i < p->host->max_id; i++)
+    scbnum += p->dev_scbs_needed[i];
   while (scbnum > p->scb_data->numscbs)
   {
     /*
@@ -7148,8 +7157,77 @@
      * the SCB in order to perform a swap operation (possible deadlock)
      */
     if ( aic7xxx_allocate_scb(p) == 0 )
-      return;
+      break;
+  }
+
+  /*
+   * We only need to check INQUIRY data on one lun of multi lun devices
+   * since speed negotiations are not lun specific.  Once we've check this
+   * particular target id once, the DEVICE_PRESENT flag will be set.
+   */
+  if (!(p->dev_flags[tindex] & DEVICE_DTR_SCANNED))
+  {
+    p->dev_flags[tindex] |= DEVICE_DTR_SCANNED;
+
+    if ( sdpnt->wdtr && (p->features & AHC_WIDE) )
+    {
+      p->needwdtr |= (1<<tindex);
+      p->needwdtr_copy |= (1<<tindex);
+      p->transinfo[tindex].goal_width = p->transinfo[tindex].user_width;
+    }
+    else
+    {
+      p->needwdtr &= ~(1<<tindex);
+      p->needwdtr_copy &= ~(1<<tindex);
+      pause_sequencer(p);
+      aic7xxx_set_width(p, sdpnt->id, sdpnt->channel, sdpnt->lun,
+                        MSG_EXT_WDTR_BUS_8_BIT, (AHC_TRANS_ACTIVE |
+                                                 AHC_TRANS_GOAL |
+                                                 AHC_TRANS_CUR) );
+      unpause_sequencer(p, FALSE);
+    }
+    if ( sdpnt->sdtr && p->transinfo[tindex].user_offset )
+    {
+      p->transinfo[tindex].goal_period = p->transinfo[tindex].user_period;
+      p->transinfo[tindex].goal_options = p->transinfo[tindex].user_options;
+      if (p->features & AHC_ULTRA2)
+        p->transinfo[tindex].goal_offset = MAX_OFFSET_ULTRA2;
+      else if (p->transinfo[tindex].goal_width == MSG_EXT_WDTR_BUS_16_BIT)
+        p->transinfo[tindex].goal_offset = MAX_OFFSET_16BIT;
+      else
+        p->transinfo[tindex].goal_offset = MAX_OFFSET_8BIT;
+      if ( sdpnt->ppr && p->transinfo[tindex].user_period <= 9 &&
+             p->transinfo[tindex].user_options )
+      {
+        p->needppr |= (1<<tindex);
+        p->needppr_copy |= (1<<tindex);
+        p->needsdtr &= ~(1<<tindex);
+        p->needsdtr_copy &= ~(1<<tindex);
+        p->needwdtr &= ~(1<<tindex);
+        p->needwdtr_copy &= ~(1<<tindex);
+        p->dev_flags[tindex] |= DEVICE_SCSI_3;
+      }
+      else
+      {
+        p->needsdtr |= (1<<tindex);
+        p->needsdtr_copy |= (1<<tindex);
+        p->transinfo[tindex].goal_period = 
+          MAX(10, p->transinfo[tindex].goal_period);
+        p->transinfo[tindex].goal_options = 0;
+      }
+    }
+    else
+    {
+      p->needsdtr &= ~(1<<tindex);
+      p->needsdtr_copy &= ~(1<<tindex);
+      p->transinfo[tindex].goal_period = 255;
+      p->transinfo[tindex].goal_offset = 0;
+      p->transinfo[tindex].goal_options = 0;
+    }
+    p->dev_flags[tindex] |= DEVICE_PRINT_DTR;
   }
+
+  return(0);
 }
 
 /*+F*************************************************************************
@@ -8246,7 +8324,6 @@
   host->can_queue = AIC7XXX_MAXSCB;
   host->cmd_per_lun = 3;
   host->sg_tablesize = AIC7XXX_MAX_SG;
-  host->select_queue_depths = aic7xxx_select_queue_depth;
   host->this_id = p->scsi_id;
   host->io_port = p->base;
   host->n_io_port = 0xFF;
diff -Nru a/drivers/scsi/hosts.h b/drivers/scsi/hosts.h
--- a/drivers/scsi/hosts.h	Wed Oct  9 05:38:48 2002
+++ b/drivers/scsi/hosts.h	Wed Oct  9 05:38:48 2002
@@ -97,6 +97,10 @@
      */
     int (* detect)(struct SHT *);
 
+    /*
+     * This function is only used by one driver and will be going away
+     * once it switches over to using the slave_detach() function instead.
+     */
     int (*revoke)(Scsi_Device *);
 
     /* Used with loadable modules to unload the host structures.  Note:
@@ -200,11 +204,59 @@
     int (* reset)(Scsi_Cmnd *, unsigned int);
 
     /*
-     * This function is used to select synchronous communications,
-     * which will result in a higher data throughput.  Not implemented
-     * yet.
+     * Once the device has responded to an INQUIRY and we know the device
+     * is online, call into the low level driver with the Scsi_Device *
+     * (so that the low level driver may save it off in a safe location
+     * for later use in calling scsi_adjust_queue_depth() or possibly
+     * other scsi_* functions) and char * to the INQUIRY return data buffer.
+     * This way, low level drivers will no longer have to snoop INQUIRY data
+     * to see if a drive supports PPR message protocol for Ultra160 speed
+     * negotiations or other similar items.  Instead it can simply wait until
+     * the scsi mid layer calls them with the data in hand and then it can
+     * do it's checking of INQUIRY data.  This will happen once for each new
+     * device added on this controller (including once for each lun on
+     * multi-lun devices, so low level drivers should take care to make
+     * sure that if they do tagged queueing on a per physical unit basis
+     * instead of a per logical unit basis that they have the mid layer
+     * allocate tags accordingly).
+     *
+     * Things currently recommended to be handled at this time include:
+     *
+     * 1.  Checking for tagged queueing capability and if able then calling
+     *     scsi_adjust_queue_depth() with the device pointer and the
+     *     suggested new queue depth.
+     * 2.  Checking for things such as SCSI level or DT bit in order to
+     *     determine if PPR message protocols are appropriate on this
+     *     device (or any other scsi INQUIRY data specific things the
+     *     driver wants to know in order to properly handle this device).
+     * 3.  Allocating command structs that the device will need.
+     * 4.  Setting the default timeout on this device (if needed).
+     * 5.  Saving the Scsi_Device pointer so that the low level driver
+     *     will be able to easily call back into scsi_adjust_queue_depth
+     *     again should it be determined that the queue depth for this
+     *     device should be lower or higher than it is initially set to.
+     * 6.  Allocate device data structures as needed that can be attached
+     *     to the Scsi_Device * via SDpnt->host_device_ptr
+     * 7.  Anything else the low level driver might want to do on a device
+     *     specific setup basis...
+     * 8.  Return 0 on success, non-0 on error.  The device will be marked
+     *     as offline on error so that no access will occur.
+     */
+    int (* slave_attach)(Scsi_Device *);
+
+    /*
+     * If we are getting ready to remove a device from the scsi chain then
+     * we call into the low level driver to let them know.  Once a low
+     * level driver has been informed that a drive is going away, the low
+     * level driver *must* remove it's pointer to the Scsi_Device because
+     * it is going to be kfree()'ed shortly.  It is no longer safe to call
+     * any mid layer functions with this Scsi_Device *.  Additionally, the
+     * mid layer will not make any more calls into the low level driver's
+     * queue routine with this device, so it is safe for the device driver
+     * to deallocate all structs/commands/etc that is has allocated
+     * specifically for this device at the time of this call.
      */
-    int (* slave_attach)(int, int);
+    void (* slave_detach)(Scsi_Device *);
 
     /*
      * This function determines the bios parameters for a given
@@ -217,6 +269,8 @@
 
     /*
      * Used to set the queue depth for a specific device.
+     *
+     * Once the slave_attach() function is in full use, this will go away.
      */
     void (*select_queue_depths)(struct Scsi_Host *, Scsi_Device *);
 
diff -Nru a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
--- a/drivers/scsi/scsi.c	Wed Oct  9 05:38:48 2002
+++ b/drivers/scsi/scsi.c	Wed Oct  9 05:38:48 2002
@@ -551,6 +551,7 @@
 {
 	unsigned long flags;
         Scsi_Device * SDpnt;
+	int alloc_cmd = 0;
 
 	spin_lock_irqsave(&device_request_lock, flags);
 
@@ -567,6 +568,25 @@
 				   atomic_read(&SCpnt->host->host_active),
 				   SCpnt->host->host_failed));
 
+	if(SDpnt->queue_depth > SDpnt->new_queue_depth) {
+		Scsi_Cmnd *prev, *next;
+		/*
+		 * Release the command block and decrement the queue
+		 * depth.
+		 */
+		for(prev = NULL, next = SDpnt->device_queue;
+				next != SCpnt;
+				prev = next, next = next->next) ;
+		if(prev == NULL)
+			SDpnt->device_queue = next->next;
+		else
+			prev->next = next->next;
+		kfree((char *)SCpnt);
+		SDpnt->queue_depth--;
+	} else if(SDpnt->queue_depth < SDpnt->new_queue_depth) {
+		alloc_cmd = 1;
+		SDpnt->queue_depth++;
+	}
 	spin_unlock_irqrestore(&device_request_lock, flags);
 
         /*
@@ -575,6 +595,48 @@
          * they wake up.  
          */
 	wake_up(&SDpnt->scpnt_wait);
+
+	/*
+	 * We are happy to release command blocks in the scope of the
+	 * device_request_lock since that's nice and quick, but allocation
+	 * can take more time so do it outside that scope instead.
+	 */
+	if(alloc_cmd) {
+		Scsi_Cmnd *newSCpnt;
+
+		newSCpnt = kmalloc(sizeof(Scsi_Cmnd), GFP_ATOMIC |
+				(SDpnt->host->unchecked_isa_dma ?
+				 GFP_DMA : 0));
+		if(newSCpnt) {
+			memset(newSCpnt, 0, sizeof(Scsi_Cmnd));
+			newSCpnt->host = SDpnt->host;
+			newSCpnt->device = SDpnt;
+			newSCpnt->target = SDpnt->id;
+			newSCpnt->lun = SDpnt->lun;
+			newSCpnt->channel = SDpnt->channel;
+			newSCpnt->request = NULL;
+			newSCpnt->use_sg = 0;
+			newSCpnt->old_use_sg = 0;
+			newSCpnt->old_cmd_len = 0;
+			newSCpnt->underflow = 0;
+			newSCpnt->old_underflow = 0;
+			newSCpnt->transfersize = 0;
+			newSCpnt->resid = 0;
+			newSCpnt->serial_number = 0;
+			newSCpnt->serial_number_at_timeout = 0;
+			newSCpnt->host_scribble = NULL;
+			newSCpnt->state = SCSI_STATE_UNUSED;
+			newSCpnt->owner = SCSI_OWNER_NOBODY;
+			spin_lock_irqsave(&device_request_lock, flags);
+			newSCpnt->next = SDpnt->device_queue;
+			SDpnt->device_queue = newSCpnt;
+			spin_unlock_irqrestore(&device_request_lock, flags);
+		} else {
+			spin_lock_irqsave(&device_request_lock, flags);
+			SDpnt->queue_depth--;
+			spin_unlock_irqrestore(&device_request_lock, flags);
+		}
+	}
 }
 
 /*
@@ -1447,8 +1509,8 @@
 		SDpnt->device_queue = SCnext = SCpnt->next;
 		kfree((char *) SCpnt);
 	}
-	SDpnt->has_cmdblocks = 0;
 	SDpnt->queue_depth = 0;
+	SDpnt->new_queue_depth = 0;
 	spin_unlock_irqrestore(&device_request_lock, flags);
 }
 
@@ -1463,63 +1525,115 @@
  *
  * Lock status: No locking assumed or required.
  *
- * Notes:
+ * Notes:	We really only allocate one command here.  We will allocate
+ *		more commands as needed once the device goes into real use.
  */
 void scsi_build_commandblocks(Scsi_Device * SDpnt)
 {
 	unsigned long flags;
-	struct Scsi_Host *host = SDpnt->host;
-	int j;
 	Scsi_Cmnd *SCpnt;
 
+	if (SDpnt->queue_depth != 0)
+		return;
+		
+	SCpnt = (Scsi_Cmnd *) kmalloc(sizeof(Scsi_Cmnd), GFP_ATOMIC |
+			(SDpnt->host->unchecked_isa_dma ? GFP_DMA : 0));
+	if (NULL == SCpnt) {
+		/*
+		 * Since we don't currently have *any* command blocks on this
+		 * device, go ahead and try an atomic allocation...
+		 */
+		SCpnt = (Scsi_Cmnd *) kmalloc(sizeof(Scsi_Cmnd), GFP_ATOMIC |
+			(SDpnt->host->unchecked_isa_dma ? GFP_DMA : 0));
+		if (NULL == SCpnt)
+			return;	/* Oops, we aren't going anywhere for now */
+	}
+
+	memset(SCpnt, 0, sizeof(Scsi_Cmnd));
+	SCpnt->host = SDpnt->host;
+	SCpnt->device = SDpnt;
+	SCpnt->target = SDpnt->id;
+	SCpnt->lun = SDpnt->lun;
+	SCpnt->channel = SDpnt->channel;
+	SCpnt->request = NULL;
+	SCpnt->use_sg = 0;
+	SCpnt->old_use_sg = 0;
+	SCpnt->old_cmd_len = 0;
+	SCpnt->underflow = 0;
+	SCpnt->old_underflow = 0;
+	SCpnt->transfersize = 0;
+	SCpnt->resid = 0;
+	SCpnt->serial_number = 0;
+	SCpnt->serial_number_at_timeout = 0;
+	SCpnt->host_scribble = NULL;
+	SCpnt->state = SCSI_STATE_UNUSED;
+	SCpnt->owner = SCSI_OWNER_NOBODY;
 	spin_lock_irqsave(&device_request_lock, flags);
+	if(SDpnt->new_queue_depth == 0)
+		SDpnt->new_queue_depth = 1;
+	SDpnt->queue_depth++;
+	SCpnt->next = SDpnt->device_queue;
+	SDpnt->device_queue = SCpnt;
+	spin_unlock_irqrestore(&device_request_lock, flags);
+}
 
-	if (SDpnt->queue_depth == 0)
+/*
+ * Function:	scsi_adjust_queue_depth()
+ *
+ * Purpose:	Allow low level drivers to tell us to change the queue depth
+ * 		on a specific SCSI device
+ *
+ * Arguments:	SDpnt	- SCSI Device in question
+ * 		tagged	- Do we use tagged queueing (non-0) or do we treat
+ * 			  this device as an untagged device (0)
+ * 		tags	- Number of tags allowed if tagged queueing enabled,
+ * 			  or number of commands the low level driver can
+ * 			  queue up in non-tagged mode (as per cmd_per_lun).
+ *
+ * Returns:	Nothing
+ *
+ * Lock Status:	None held on entry
+ *
+ * Notes:	Low level drivers may call this at any time and we will do
+ * 		the right thing depending on whether or not the device is
+ * 		currently active and whether or not it even has the
+ * 		command blocks built yet.
+ *
+ * 		If cmdblocks != 0 then we are a live device.  We just set the
+ * 		new_queue_depth variable and when the scsi completion handler
+ * 		notices that queue_depth != new_queue_depth it will work to
+ *		rectify the situation.  If new_queue_depth is less than current
+ *		queue_depth, then it will free the completed command instead of
+ *		putting it back on the free list and dec queue_depth.  Otherwise
+ *		it will try to allocate a new command block for the device and
+ *		put it on the free list along with the command that is being
+ *		completed.  Obviously, if the device isn't doing anything then
+ *		neither is this code, so it will bring the devices queue depth
+ *		back into line when the device is actually being used.  This
+ *		keeps us from needing to fire off a kernel thread or some such
+ *		nonsense (this routine can be called from interrupt code, so
+ *		handling allocations here would be tricky and risky, making
+ *		a kernel thread a much safer way to go if we wanted to handle
+ *		the work immediately instead of letting it get done a little
+ *		at a time in the completion handler).
+ */
+void scsi_adjust_queue_depth(Scsi_Device *SDpnt, int tagged, int tags)
+{
+	unsigned long flags;
+
+	/*
+	 * refuse to set tagged depth to an unworkable size
+	 */
+	if(tags == 0)
+		return;
+	spin_lock_irqsave(&device_request_lock, flags);
+	SDpnt->new_queue_depth = tags;
+	SDpnt->tagged_queue = tagged;
+	spin_unlock_irqrestore(&device_request_lock, flags);
+	if(SDpnt->queue_depth == 0)
 	{
-		SDpnt->queue_depth = host->cmd_per_lun;
-		if (SDpnt->queue_depth == 0)
-			SDpnt->queue_depth = 1; /* live to fight another day */
-	}
-	SDpnt->device_queue = NULL;
-
-	for (j = 0; j < SDpnt->queue_depth; j++) {
-		SCpnt = (Scsi_Cmnd *)
-		    kmalloc(sizeof(Scsi_Cmnd),
-				     GFP_ATOMIC |
-				(host->unchecked_isa_dma ? GFP_DMA : 0));
-		if (NULL == SCpnt)
-			break;	/* If not, the next line will oops ... */
-		memset(SCpnt, 0, sizeof(Scsi_Cmnd));
-		SCpnt->host = host;
-		SCpnt->device = SDpnt;
-		SCpnt->target = SDpnt->id;
-		SCpnt->lun = SDpnt->lun;
-		SCpnt->channel = SDpnt->channel;
-		SCpnt->request = NULL;
-		SCpnt->use_sg = 0;
-		SCpnt->old_use_sg = 0;
-		SCpnt->old_cmd_len = 0;
-		SCpnt->underflow = 0;
-		SCpnt->old_underflow = 0;
-		SCpnt->transfersize = 0;
-		SCpnt->resid = 0;
-		SCpnt->serial_number = 0;
-		SCpnt->serial_number_at_timeout = 0;
-		SCpnt->host_scribble = NULL;
-		SCpnt->next = SDpnt->device_queue;
-		SDpnt->device_queue = SCpnt;
-		SCpnt->state = SCSI_STATE_UNUSED;
-		SCpnt->owner = SCSI_OWNER_NOBODY;
-	}
-	if (j < SDpnt->queue_depth) {	/* low on space (D.Gilbert 990424) */
-		printk(KERN_WARNING "scsi_build_commandblocks: want=%d, space for=%d blocks\n",
-		       SDpnt->queue_depth, j);
-		SDpnt->queue_depth = j;
-		SDpnt->has_cmdblocks = (0 != j);
-	} else {
-		SDpnt->has_cmdblocks = 1;
+		scsi_build_commandblocks(SDpnt);
 	}
-	spin_unlock_irqrestore(&device_request_lock, flags);
 }
 
 void __init scsi_host_no_insert(char *str, int n)
@@ -1758,13 +1872,6 @@
 			goto out;	/* We do not yet support unplugging */
 
 		scan_scsis(HBA_ptr, 1, channel, id, lun);
-
-		/* FIXME (DB) This assumes that the queue_depth routines can be used
-		   in this context as well, while they were all designed to be
-		   called only once after the detect routine. (DB) */
-		/* queue_depth routine moved to inside scan_scsis(,1,,,) so
-		   it is called before build_commandblocks() */
-
 		err = length;
 		goto out;
 	}
@@ -1826,6 +1933,8 @@
 			 */
                         if (HBA_ptr->hostt->revoke)
                                 HBA_ptr->hostt->revoke(scd);
+			if (HBA_ptr->hostt->slave_detach)
+				(*HBA_ptr->hostt->slave_detach) (scd);
 			devfs_unregister (scd->de);
 			scsi_release_commandblocks(scd);
 
@@ -1963,9 +2072,6 @@
 				/* first register parent with driverfs */
 				device_register(&shpnt->host_driverfs_dev);
 				scan_scsis(shpnt, 0, 0, 0, 0);
-				if (shpnt->select_queue_depths != NULL) {
-					(shpnt->select_queue_depths) (shpnt, shpnt->host_queue);
-				}
 			}
 		}
 
@@ -1985,7 +2091,7 @@
 							(*sdtpnt->attach) (SDpnt);
 					if (SDpnt->attached) {
 						scsi_build_commandblocks(SDpnt);
-						if (0 == SDpnt->has_cmdblocks)
+						if (SDpnt->queue_depth == 0)
 							out_of_space = 1;
 					}
 				}
@@ -2116,6 +2222,8 @@
 				printk(KERN_ERR "Attached usage count = %d\n", SDpnt->attached);
 				goto err_out;
 			}
+			if (shpnt->hostt->slave_detach)
+				(*shpnt->hostt->slave_detach) (SDpnt);
 			devfs_unregister (SDpnt->de);
 			put_device(&SDpnt->sdev_driverfs_dev);
 		}
@@ -2272,10 +2380,10 @@
 			 * If this driver attached to the device, and don't have any
 			 * command blocks for this device, allocate some.
 			 */
-			if (SDpnt->attached && SDpnt->has_cmdblocks == 0) {
+			if (SDpnt->attached && SDpnt->queue_depth == 0) {
 				SDpnt->online = TRUE;
 				scsi_build_commandblocks(SDpnt);
-				if (0 == SDpnt->has_cmdblocks)
+				if (SDpnt->queue_depth == 0)
 					out_of_space = 1;
 			}
 		}
@@ -2325,6 +2433,8 @@
 				 * Nobody is using this device any more.  Free all of the
 				 * command structures.
 				 */
+				if (shpnt->hostt->slave_detach)
+					(*shpnt->hostt->slave_detach) (SDpnt);
 				scsi_release_commandblocks(SDpnt);
 			}
 		}
@@ -2678,9 +2788,13 @@
         SDpnt->host = SHpnt;
         SDpnt->id = SHpnt->this_id;
         SDpnt->type = -1;
-        SDpnt->queue_depth = 1;
+	SDpnt->new_queue_depth = 1;
         
 	scsi_build_commandblocks(SDpnt);
+	if(SDpnt->queue_depth == 0) {
+		kfree(SDpnt);
+		return NULL;
+	}
 
 	scsi_initialize_queue(SDpnt, SHpnt);
 
diff -Nru a/drivers/scsi/scsi.h b/drivers/scsi/scsi.h
--- a/drivers/scsi/scsi.h	Wed Oct  9 05:38:48 2002
+++ b/drivers/scsi/scsi.h	Wed Oct  9 05:38:48 2002
@@ -481,6 +481,7 @@
 extern void scsi_bottom_half_handler(void);
 extern void scsi_release_commandblocks(Scsi_Device * SDpnt);
 extern void scsi_build_commandblocks(Scsi_Device * SDpnt);
+extern void scsi_adjust_queue_depth(Scsi_Device *, int, int);
 extern void scsi_done(Scsi_Cmnd * SCpnt);
 extern void scsi_finish_command(Scsi_Cmnd *);
 extern int scsi_retry_command(Scsi_Cmnd *);
@@ -563,6 +564,8 @@
 	volatile unsigned short device_busy;	/* commands actually active on low-level */
 	Scsi_Cmnd *device_queue;	/* queue of SCSI Command structures */
         Scsi_Cmnd *current_cmnd;	/* currently active command */
+	unsigned short queue_depth;	/* How deep of a queue we have */
+	unsigned short new_queue_depth; /* How deep of a queue we want */
 
 	unsigned int id, lun, channel;
 
@@ -586,24 +589,25 @@
 	unsigned char current_tag;	/* current tag */
 	unsigned char sync_min_period;	/* Not less than this period */
 	unsigned char sync_max_offset;	/* Not greater than this offset */
-	unsigned char queue_depth;	/* How deep a queue to use */
 
 	unsigned online:1;
 	unsigned writeable:1;
 	unsigned removable:1;
 	unsigned random:1;
-	unsigned has_cmdblocks:1;
 	unsigned changed:1;	/* Data invalid due to media change */
 	unsigned busy:1;	/* Used to prevent races */
 	unsigned lockable:1;	/* Able to prevent media removal */
 	unsigned borken:1;	/* Tell the Seagate driver to be 
 				 * painfully slow on this device */
-	unsigned tagged_supported:1;	/* Supports SCSI-II tagged queuing */
-	unsigned tagged_queue:1;	/* SCSI-II tagged queuing enabled */
 	unsigned disconnect:1;	/* can disconnect */
 	unsigned soft_reset:1;	/* Uses soft reset option */
-	unsigned sync:1;	/* Negotiate for sync transfers */
-	unsigned wide:1;	/* Negotiate for WIDE transfers */
+	unsigned sdtr:1;	/* Device supports SDTR messages */
+	unsigned wdtr:1;	/* Device supports WDTR messages */
+	unsigned ppr:1;		/* Device supports PPR messages */
+	unsigned tagged_supported:1;	/* Supports SCSI-II tagged queuing */
+	unsigned tagged_queue:1;	/* SCSI-II tagged queuing enabled */
+	unsigned simple_tags:1;	/* Device supports simple queue tag messages */
+	unsigned ordered_tags:1;/* Device supports ordered queue tag messages */
 	unsigned single_lun:1;	/* Indicates we should only allow I/O to
 				 * one of the luns for the device at a 
 				 * time. */
diff -Nru a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
--- a/drivers/scsi/scsi_scan.c	Wed Oct  9 05:38:48 2002
+++ b/drivers/scsi/scsi_scan.c	Wed Oct  9 05:38:48 2002
@@ -1409,6 +1409,14 @@
 	sdev->lockable = sdev->removable;
 	sdev->soft_reset = (inq_result[7] & 1) && ((inq_result[3] & 7) == 2);
 
+	if (sdev->scsi_level >= SCSI_3 || (sdev->inquiry_len > 56 &&
+		inq_result[56] & 0x04))
+		sdev->ppr = 1;
+	if (inq_result[7] & 0x60)
+		sdev->wdtr = 1;
+	if (inq_result[7] & 0x10)
+		sdev->sdtr = 1;
+
 	/*
 	 * XXX maybe move the identifier and driverfs/devfs setup to a new
 	 * function, and call them after this function is called.
@@ -1513,9 +1521,9 @@
 	 * XXX maybe change scsi_release_commandblocks to not reset
 	 * queue_depth to 0.
 	 */
-	sdevscan->queue_depth = 1;
+	sdevscan->new_queue_depth = 1;
 	scsi_build_commandblocks(sdevscan);
-	if (sdevscan->has_cmdblocks == 0)
+	if (sdevscan->queue_depth == 0)
 		goto alloc_failed;
 
 	sreq = scsi_allocate_request(sdevscan);
@@ -1589,7 +1597,7 @@
 		kfree(scsi_result);
 	if (sreq != NULL)
 		scsi_release_request(sreq);
-	if (sdevscan->has_cmdblocks != 0)
+	if (sdevscan->queue_depth != 0)
 		scsi_release_commandblocks(sdevscan);
 	return SCSI_SCAN_NO_RESPONSE;
 }
@@ -1743,9 +1751,9 @@
 	if (sdevscan->scsi_level < SCSI_3)
 		return 1;
 
-	sdevscan->queue_depth = 1;
+	sdevscan->new_queue_depth = 1;
 	scsi_build_commandblocks(sdevscan);
-	if (sdevscan->has_cmdblocks == 0) {
+	if (sdevscan->queue_depth == 0) {
 		printk(ALLOC_FAILURE_MSG, __FUNCTION__);
 		/*
 		 * We are out of memory, don't try scanning any further.
@@ -2018,6 +2026,17 @@
 		 */
 		if (shost->select_queue_depths != NULL)
 			(shost->select_queue_depths) (shost, shost->host_queue);
+		if (shost->hostt->slave_attach != NULL)
+			if ((shost->hostt->slave_attach) (sdev) != 0) {
+				/*
+				 * Low level driver failed to attach this
+				 * device, we've got to kick it back out
+				 * now as a result :-(
+				 */
+				printk("scsi_scan_selected_lun: slave_attach "
+					"failed, marking device OFFLINE.\n");
+				sdev->online = FALSE;
+			}
 
 		for (sdt = scsi_devicelist; sdt; sdt = sdt->next)
 			if (sdt->init && sdt->dev_noticed)
@@ -2028,7 +2047,7 @@
 				(*sdt->attach) (sdev);
 				if (sdev->attached) {
 					scsi_build_commandblocks(sdev);
-					if (sdev->has_cmdblocks == 0)
+					if (sdev->queue_depth == 0)
 						printk(ALLOC_FAILURE_MSG,
 						       __FUNCTION__);
 				}

[-- Attachment #3: james_bottomley_scsi_fix.patch --]
[-- Type: text/plain, Size: 3643 bytes --]

Received: from localhost (nighthawk [127.0.0.1])
	by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g9B19MU01846
	for <dave@localhost>; Thu, 10 Oct 2002 18:09:22 -0700
Received: from imap.linux.ibm.com [9.27.103.44]
	by localhost with IMAP (fetchmail-5.9.0)
	for dave@localhost (multi-drop); Thu, 10 Oct 2002 18:09:22 -0700 (PDT)
Received: from localhost ([unix socket])
	by imap.linux.ibm.com (Cyrus v2.1.9) with LMTP; Thu, 10 Oct 2002 21:09:26 -0400
X-Sieve: CMU Sieve 2.2
Received: from smtp.linux.ibm.com (linux.ibm.com [9.26.4.197])
	by imap.linux.ibm.com (Postfix) with ESMTP id A15D57C017
	for <haveblue@imap.linux.ibm.com>; Thu, 10 Oct 2002 21:09:25 -0400 (EDT)
Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206])
	by smtp.linux.ibm.com (Postfix) with ESMTP id 04B4F3FE06
	for <haveblue@linux.ibm.com>; Thu, 10 Oct 2002 21:09:25 -0400 (EDT)
Received: from e2.ny.us.ibm.com (e2.esmtp.ibm.com [9.14.6.102])
	by northrelay04.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g9B19IKf137968
	for <haveblue@us.ibm.com>; Thu, 10 Oct 2002 21:09:22 -0400
Received: from pogo.mtv1.steeleye.com (host194.steeleye.com [66.206.164.34])
	by e2.ny.us.ibm.com (8.12.2/8.12.2) with ESMTP id g9B19KHd065956
	for <haveblue@us.ibm.com>; Thu, 10 Oct 2002 21:09:20 -0400
Received: (from root@localhost)
	by pogo.mtv1.steeleye.com (8.9.3/8.9.3) id SAA07896
	for <haveblue@us.ibm.com>; Thu, 10 Oct 2002 18:09:19 -0700
Received: from localhost.localdomain (vpn-233.mtv1.steeleye.com [172.16.1.233])
	by pogo.mtv1.steeleye.com (8.9.3/8.9.3) with ESMTP id SAA07617;
	Thu, 10 Oct 2002 18:09:16 -0700
Received: from mulgrave (jejb@localhost)
	by localhost.localdomain (8.11.6/linuxconf) with ESMTP id g9B19FJ14530;
	Thu, 10 Oct 2002 18:09:16 -0700
Message-Id: <200210110109.g9B19FJ14530@localhost.localdomain>
X-Mailer: exmh version 2.4 06/23/2000 with nmh-1.0.4
To: Dave Hansen <haveblue@us.ibm.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: Degraded I/O performance, since 2.5.41 
In-Reply-To: Message from Dave Hansen <haveblue@us.ibm.com> 
   of "Thu, 10 Oct 2002 16:48:47 PDT." <3DA611DF.3000206@us.ibm.com> 
Mime-Version: 1.0
Content-Type: multipart/mixed ;
	boundary="==_Exmh_-5647072960"
Date: Thu, 10 Oct 2002 18:09:14 -0700
From: James Bottomley <James.Bottomley@steeleye.com>
X-AntiVirus: scanned for viruses by AMaViS 0.2.1 (http://amavis.org/)
X-Spam-Status: No, hits=-7.3 required=5.0
	tests=IN_REP_TO,DOUBLE_CAPSWORD,UNIFIED_PATCH
	version=2.31
X-Spam-Level: 
X-Fetchmail-Warning: no recipient addresses matched declared local names

This is a multipart MIME message.

--==_Exmh_-5647072960
Content-Type: text/plain; charset=us-ascii

OK, this patch should fix it.  Do your performance numbers for ips improve 
again with this?

James


--==_Exmh_-5647072960
Content-Type: text/plain ; name="tmp.diff"; charset=us-ascii
Content-Description: tmp.diff
Content-Disposition: attachment; filename="tmp.diff"

===== drivers/scsi/scsi_scan.c 1.24 vs edited =====
--- 1.24/drivers/scsi/scsi_scan.c	Tue Oct  8 15:45:57 2002
+++ edited/drivers/scsi/scsi_scan.c	Thu Oct 10 17:40:53 2002
@@ -1477,11 +1477,14 @@
 		if (sdt->detect)
 			sdev->attached += (*sdt->detect) (sdev);
 
-	if (sdev->host->hostt->slave_attach != NULL)
+	if (sdev->host->hostt->slave_attach != NULL) {
 		if (sdev->host->hostt->slave_attach(sdev) != 0) {
 			printk(KERN_INFO "scsi_add_lun: failed low level driver attach, setting device offline");
 			sdev->online = FALSE;
 		}
+	} else if(sdev->host->cmd_per_lun) {
+		scsi_adjust_queue_depth(sdev, 0, sdev->host->cmd_per_lun);
+	}
 
 	if (sdevnew != NULL)
 		*sdevnew = sdev;

--==_Exmh_-5647072960--



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Degraded I/O performance, since 2.5.41
@ 2002-10-11 13:49 Jeffery, David
  2002-10-12 17:47 ` Mike Anderson
  0 siblings, 1 reply; 5+ messages in thread
From: Jeffery, David @ 2002-10-11 13:49 UTC (permalink / raw)
  To: 'Dave Hansen'; +Cc: linux-scsi, lkml




> From: Dave Hansen 
> James Bottomley wrote:
> > OK, this patch should fix it.  Do your performance numbers 
> for ips improve 
> > again with this?
> 
> Yes, they are better, but still about 10% below what I was seeing 
> before.  Thank you for getting this out so quickly.  I can do 
> reasonable work with this.
> 
> Are the ServeRAID people aware of this situation?  Do they know that 
> their performance could be in the toliet if they don't 
> implement queue 
> resizing in the driver?
> -- 
> Dave Hansen
> haveblue@us.ibm.com
> 

Dave,

Thank you for for adding me to the CC list.  I didn't realize how the
queueing work would affect scsi drivers.

I'm using an older 2.5 kernel so I hadn't seen the performance drop.
I'll update my kernel and get to work on it next week when I have
time.

David Jeffery

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Degraded I/O performance, since 2.5.41
  2002-10-11 13:49 Jeffery, David
@ 2002-10-12 17:47 ` Mike Anderson
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Anderson @ 2002-10-12 17:47 UTC (permalink / raw)
  To: Jeffery, David; +Cc: 'Dave Hansen', linux-scsi, linux-kernel

Sorry if this is a resend it looked this did not make it onto the list
Friday.

Jeffery, David [David_Jeffery@adaptec.com] wrote:
> > Yes, they are better, but still about 10% below what I was seeing 
> > before.  Thank you for getting this out so quickly.  I can do 
> > reasonable work with this.

Dave H,

The short term patch will not bring your queue_depth value back up all
the way. The ips drivers cmd_per_lun value is 16. The old method would
distribute the queue_depth among the devices on the adapter. I believe
your system has a ServeRAID 4M which means you could have had a
queue_depth as high as 95. This seems high but from your numbers a value
higher than 16 is most likely needed.

You can check your old and new values post booting the kernel if you have
sg configured in with:

cat /proc/scsi/sg/device_hdr /proc/scsi/sg/devices

> 
> I'm using an older 2.5 kernel so I hadn't seen the performance drop.
> I'll update my kernel and get to work on it next week when I have

David J,
Here is a quick patch for you to review. I tested and seems to bring my
system back up to per patch queue depth values. I used a enq value to
simulate past behavior in the attach function. I tested with one logical
and two logical devices.


Here is some sg output at different kernel and logical driver values.

2.5.40 1 logical drive:
host    chan    id      lun     type    opens   qdepth  busy    online
0       0       0       0       0       3       31      0       1

2.5.40 2 logical drives:
host    chan    id      lun     type    opens   qdepth  busy    online
0       0       0       0       0       3       15      0       1
0       0       1       0       0       0       15      0       1

2.5 current 2 logical drives: 
host    chan    id      lun     type    opens   qdepth  busy    online
0       0       0       0       0       3       1       0       1
0       0       1       0       0       0       1       0       1

2.5 current 2 logical drives + patch:
host    chan    id      lun     type    opens   qdepth  busy    online
0       0       0       0       0       3       15      0       1
0       0       1       0       0       0       7       0       1

2.5 current 2 logical drives + patch post some IO:
host    chan    id      lun     type    opens   qdepth  busy    online
0       0       0       0       0       3       15      0       1
0       0       1       0       0       0       15      0       1

-andmike
--
Michael Anderson
andmike@us.ibm.com

 ips.c |   47 ++++++++++++++++-------------------------------
 ips.h |    3 ++-
 2 files changed, 18 insertions(+), 32 deletions(-)
------

--- 1.26/drivers/scsi/ips.c	Tue Oct  8 00:51:39 2002
+++ edited/drivers/scsi/ips.c	Fri Oct 11 13:24:08 2002
@@ -481,7 +481,6 @@
 static void ips_free_flash_copperhead(ips_ha_t *ha);
 static void ips_get_bios_version(ips_ha_t *, int);
 static void ips_identify_controller(ips_ha_t *);
-static void ips_select_queue_depth(struct Scsi_Host *, Scsi_Device *);
 static void ips_chkstatus(ips_ha_t *, IPS_STATUS *);
 static void ips_enable_int_copperhead(ips_ha_t *);
 static void ips_enable_int_copperhead_memio(ips_ha_t *);
@@ -1087,7 +1086,6 @@
          sh->n_io_port = io_addr ? 255 : 0;
          sh->unique_id = (io_addr) ? io_addr : mem_addr;
          sh->irq = irq;
-         sh->select_queue_depths = ips_select_queue_depth;
          sh->sg_tablesize = sh->hostt->sg_tablesize;
          sh->can_queue = sh->hostt->can_queue;
          sh->cmd_per_lun = sh->hostt->cmd_per_lun;
@@ -1820,45 +1818,33 @@
 
 /****************************************************************************/
 /*                                                                          */
-/* Routine Name: ips_select_queue_depth                                     */
+/* Routine Name: ips_slave_attach                                           */
 /*                                                                          */
 /* Routine Description:                                                     */
 /*                                                                          */
-/*   Select queue depths for the devices on the contoller                   */
+/*   Configure the device we are attaching to this controller               */
 /*                                                                          */
 /****************************************************************************/
-static void
-ips_select_queue_depth(struct Scsi_Host *host, Scsi_Device *scsi_devs) {
-   Scsi_Device *device;
+int
+ips_slave_attach(Scsi_Device *scsi_dev) {
    ips_ha_t    *ha;
-   int          count = 0;
-   int          min;
+   int          queue_depth;
+   int          min, per_logical;
+
+   ha = (ips_ha_t *) scsi_dev->host->hostdata;;
 
-   ha = IPS_HA(host);
    min = ha->max_cmds / 4;
+   per_logical = ( ha->max_cmds -1 ) / ha->enq->ucLogDriveCount;
 
-   for (device = scsi_devs; device; device = device->next) {
-     if (device->host == host) {
-        if ((device->channel == 0) && (device->type == 0))
-           count++;
-     }
+   if ((scsi_dev->channel == 0) && (scsi_dev->type == 0)) {
+      queue_depth = max(per_logical, min);
+   } else {
+      queue_depth = 2;
    }
 
-   for (device = scsi_devs; device; device = device->next) {
-      if (device->host == host) {
-         if ((device->channel == 0) && (device->type == 0)) {
-            device->queue_depth = ( ha->max_cmds - 1 ) / count;
-            if (device->queue_depth < min) 
-               device->queue_depth = min;
-         }
-         else {
-            device->queue_depth = 2;
-         }
-
-         if (device->queue_depth < 2)
-            device->queue_depth = 2;
-      }
-   }
+   scsi_adjust_queue_depth(scsi_dev, 0, queue_depth);
+   
+   return 0;
 }
 
 /****************************************************************************/
@@ -7407,7 +7393,6 @@
     sh->n_io_port = io_addr ? 255 : 0;
     sh->unique_id = (io_addr) ? io_addr : mem_addr;
     sh->irq = irq;
-    sh->select_queue_depths = ips_select_queue_depth;
     sh->sg_tablesize = sh->hostt->sg_tablesize;
     sh->can_queue = sh->hostt->can_queue;
     sh->cmd_per_lun = sh->hostt->cmd_per_lun;
--- 1.9/drivers/scsi/ips.h	Tue Oct  8 00:51:39 2002
+++ edited/drivers/scsi/ips.h	Fri Oct 11 11:39:14 2002
@@ -62,6 +62,7 @@
    extern int ips_biosparam(Disk *, struct block_device *, int *);
    extern const char * ips_info(struct Scsi_Host *);
    extern void do_ips(int, void *, struct pt_regs *);
+   extern int ips_slave_attach(Scsi_Device *);
 
    /*
     * Some handy macros
@@ -481,7 +482,7 @@
     eh_host_reset_handler : ips_eh_reset, \
     abort : NULL,                         \
     reset : NULL,                         \
-    slave_attach : NULL,                  \
+    slave_attach : ips_slave_attach,      \
     bios_param : ips_biosparam,           \
     can_queue : 0,                        \
     this_id: -1,                          \



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-10-12 17:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-10  1:09 Degraded I/O performance, since 2.5.41 Dave Hansen
     [not found] <Pine.LNX.4.44.0210092015170.9790-100000@home.transmeta.com>
     [not found] ` <3DA61041.9080808@us.ibm.com>
     [not found]   ` <20021011004227.GA27073@redhat.com>
2002-10-11  0:53     ` Dave Hansen
2002-10-11  1:45       ` Dave Hansen
  -- strict thread matches above, loose matches on Subject: below --
2002-10-11 13:49 Jeffery, David
2002-10-12 17:47 ` Mike Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox