[PATCH] 0/7 per scsi_device queue lock patches

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] 0/7 per scsi_device queue lock patches
@ 2003-03-25  1:53 Patrick Mansfield
  2003-03-25  1:54 ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Patrick Mansfield
  2003-03-27 16:14 ` [PATCH] 0/7 per scsi_device queue lock patches James Bottomley
  0 siblings, 2 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  1:53 UTC (permalink / raw)
  To: linux-scsi

The following patches against recent 2.5.x bk (pulled on march 24, should
apply fine on top of 2.5.66) leads to splitting the scsi queue_lock into a
pre-scsi_device lock rather than the current per-scsi_host lock.

The first 2 patches were already posted and discussed, and include changes
based on that discussion.

Hopefully there are no major issues with the first 4 patches - I would
like to have them pushed for inclusion in 2.5.

Patches are incremental (they overlap).

Patch descriptions:

patch 1: starved changes - use a list_head for starved queue's
patch 2: add missing scsi_queue_next_request calls
patch 3: consolidate single_lun code (this code goes away with patch 7)
patch 4: cleanup/consolidate code in scsi_request_fn
patch 5: alloc a request_queue on each scsi_alloc_sdev call
patch 6: add and use a per-scsi_device queue_lock
patch 7: fix single_lun code for per-scsi_device queue_lock.

If you run with all patches applied let me know your results.

I've run tests across 20 drives and 2 qla 2300 adapters on an 8 CPU NUMAQ
system, using the feral driver with can_queue set to 50 and queue_depth
set to 16.

I ran with 20 processes, each process just continuously re-reads the first
block of a different disk (using O_DIRECT). Not much of a benchmark.

With all the patches applied (20 processes each reading 20000 times), I got:

1.03user 81.91system 0:28.46elapsed

Without patches:

1.09user 153.36system 0:34.61elapsed

If anyone wants, I can get vmstat or oprofile before/after results.

I ran some write-fsync tests (on file systems mounted across 20 drives),
but hitting can_queue without the starved changes causes variations in
performance numbers, and I'm getting IO hangs (with and without the above
patches, but more so without the patches). I haven't figured out what is
wrong. I'm also getting occasional hangs on boot (with or without patches,
NUMAQ + feral + isp 1020 + qla 2300). I have not had any problems on a
netfinity (more standard x86) box with an aic adapter.

Thanks.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 1/7 starved changes - use a list_head for starved queue's
  2003-03-25  1:53 [PATCH] 0/7 per scsi_device queue lock patches Patrick Mansfield
@ 2003-03-25  1:54 ` Patrick Mansfield
  2003-03-25  2:02   ` [PATCH] 2/7 add missing scsi_queue_next_request calls Patrick Mansfield
  2003-03-25 19:39   ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Luben Tuikov
  2003-03-27 16:14 ` [PATCH] 0/7 per scsi_device queue lock patches James Bottomley
  1 sibling, 2 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  1:54 UTC (permalink / raw)
  To: linux-scsi

Use a list_head per scsi_host to store a list of scsi request queues
that were "starved" (they were not able to send IO because of per host
limitations).

diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/hosts.c starve-25/drivers/scsi/hosts.c
--- 25-bk-base/drivers/scsi/hosts.c	Thu Mar  6 12:34:47 2003
+++ starve-25/drivers/scsi/hosts.c	Mon Mar 24 12:14:28 2003
@@ -383,6 +383,7 @@ struct Scsi_Host * scsi_register(Scsi_Ho
 	scsi_assign_lock(shost, &shost->default_lock);
 	INIT_LIST_HEAD(&shost->my_devices);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
+	INIT_LIST_HEAD(&shost->starved_list);
 
 	init_waitqueue_head(&shost->host_wait);
 	shost->dma_channel = 0xff;
diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/hosts.h starve-25/drivers/scsi/hosts.h
--- 25-bk-base/drivers/scsi/hosts.h	Tue Mar 18 12:53:33 2003
+++ starve-25/drivers/scsi/hosts.h	Mon Mar 24 12:14:28 2003
@@ -380,6 +380,7 @@ struct Scsi_Host
     struct scsi_host_cmd_pool *cmd_pool;
     spinlock_t            free_list_lock;
     struct list_head      free_list;   /* backup store of cmd structs */
+    struct list_head      starved_list;
 
     spinlock_t		  default_lock;
     spinlock_t		  *host_lock;
@@ -471,12 +472,6 @@ struct Scsi_Host
     unsigned reverse_ordering:1;
 
     /*
-     * Indicates that one or more devices on this host were starved, and
-     * when the device becomes less busy that we need to feed them.
-     */
-    unsigned some_device_starved:1;
-   
-    /*
      * Host has rejected a command because it was busy.
      */
     unsigned int host_blocked;
diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/scsi.h starve-25/drivers/scsi/scsi.h
--- 25-bk-base/drivers/scsi/scsi.h	Mon Mar 24 11:57:13 2003
+++ starve-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:28 2003
@@ -555,6 +555,7 @@ struct scsi_device {
 	volatile unsigned short device_busy;	/* commands actually active on low-level */
 	spinlock_t list_lock;
 	struct list_head cmd_list;	/* queue of in use SCSI Command structures */
+	struct list_head starved_entry;
         Scsi_Cmnd *current_cmnd;	/* currently active command */
 	unsigned short queue_depth;	/* How deep of a queue we want */
 	unsigned short last_queue_full_depth; /* These two are used by */
@@ -615,8 +616,6 @@ struct scsi_device {
 					 * because we did a bus reset. */
 	unsigned ten:1;		/* support ten byte read / write */
 	unsigned remap:1;	/* support remapping  */
-	unsigned starved:1;	/* unable to process commands because
-				   host busy */
 //	unsigned sync:1;	/* Sync transfer state, managed by host */
 //	unsigned wide:1;	/* WIDE transfer state, managed by host */
 
diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/scsi_lib.c starve-25/drivers/scsi/scsi_lib.c
--- 25-bk-base/drivers/scsi/scsi_lib.c	Tue Mar 18 12:53:33 2003
+++ starve-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:28 2003
@@ -356,7 +356,6 @@ static void scsi_queue_next_request(requ
 	struct scsi_device *sdev, *sdev2;
 	struct Scsi_Host *shost;
 	unsigned long flags;
-	int all_clear;
 
 	ASSERT_LOCK(q->queue_lock, 0);
 
@@ -383,11 +382,6 @@ static void scsi_queue_next_request(requ
 		__elv_add_request(q, cmd->request, 0, 0);
 	}
 
-	/*
-	 * Just hit the requeue function for the queue.
-	 */
-	__blk_run_queue(q);
-
 	sdev = q->queuedata;
 	shost = sdev->host;
 
@@ -412,31 +406,24 @@ static void scsi_queue_next_request(requ
 		}
 	}
 
-	/*
-	 * Now see whether there are other devices on the bus which
-	 * might be starved.  If so, hit the request function.  If we
-	 * don't find any, then it is safe to reset the flag.  If we
-	 * find any device that it is starved, it isn't safe to reset the
-	 * flag as the queue function releases the lock and thus some
-	 * other device might have become starved along the way.
-	 */
-	all_clear = 1;
-	if (shost->some_device_starved) {
-		list_for_each_entry(sdev, &shost->my_devices, siblings) {
-			if (shost->can_queue > 0 &&
-			    shost->host_busy >= shost->can_queue)
-				break;
-			if (shost->host_blocked || shost->host_self_blocked)
-				break;
-			if (sdev->device_blocked || !sdev->starved)
-				continue;
-			__blk_run_queue(sdev->request_queue);
-			all_clear = 0;
-		}
-
-		if (sdev == NULL && all_clear)
-			shost->some_device_starved = 0;
+	while (!list_empty(&shost->starved_list) &&
+	       !shost->host_blocked && !shost->host_self_blocked &&
+		!((shost->can_queue > 0) &&
+		  (shost->host_busy >= shost->can_queue))) {
+		/*
+		 * As long as shost is accepting commands and we have
+		 * starved queues, call __blk_run_queue. scsi_request_fn
+		 * drops the queue_lock and can add us back to the
+		 * starved_list.
+		 */
+		sdev2 = list_entry(shost->starved_list.next,
+					  struct scsi_device, starved_entry);
+		list_del_init(&sdev2->starved_entry);
+		__blk_run_queue(sdev2->request_queue);
 	}
+
+	__blk_run_queue(q);
+
 	spin_unlock_irqrestore(q->queue_lock, flags);
 }
 
@@ -1115,23 +1102,16 @@ static void scsi_request_fn(request_queu
 		 */
 		if (sdev->device_blocked)
 			break;
+
+		if (!list_empty(&sdev->starved_entry))
+			break;
+
 		if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
 		    shost->host_blocked || shost->host_self_blocked) {
-			/*
-			 * If we are unable to process any commands at all for
-			 * this device, then we consider it to be starved.
-			 * What this means is that there are no outstanding
-			 * commands for this device and hence we need a
-			 * little help getting it started again
-			 * once the host isn't quite so busy.
-			 */
-			if (sdev->device_busy == 0) {
-				sdev->starved = 1;
-				shost->some_device_starved = 1;
-			}
+			list_add_tail(&sdev->starved_entry,
+				      &shost->starved_list);
 			break;
-		} else
-			sdev->starved = 0;
+		}
 
 		/*
 		 * If we couldn't find a request that could be queued, then we
diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/scsi_scan.c starve-25/drivers/scsi/scsi_scan.c
--- 25-bk-base/drivers/scsi/scsi_scan.c	Mon Mar 24 11:57:13 2003
+++ starve-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:14:28 2003
@@ -400,6 +400,7 @@ static struct scsi_device *scsi_alloc_sd
 	INIT_LIST_HEAD(&sdev->siblings);
 	INIT_LIST_HEAD(&sdev->same_target_siblings);
 	INIT_LIST_HEAD(&sdev->cmd_list);
+	INIT_LIST_HEAD(&sdev->starved_entry);
 	spin_lock_init(&sdev->list_lock);
 
 	/*

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 2/7 add missing scsi_queue_next_request calls
  2003-03-25  1:54 ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Patrick Mansfield
@ 2003-03-25  2:02   ` Patrick Mansfield
  2003-03-25  2:02     ` [PATCH] 3/7 consolidate single_lun code Patrick Mansfield
  2003-03-25 19:41     ` [PATCH] 2/7 add missing scsi_queue_next_request calls Luben Tuikov
  2003-03-25 19:39   ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Luben Tuikov
  1 sibling, 2 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  2:02 UTC (permalink / raw)
  To: linux-scsi

Add missing scsi_queue_next_request calls.

Add missing scsi_put_command and scsi_get_command exports.

diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi.h put_cmd-25/drivers/scsi/scsi.h
--- starve-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:28 2003
+++ put_cmd-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:51 2003
@@ -417,6 +417,7 @@ extern void scsi_setup_cmd_retry(Scsi_Cm
 extern void scsi_io_completion(Scsi_Cmnd * SCpnt, int good_sectors,
 			       int block_sectors);
 extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
+extern void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd);
 extern request_queue_t *scsi_alloc_queue(struct Scsi_Host *shost);
 extern void scsi_free_queue(request_queue_t *q);
 extern int scsi_init_queue(void);
diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi_error.c put_cmd-25/drivers/scsi/scsi_error.c
--- starve-25/drivers/scsi/scsi_error.c	Tue Mar 18 12:53:33 2003
+++ put_cmd-25/drivers/scsi/scsi_error.c	Mon Mar 24 12:14:51 2003
@@ -1681,6 +1681,7 @@ scsi_reset_provider(struct scsi_device *
 	struct scsi_cmnd *scmd = scsi_get_command(dev, GFP_KERNEL);
 	struct request req;
 	int rtn;
+	struct request_queue *q;
 
 	scmd->request = &req;
 	memset(&scmd->eh_timeout, 0, sizeof(scmd->eh_timeout));
@@ -1735,6 +1736,8 @@ scsi_reset_provider(struct scsi_device *
 	}
 
 	scsi_delete_timer(scmd);
+	q = scmd->device->request_queue;
 	scsi_put_command(scmd);
+	scsi_queue_next_request(q, NULL);
 	return rtn;
 }
diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi_lib.c put_cmd-25/drivers/scsi/scsi_lib.c
--- starve-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:28 2003
+++ put_cmd-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:51 2003
@@ -174,14 +174,18 @@ void scsi_do_req(struct scsi_request *sr
 		 void (*done)(struct scsi_cmnd *),
 		 int timeout, int retries)
 {
+	struct request_queue *q;
+
 	/*
 	 * If the upper level driver is reusing these things, then
 	 * we should release the low-level block now.  Another one will
 	 * be allocated later when this request is getting queued.
 	 */
 	if (sreq->sr_command) {
+		q = sreq->sr_command->device->request_queue;
 		scsi_put_command(sreq->sr_command);
 		sreq->sr_command = NULL;
+		scsi_queue_next_request(q, NULL);
 	}
 
 	/*
@@ -228,6 +232,7 @@ static void scsi_wait_done(struct scsi_c
 void scsi_wait_req(struct scsi_request *sreq, const void *cmnd, void *buffer,
 		   unsigned bufflen, int timeout, int retries)
 {
+	struct request_queue *q;
 	DECLARE_COMPLETION(wait);
 	
 	sreq->sr_request->waiting = &wait;
@@ -239,7 +244,9 @@ void scsi_wait_req(struct scsi_request *
 	sreq->sr_request->waiting = NULL;
 
 	if (sreq->sr_command) {
+		q = sreq->sr_command->device->request_queue;
 		scsi_put_command(sreq->sr_command);
+		scsi_queue_next_request(q, NULL);
 		sreq->sr_command = NULL;
 	}
 }
@@ -351,7 +358,7 @@ void scsi_setup_cmd_retry(struct scsi_cm
  *		permutations grows as 2**N, and if too many more special cases
  *		get added, we start to get screwed.
  */
-static void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd)
+void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd)
 {
 	struct scsi_device *sdev, *sdev2;
 	struct Scsi_Host *shost;
diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi_syms.c put_cmd-25/drivers/scsi/scsi_syms.c
--- starve-25/drivers/scsi/scsi_syms.c	Thu Mar  6 12:34:52 2003
+++ put_cmd-25/drivers/scsi/scsi_syms.c	Mon Mar 24 12:14:51 2003
@@ -60,6 +60,8 @@ EXPORT_SYMBOL(scsi_allocate_request);
 EXPORT_SYMBOL(scsi_release_request);
 EXPORT_SYMBOL(scsi_wait_req);
 EXPORT_SYMBOL(scsi_do_req);
+EXPORT_SYMBOL(scsi_get_command);
+EXPORT_SYMBOL(scsi_put_command);
 
 EXPORT_SYMBOL(scsi_report_bus_reset);
 EXPORT_SYMBOL(scsi_block_requests);

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 3/7 consolidate single_lun code
  2003-03-25  2:02   ` [PATCH] 2/7 add missing scsi_queue_next_request calls Patrick Mansfield
@ 2003-03-25  2:02     ` Patrick Mansfield
  2003-03-25  2:03       ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Patrick Mansfield
                         ` (2 more replies)
  2003-03-25 19:41     ` [PATCH] 2/7 add missing scsi_queue_next_request calls Luben Tuikov
  1 sibling, 3 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  2:02 UTC (permalink / raw)
  To: linux-scsi

Consolidate scsi single_lun code.

diff -purN -X /home/patman/dontdiff put_cmd-25/drivers/scsi/scsi_lib.c lun-single-25/drivers/scsi/scsi_lib.c
--- put_cmd-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:51 2003
+++ lun-single-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:55 2003
@@ -323,6 +323,49 @@ void scsi_setup_cmd_retry(struct scsi_cm
 }
 
 /*
+ * Called for single_lun devices. The target associated with current_sdev can
+ * only handle one active command at a time. Scan through all of the luns
+ * on the same target as current_sdev, return 1 if any are active.
+ */
+static int scsi_single_lun_check(struct scsi_device *current_sdev)
+{
+	struct scsi_device *sdev;
+
+	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
+			    same_target_siblings)
+		if (sdev->device_busy)
+			return 1;
+	return 0;
+}
+
+/*
+ * Called for single_lun devices on IO completion. If no requests
+ * outstanding for current_sdev, call __blk_run_queue for the next
+ * scsi_device on the same target that has requests.
+ *
+ * Called with queue_lock held.
+ */
+static void scsi_single_lun_run(struct scsi_device *current_sdev,
+				 struct request_queue *q)
+{
+	struct scsi_device *sdev;
+	struct Scsi_Host *shost;
+
+	shost = current_sdev->host;
+	if (blk_queue_empty(q) && current_sdev->device_busy == 0 &&
+	    !shost->host_blocked && !shost->host_self_blocked &&
+	    !((shost->can_queue > 0) &&
+	      (shost->host_busy >= shost->can_queue)))
+		list_for_each_entry(sdev, &current_sdev->same_target_siblings,
+				    same_target_siblings)
+			if (!sdev->device_blocked &&
+			    !blk_queue_empty(sdev->request_queue)) {
+				__blk_run_queue(sdev->request_queue);
+				break;
+			}
+}
+
+/*
  * Function:    scsi_queue_next_request()
  *
  * Purpose:     Handle post-processing of completed commands.
@@ -390,29 +433,11 @@ void scsi_queue_next_request(request_que
 	}
 
 	sdev = q->queuedata;
-	shost = sdev->host;
 
-	/*
-	 * If this is a single-lun device, and we are currently finished
-	 * with this device, then see if we need to get another device
-	 * started.  FIXME(eric) - if this function gets too cluttered
-	 * with special case code, then spin off separate versions and
-	 * use function pointers to pick the right one.
-	 */
-	if (sdev->single_lun && blk_queue_empty(q) && sdev->device_busy ==0 &&
-			!shost->host_blocked && !shost->host_self_blocked &&
-			!((shost->can_queue > 0) && (shost->host_busy >=
-				       		     shost->can_queue))) {
-		list_for_each_entry(sdev2, &sdev->same_target_siblings,
-			       same_target_siblings) {
-			if (!sdev2->device_blocked &&
-			    !blk_queue_empty(sdev2->request_queue)) {
-				__blk_run_queue(sdev2->request_queue);
-				break;
-			}
-		}
-	}
+	if (sdev->single_lun)
+		scsi_single_lun_run(sdev, q);
 
+	shost = sdev->host;
 	while (!list_empty(&shost->starved_list) &&
 	       !shost->host_blocked && !shost->host_self_blocked &&
 		!((shost->can_queue > 0) &&
@@ -917,22 +942,6 @@ static int scsi_init_io(struct scsi_cmnd
 	return BLKPREP_KILL;
 }
 
-/*
- * The target associated with myself can only handle one active command at
- * a time. Scan through all of the luns on the same target as myself,
- * return 1 if any are active.
- */
-static int check_all_luns(struct scsi_device *myself)
-{
-	struct scsi_device *sdev;
-
-	list_for_each_entry(sdev, &myself->same_target_siblings,
-			    same_target_siblings)
-		if (sdev->device_busy)
-			return 1;
-	return 0;
-}
-
 static int scsi_prep_fn(struct request_queue *q, struct request *req)
 {
 	struct Scsi_Device_Template *sdt;
@@ -1077,9 +1086,6 @@ static void scsi_request_fn(request_queu
 		if (sdev->device_busy >= sdev->queue_depth)
 			break;
 
-		if (sdev->single_lun && check_all_luns(sdev))
-			break;
-
 		if (shost->host_busy == 0 && shost->host_blocked) {
 			/* unblock after host_blocked iterates to zero */
 			if (--shost->host_blocked == 0) {
@@ -1120,6 +1126,9 @@ static void scsi_request_fn(request_queu
 			break;
 		}
 
+		if (sdev->single_lun && scsi_single_lun_check(sdev))
+			break;
+
 		/*
 		 * If we couldn't find a request that could be queued, then we
 		 * can also quit.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn
  2003-03-25  2:02     ` [PATCH] 3/7 consolidate single_lun code Patrick Mansfield
@ 2003-03-25  2:03       ` Patrick Mansfield
  2003-03-25  2:03         ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Patrick Mansfield
  2003-03-25 21:32         ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Luben Tuikov
  2003-03-25 20:36       ` [PATCH] 3/7 consolidate single_lun code Luben Tuikov
  2003-03-25 20:50       ` Luben Tuikov
  2 siblings, 2 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  2:03 UTC (permalink / raw)
  To: linux-scsi

Cleanup and consolidate scsi_device and scsi_host checks in
scsi_request_fn.

diff -purN -X /home/patman/dontdiff lun-single-25/drivers/scsi/scsi_lib.c req_fn-cleanup-25/drivers/scsi/scsi_lib.c
--- lun-single-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:55 2003
+++ req_fn-cleanup-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:01 2003
@@ -1043,6 +1043,79 @@ static int scsi_prep_fn(struct request_q
 }
 
 /*
+ * scsi_check_sdev: if we can send requests to sdev, return 0 else return 1.
+ *
+ * Called with the queue_lock held.
+ */
+static inline int scsi_check_sdev(struct request_queue *q,
+				  struct scsi_device *sdev)
+{
+	if (sdev->device_busy >= sdev->queue_depth)
+		return 1;
+	if (sdev->device_busy == 0 && sdev->device_blocked) {
+		/*
+		 * unblock after device_blocked iterates to zero
+		 */
+		if (--sdev->device_blocked == 0) {
+			SCSI_LOG_MLQUEUE(3,
+				printk("scsi%d (%d:%d) unblocking device at"
+				       " zero depth\n", sdev->host->host_no,
+				       sdev->id, sdev->lun));
+		} else {
+			blk_plug_device(q);
+			return 1;
+		}
+	}
+	if (sdev->device_blocked)
+		return 1;
+
+	return 0;
+}
+
+/*
+ * scsi_check_shost: if we can send requests to shost, return 0 else return 1.
+ *
+ * Called with the queue_lock held.
+ */
+static inline int scsi_check_shost(struct request_queue *q,
+				   struct Scsi_Host *shost,
+				   struct scsi_device *sdev)
+{
+	if (shost->in_recovery)
+		return 1;
+	if (shost->host_busy == 0 && shost->host_blocked) {
+		/*
+		 * unblock after host_blocked iterates to zero
+		 */
+		if (--shost->host_blocked == 0) {
+			SCSI_LOG_MLQUEUE(3,
+				printk("scsi%d unblocking host at zero depth\n",
+					shost->host_no));
+		} else {
+			blk_plug_device(q);
+			return 1;
+		}
+	}
+	if (!list_empty(&sdev->starved_entry))
+		return 1;
+	if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
+	    shost->host_blocked || shost->host_self_blocked) {
+		SCSI_LOG_MLQUEUE(3,
+			printk("add starved dev <%d,%d,%d,%d>; host "
+			       "limit %d, busy %d, blocked %d selfblocked %d\n",
+			       sdev->host->host_no, sdev->channel,
+			       sdev->id, sdev->lun,
+			       shost->can_queue, shost->host_busy,
+			       shost->host_blocked, shost->host_self_blocked));
+		list_add_tail(&sdev->starved_entry,
+			      &shost->starved_list);
+		return 1;
+	}
+
+	return 0;
+}
+
+/*
  * Function:    scsi_request_fn()
  *
  * Purpose:     Main strategy routine for SCSI.
@@ -1067,13 +1140,8 @@ static void scsi_request_fn(request_queu
 	 * the host is no longer able to accept any more requests.
 	 */
 	for (;;) {
-		/*
-		 * Check this again - each time we loop through we will have
-		 * released the lock and grabbed it again, so each time
-		 * we need to check to see if the queue is plugged or not.
-		 */
-		if (shost->in_recovery || blk_queue_plugged(q))
-			return;
+		if (blk_queue_plugged(q))
+			break;
 
 		/*
 		 * get next queueable request.  We do this early to make sure
@@ -1083,48 +1151,11 @@ static void scsi_request_fn(request_queu
 		 */
 		req = elv_next_request(q);
 
-		if (sdev->device_busy >= sdev->queue_depth)
-			break;
-
-		if (shost->host_busy == 0 && shost->host_blocked) {
-			/* unblock after host_blocked iterates to zero */
-			if (--shost->host_blocked == 0) {
-				SCSI_LOG_MLQUEUE(3,
-					printk("scsi%d unblocking host at zero depth\n",
-						shost->host_no));
-			} else {
-				blk_plug_device(q);
-				break;
-			}
-		}
-
-		if (sdev->device_busy == 0 && sdev->device_blocked) {
-			/* unblock after device_blocked iterates to zero */
-			if (--sdev->device_blocked == 0) {
-				SCSI_LOG_MLQUEUE(3,
-					printk("scsi%d (%d:%d) unblocking device at zero depth\n",
-						shost->host_no, sdev->id, sdev->lun));
-			} else {
-				blk_plug_device(q);
-				break;
-			}
-		}
-
-		/*
-		 * If the device cannot accept another request, then quit.
-		 */
-		if (sdev->device_blocked)
-			break;
-
-		if (!list_empty(&sdev->starved_entry))
+		if (scsi_check_sdev(q, sdev))
 			break;
 
-		if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
-		    shost->host_blocked || shost->host_self_blocked) {
-			list_add_tail(&sdev->starved_entry,
-				      &shost->starved_list);
+		if (scsi_check_shost(q, shost, sdev))
 			break;
-		}
 
 		if (sdev->single_lun && scsi_single_lun_check(sdev))
 			break;
@@ -1140,7 +1171,7 @@ static void scsi_request_fn(request_queu
 			/* If the device is busy, a returning I/O
 			 * will restart the queue.  Otherwise, we have
 			 * to plug the queue */
-			if(sdev->device_busy == 0)
+			if (sdev->device_busy == 0)
 				blk_plug_device(q);
 			break;
 		}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call
  2003-03-25  2:03       ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Patrick Mansfield
@ 2003-03-25  2:03         ` Patrick Mansfield
  2003-03-25  2:03           ` [PATCH] 6/7 add and use a per-scsi_device queue_lock Patrick Mansfield
  2003-03-25  7:12           ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Christoph Hellwig
  2003-03-25 21:32         ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Luben Tuikov
  1 sibling, 2 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  2:03 UTC (permalink / raw)
  To: linux-scsi

Call scsi_alloc_queue each time we call scsi_alloc_sdev; call
scsi_free_queue each time we do not find backing LUN.

This code is less optimal, but leads to cleaner code, and the lock
split-up patch needs this change.

diff -purN -X /home/patman/dontdiff req_fn-cleanup-25/drivers/scsi/scsi_scan.c salloc_queue-25/drivers/scsi/scsi_scan.c
--- req_fn-cleanup-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:14:28 2003
+++ salloc_queue-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:05 2003
@@ -380,7 +380,7 @@ static void print_inquiry(unsigned char 
  *     Scsi_Device pointer, or NULL on failure.
  **/
 static struct scsi_device *scsi_alloc_sdev(struct Scsi_Host *shost,
-	       	struct request_queue **q, uint channel, uint id, uint lun)
+	       	uint channel, uint id, uint lun)
 {
 	struct scsi_device *sdev, *device;
 
@@ -415,14 +415,9 @@ static struct scsi_device *scsi_alloc_sd
 	 */
 	sdev->borken = 1;
 
-	if (!q || *q == NULL) {
-		sdev->request_queue = scsi_alloc_queue(shost);
-		if (!sdev->request_queue)
-			goto out_free_dev;
-	} else {
-		sdev->request_queue = *q;
-		*q = NULL;
-	}
+	sdev->request_queue = scsi_alloc_queue(shost);
+	if (!sdev->request_queue)
+		goto out_free_dev;
 
 	sdev->request_queue->queuedata = sdev;
 	scsi_adjust_queue_depth(sdev, 0, sdev->host->cmd_per_lun);
@@ -462,10 +457,7 @@ static struct scsi_device *scsi_alloc_sd
 	return sdev;
 
 out_free_queue:
-	if (q && sdev->request_queue) {
-		*q = sdev->request_queue;
-		sdev->request_queue = NULL;
-	} else if (sdev->request_queue)
+	if (sdev->request_queue)
 		scsi_free_queue(sdev->request_queue);
 
 out_free_dev:
@@ -1282,15 +1274,15 @@ static int scsi_add_lun(Scsi_Device *sde
  *     SCSI_SCAN_LUN_PRESENT: a new Scsi_Device was allocated and initialized
  **/
 static int scsi_probe_and_add_lun(struct Scsi_Host *host,
-		struct request_queue **q, uint channel, uint id, uint lun,
-		int *bflagsp, struct scsi_device **sdevp)
+		uint channel, uint id, uint lun, int *bflagsp,
+		struct scsi_device **sdevp)
 {
 	struct scsi_device *sdev;
 	struct scsi_request *sreq;
 	unsigned char *result;
 	int bflags, res = SCSI_SCAN_NO_RESPONSE;
 
-	sdev = scsi_alloc_sdev(host, q, channel, id, lun);
+	sdev = scsi_alloc_sdev(host, channel, id, lun);
 	if (!sdev)
 		goto out;
 	sreq = scsi_allocate_request(sdev);
@@ -1344,13 +1336,8 @@ static int scsi_probe_and_add_lun(struct
 	if (res == SCSI_SCAN_LUN_PRESENT) {
 		if (sdevp)
 			*sdevp = sdev;
-	} else {
-		if (q) {
-			*q = sdev->request_queue;
-			sdev->request_queue = NULL;
-		}
+	} else
 		scsi_free_sdev(sdev);
-	}
  out:
 	return res;
 }
@@ -1368,9 +1355,8 @@ static int scsi_probe_and_add_lun(struct
  *
  *     Modifies sdevscan->lun.
  **/
-static void scsi_sequential_lun_scan(struct Scsi_Host *shost,
-	       	struct request_queue **q, uint channel, uint id,
-	       	int bflags, int lun0_res, int scsi_level)
+static void scsi_sequential_lun_scan(struct Scsi_Host *shost, uint channel,
+		uint id, int bflags, int lun0_res, int scsi_level)
 {
 	unsigned int sparse_lun, lun, max_dev_lun;
 
@@ -1438,7 +1424,7 @@ static void scsi_sequential_lun_scan(str
 	 * sparse_lun.
 	 */
 	for (lun = 1; lun < max_dev_lun; ++lun)
-		if ((scsi_probe_and_add_lun(shost, q, channel, id, lun,
+		if ((scsi_probe_and_add_lun(shost, channel, id, lun,
 		      NULL, NULL) != SCSI_SCAN_LUN_PRESENT) && !sparse_lun)
 			return;
 }
@@ -1491,8 +1477,7 @@ static int scsilun_to_int(ScsiLun *scsil
  *     0: scan completed (or no memory, so further scanning is futile)
  *     1: no report lun scan, or not configured
  **/
-static int scsi_report_lun_scan(Scsi_Device *sdev, struct request_queue **q,
-		int bflags)
+static int scsi_report_lun_scan(Scsi_Device *sdev, int bflags)
 {
 #ifdef CONFIG_SCSI_REPORT_LUNS
 
@@ -1653,8 +1638,8 @@ static int scsi_report_lun_scan(Scsi_Dev
 		} else {
 			int res;
 
-			res = scsi_probe_and_add_lun(sdev->host, q,
-				sdev->channel, sdev->id, lun, NULL, NULL);
+			res = scsi_probe_and_add_lun(sdev->host, sdev->channel,
+				sdev->id, lun, NULL, NULL);
 			if (res == SCSI_SCAN_NO_RESPONSE) {
 				/*
 				 * Got some results, but now none, abort.
@@ -1682,8 +1667,7 @@ struct scsi_device *scsi_add_device(stru
 	struct scsi_device *sdev;
 	int error = -ENODEV, res;
 
-	res = scsi_probe_and_add_lun(shost, NULL, channel, id, lun,
-			NULL, &sdev);
+	res = scsi_probe_and_add_lun(shost, channel, id, lun, NULL, &sdev);
 	if (res == SCSI_SCAN_LUN_PRESENT)
 		error = scsi_attach_device(sdev);
 
@@ -1724,8 +1708,8 @@ int scsi_remove_device(struct scsi_devic
  *     First try a REPORT LUN scan, if that does not scan the target, do a
  *     sequential scan of LUNs on the target id.
  **/
-static void scsi_scan_target(struct Scsi_Host *shost, struct request_queue **q,
-			     unsigned int channel, unsigned int id)
+static void scsi_scan_target(struct Scsi_Host *shost, unsigned int channel,
+			     unsigned int id)
 {
 	int bflags = 0;
 	int res;
@@ -1741,14 +1725,14 @@ static void scsi_scan_target(struct Scsi
 	 * Scan LUN 0, if there is some response, scan further. Ideally, we
 	 * would not configure LUN 0 until all LUNs are scanned.
 	 */
-	res = scsi_probe_and_add_lun(shost, q, channel, id, 0, &bflags, &sdev);
+	res = scsi_probe_and_add_lun(shost, channel, id, 0, &bflags, &sdev);
 	if (res == SCSI_SCAN_LUN_PRESENT) {
-		if (scsi_report_lun_scan(sdev, q, bflags) != 0)
+		if (scsi_report_lun_scan(sdev, bflags) != 0)
 			/*
 			 * The REPORT LUN did not scan the target,
 			 * do a sequential scan.
 			 */
-			scsi_sequential_lun_scan(shost, q, channel, id, bflags,
+			scsi_sequential_lun_scan(shost, channel, id, bflags,
 				       	res, sdev->scsi_level);
 	} else if (res == SCSI_SCAN_TARGET_PRESENT) {
 		/*
@@ -1757,7 +1741,7 @@ static void scsi_scan_target(struct Scsi
 		 * sequential lun scan with a bflags of SPARSELUN and
 		 * a default scsi level of SCSI_2
 		 */
-		scsi_sequential_lun_scan(shost, q, channel, id, BLIST_SPARSELUN,
+		scsi_sequential_lun_scan(shost, channel, id, BLIST_SPARSELUN,
 				SCSI_SCAN_TARGET_PRESENT, SCSI_2);
 	}
 }
@@ -1772,7 +1756,6 @@ static void scsi_scan_target(struct Scsi
  **/
 void scsi_scan_host(struct Scsi_Host *shost)
 {
-	struct request_queue *q = NULL;
 	uint channel, id, order_id;
 
 	/*
@@ -1797,12 +1780,9 @@ void scsi_scan_host(struct Scsi_Host *sh
 				order_id = shost->max_id - id - 1;
 			else
 				order_id = id;
-			scsi_scan_target(shost, &q, channel, order_id);
+			scsi_scan_target(shost, channel, order_id);
 		}
 	}
-
-	if (q)
-		scsi_free_queue(q);
 }
 
 void scsi_forget_host(struct Scsi_Host *shost)
@@ -1841,7 +1821,7 @@ struct scsi_device *scsi_get_host_dev(st
 {
 	struct scsi_device *sdev;
 
-	sdev = scsi_alloc_sdev(shost, NULL, 0, shost->this_id, 0);
+	sdev = scsi_alloc_sdev(shost, 0, shost->this_id, 0);
 	if (sdev) {
 		sdev->borken = 0;
 	}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-25  2:03         ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Patrick Mansfield
@ 2003-03-25  2:03           ` Patrick Mansfield
  2003-03-25  2:04             ` [PATCH] 7/7 fix single_lun code for " Patrick Mansfield
                               ` (2 more replies)
  2003-03-25  7:12           ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Christoph Hellwig
  1 sibling, 3 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  2:03 UTC (permalink / raw)
  To: linux-scsi

Add and use a per scsi_device queue_lock.

diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/hosts.c lksplit-25/drivers/scsi/hosts.c
--- salloc_queue-25/drivers/scsi/hosts.c	Mon Mar 24 12:14:28 2003
+++ lksplit-25/drivers/scsi/hosts.c	Mon Mar 24 12:15:10 2003
@@ -620,7 +620,6 @@ void scsi_host_busy_dec_and_test(struct 
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	shost->host_busy--;
-	sdev->device_busy--;
 	if (shost->in_recovery && shost->host_failed &&
 	    (shost->host_busy == shost->host_failed))
 	{
diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi.c lksplit-25/drivers/scsi/scsi.c
--- salloc_queue-25/drivers/scsi/scsi.c	Mon Mar 24 11:57:13 2003
+++ lksplit-25/drivers/scsi/scsi.c	Mon Mar 24 12:15:10 2003
@@ -447,8 +447,6 @@ int scsi_dispatch_cmd(Scsi_Cmnd * SCpnt)
 
 	host = SCpnt->device->host;
 
-	ASSERT_LOCK(host->host_lock, 0);
-
 	/* Assign a unique nonzero serial_number. */
 	if (++serial_number == 0)
 		serial_number = 1;
@@ -574,8 +572,6 @@ void scsi_init_cmd_from_req(Scsi_Cmnd * 
 {
 	struct Scsi_Host *host = SCpnt->device->host;
 
-	ASSERT_LOCK(host->host_lock, 0);
-
 	SCpnt->owner = SCSI_OWNER_MIDLEVEL;
 	SRpnt->sr_command = SCpnt;
 
@@ -819,12 +815,11 @@ void scsi_finish_command(Scsi_Cmnd * SCp
 	struct Scsi_Host *host;
 	Scsi_Device *device;
 	Scsi_Request * SRpnt;
+	unsigned int flags;
 
 	host = SCpnt->device->host;
 	device = SCpnt->device;
 
-	ASSERT_LOCK(host->host_lock, 0);
-
         /*
          * We need to protect the decrement, as otherwise a race condition
          * would exist.  Fiddling with SCpnt isn't a problem as the
@@ -833,6 +828,9 @@ void scsi_finish_command(Scsi_Cmnd * SCp
          * shared.
          */
 	scsi_host_busy_dec_and_test(host, device);
+	spin_lock_irqsave(SCpnt->device->request_queue->queue_lock, flags);
+	SCpnt->device->device_busy--;
+	spin_unlock_irqrestore(SCpnt->device->request_queue->queue_lock, flags);
 
         /*
          * Clear the flags which say that the device/host is no longer
diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi.h lksplit-25/drivers/scsi/scsi.h
--- salloc_queue-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:51 2003
+++ lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:10 2003
@@ -418,7 +418,7 @@ extern void scsi_io_completion(Scsi_Cmnd
 			       int block_sectors);
 extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
 extern void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd);
-extern request_queue_t *scsi_alloc_queue(struct Scsi_Host *shost);
+extern request_queue_t *scsi_alloc_queue(struct scsi_device *sdev);
 extern void scsi_free_queue(request_queue_t *q);
 extern int scsi_init_queue(void);
 extern void scsi_exit_queue(void);
@@ -554,6 +554,7 @@ struct scsi_device {
 	struct Scsi_Host *host;
 	request_queue_t *request_queue;
 	volatile unsigned short device_busy;	/* commands actually active on low-level */
+	spinlock_t sdev_lock;           /* also the request queue_lock */
 	spinlock_t list_lock;
 	struct list_head cmd_list;	/* queue of in use SCSI Command structures */
 	struct list_head starved_entry;
diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi_error.c lksplit-25/drivers/scsi/scsi_error.c
--- salloc_queue-25/drivers/scsi/scsi_error.c	Mon Mar 24 12:14:51 2003
+++ lksplit-25/drivers/scsi/scsi_error.c	Mon Mar 24 12:15:10 2003
@@ -431,8 +431,6 @@ static int scsi_send_eh_cmnd(struct scsi
 	unsigned long flags;
 	int rtn = SUCCESS;
 
-	ASSERT_LOCK(host->host_lock, 0);
-
 	/*
 	 * we will use a queued command if possible, otherwise we will
 	 * emulate the queuing and calling of completion function ourselves.
@@ -1405,8 +1403,6 @@ static void scsi_restart_operations(stru
 	struct scsi_device *sdev;
 	unsigned long flags;
 
-	ASSERT_LOCK(shost->host_lock, 0);
-
 	/*
 	 * If the door was locked, we need to insert a door lock request
 	 * onto the head of the SCSI request queue for the device.  There
@@ -1434,18 +1430,11 @@ static void scsi_restart_operations(stru
 	 * now that error recovery is done, we will need to ensure that these
 	 * requests are started.
 	 */
-	spin_lock_irqsave(shost->host_lock, flags);
 	list_for_each_entry(sdev, &shost->my_devices, siblings) {
-		if ((shost->can_queue > 0 &&
-		     (shost->host_busy >= shost->can_queue))
-		    || (shost->host_blocked)
-		    || (shost->host_self_blocked)) {
-			break;
-		}
-
+		spin_lock_irqsave(sdev->request_queue->queue_lock, flags);
 		__blk_run_queue(sdev->request_queue);
+		spin_unlock_irqrestore(sdev->request_queue->queue_lock, flags);
 	}
-	spin_unlock_irqrestore(shost->host_lock, flags);
 }
 
 /**
diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi_lib.c lksplit-25/drivers/scsi/scsi_lib.c
--- salloc_queue-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:01 2003
+++ lksplit-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:10 2003
@@ -92,6 +92,7 @@ int scsi_queue_insert(struct scsi_cmnd *
 {
 	struct Scsi_Host *host = cmd->device->host;
 	struct scsi_device *device = cmd->device;
+	unsigned long flags;
 
 	SCSI_LOG_MLQUEUE(1,
 		 printk("Inserting command %p into mlqueue\n", cmd));
@@ -130,6 +131,9 @@ int scsi_queue_insert(struct scsi_cmnd *
 	 * Decrement the counters, since these commands are no longer
 	 * active on the host/device.
 	 */
+	spin_lock_irqsave(device->request_queue->queue_lock, flags);
+	device->device_busy--;
+	spin_unlock_irqrestore(device->request_queue->queue_lock, flags);
 	scsi_host_busy_dec_and_test(host, device);
 
 	/*
@@ -343,7 +347,7 @@ static int scsi_single_lun_check(struct 
  * outstanding for current_sdev, call __blk_run_queue for the next
  * scsi_device on the same target that has requests.
  *
- * Called with queue_lock held.
+ * Called with *no* scsi locks held.
  */
 static void scsi_single_lun_run(struct scsi_device *current_sdev,
 				 struct request_queue *q)
@@ -407,9 +411,6 @@ void scsi_queue_next_request(request_que
 	struct Scsi_Host *shost;
 	unsigned long flags;
 
-	ASSERT_LOCK(q->queue_lock, 0);
-
-	spin_lock_irqsave(q->queue_lock, flags);
 	if (cmd != NULL) {
 
 		/*
@@ -418,6 +419,7 @@ void scsi_queue_next_request(request_que
 		 * in which case we need to request the blocks that come after
 		 * the bad sector.
 		 */
+		spin_lock_irqsave(q->queue_lock, flags);
 		cmd->request->special = cmd;
 		if (blk_rq_tagged(cmd->request))
 			blk_queue_end_tag(q, cmd->request);
@@ -430,6 +432,7 @@ void scsi_queue_next_request(request_que
 		cmd->request->flags |= REQ_SPECIAL;
 		cmd->request->flags &= ~REQ_DONTPREP;
 		__elv_add_request(q, cmd->request, 0, 0);
+		spin_unlock_irqrestore(q->queue_lock, flags);
 	}
 
 	sdev = q->queuedata;
@@ -438,6 +441,7 @@ void scsi_queue_next_request(request_que
 		scsi_single_lun_run(sdev, q);
 
 	shost = sdev->host;
+	spin_lock_irqsave(shost->host_lock, flags);
 	while (!list_empty(&shost->starved_list) &&
 	       !shost->host_blocked && !shost->host_self_blocked &&
 		!((shost->can_queue > 0) &&
@@ -447,15 +451,26 @@ void scsi_queue_next_request(request_que
 		 * starved queues, call __blk_run_queue. scsi_request_fn
 		 * drops the queue_lock and can add us back to the
 		 * starved_list.
+		 *
+		 * host_lock protects the starved_list and starved_entry.
+		 * scsi_request_fn must get the host_lock before checking
+		 * or modifying starved_list or starved_entry.
 		 */
 		sdev2 = list_entry(shost->starved_list.next,
 					  struct scsi_device, starved_entry);
 		list_del_init(&sdev2->starved_entry);
+		spin_unlock_irqrestore(shost->host_lock, flags);
+
+		spin_lock_irqsave(sdev2->request_queue->queue_lock, flags);
 		__blk_run_queue(sdev2->request_queue);
+		spin_unlock_irqrestore(sdev2->request_queue->queue_lock, flags);
+
+		spin_lock_irqsave(shost->host_lock, flags);
 	}
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
+	spin_lock_irqsave(q->queue_lock, flags);
 	__blk_run_queue(q);
-
 	spin_unlock_irqrestore(q->queue_lock, flags);
 }
 
@@ -489,8 +504,6 @@ static struct scsi_cmnd *scsi_end_reques
 	struct request *req = cmd->request;
 	unsigned long flags;
 
-	ASSERT_LOCK(q->queue_lock, 0);
-
 	/*
 	 * If there are blocks left over at the end, set up the command
 	 * to queue the remainder of them.
@@ -588,8 +601,6 @@ static void scsi_release_buffers(struct 
 {
 	struct request *req = cmd->request;
 
-	ASSERT_LOCK(cmd->device->host->host_lock, 0);
-
 	/*
 	 * Free up any indirection buffers we allocated for DMA purposes. 
 	 */
@@ -670,8 +681,6 @@ void scsi_io_completion(struct scsi_cmnd
 	 *	would be used if we just wanted to retry, for example.
 	 *
 	 */
-	ASSERT_LOCK(q->queue_lock, 0);
-
 	/*
 	 * Free up any indirection buffers we allocated for DMA purposes. 
 	 * For the case of a READ, we need to copy the data out of the
@@ -1075,7 +1084,7 @@ static inline int scsi_check_sdev(struct
 /*
  * scsi_check_shost: if we can send requests to shost, return 0 else return 1.
  *
- * Called with the queue_lock held.
+ * Called with queue_lock and host_lock held.
  */
 static inline int scsi_check_shost(struct request_queue *q,
 				   struct Scsi_Host *shost,
@@ -1132,8 +1141,7 @@ static void scsi_request_fn(request_queu
 	struct Scsi_Host *shost = sdev->host;
 	struct scsi_cmnd *cmd;
 	struct request *req;
-
-	ASSERT_LOCK(q->queue_lock, 1);
+	unsigned int flags;
 
 	/*
 	 * To start with, we keep looping until the queue is empty, or until
@@ -1141,7 +1149,7 @@ static void scsi_request_fn(request_queu
 	 */
 	for (;;) {
 		if (blk_queue_plugged(q))
-			break;
+			goto completed;
 
 		/*
 		 * get next queueable request.  We do this early to make sure
@@ -1152,28 +1160,29 @@ static void scsi_request_fn(request_queu
 		req = elv_next_request(q);
 
 		if (scsi_check_sdev(q, sdev))
-			break;
+			goto completed;
 
+		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_check_shost(q, shost, sdev))
-			break;
+			goto after_host_lock;
 
 		if (sdev->single_lun && scsi_single_lun_check(sdev))
-			break;
+			goto after_host_lock;
 
 		/*
 		 * If we couldn't find a request that could be queued, then we
 		 * can also quit.
 		 */
 		if (blk_queue_empty(q))
-			break;
+			goto after_host_lock;
 
 		if (!req) {
 			/* If the device is busy, a returning I/O
 			 * will restart the queue.  Otherwise, we have
 			 * to plug the queue */
-			if (sdev->device_busy == 0)
+			if (sdev->device_busy == 1)
 				blk_plug_device(q);
-			break;
+			goto after_host_lock;
 		}
 
 		cmd = req->special;
@@ -1195,11 +1204,9 @@ static void scsi_request_fn(request_queu
 		if (!(blk_queue_tagged(q) && (blk_queue_start_tag(q, req) == 0)))
 			blkdev_dequeue_request(req);
 	
-		/*
-		 * Now bump the usage count for both the host and the
-		 * device.
-		 */
 		shost->host_busy++;
+		spin_unlock_irqrestore(shost->host_lock, flags);
+
 		sdev->device_busy++;
 		spin_unlock_irq(q->queue_lock);
 
@@ -1220,6 +1227,11 @@ static void scsi_request_fn(request_queu
 		 */
 		spin_lock_irq(q->queue_lock);
 	}
+completed:
+	return;
+
+after_host_lock:
+	spin_unlock_irqrestore(shost->host_lock, flags);
 }
 
 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
@@ -1241,15 +1253,20 @@ u64 scsi_calculate_bounce_limit(struct S
 	return BLK_BOUNCE_HIGH;
 }
 
-request_queue_t *scsi_alloc_queue(struct Scsi_Host *shost)
+request_queue_t *scsi_alloc_queue(struct scsi_device *sdev)
 {
 	request_queue_t *q;
+	struct Scsi_Host *shost;
 
 	q = kmalloc(sizeof(*q), GFP_ATOMIC);
 	if (!q)
 		return NULL;
 	memset(q, 0, sizeof(*q));
 
+	/*
+	 * XXX move host code to scsi_register
+	 */
+	shost = sdev->host;
 	if (!shost->max_sectors) {
 		/*
 		 * Driver imposes no hard sector transfer limit.
@@ -1258,7 +1275,7 @@ request_queue_t *scsi_alloc_queue(struct
 		shost->max_sectors = SCSI_DEFAULT_MAX_SECTORS;
 	}
 
-	blk_init_queue(q, scsi_request_fn, shost->host_lock);
+	blk_init_queue(q, scsi_request_fn, &sdev->sdev_lock);
 	blk_queue_prep_rq(q, scsi_prep_fn);
 
 	blk_queue_max_hw_segments(q, shost->sg_tablesize);
diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi_scan.c lksplit-25/drivers/scsi/scsi_scan.c
--- salloc_queue-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:05 2003
+++ lksplit-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:10 2003
@@ -415,7 +415,8 @@ static struct scsi_device *scsi_alloc_sd
 	 */
 	sdev->borken = 1;
 
-	sdev->request_queue = scsi_alloc_queue(shost);
+	spin_lock_init(&sdev->sdev_lock);
+	sdev->request_queue = scsi_alloc_queue(sdev);
 	if (!sdev->request_queue)
 		goto out_free_dev;
 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] 7/7 fix single_lun code for per-scsi_device queue_lock
  2003-03-25  2:03           ` [PATCH] 6/7 add and use a per-scsi_device queue_lock Patrick Mansfield
@ 2003-03-25  2:04             ` Patrick Mansfield
  2003-03-25 21:23               ` Luben Tuikov
  2003-03-25 21:03             ` [PATCH] 6/7 add and use a " Luben Tuikov
  2003-03-25 21:20             ` James Bottomley
  2 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-25  2:04 UTC (permalink / raw)
  To: linux-scsi

Fix single_lun code for per-scsi_device queue_lock

diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi.h sl_lksplit-25/drivers/scsi/scsi.h
--- lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:10 2003
+++ sl_lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:26 2003
@@ -531,6 +531,15 @@ extern struct list_head scsi_dev_info_li
 extern int scsi_dev_info_list_add_str(char *);
 
 /*
+ * scsi_target: representation of a scsi target, for now, this is only
+ * used for single_lun devices.
+ */
+struct scsi_target {
+	unsigned int starget_busy;
+	unsigned int starget_refcnt;
+};
+
+/*
  *  The scsi_device struct contains what we know about each given scsi
  *  device.
  *
@@ -588,6 +597,7 @@ struct scsi_device {
 	unsigned char current_tag;	/* current tag */
 //	unsigned char sync_min_period;	/* Not less than this period */
 //	unsigned char sync_max_offset;	/* Not greater than this offset */
+	struct scsi_target      *sdev_target;   /* used only for single_lun */
 
 	unsigned online:1;
 	unsigned writeable:1;
diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi_lib.c sl_lksplit-25/drivers/scsi/scsi_lib.c
--- lksplit-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:10 2003
+++ sl_lksplit-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:26 2003
@@ -327,46 +327,36 @@ void scsi_setup_cmd_retry(struct scsi_cm
 }
 
 /*
- * Called for single_lun devices. The target associated with current_sdev can
- * only handle one active command at a time. Scan through all of the luns
- * on the same target as current_sdev, return 1 if any are active.
- */
-static int scsi_single_lun_check(struct scsi_device *current_sdev)
-{
-	struct scsi_device *sdev;
-
-	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
-			    same_target_siblings)
-		if (sdev->device_busy)
-			return 1;
-	return 0;
-}
-
-/*
- * Called for single_lun devices on IO completion. If no requests
- * outstanding for current_sdev, call __blk_run_queue for the next
- * scsi_device on the same target that has requests.
+ * Called for single_lun devices on IO completion. Clear starget_busy, and
+ * Call __blk_run_queue for all the scsi_devices on the target - including
+ * current_sdev first.
  *
  * Called with *no* scsi locks held.
  */
-static void scsi_single_lun_run(struct scsi_device *current_sdev,
-				 struct request_queue *q)
+static void scsi_single_lun_run(struct scsi_device *current_sdev)
 {
 	struct scsi_device *sdev;
-	struct Scsi_Host *shost;
+	unsigned int flags, flags2;
 
-	shost = current_sdev->host;
-	if (blk_queue_empty(q) && current_sdev->device_busy == 0 &&
-	    !shost->host_blocked && !shost->host_self_blocked &&
-	    !((shost->can_queue > 0) &&
-	      (shost->host_busy >= shost->can_queue)))
-		list_for_each_entry(sdev, &current_sdev->same_target_siblings,
-				    same_target_siblings)
-			if (!sdev->device_blocked &&
-			    !blk_queue_empty(sdev->request_queue)) {
-				__blk_run_queue(sdev->request_queue);
-				break;
-			}
+	spin_lock_irqsave(current_sdev->request_queue->queue_lock, flags2);
+	spin_lock_irqsave(current_sdev->host->host_lock, flags);
+	WARN_ON(!current_sdev->sdev_target->starget_busy);
+	if (current_sdev->device_busy == 0)
+		current_sdev->sdev_target->starget_busy = 0;
+	spin_unlock_irqrestore(current_sdev->host->host_lock, flags);
+
+	/*
+	 * Call __blk_run_queue for all LUNs on the target, starting with
+	 * current_sdev.
+	 */
+	__blk_run_queue(current_sdev->request_queue);
+	spin_unlock_irqrestore(current_sdev->request_queue->queue_lock, flags2);
+	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
+			    same_target_siblings) {
+		spin_lock_irqsave(sdev->request_queue->queue_lock, flags2);
+		__blk_run_queue(sdev->request_queue);
+		spin_unlock_irqrestore(sdev->request_queue->queue_lock, flags2);
+	}
 }
 
 /*
@@ -438,7 +428,7 @@ void scsi_queue_next_request(request_que
 	sdev = q->queuedata;
 
 	if (sdev->single_lun)
-		scsi_single_lun_run(sdev, q);
+		scsi_single_lun_run(sdev);
 
 	shost = sdev->host;
 	spin_lock_irqsave(shost->host_lock, flags);
@@ -1166,7 +1156,8 @@ static void scsi_request_fn(request_queu
 		if (scsi_check_shost(q, shost, sdev))
 			goto after_host_lock;
 
-		if (sdev->single_lun && scsi_single_lun_check(sdev))
+		if (sdev->single_lun && !sdev->device_busy &&
+		    sdev->sdev_target->starget_busy)
 			goto after_host_lock;
 
 		/*
@@ -1204,6 +1195,9 @@ static void scsi_request_fn(request_queu
 		if (!(blk_queue_tagged(q) && (blk_queue_start_tag(q, req) == 0)))
 			blkdev_dequeue_request(req);
 	
+		if (sdev->single_lun)
+			sdev->sdev_target->starget_busy = 1;
+
 		shost->host_busy++;
 		spin_unlock_irqrestore(shost->host_lock, flags);
 
diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi_scan.c sl_lksplit-25/drivers/scsi/scsi_scan.c
--- lksplit-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:10 2003
+++ sl_lksplit-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:26 2003
@@ -478,6 +478,8 @@ out:
  **/
 static void scsi_free_sdev(struct scsi_device *sdev)
 {
+	unsigned int flags;
+
 	list_del(&sdev->siblings);
 	list_del(&sdev->same_target_siblings);
 
@@ -487,6 +489,14 @@ static void scsi_free_sdev(struct scsi_d
 		sdev->host->hostt->slave_destroy(sdev);
 	if (sdev->inquiry)
 		kfree(sdev->inquiry);
+	if (sdev->single_lun) {
+		spin_lock_irqsave(sdev->host->host_lock, flags);
+		sdev->sdev_target->starget_refcnt--;
+		if (sdev->sdev_target->starget_refcnt == 0)
+			kfree(sdev->sdev_target);
+		spin_unlock_irqrestore(sdev->host->host_lock, flags);
+	}
+
 	kfree(sdev);
 }
 
@@ -1122,6 +1132,10 @@ static void scsi_probe_lun(Scsi_Request 
 static int scsi_add_lun(Scsi_Device *sdev, Scsi_Request *sreq,
 		char *inq_result, int *bflags)
 {
+	struct scsi_device *sdev_sibling;
+	struct scsi_target *starget;
+	unsigned int flags;
+
 	/*
 	 * XXX do not save the inquiry, since it can change underneath us,
 	 * save just vendor/model/rev.
@@ -1243,10 +1257,38 @@ static int scsi_add_lun(Scsi_Device *sde
 
 	/*
 	 * If we need to allow I/O to only one of the luns attached to
-	 * this target id at a time, then we set this flag.
+	 * this target id at a time set single_lun, and allocate or modify
+	 * sdev_target.
 	 */
-	if (*bflags & BLIST_SINGLELUN)
+	if (*bflags & BLIST_SINGLELUN) {
 		sdev->single_lun = 1;
+		spin_lock_irqsave(sdev->host->host_lock, flags);
+		starget = NULL;
+		/*
+		 * Search for an existing target for this sdev.
+		 */
+		list_for_each_entry(sdev_sibling, &sdev->same_target_siblings,
+				    same_target_siblings) {
+			if (sdev_sibling->sdev_target != NULL) {
+				starget = sdev_sibling->sdev_target;
+				break;
+			}
+		}
+		if (!starget) {
+			starget = kmalloc(sizeof(*starget), GFP_KERNEL);
+			if (!starget) {
+				printk(ALLOC_FAILURE_MSG, __FUNCTION__);
+				spin_unlock_irqrestore(sdev->host->host_lock,
+						       flags);
+				return SCSI_SCAN_NO_RESPONSE;
+			}
+			starget->starget_refcnt = 0;
+			starget->starget_busy = 0;
+		}
+		starget->starget_refcnt++;
+		sdev->sdev_target = starget;
+		spin_unlock_irqrestore(sdev->host->host_lock, flags);
+	}
 
 	/* if the device needs this changing, it may do so in the detect
 	 * function */

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call
  2003-03-25  2:03         ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Patrick Mansfield
  2003-03-25  2:03           ` [PATCH] 6/7 add and use a per-scsi_device queue_lock Patrick Mansfield
@ 2003-03-25  7:12           ` Christoph Hellwig
  2003-03-25  7:18             ` Jens Axboe
  1 sibling, 1 reply; 34+ messages in thread
From: Christoph Hellwig @ 2003-03-25  7:12 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

On Mon, Mar 24, 2003 at 06:03:25PM -0800, Patrick Mansfield wrote:
> Call scsi_alloc_queue each time we call scsi_alloc_sdev; call
> scsi_free_queue each time we do not find backing LUN.
> 
> This code is less optimal, but leads to cleaner code, and the lock
> split-up patch needs this change.

I think this needs to be deferred until akpm (or someone else) implemented
the lazy request allocation.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call
  2003-03-25  7:12           ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Christoph Hellwig
@ 2003-03-25  7:18             ` Jens Axboe
  0 siblings, 0 replies; 34+ messages in thread
From: Jens Axboe @ 2003-03-25  7:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Patrick Mansfield, linux-scsi

On Tue, Mar 25 2003, Christoph Hellwig wrote:
> On Mon, Mar 24, 2003 at 06:03:25PM -0800, Patrick Mansfield wrote:
> > Call scsi_alloc_queue each time we call scsi_alloc_sdev; call
> > scsi_free_queue each time we do not find backing LUN.
> > 
> > This code is less optimal, but leads to cleaner code, and the lock
> > split-up patch needs this change.
> 
> I think this needs to be deferred until akpm (or someone else) implemented
> the lazy request allocation.

lazy request allocation is coming shortly, hopefully today.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 1/7 starved changes - use a list_head for starved queue's
  2003-03-25  1:54 ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Patrick Mansfield
  2003-03-25  2:02   ` [PATCH] 2/7 add missing scsi_queue_next_request calls Patrick Mansfield
@ 2003-03-25 19:39   ` Luben Tuikov
  1 sibling, 0 replies; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 19:39 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Use a list_head per scsi_host to store a list of scsi request queues
> that were "starved" (they were not able to send IO because of per host
> limitations).

This patch has been discussed already!

The only meaningful piece is the adding of the starved_entry
and starved_list, which should probably be incorporated
logically in your latter patches.

This will make the job of the maintaners a lot easier and
the patch(es) alot more logically bound.


> diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/hosts.c starve-25/drivers/scsi/hosts.c
> --- 25-bk-base/drivers/scsi/hosts.c	Thu Mar  6 12:34:47 2003
> +++ starve-25/drivers/scsi/hosts.c	Mon Mar 24 12:14:28 2003
> @@ -383,6 +383,7 @@ struct Scsi_Host * scsi_register(Scsi_Ho
>  	scsi_assign_lock(shost, &shost->default_lock);
>  	INIT_LIST_HEAD(&shost->my_devices);
>  	INIT_LIST_HEAD(&shost->eh_cmd_q);
> +	INIT_LIST_HEAD(&shost->starved_list);
>  
>  	init_waitqueue_head(&shost->host_wait);
>  	shost->dma_channel = 0xff;
> diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/hosts.h starve-25/drivers/scsi/hosts.h
> --- 25-bk-base/drivers/scsi/hosts.h	Tue Mar 18 12:53:33 2003
> +++ starve-25/drivers/scsi/hosts.h	Mon Mar 24 12:14:28 2003
> @@ -380,6 +380,7 @@ struct Scsi_Host
>      struct scsi_host_cmd_pool *cmd_pool;
>      spinlock_t            free_list_lock;
>      struct list_head      free_list;   /* backup store of cmd structs */
> +    struct list_head      starved_list;
>  
>      spinlock_t		  default_lock;
>      spinlock_t		  *host_lock;
> @@ -471,12 +472,6 @@ struct Scsi_Host
>      unsigned reverse_ordering:1;
>  
>      /*
> -     * Indicates that one or more devices on this host were starved, and
> -     * when the device becomes less busy that we need to feed them.
> -     */
> -    unsigned some_device_starved:1;
> -   
> -    /*
>       * Host has rejected a command because it was busy.
>       */
>      unsigned int host_blocked;
> diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/scsi.h starve-25/drivers/scsi/scsi.h
> --- 25-bk-base/drivers/scsi/scsi.h	Mon Mar 24 11:57:13 2003
> +++ starve-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:28 2003
> @@ -555,6 +555,7 @@ struct scsi_device {
>  	volatile unsigned short device_busy;	/* commands actually active on low-level */
>  	spinlock_t list_lock;
>  	struct list_head cmd_list;	/* queue of in use SCSI Command structures */
> +	struct list_head starved_entry;
>          Scsi_Cmnd *current_cmnd;	/* currently active command */
>  	unsigned short queue_depth;	/* How deep of a queue we want */
>  	unsigned short last_queue_full_depth; /* These two are used by */
> @@ -615,8 +616,6 @@ struct scsi_device {
>  					 * because we did a bus reset. */
>  	unsigned ten:1;		/* support ten byte read / write */
>  	unsigned remap:1;	/* support remapping  */
> -	unsigned starved:1;	/* unable to process commands because
> -				   host busy */
>  //	unsigned sync:1;	/* Sync transfer state, managed by host */
>  //	unsigned wide:1;	/* WIDE transfer state, managed by host */
>  
> diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/scsi_lib.c starve-25/drivers/scsi/scsi_lib.c
> --- 25-bk-base/drivers/scsi/scsi_lib.c	Tue Mar 18 12:53:33 2003
> +++ starve-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:28 2003
> @@ -356,7 +356,6 @@ static void scsi_queue_next_request(requ
>  	struct scsi_device *sdev, *sdev2;
>  	struct Scsi_Host *shost;
>  	unsigned long flags;
> -	int all_clear;
>  
>  	ASSERT_LOCK(q->queue_lock, 0);
>  
> @@ -383,11 +382,6 @@ static void scsi_queue_next_request(requ
>  		__elv_add_request(q, cmd->request, 0, 0);
>  	}
>  
> -	/*
> -	 * Just hit the requeue function for the queue.
> -	 */
> -	__blk_run_queue(q);
> -
>  	sdev = q->queuedata;
>  	shost = sdev->host;
>  
> @@ -412,31 +406,24 @@ static void scsi_queue_next_request(requ
>  		}
>  	}
>  
> -	/*
> -	 * Now see whether there are other devices on the bus which
> -	 * might be starved.  If so, hit the request function.  If we
> -	 * don't find any, then it is safe to reset the flag.  If we
> -	 * find any device that it is starved, it isn't safe to reset the
> -	 * flag as the queue function releases the lock and thus some
> -	 * other device might have become starved along the way.
> -	 */
> -	all_clear = 1;
> -	if (shost->some_device_starved) {
> -		list_for_each_entry(sdev, &shost->my_devices, siblings) {
> -			if (shost->can_queue > 0 &&
> -			    shost->host_busy >= shost->can_queue)
> -				break;
> -			if (shost->host_blocked || shost->host_self_blocked)
> -				break;
> -			if (sdev->device_blocked || !sdev->starved)
> -				continue;
> -			__blk_run_queue(sdev->request_queue);
> -			all_clear = 0;
> -		}
> -
> -		if (sdev == NULL && all_clear)
> -			shost->some_device_starved = 0;
> +	while (!list_empty(&shost->starved_list) &&
> +	       !shost->host_blocked && !shost->host_self_blocked &&
> +		!((shost->can_queue > 0) &&
> +		  (shost->host_busy >= shost->can_queue))) {
> +		/*
> +		 * As long as shost is accepting commands and we have
> +		 * starved queues, call __blk_run_queue. scsi_request_fn
> +		 * drops the queue_lock and can add us back to the
> +		 * starved_list.
> +		 */
> +		sdev2 = list_entry(shost->starved_list.next,
> +					  struct scsi_device, starved_entry);
> +		list_del_init(&sdev2->starved_entry);
> +		__blk_run_queue(sdev2->request_queue);
>  	}
> +
> +	__blk_run_queue(q);
> +
>  	spin_unlock_irqrestore(q->queue_lock, flags);
>  }
>  
> @@ -1115,23 +1102,16 @@ static void scsi_request_fn(request_queu
>  		 */
>  		if (sdev->device_blocked)
>  			break;
> +
> +		if (!list_empty(&sdev->starved_entry))
> +			break;
> +
>  		if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
>  		    shost->host_blocked || shost->host_self_blocked) {
> -			/*
> -			 * If we are unable to process any commands at all for
> -			 * this device, then we consider it to be starved.
> -			 * What this means is that there are no outstanding
> -			 * commands for this device and hence we need a
> -			 * little help getting it started again
> -			 * once the host isn't quite so busy.
> -			 */
> -			if (sdev->device_busy == 0) {
> -				sdev->starved = 1;
> -				shost->some_device_starved = 1;
> -			}
> +			list_add_tail(&sdev->starved_entry,
> +				      &shost->starved_list);
>  			break;
> -		} else
> -			sdev->starved = 0;
> +		}
>  
>  		/*
>  		 * If we couldn't find a request that could be queued, then we
> diff -purN -X /home/patman/dontdiff 25-bk-base/drivers/scsi/scsi_scan.c starve-25/drivers/scsi/scsi_scan.c
> --- 25-bk-base/drivers/scsi/scsi_scan.c	Mon Mar 24 11:57:13 2003
> +++ starve-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:14:28 2003
> @@ -400,6 +400,7 @@ static struct scsi_device *scsi_alloc_sd
>  	INIT_LIST_HEAD(&sdev->siblings);
>  	INIT_LIST_HEAD(&sdev->same_target_siblings);
>  	INIT_LIST_HEAD(&sdev->cmd_list);
> +	INIT_LIST_HEAD(&sdev->starved_entry);
>  	spin_lock_init(&sdev->list_lock);
>  
>  	/*
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 2/7 add missing scsi_queue_next_request calls
  2003-03-25  2:02   ` [PATCH] 2/7 add missing scsi_queue_next_request calls Patrick Mansfield
  2003-03-25  2:02     ` [PATCH] 3/7 consolidate single_lun code Patrick Mansfield
@ 2003-03-25 19:41     ` Luben Tuikov
  1 sibling, 0 replies; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 19:41 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Add missing scsi_queue_next_request calls.
> 
> Add missing scsi_put_command and scsi_get_command exports.

No logistic problem here, except that in the future, those
scattered calls to scsi_queue_next_request() will have to
be centralized and their count will have to be ONE.

(My same opinion as before.)

> diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi.h put_cmd-25/drivers/scsi/scsi.h
> --- starve-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:28 2003
> +++ put_cmd-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:51 2003
> @@ -417,6 +417,7 @@ extern void scsi_setup_cmd_retry(Scsi_Cm
>  extern void scsi_io_completion(Scsi_Cmnd * SCpnt, int good_sectors,
>  			       int block_sectors);
>  extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
> +extern void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd);
>  extern request_queue_t *scsi_alloc_queue(struct Scsi_Host *shost);
>  extern void scsi_free_queue(request_queue_t *q);
>  extern int scsi_init_queue(void);
> diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi_error.c put_cmd-25/drivers/scsi/scsi_error.c
> --- starve-25/drivers/scsi/scsi_error.c	Tue Mar 18 12:53:33 2003
> +++ put_cmd-25/drivers/scsi/scsi_error.c	Mon Mar 24 12:14:51 2003
> @@ -1681,6 +1681,7 @@ scsi_reset_provider(struct scsi_device *
>  	struct scsi_cmnd *scmd = scsi_get_command(dev, GFP_KERNEL);
>  	struct request req;
>  	int rtn;
> +	struct request_queue *q;
>  
>  	scmd->request = &req;
>  	memset(&scmd->eh_timeout, 0, sizeof(scmd->eh_timeout));
> @@ -1735,6 +1736,8 @@ scsi_reset_provider(struct scsi_device *
>  	}
>  
>  	scsi_delete_timer(scmd);
> +	q = scmd->device->request_queue;
>  	scsi_put_command(scmd);
> +	scsi_queue_next_request(q, NULL);
>  	return rtn;
>  }
> diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi_lib.c put_cmd-25/drivers/scsi/scsi_lib.c
> --- starve-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:28 2003
> +++ put_cmd-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:51 2003
> @@ -174,14 +174,18 @@ void scsi_do_req(struct scsi_request *sr
>  		 void (*done)(struct scsi_cmnd *),
>  		 int timeout, int retries)
>  {
> +	struct request_queue *q;
> +
>  	/*
>  	 * If the upper level driver is reusing these things, then
>  	 * we should release the low-level block now.  Another one will
>  	 * be allocated later when this request is getting queued.
>  	 */
>  	if (sreq->sr_command) {
> +		q = sreq->sr_command->device->request_queue;
>  		scsi_put_command(sreq->sr_command);
>  		sreq->sr_command = NULL;
> +		scsi_queue_next_request(q, NULL);
>  	}
>  
>  	/*
> @@ -228,6 +232,7 @@ static void scsi_wait_done(struct scsi_c
>  void scsi_wait_req(struct scsi_request *sreq, const void *cmnd, void *buffer,
>  		   unsigned bufflen, int timeout, int retries)
>  {
> +	struct request_queue *q;
>  	DECLARE_COMPLETION(wait);
>  	
>  	sreq->sr_request->waiting = &wait;
> @@ -239,7 +244,9 @@ void scsi_wait_req(struct scsi_request *
>  	sreq->sr_request->waiting = NULL;
>  
>  	if (sreq->sr_command) {
> +		q = sreq->sr_command->device->request_queue;
>  		scsi_put_command(sreq->sr_command);
> +		scsi_queue_next_request(q, NULL);
>  		sreq->sr_command = NULL;
>  	}
>  }
> @@ -351,7 +358,7 @@ void scsi_setup_cmd_retry(struct scsi_cm
>   *		permutations grows as 2**N, and if too many more special cases
>   *		get added, we start to get screwed.
>   */
> -static void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd)
> +void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd)
>  {
>  	struct scsi_device *sdev, *sdev2;
>  	struct Scsi_Host *shost;
> diff -purN -X /home/patman/dontdiff starve-25/drivers/scsi/scsi_syms.c put_cmd-25/drivers/scsi/scsi_syms.c
> --- starve-25/drivers/scsi/scsi_syms.c	Thu Mar  6 12:34:52 2003
> +++ put_cmd-25/drivers/scsi/scsi_syms.c	Mon Mar 24 12:14:51 2003
> @@ -60,6 +60,8 @@ EXPORT_SYMBOL(scsi_allocate_request);
>  EXPORT_SYMBOL(scsi_release_request);
>  EXPORT_SYMBOL(scsi_wait_req);
>  EXPORT_SYMBOL(scsi_do_req);
> +EXPORT_SYMBOL(scsi_get_command);
> +EXPORT_SYMBOL(scsi_put_command);
>  
>  EXPORT_SYMBOL(scsi_report_bus_reset);
>  EXPORT_SYMBOL(scsi_block_requests);
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-25  2:02     ` [PATCH] 3/7 consolidate single_lun code Patrick Mansfield
  2003-03-25  2:03       ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Patrick Mansfield
@ 2003-03-25 20:36       ` Luben Tuikov
  2003-03-26 19:11         ` Patrick Mansfield
  2003-03-25 20:50       ` Luben Tuikov
  2 siblings, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 20:36 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Consolidate scsi single_lun code.
> 
> diff -purN -X /home/patman/dontdiff put_cmd-25/drivers/scsi/scsi_lib.c lun-single-25/drivers/scsi/scsi_lib.c
> --- put_cmd-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:51 2003
> +++ lun-single-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:55 2003
> @@ -323,6 +323,49 @@ void scsi_setup_cmd_retry(struct scsi_cm
>  }
>  
>  /*
> + * Called for single_lun devices. The target associated with current_sdev can
> + * only handle one active command at a time. Scan through all of the luns
> + * on the same target as current_sdev, return 1 if any are active.
> + */

Here is parth of your comment:
	A single_lun (single_lu:1) marked target can handle a single
	active LU at a time.

> +static int scsi_single_lun_check(struct scsi_device *current_sdev)

Here is the name:

static int scsi_single_lun_active(struct scsi_device *dev);

The reason is that this makes sense:

	if (scsi_single_lun_active(dev)) {
		/* then there is an active lu */
	} else {
		/* No, there's no active devices or
		the target is NOT a single lun. */
	}

or the negation:

	if (!scsi_single_lun_active(dev)) {
		/* no active lun devices or target is not a single
		LU, -- either way we can continue here: */
		...
	}

And you comment continues:
	Returns 1 if the target has an active LU at this time,
	else 0.

Ideally you'd do this:

static int scsi_lu_active(struct scsi_target *target);

Then go over the list of LUs of the target.
(You'd have to add things to the scsi_target structure, etc...,
see next emails.)

{
	if (!target->single_lun)
		return 0;
	for all devices
		if an LU is busy
			return 1;
	return 0;
} /* end scsi_lu_active() */


> +{
> +	struct scsi_device *sdev;
> +
> +	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> +			    same_target_siblings)
> +		if (sdev->device_busy)
> +			return 1;
> +	return 0;
> +}
> +
> +/*
> + * Called for single_lun devices on IO completion. If no requests
> + * outstanding for current_sdev, call __blk_run_queue for the next
> + * scsi_device on the same target that has requests.
> + *
> + * Called with queue_lock held.
> + */
> +static void scsi_single_lun_run(struct scsi_device *current_sdev,
> +				 struct request_queue *q)

Now here I've got a problem.  This just pulls out the old logic
into a separate fn.

I'd rather you did this: in the request fn put the single_lun device
into the starved devices list, just as you'd do for other devices.

If you want you to discern between devices, you can add single_lun
devices to the front and fully capable devices into the tail of the
starved devices list -- this would make the starved devices list
into a priority queue.

Then have a general function to run the starved devices queues
as follows:

/**
  * Function:	scsi_run_starved_devs()
  * Purpose:	Run the request queue of starved devices of a host.
  * Arguments:	shost, pointer to struct Scsi_Host.
  * Returns:	Nothing.
  * Notes:	This function may not necessarily empty the starved device
  *		list.  For this reason the following should be observed:
  *		- One shot prioritization: if the caller prioritizes
  *		the starved device list, then the caller has to prioritize
  *		it _before_ a call to this fn.
  *		- Priority queue: if the starved device list is treated as
  *		a priority queue, no prior arrangements are necessary.  The
  *		reason is that if any entries are inserted in due time
  *		of running of this function, the leftovers are added at
  *		the front of the starved list, thus ordering them by time.
  */
static inline void scsi_run_starved_devs(struct Scsi_Host *shost)
{
	unsigned long flags;
	LIST_HEAD(starved);
	
	spin_lock_irqsave(&shost->starved_devs_lock, flags);
	list_splice_init(&shost->starved_devs, &starved);
	spin_unlock_irqrestore(&shost->starved_devs_lock, flags);
	
	while (!list_empty(&starved)) {
		struct scsi_device *dev;

		if (shost->host_blocked || shost->host_self_blocked ||
		    (shost->can_queue > 0 &&
		     shost->host_busy >= shost->can_queue))
			break;
	
		dev=list_entry(starved.next,struct scsi_device,starved_entry);
		list_del_init(dev->starved_entry);
		spin_lock_irqsave(dev->request_queue->queue_lock, flags);
		__blk_run_queue(dev->request_queue);
		spin_unlock_irqrestore(dev->request_queue->queue_lock, flags);
	}
	
	if (!list_empty(&starved)) {
		spin_lock_irqsave(&shost->starved_devs_lock, flags);
		list_splice_init(&starved, &shost->starved_devs);
		spin_unlock_irqrestore(&shost->starved_devs_lock, flags);
	}
}


Now since __blk_run_queue(q) call the scsi_request_fn() it MAY
add the device back to the starved list, but that's no problem
as it is already emtpy.  If the host blocks in due time, then
the left over devices are added to the front and we get
prioritization by time (submittal).

> +{
> +	struct scsi_device *sdev;
> +	struct Scsi_Host *shost;
> +
> +	shost = current_sdev->host;
> +	if (blk_queue_empty(q) && current_sdev->device_busy == 0 &&
> +	    !shost->host_blocked && !shost->host_self_blocked &&
> +	    !((shost->can_queue > 0) &&
> +	      (shost->host_busy >= shost->can_queue)))
> +		list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> +				    same_target_siblings)
> +			if (!sdev->device_blocked &&
> +			    !blk_queue_empty(sdev->request_queue)) {
> +				__blk_run_queue(sdev->request_queue);
> +				break;
> +			}
> +}
> +
> +/*
>   * Function:    scsi_queue_next_request()
>   *
>   * Purpose:     Handle post-processing of completed commands.

And this is how scsi_queue_next_request() might look like:
(i.e. a more general form)

static void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd)
{
	struct scsi_device *sdev, *sdev2;
	struct Scsi_Host *shost;
	unsigned long flags;

	ASSERT_LOCK(q->queue_lock, 0);

	sdev = q->queuedata;
	shost = sdev->host;

	scsi_run_starved_devs(shost);

	spin_lock_irqsave(q->queue_lock, flags);
	if (cmd != NULL) {

		/*
		 * For some reason, we are not done with this request.
		 * This happens for I/O errors in the middle of the request,
		 * in which case we need to request the blocks that come after
		 * the bad sector.
		 */
		cmd->request->special = cmd;
		if (blk_rq_tagged(cmd->request))
			blk_queue_end_tag(q, cmd->request);

		/*
		 * set REQ_SPECIAL - we have a command
		 * clear REQ_DONTPREP - we assume the sg table has been
		 *	nuked so we need to set it up again.
		 */
		cmd->request->flags |= REQ_SPECIAL;
		cmd->request->flags &= ~REQ_DONTPREP;
		__elv_add_request(q, cmd->request, 0, 0);
	}

	__blk_run_queue(q);

	spin_unlock_irqrestore(q->queue_lock, flags);
}

And then in scsi_request_fn() you'd genrally include
starved devices in the starved list.

So to summarize, overall, you'd get single_lun devices to the
front of the starved list, others enqueue at the end.
This will give leverage and will give way to sinle_lun target devices
as they are more finicky.


> @@ -390,29 +433,11 @@ void scsi_queue_next_request(request_que
>  	}
>  
>  	sdev = q->queuedata;
> -	shost = sdev->host;
>  
> -	/*
> -	 * If this is a single-lun device, and we are currently finished
> -	 * with this device, then see if we need to get another device
> -	 * started.  FIXME(eric) - if this function gets too cluttered
> -	 * with special case code, then spin off separate versions and
> -	 * use function pointers to pick the right one.
> -	 */
> -	if (sdev->single_lun && blk_queue_empty(q) && sdev->device_busy ==0 &&
> -			!shost->host_blocked && !shost->host_self_blocked &&
> -			!((shost->can_queue > 0) && (shost->host_busy >=
> -				       		     shost->can_queue))) {
> -		list_for_each_entry(sdev2, &sdev->same_target_siblings,
> -			       same_target_siblings) {
> -			if (!sdev2->device_blocked &&
> -			    !blk_queue_empty(sdev2->request_queue)) {
> -				__blk_run_queue(sdev2->request_queue);
> -				break;
> -			}
> -		}
> -	}
> +	if (sdev->single_lun)
> +		scsi_single_lun_run(sdev, q);
>  
> +	shost = sdev->host;
>  	while (!list_empty(&shost->starved_list) &&
>  	       !shost->host_blocked && !shost->host_self_blocked &&
>  		!((shost->can_queue > 0) &&
> @@ -917,22 +942,6 @@ static int scsi_init_io(struct scsi_cmnd
>  	return BLKPREP_KILL;
>  }
>  
> -/*
> - * The target associated with myself can only handle one active command at
> - * a time. Scan through all of the luns on the same target as myself,
> - * return 1 if any are active.
> - */
> -static int check_all_luns(struct scsi_device *myself)
> -{
> -	struct scsi_device *sdev;
> -
> -	list_for_each_entry(sdev, &myself->same_target_siblings,
> -			    same_target_siblings)
> -		if (sdev->device_busy)
> -			return 1;
> -	return 0;
> -}
> -
>  static int scsi_prep_fn(struct request_queue *q, struct request *req)
>  {
>  	struct Scsi_Device_Template *sdt;
> @@ -1077,9 +1086,6 @@ static void scsi_request_fn(request_queu
>  		if (sdev->device_busy >= sdev->queue_depth)
>  			break;
>  
> -		if (sdev->single_lun && check_all_luns(sdev))
> -			break;
> -
>  		if (shost->host_busy == 0 && shost->host_blocked) {
>  			/* unblock after host_blocked iterates to zero */
>  			if (--shost->host_blocked == 0) {
> @@ -1120,6 +1126,9 @@ static void scsi_request_fn(request_queu
>  			break;
>  		}
>  
> +		if (sdev->single_lun && scsi_single_lun_check(sdev))
> +			break;
> +
>  		/*
>  		 * If we couldn't find a request that could be queued, then we
>  		 * can also quit.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-25  2:02     ` [PATCH] 3/7 consolidate single_lun code Patrick Mansfield
  2003-03-25  2:03       ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Patrick Mansfield
  2003-03-25 20:36       ` [PATCH] 3/7 consolidate single_lun code Luben Tuikov
@ 2003-03-25 20:50       ` Luben Tuikov
  2 siblings, 0 replies; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 20:50 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Consolidate scsi single_lun code.
> 
> diff -purN -X /home/patman/dontdiff put_cmd-25/drivers/scsi/scsi_lib.c lun-single-25/drivers/scsi/scsi_lib.c
> --- put_cmd-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:51 2003
> +++ lun-single-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:55 2003
> @@ -323,6 +323,49 @@ void scsi_setup_cmd_retry(struct scsi_cm
>  }
>  
>  /*
> + * Called for single_lun devices. The target associated with current_sdev can
> + * only handle one active command at a time. Scan through all of the luns
> + * on the same target as current_sdev, return 1 if any are active.
> + */
> +static int scsi_single_lun_check(struct scsi_device *current_sdev)

This is such a general name, that it does not mean anything.
What kind of ``check'' is this?

Why not like this:

/**
  * scsi_target_ok2queue: return non-zero if it is ok to queue
  * to this target, else 0.
  */

<< but a nicely formatted comment>>

static int scsi_target_ok2queue(struct scsi_target *target)
{
	if (!target->single_lu)
		return 1;
	for all devices of the target
		if an LU is active
			return 0;
	return 1;
}

That is, 0 is returned when there is NO error code.
Since this function does NOT return error codes, but performes
a LOGICAL nature test, you'd return 1 and 0, so that this makes sense:

	if (scsi_target_ok2queue(target)) {
		/* we're ok to queue, go! */
	} else {
		/* no we're not ok to queue */
	}

and the negation:
	if (!scsi_target_ok2queue(target)) {
	...

I'd rather have it named int scsi_target_canqueue(), but there
might be some confusion with the can_queue struct member.

> +{
> +	struct scsi_device *sdev;
> +
> +	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> +			    same_target_siblings)
> +		if (sdev->device_busy)
> +			return 1;
> +	return 0;
> +}
> +
> +/*
> + * Called for single_lun devices on IO completion. If no requests
> + * outstanding for current_sdev, call __blk_run_queue for the next
> + * scsi_device on the same target that has requests.
> + *
> + * Called with queue_lock held.
> + */
> +static void scsi_single_lun_run(struct scsi_device *current_sdev,
> +				 struct request_queue *q)

This will be incorporated by default by the starved_devs list
priority queue.

I.e. you should try not to have a partucular function for this but
arrange the infrastructure in such a way that this is handled
by default (via the infrastructure itself).

> +{
> +	struct scsi_device *sdev;
> +	struct Scsi_Host *shost;
> +
> +	shost = current_sdev->host;
> +	if (blk_queue_empty(q) && current_sdev->device_busy == 0 &&
> +	    !shost->host_blocked && !shost->host_self_blocked &&
> +	    !((shost->can_queue > 0) &&
> +	      (shost->host_busy >= shost->can_queue)))
> +		list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> +				    same_target_siblings)
> +			if (!sdev->device_blocked &&
> +			    !blk_queue_empty(sdev->request_queue)) {
> +				__blk_run_queue(sdev->request_queue);
> +				break;
> +			}
> +}
> +
> +/*
>   * Function:    scsi_queue_next_request()
>   *
>   * Purpose:     Handle post-processing of completed commands.
> @@ -390,29 +433,11 @@ void scsi_queue_next_request(request_que
>  	}
>  
>  	sdev = q->queuedata;
> -	shost = sdev->host;
>  
> -	/*
> -	 * If this is a single-lun device, and we are currently finished
> -	 * with this device, then see if we need to get another device
> -	 * started.  FIXME(eric) - if this function gets too cluttered
> -	 * with special case code, then spin off separate versions and
> -	 * use function pointers to pick the right one.
> -	 */
> -	if (sdev->single_lun && blk_queue_empty(q) && sdev->device_busy ==0 &&
> -			!shost->host_blocked && !shost->host_self_blocked &&
> -			!((shost->can_queue > 0) && (shost->host_busy >=
> -				       		     shost->can_queue))) {
> -		list_for_each_entry(sdev2, &sdev->same_target_siblings,
> -			       same_target_siblings) {
> -			if (!sdev2->device_blocked &&
> -			    !blk_queue_empty(sdev2->request_queue)) {
> -				__blk_run_queue(sdev2->request_queue);
> -				break;
> -			}
> -		}
> -	}
> +	if (sdev->single_lun)
> +		scsi_single_lun_run(sdev, q);
>  
> +	shost = sdev->host;
>  	while (!list_empty(&shost->starved_list) &&
>  	       !shost->host_blocked && !shost->host_self_blocked &&
>  		!((shost->can_queue > 0) &&
> @@ -917,22 +942,6 @@ static int scsi_init_io(struct scsi_cmnd
>  	return BLKPREP_KILL;
>  }
>  
> -/*
> - * The target associated with myself can only handle one active command at
> - * a time. Scan through all of the luns on the same target as myself,
> - * return 1 if any are active.
> - */
> -static int check_all_luns(struct scsi_device *myself)
> -{
> -	struct scsi_device *sdev;
> -
> -	list_for_each_entry(sdev, &myself->same_target_siblings,
> -			    same_target_siblings)
> -		if (sdev->device_busy)
> -			return 1;
> -	return 0;
> -}
> -
>  static int scsi_prep_fn(struct request_queue *q, struct request *req)
>  {
>  	struct Scsi_Device_Template *sdt;
> @@ -1077,9 +1086,6 @@ static void scsi_request_fn(request_queu
>  		if (sdev->device_busy >= sdev->queue_depth)
>  			break;
>  
> -		if (sdev->single_lun && check_all_luns(sdev))
> -			break;
> -
>  		if (shost->host_busy == 0 && shost->host_blocked) {
>  			/* unblock after host_blocked iterates to zero */
>  			if (--shost->host_blocked == 0) {
> @@ -1120,6 +1126,9 @@ static void scsi_request_fn(request_queu
>  			break;
>  		}
>  
> +		if (sdev->single_lun && scsi_single_lun_check(sdev))
> +			break;
> +
>  		/*
>  		 * If we couldn't find a request that could be queued, then we
>  		 * can also quit.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-25  2:03           ` [PATCH] 6/7 add and use a per-scsi_device queue_lock Patrick Mansfield
  2003-03-25  2:04             ` [PATCH] 7/7 fix single_lun code for " Patrick Mansfield
@ 2003-03-25 21:03             ` Luben Tuikov
  2003-03-26 21:33               ` Patrick Mansfield
  2003-03-25 21:20             ` James Bottomley
  2 siblings, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 21:03 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Add and use a per scsi_device queue_lock.
> 
> diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/hosts.c lksplit-25/drivers/scsi/hosts.c
> --- salloc_queue-25/drivers/scsi/hosts.c	Mon Mar 24 12:14:28 2003
> +++ lksplit-25/drivers/scsi/hosts.c	Mon Mar 24 12:15:10 2003
> @@ -620,7 +620,6 @@ void scsi_host_busy_dec_and_test(struct 
>  
>  	spin_lock_irqsave(shost->host_lock, flags);
>  	shost->host_busy--;
> -	sdev->device_busy--;
>  	if (shost->in_recovery && shost->host_failed &&
>  	    (shost->host_busy == shost->host_failed))
>  	{
> diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi.c lksplit-25/drivers/scsi/scsi.c
> --- salloc_queue-25/drivers/scsi/scsi.c	Mon Mar 24 11:57:13 2003
> +++ lksplit-25/drivers/scsi/scsi.c	Mon Mar 24 12:15:10 2003
> @@ -447,8 +447,6 @@ int scsi_dispatch_cmd(Scsi_Cmnd * SCpnt)
>  
>  	host = SCpnt->device->host;
>  
> -	ASSERT_LOCK(host->host_lock, 0);
> -
>  	/* Assign a unique nonzero serial_number. */
>  	if (++serial_number == 0)
>  		serial_number = 1;
> @@ -574,8 +572,6 @@ void scsi_init_cmd_from_req(Scsi_Cmnd * 
>  {
>  	struct Scsi_Host *host = SCpnt->device->host;
>  
> -	ASSERT_LOCK(host->host_lock, 0);
> -
>  	SCpnt->owner = SCSI_OWNER_MIDLEVEL;
>  	SRpnt->sr_command = SCpnt;
>  
> @@ -819,12 +815,11 @@ void scsi_finish_command(Scsi_Cmnd * SCp
>  	struct Scsi_Host *host;
>  	Scsi_Device *device;
>  	Scsi_Request * SRpnt;
> +	unsigned int flags;
>  
>  	host = SCpnt->device->host;
>  	device = SCpnt->device;
>  
> -	ASSERT_LOCK(host->host_lock, 0);
> -
>          /*
>           * We need to protect the decrement, as otherwise a race condition
>           * would exist.  Fiddling with SCpnt isn't a problem as the
> @@ -833,6 +828,9 @@ void scsi_finish_command(Scsi_Cmnd * SCp
>           * shared.
>           */
>  	scsi_host_busy_dec_and_test(host, device);
> +	spin_lock_irqsave(SCpnt->device->request_queue->queue_lock, flags);
> +	SCpnt->device->device_busy--;
> +	spin_unlock_irqrestore(SCpnt->device->request_queue->queue_lock, flags);


Such acrobatics warrant an atomic variable. I.e. make device_busy an
atomic_t and avoid all this juggling.

atomic_dec(SCpnt->device->device_busy);


>          /*
>           * Clear the flags which say that the device/host is no longer
> diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi.h lksplit-25/drivers/scsi/scsi.h
> --- salloc_queue-25/drivers/scsi/scsi.h	Mon Mar 24 12:14:51 2003
> +++ lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:10 2003
> @@ -418,7 +418,7 @@ extern void scsi_io_completion(Scsi_Cmnd
>  			       int block_sectors);
>  extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
>  extern void scsi_queue_next_request(request_queue_t *q, struct scsi_cmnd *cmd);
> -extern request_queue_t *scsi_alloc_queue(struct Scsi_Host *shost);
> +extern request_queue_t *scsi_alloc_queue(struct scsi_device *sdev);
>  extern void scsi_free_queue(request_queue_t *q);
>  extern int scsi_init_queue(void);
>  extern void scsi_exit_queue(void);
> @@ -554,6 +554,7 @@ struct scsi_device {
>  	struct Scsi_Host *host;
>  	request_queue_t *request_queue;
>  	volatile unsigned short device_busy;	/* commands actually active on low-level */
> +	spinlock_t sdev_lock;           /* also the request queue_lock */
>  	spinlock_t list_lock;
>  	struct list_head cmd_list;	/* queue of in use SCSI Command structures */
>  	struct list_head starved_entry;
> diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi_error.c lksplit-25/drivers/scsi/scsi_error.c
> --- salloc_queue-25/drivers/scsi/scsi_error.c	Mon Mar 24 12:14:51 2003
> +++ lksplit-25/drivers/scsi/scsi_error.c	Mon Mar 24 12:15:10 2003
> @@ -431,8 +431,6 @@ static int scsi_send_eh_cmnd(struct scsi
>  	unsigned long flags;
>  	int rtn = SUCCESS;
>  
> -	ASSERT_LOCK(host->host_lock, 0);
> -
>  	/*
>  	 * we will use a queued command if possible, otherwise we will
>  	 * emulate the queuing and calling of completion function ourselves.
> @@ -1405,8 +1403,6 @@ static void scsi_restart_operations(stru
>  	struct scsi_device *sdev;
>  	unsigned long flags;
>  
> -	ASSERT_LOCK(shost->host_lock, 0);
> -
>  	/*
>  	 * If the door was locked, we need to insert a door lock request
>  	 * onto the head of the SCSI request queue for the device.  There
> @@ -1434,18 +1430,11 @@ static void scsi_restart_operations(stru
>  	 * now that error recovery is done, we will need to ensure that these
>  	 * requests are started.
>  	 */
> -	spin_lock_irqsave(shost->host_lock, flags);
>  	list_for_each_entry(sdev, &shost->my_devices, siblings) {
> -		if ((shost->can_queue > 0 &&
> -		     (shost->host_busy >= shost->can_queue))
> -		    || (shost->host_blocked)
> -		    || (shost->host_self_blocked)) {
> -			break;
> -		}
> -
> +		spin_lock_irqsave(sdev->request_queue->queue_lock, flags);
>  		__blk_run_queue(sdev->request_queue);
> +		spin_unlock_irqrestore(sdev->request_queue->queue_lock, flags);
>  	}
> -	spin_unlock_irqrestore(shost->host_lock, flags);
>  }
>  
>  /**
> diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi_lib.c lksplit-25/drivers/scsi/scsi_lib.c
> --- salloc_queue-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:01 2003
> +++ lksplit-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:10 2003
> @@ -92,6 +92,7 @@ int scsi_queue_insert(struct scsi_cmnd *
>  {
>  	struct Scsi_Host *host = cmd->device->host;
>  	struct scsi_device *device = cmd->device;
> +	unsigned long flags;
>  
>  	SCSI_LOG_MLQUEUE(1,
>  		 printk("Inserting command %p into mlqueue\n", cmd));
> @@ -130,6 +131,9 @@ int scsi_queue_insert(struct scsi_cmnd *
>  	 * Decrement the counters, since these commands are no longer
>  	 * active on the host/device.
>  	 */
> +	spin_lock_irqsave(device->request_queue->queue_lock, flags);
> +	device->device_busy--;
> +	spin_unlock_irqrestore(device->request_queue->queue_lock, flags);
>  	scsi_host_busy_dec_and_test(host, device);

Same thing here.

Discourse: I'd normally have this:

struct scsi_device {
	...
	spinlock_t	pending_q_lock;
	atomic_t	pending_q_count; /* old: device_busy */
	list_head	pending_q;
	...
}

This gives you a lot more _infrastructure_ (to work with),
when you need it, i.e. in situations as the one at hand.

Where pending queue stores all commands which have been
submitted to the LLDD via queuecommand() and are pending
execution status.

Please note: references are _always_ from the Initiator's point
of view -- in standards, specs, etc.

For this reason it's called ``pending_q''.

So from the request_q, requests come to the pending_q (as scsi commands
of course so that it is MORE general -- i.e. when the command didn't come
via a request).

Then they go to the done_q, and end up back to the request_q (logically).

Or from SCSI's point of view: allocation queues, (incoming_q), pending_q,
done_q, back to allocation queues -- and you get closed loop!

>  	/*
> @@ -343,7 +347,7 @@ static int scsi_single_lun_check(struct 
>   * outstanding for current_sdev, call __blk_run_queue for the next
>   * scsi_device on the same target that has requests.
>   *
> - * Called with queue_lock held.
> + * Called with *no* scsi locks held.
>   */
>  static void scsi_single_lun_run(struct scsi_device *current_sdev,
>  				 struct request_queue *q)
> @@ -407,9 +411,6 @@ void scsi_queue_next_request(request_que
>  	struct Scsi_Host *shost;
>  	unsigned long flags;
>  
> -	ASSERT_LOCK(q->queue_lock, 0);
> -
> -	spin_lock_irqsave(q->queue_lock, flags);
>  	if (cmd != NULL) {
>  
>  		/*
> @@ -418,6 +419,7 @@ void scsi_queue_next_request(request_que
>  		 * in which case we need to request the blocks that come after
>  		 * the bad sector.
>  		 */
> +		spin_lock_irqsave(q->queue_lock, flags);
>  		cmd->request->special = cmd;
>  		if (blk_rq_tagged(cmd->request))
>  			blk_queue_end_tag(q, cmd->request);
> @@ -430,6 +432,7 @@ void scsi_queue_next_request(request_que
>  		cmd->request->flags |= REQ_SPECIAL;
>  		cmd->request->flags &= ~REQ_DONTPREP;
>  		__elv_add_request(q, cmd->request, 0, 0);
> +		spin_unlock_irqrestore(q->queue_lock, flags);
>  	}
>  
>  	sdev = q->queuedata;
> @@ -438,6 +441,7 @@ void scsi_queue_next_request(request_que
>  		scsi_single_lun_run(sdev, q);
>  
>  	shost = sdev->host;
> +	spin_lock_irqsave(shost->host_lock, flags);
>  	while (!list_empty(&shost->starved_list) &&
>  	       !shost->host_blocked && !shost->host_self_blocked &&
>  		!((shost->can_queue > 0) &&
> @@ -447,15 +451,26 @@ void scsi_queue_next_request(request_que
>  		 * starved queues, call __blk_run_queue. scsi_request_fn
>  		 * drops the queue_lock and can add us back to the
>  		 * starved_list.
> +		 *
> +		 * host_lock protects the starved_list and starved_entry.
> +		 * scsi_request_fn must get the host_lock before checking
> +		 * or modifying starved_list or starved_entry.
>  		 */
>  		sdev2 = list_entry(shost->starved_list.next,
>  					  struct scsi_device, starved_entry);
>  		list_del_init(&sdev2->starved_entry);
> +		spin_unlock_irqrestore(shost->host_lock, flags);
> +
> +		spin_lock_irqsave(sdev2->request_queue->queue_lock, flags);
>  		__blk_run_queue(sdev2->request_queue);
> +		spin_unlock_irqrestore(sdev2->request_queue->queue_lock, flags);
> +
> +		spin_lock_irqsave(shost->host_lock, flags);
>  	}
> +	spin_unlock_irqrestore(shost->host_lock, flags);
>  
> +	spin_lock_irqsave(q->queue_lock, flags);
>  	__blk_run_queue(q);
> -
>  	spin_unlock_irqrestore(q->queue_lock, flags);
>  }
>  
> @@ -489,8 +504,6 @@ static struct scsi_cmnd *scsi_end_reques
>  	struct request *req = cmd->request;
>  	unsigned long flags;
>  
> -	ASSERT_LOCK(q->queue_lock, 0);
> -
>  	/*
>  	 * If there are blocks left over at the end, set up the command
>  	 * to queue the remainder of them.
> @@ -588,8 +601,6 @@ static void scsi_release_buffers(struct 
>  {
>  	struct request *req = cmd->request;
>  
> -	ASSERT_LOCK(cmd->device->host->host_lock, 0);
> -
>  	/*
>  	 * Free up any indirection buffers we allocated for DMA purposes. 
>  	 */
> @@ -670,8 +681,6 @@ void scsi_io_completion(struct scsi_cmnd
>  	 *	would be used if we just wanted to retry, for example.
>  	 *
>  	 */
> -	ASSERT_LOCK(q->queue_lock, 0);
> -
>  	/*
>  	 * Free up any indirection buffers we allocated for DMA purposes. 
>  	 * For the case of a READ, we need to copy the data out of the
> @@ -1075,7 +1084,7 @@ static inline int scsi_check_sdev(struct
>  /*
>   * scsi_check_shost: if we can send requests to shost, return 0 else return 1.
>   *
> - * Called with the queue_lock held.
> + * Called with queue_lock and host_lock held.
>   */
>  static inline int scsi_check_shost(struct request_queue *q,
>  				   struct Scsi_Host *shost,
> @@ -1132,8 +1141,7 @@ static void scsi_request_fn(request_queu
>  	struct Scsi_Host *shost = sdev->host;
>  	struct scsi_cmnd *cmd;
>  	struct request *req;
> -
> -	ASSERT_LOCK(q->queue_lock, 1);
> +	unsigned int flags;
>  
>  	/*
>  	 * To start with, we keep looping until the queue is empty, or until
> @@ -1141,7 +1149,7 @@ static void scsi_request_fn(request_queu
>  	 */
>  	for (;;) {
>  		if (blk_queue_plugged(q))
> -			break;
> +			goto completed;
>  
>  		/*
>  		 * get next queueable request.  We do this early to make sure
> @@ -1152,28 +1160,29 @@ static void scsi_request_fn(request_queu
>  		req = elv_next_request(q);
>  
>  		if (scsi_check_sdev(q, sdev))
> -			break;
> +			goto completed;
>  
> +		spin_lock_irqsave(shost->host_lock, flags);
>  		if (scsi_check_shost(q, shost, sdev))
> -			break;
> +			goto after_host_lock;
>  
>  		if (sdev->single_lun && scsi_single_lun_check(sdev))
> -			break;
> +			goto after_host_lock;
>  
>  		/*
>  		 * If we couldn't find a request that could be queued, then we
>  		 * can also quit.
>  		 */
>  		if (blk_queue_empty(q))
> -			break;
> +			goto after_host_lock;
>  
>  		if (!req) {
>  			/* If the device is busy, a returning I/O
>  			 * will restart the queue.  Otherwise, we have
>  			 * to plug the queue */
> -			if (sdev->device_busy == 0)
> +			if (sdev->device_busy == 1)
>  				blk_plug_device(q);
> -			break;
> +			goto after_host_lock;
>  		}
>  
>  		cmd = req->special;
> @@ -1195,11 +1204,9 @@ static void scsi_request_fn(request_queu
>  		if (!(blk_queue_tagged(q) && (blk_queue_start_tag(q, req) == 0)))
>  			blkdev_dequeue_request(req);
>  	
> -		/*
> -		 * Now bump the usage count for both the host and the
> -		 * device.
> -		 */
>  		shost->host_busy++;
> +		spin_unlock_irqrestore(shost->host_lock, flags);
> +
>  		sdev->device_busy++;
>  		spin_unlock_irq(q->queue_lock);
>  
> @@ -1220,6 +1227,11 @@ static void scsi_request_fn(request_queu
>  		 */
>  		spin_lock_irq(q->queue_lock);
>  	}
> +completed:
> +	return;
> +
> +after_host_lock:
> +	spin_unlock_irqrestore(shost->host_lock, flags);
>  }
>  
>  u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
> @@ -1241,15 +1253,20 @@ u64 scsi_calculate_bounce_limit(struct S
>  	return BLK_BOUNCE_HIGH;
>  }
>  
> -request_queue_t *scsi_alloc_queue(struct Scsi_Host *shost)
> +request_queue_t *scsi_alloc_queue(struct scsi_device *sdev)
>  {
>  	request_queue_t *q;
> +	struct Scsi_Host *shost;
>  
>  	q = kmalloc(sizeof(*q), GFP_ATOMIC);
>  	if (!q)
>  		return NULL;
>  	memset(q, 0, sizeof(*q));
>  
> +	/*
> +	 * XXX move host code to scsi_register
> +	 */
> +	shost = sdev->host;
>  	if (!shost->max_sectors) {
>  		/*
>  		 * Driver imposes no hard sector transfer limit.
> @@ -1258,7 +1275,7 @@ request_queue_t *scsi_alloc_queue(struct
>  		shost->max_sectors = SCSI_DEFAULT_MAX_SECTORS;
>  	}
>  
> -	blk_init_queue(q, scsi_request_fn, shost->host_lock);
> +	blk_init_queue(q, scsi_request_fn, &sdev->sdev_lock);
>  	blk_queue_prep_rq(q, scsi_prep_fn);
>  
>  	blk_queue_max_hw_segments(q, shost->sg_tablesize);
> diff -purN -X /home/patman/dontdiff salloc_queue-25/drivers/scsi/scsi_scan.c lksplit-25/drivers/scsi/scsi_scan.c
> --- salloc_queue-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:05 2003
> +++ lksplit-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:10 2003
> @@ -415,7 +415,8 @@ static struct scsi_device *scsi_alloc_sd
>  	 */
>  	sdev->borken = 1;
>  
> -	sdev->request_queue = scsi_alloc_queue(shost);
> +	spin_lock_init(&sdev->sdev_lock);
> +	sdev->request_queue = scsi_alloc_queue(sdev);
>  	if (!sdev->request_queue)
>  		goto out_free_dev;
>  
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-25  2:03           ` [PATCH] 6/7 add and use a per-scsi_device queue_lock Patrick Mansfield
  2003-03-25  2:04             ` [PATCH] 7/7 fix single_lun code for " Patrick Mansfield
  2003-03-25 21:03             ` [PATCH] 6/7 add and use a " Luben Tuikov
@ 2003-03-25 21:20             ` James Bottomley
  2003-03-26  2:01               ` Patrick Mansfield
  2 siblings, 1 reply; 34+ messages in thread
From: James Bottomley @ 2003-03-25 21:20 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: SCSI Mailing List

On Mon, 2003-03-24 at 20:03, Patrick Mansfield wrote:
> Add and use a per scsi_device queue_lock.

In doing this, you're introducing a locking heirarchy, which would need
to be documented.  If you need both the host and queue locks, which do
you take first?  The code implies that it's queue_lock and then
host_lock, which is a bit counter intuitive and leads to awful problems
for your starved_list: the host_lock is actually protecting the starved
list, so the way you currently have it, you have to drop the host lock
to acquire the queue_lock to run the block queue.  Unfortunately, doing
this could lead to the device being readded when running the queue. 
Thus, the queue_next_request can spin removing and adding the device to
the starved list---probably working from a copy of the starved list will
help with this.

I'd also like to see avoidance of the locking hierarchy where possible. 
i.e. in the scsi_request_fn for instance, can we move the actions that
need the host_lock outside the queue_lock?

James

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 7/7 fix single_lun code for per-scsi_device queue_lock
  2003-03-25  2:04             ` [PATCH] 7/7 fix single_lun code for " Patrick Mansfield
@ 2003-03-25 21:23               ` Luben Tuikov
  2003-03-26 21:47                 ` Patrick Mansfield
  0 siblings, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 21:23 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Fix single_lun code for per-scsi_device queue_lock
> 
> diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi.h sl_lksplit-25/drivers/scsi/scsi.h
> --- lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:10 2003
> +++ sl_lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:26 2003
> @@ -531,6 +531,15 @@ extern struct list_head scsi_dev_info_li
>  extern int scsi_dev_info_list_add_str(char *);
>  
>  /*
> + * scsi_target: representation of a scsi target, for now, this is only
> + * used for single_lun devices.
> + */
> +struct scsi_target {
> +	unsigned int starget_busy;
> +	unsigned int starget_refcnt;
> +};

Yes ok, but let's do it right.
First, I abhore all this letter playing to devise
names like: starget_busy and starget_refcnt.

starget_busy could be an atomic_t which is equal
to the sum of all LU's pending_q_counts, i.e.
it better, else we're losing commands, and also
with a different name (more descriptive).
Something like active_cmds, or active_cmd_count,
but NOT ``..._cnt''!

Also, moving all the necessary variables into scsi_target, like
say single_lun to be renamed single_lu.

Also having a list of devices:

	struct list_head lu_list;

etc, etc.


> +
> +/*
>   *  The scsi_device struct contains what we know about each given scsi
>   *  device.
>   *
> @@ -588,6 +597,7 @@ struct scsi_device {
>  	unsigned char current_tag;	/* current tag */
>  //	unsigned char sync_min_period;	/* Not less than this period */
>  //	unsigned char sync_max_offset;	/* Not greater than this offset */
> +	struct scsi_target      *sdev_target;   /* used only for single_lun */


Should probably be used for all devices.  Let's do it right.


>  	unsigned online:1;
>  	unsigned writeable:1;
> diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi_lib.c sl_lksplit-25/drivers/scsi/scsi_lib.c
> --- lksplit-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:10 2003
> +++ sl_lksplit-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:26 2003
> @@ -327,46 +327,36 @@ void scsi_setup_cmd_retry(struct scsi_cm
>  }
>  
>  /*
> - * Called for single_lun devices. The target associated with current_sdev can
> - * only handle one active command at a time. Scan through all of the luns
> - * on the same target as current_sdev, return 1 if any are active.
> - */
> -static int scsi_single_lun_check(struct scsi_device *current_sdev)
> -{
> -	struct scsi_device *sdev;
> -
> -	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> -			    same_target_siblings)
> -		if (sdev->device_busy)
> -			return 1;
> -	return 0;
> -}
> -
> -/*
> - * Called for single_lun devices on IO completion. If no requests
> - * outstanding for current_sdev, call __blk_run_queue for the next
> - * scsi_device on the same target that has requests.
> + * Called for single_lun devices on IO completion. Clear starget_busy, and
> + * Call __blk_run_queue for all the scsi_devices on the target - including
> + * current_sdev first.
>   *
>   * Called with *no* scsi locks held.
>   */
> -static void scsi_single_lun_run(struct scsi_device *current_sdev,
> -				 struct request_queue *q)
> +static void scsi_single_lun_run(struct scsi_device *current_sdev)
>  {
>  	struct scsi_device *sdev;
> -	struct Scsi_Host *shost;
> +	unsigned int flags, flags2;
>  
> -	shost = current_sdev->host;
> -	if (blk_queue_empty(q) && current_sdev->device_busy == 0 &&
> -	    !shost->host_blocked && !shost->host_self_blocked &&
> -	    !((shost->can_queue > 0) &&
> -	      (shost->host_busy >= shost->can_queue)))
> -		list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> -				    same_target_siblings)
> -			if (!sdev->device_blocked &&
> -			    !blk_queue_empty(sdev->request_queue)) {
> -				__blk_run_queue(sdev->request_queue);
> -				break;
> -			}
> +	spin_lock_irqsave(current_sdev->request_queue->queue_lock, flags2);
> +	spin_lock_irqsave(current_sdev->host->host_lock, flags);

No, you do NOT need flags and flags2.  Look at other kernel code
to see how this is done.  scsi_put_command() gives
an example of doing double locks and irq state -- it's a nice one.

So, things to remember:

1. spin_lock_irqsave(...., flags);
    spin_lock(...);

    critical section;

    spin_unlock(...);
    spin_unlock_irqrestore(..., flags);

2. If you're obtaining locks sequentially like this,
    e.g:

	grab L1
	glab L2

	critical section

	release L2
	release L1

you MUST absolutely mention this somewhere with boldface red
letters comment that the queue lock is always obtained FIRST
and the host lock SECOND (and, trivially, released in opposite
order).

Else someone trying to obtain both locks in opposite order, L2 and
then L1, will lock up the system.


> +	WARN_ON(!current_sdev->sdev_target->starget_busy);
> +	if (current_sdev->device_busy == 0)
> +		current_sdev->sdev_target->starget_busy = 0;
> +	spin_unlock_irqrestore(current_sdev->host->host_lock, flags);

Do not need flags here as mentioned above.

> +
> +	/*
> +	 * Call __blk_run_queue for all LUNs on the target, starting with
> +	 * current_sdev.
> +	 */
> +	__blk_run_queue(current_sdev->request_queue);
> +	spin_unlock_irqrestore(current_sdev->request_queue->queue_lock, flags2);

Now if you make device_busy an atomic_t type, then you DO not
need these double locking acrobatics.

> +	list_for_each_entry(sdev, &current_sdev->same_target_siblings,
> +			    same_target_siblings) {
> +		spin_lock_irqsave(sdev->request_queue->queue_lock, flags2);
> +		__blk_run_queue(sdev->request_queue);
> +		spin_unlock_irqrestore(sdev->request_queue->queue_lock, flags2);
> +	}
>  }
>  
>  /*
> @@ -438,7 +428,7 @@ void scsi_queue_next_request(request_que
>  	sdev = q->queuedata;
>  
>  	if (sdev->single_lun)
> -		scsi_single_lun_run(sdev, q);
> +		scsi_single_lun_run(sdev);
>  
>  	shost = sdev->host;
>  	spin_lock_irqsave(shost->host_lock, flags);
> @@ -1166,7 +1156,8 @@ static void scsi_request_fn(request_queu
>  		if (scsi_check_shost(q, shost, sdev))
>  			goto after_host_lock;
>  
> -		if (sdev->single_lun && scsi_single_lun_check(sdev))
> +		if (sdev->single_lun && !sdev->device_busy &&
> +		    sdev->sdev_target->starget_busy)
>  			goto after_host_lock;
>  
>  		/*
> @@ -1204,6 +1195,9 @@ static void scsi_request_fn(request_queu
>  		if (!(blk_queue_tagged(q) && (blk_queue_start_tag(q, req) == 0)))
>  			blkdev_dequeue_request(req);
>  	
> +		if (sdev->single_lun)
> +			sdev->sdev_target->starget_busy = 1;
> +
>  		shost->host_busy++;
>  		spin_unlock_irqrestore(shost->host_lock, flags);
>  
> diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi_scan.c sl_lksplit-25/drivers/scsi/scsi_scan.c
> --- lksplit-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:10 2003
> +++ sl_lksplit-25/drivers/scsi/scsi_scan.c	Mon Mar 24 12:15:26 2003
> @@ -478,6 +478,8 @@ out:
>   **/
>  static void scsi_free_sdev(struct scsi_device *sdev)
>  {
> +	unsigned int flags;
> +
>  	list_del(&sdev->siblings);
>  	list_del(&sdev->same_target_siblings);
>  
> @@ -487,6 +489,14 @@ static void scsi_free_sdev(struct scsi_d
>  		sdev->host->hostt->slave_destroy(sdev);
>  	if (sdev->inquiry)
>  		kfree(sdev->inquiry);
> +	if (sdev->single_lun) {
> +		spin_lock_irqsave(sdev->host->host_lock, flags);
> +		sdev->sdev_target->starget_refcnt--;
> +		if (sdev->sdev_target->starget_refcnt == 0)
> +			kfree(sdev->sdev_target);


Have you seen ``atomic_dec_and_test()''?


> +		spin_unlock_irqrestore(sdev->host->host_lock, flags);
> +	}
> +
>  	kfree(sdev);
>  }
>  
> @@ -1122,6 +1132,10 @@ static void scsi_probe_lun(Scsi_Request 
>  static int scsi_add_lun(Scsi_Device *sdev, Scsi_Request *sreq,
>  		char *inq_result, int *bflags)
>  {
> +	struct scsi_device *sdev_sibling;
> +	struct scsi_target *starget;
> +	unsigned int flags;
> +
>  	/*
>  	 * XXX do not save the inquiry, since it can change underneath us,
>  	 * save just vendor/model/rev.
> @@ -1243,10 +1257,38 @@ static int scsi_add_lun(Scsi_Device *sde
>  
>  	/*
>  	 * If we need to allow I/O to only one of the luns attached to
> -	 * this target id at a time, then we set this flag.
> +	 * this target id at a time set single_lun, and allocate or modify
> +	 * sdev_target.
>  	 */
> -	if (*bflags & BLIST_SINGLELUN)
> +	if (*bflags & BLIST_SINGLELUN) {
>  		sdev->single_lun = 1;
> +		spin_lock_irqsave(sdev->host->host_lock, flags);
> +		starget = NULL;
> +		/*
> +		 * Search for an existing target for this sdev.
> +		 */
> +		list_for_each_entry(sdev_sibling, &sdev->same_target_siblings,
> +				    same_target_siblings) {
> +			if (sdev_sibling->sdev_target != NULL) {
> +				starget = sdev_sibling->sdev_target;
> +				break;
> +			}
> +		}
> +		if (!starget) {
> +			starget = kmalloc(sizeof(*starget), GFP_KERNEL);
> +			if (!starget) {
> +				printk(ALLOC_FAILURE_MSG, __FUNCTION__);
> +				spin_unlock_irqrestore(sdev->host->host_lock,
> +						       flags);
> +				return SCSI_SCAN_NO_RESPONSE;
> +			}
> +			starget->starget_refcnt = 0;
> +			starget->starget_busy = 0;
> +		}
> +		starget->starget_refcnt++;
> +		sdev->sdev_target = starget;
> +		spin_unlock_irqrestore(sdev->host->host_lock, flags);
> +	}
>  
>  	/* if the device needs this changing, it may do so in the detect
>  	 * function */
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn
  2003-03-25  2:03       ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Patrick Mansfield
  2003-03-25  2:03         ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Patrick Mansfield
@ 2003-03-25 21:32         ` Luben Tuikov
  2003-03-26  0:58           ` Patrick Mansfield
  1 sibling, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-25 21:32 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> Cleanup and consolidate scsi_device and scsi_host checks in
> scsi_request_fn.
> 
> diff -purN -X /home/patman/dontdiff lun-single-25/drivers/scsi/scsi_lib.c req_fn-cleanup-25/drivers/scsi/scsi_lib.c
> --- lun-single-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:14:55 2003
> +++ req_fn-cleanup-25/drivers/scsi/scsi_lib.c	Mon Mar 24 12:15:01 2003
> @@ -1043,6 +1043,79 @@ static int scsi_prep_fn(struct request_q
>  }
>  
>  /*
> + * scsi_check_sdev: if we can send requests to sdev, return 0 else return 1.
> + *
> + * Called with the queue_lock held.
> + */
> +static inline int scsi_check_sdev(struct request_queue *q,
> +				  struct scsi_device *sdev)

scsi_check_sdev: clearly every function is C does a check, or a computation,
or a modificaiton, or some permutation of those.  So this name is too trivial
and doesn't mean what the function does.

Further more since the function outcome is logical in nature, i.e. it does
NOT return an error code, you can can return 1 on success, and 0 on fault.

How about this:
/**
  * scsi_dev_ok2queue:  Return non-zero if we can queue to the
  * device, else 0.
  */
static inline int scsi_dev_ok2queue(struct scsi_device *sdev)
{
	...
}

So that statments like this are logical:

	if (scsi_dev_ok2queue(dev)) {
		/* let's queue */
	} else {
		/* no, will have to defer */
	}

and the negation:
	if (!scsi_dev_ok2queue(dev)) {
		...
	}

> +{
> +	if (sdev->device_busy >= sdev->queue_depth)
> +		return 1;
> +	if (sdev->device_busy == 0 && sdev->device_blocked) {
> +		/*
> +		 * unblock after device_blocked iterates to zero
> +		 */
> +		if (--sdev->device_blocked == 0) {
> +			SCSI_LOG_MLQUEUE(3,
> +				printk("scsi%d (%d:%d) unblocking device at"
> +				       " zero depth\n", sdev->host->host_no,
> +				       sdev->id, sdev->lun));
> +		} else {
> +			blk_plug_device(q);
> +			return 1;
> +		}
> +	}
> +	if (sdev->device_blocked)
> +		return 1;
> +
> +	return 0;
> +}
> +
> +/*
> + * scsi_check_shost: if we can send requests to shost, return 0 else return 1.
> + *
> + * Called with the queue_lock held.
> + */
> +static inline int scsi_check_shost(struct request_queue *q,
> +				   struct Scsi_Host *shost,
> +				   struct scsi_device *sdev)

Abosulutely the same story here, as above.

scsi_host_ok2queue() -- please do not use ``shost'' in function names,
``host'' I think is descriptive enough.

> +{
> +	if (shost->in_recovery)
> +		return 1;
> +	if (shost->host_busy == 0 && shost->host_blocked) {
> +		/*
> +		 * unblock after host_blocked iterates to zero
> +		 */
> +		if (--shost->host_blocked == 0) {
> +			SCSI_LOG_MLQUEUE(3,
> +				printk("scsi%d unblocking host at zero depth\n",
> +					shost->host_no));
> +		} else {
> +			blk_plug_device(q);
> +			return 1;
> +		}
> +	}
> +	if (!list_empty(&sdev->starved_entry))
> +		return 1;
> +	if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
> +	    shost->host_blocked || shost->host_self_blocked) {
> +		SCSI_LOG_MLQUEUE(3,
> +			printk("add starved dev <%d,%d,%d,%d>; host "
> +			       "limit %d, busy %d, blocked %d selfblocked %d\n",
> +			       sdev->host->host_no, sdev->channel,
> +			       sdev->id, sdev->lun,
> +			       shost->can_queue, shost->host_busy,
> +			       shost->host_blocked, shost->host_self_blocked));
> +		list_add_tail(&sdev->starved_entry,
> +			      &shost->starved_list);
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
>   * Function:    scsi_request_fn()
>   *
>   * Purpose:     Main strategy routine for SCSI.
> @@ -1067,13 +1140,8 @@ static void scsi_request_fn(request_queu
>  	 * the host is no longer able to accept any more requests.
>  	 */
>  	for (;;) {
> -		/*
> -		 * Check this again - each time we loop through we will have
> -		 * released the lock and grabbed it again, so each time
> -		 * we need to check to see if the queue is plugged or not.
> -		 */
> -		if (shost->in_recovery || blk_queue_plugged(q))
> -			return;
> +		if (blk_queue_plugged(q))
> +			break;
>  
>  		/*
>  		 * get next queueable request.  We do this early to make sure
> @@ -1083,48 +1151,11 @@ static void scsi_request_fn(request_queu
>  		 */
>  		req = elv_next_request(q);
>  
> -		if (sdev->device_busy >= sdev->queue_depth)
> -			break;
> -
> -		if (shost->host_busy == 0 && shost->host_blocked) {
> -			/* unblock after host_blocked iterates to zero */
> -			if (--shost->host_blocked == 0) {
> -				SCSI_LOG_MLQUEUE(3,
> -					printk("scsi%d unblocking host at zero depth\n",
> -						shost->host_no));
> -			} else {
> -				blk_plug_device(q);
> -				break;
> -			}
> -		}
> -
> -		if (sdev->device_busy == 0 && sdev->device_blocked) {
> -			/* unblock after device_blocked iterates to zero */
> -			if (--sdev->device_blocked == 0) {
> -				SCSI_LOG_MLQUEUE(3,
> -					printk("scsi%d (%d:%d) unblocking device at zero depth\n",
> -						shost->host_no, sdev->id, sdev->lun));
> -			} else {
> -				blk_plug_device(q);
> -				break;
> -			}
> -		}
> -
> -		/*
> -		 * If the device cannot accept another request, then quit.
> -		 */
> -		if (sdev->device_blocked)
> -			break;
> -
> -		if (!list_empty(&sdev->starved_entry))
> +		if (scsi_check_sdev(q, sdev))
>  			break;
>  
> -		if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
> -		    shost->host_blocked || shost->host_self_blocked) {
> -			list_add_tail(&sdev->starved_entry,
> -				      &shost->starved_list);
> +		if (scsi_check_shost(q, shost, sdev))
>  			break;
> -		}
>  
>  		if (sdev->single_lun && scsi_single_lun_check(sdev))
>  			break;
> @@ -1140,7 +1171,7 @@ static void scsi_request_fn(request_queu
>  			/* If the device is busy, a returning I/O
>  			 * will restart the queue.  Otherwise, we have
>  			 * to plug the queue */
> -			if(sdev->device_busy == 0)
> +			if (sdev->device_busy == 0)
>  				blk_plug_device(q);
>  			break;
>  		}
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Luben Tuikov, Senior Software Engineer, Splentec Ltd.
Bus: +1-905-707-1954x112, 9-5 EDT. Fax: +1-905-707-1974.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn
  2003-03-25 21:32         ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Luben Tuikov
@ 2003-03-26  0:58           ` Patrick Mansfield
  2003-03-26 17:07             ` Luben Tuikov
  0 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-26  0:58 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Tue, Mar 25, 2003 at 04:32:50PM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:

> > +static inline int scsi_check_sdev(struct request_queue *q,
> > +				  struct scsi_device *sdev)
> 
> scsi_check_sdev: clearly every function is C does a check, or a computation,
> or a modificaiton, or some permutation of those.  So this name is too trivial
> and doesn't mean what the function does.
> 
> Further more since the function outcome is logical in nature, i.e. it does
> NOT return an error code, you can can return 1 on success, and 0 on fault.
> 
> How about this:
> /**
>   * scsi_dev_ok2queue:  Return non-zero if we can queue to the
>   * device, else 0.
>   */

I don't care much one way or the other, but I don't like ok2queue.

I was using sdev to distinguish the name from struct dev and functions
related to struct dev. We already have the confusing names scsi_device_get
and scsi_get_device - one for a "get" of a scsi_device one to get a
struct dev given a scsi_host. And then the shost naming matches sdev.

Anyway, the following are fine with me:
	scsi_dev_ready, scsi_host_ready
	scsi_sdev_ready, scsi_shost_ready

> > +static inline int scsi_check_shost(struct request_queue *q,
> > +				   struct Scsi_Host *shost,
> > +				   struct scsi_device *sdev)
> 
> Abosulutely the same story here, as above.
> 
> scsi_host_ok2queue() -- please do not use ``shost'' in function names,
> ``host'' I think is descriptive enough.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-25 21:20             ` James Bottomley
@ 2003-03-26  2:01               ` Patrick Mansfield
  2003-03-27 16:09                 ` James Bottomley
  0 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-26  2:01 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List

On Tue, Mar 25, 2003 at 03:20:32PM -0600, James Bottomley wrote:
> On Mon, 2003-03-24 at 20:03, Patrick Mansfield wrote:
> > Add and use a per scsi_device queue_lock.
> 
> In doing this, you're introducing a locking heirarchy, which would need
> to be documented.  If you need both the host and queue locks, which do
> you take first?  The code implies that it's queue_lock and then
> host_lock, which is a bit counter intuitive 

scsi_request_fn is called with the queue_lock already held, so I decided
to get queue_lock first, rather than unlocking queue_lock followed by
locking it again later.

> and leads to awful problems
> for your starved_list: the host_lock is actually protecting the starved
> list, so the way you currently have it, you have to drop the host lock
> to acquire the queue_lock to run the block queue.  Unfortunately, doing
> this could lead to the device being readded when running the queue. 
> Thus, the queue_next_request can spin removing and adding the device to
> the starved list---probably working from a copy of the starved list will
> help with this.

The above doesn't happen because of the while() check: if anyone gets put
on or re-added to the starved list, we drop out of the while loop. The
current code also allows mutiple IO completions to run starved queues in
parallel (I doubt it helps any, but I have no evidence one way or the
other).

> I'd also like to see avoidance of the locking hierarchy where possible. 
> i.e. in the scsi_request_fn for instance, can we move the actions that
> need the host_lock outside the queue_lock?

I tried coding to avoid the lock hierarchy but did not quite make it.

Without a lock hierarchy, (in scsi_request_fn) I have to lock, check for
resources, increment host_busy or device_busy, and unlock.

Then, if unable to get a host resource, I have to decrement device_busy.

For the blk_queue_empty(q) or the "if !(req)" cases, I have to decrement
both device_busy and host_busy.

I thought (past tense) moving up blk_queue_empty(q) and the "if (!req)"
checks would cause problems by not decrementing device_blocked and
host_blocked, and that there is no way to handle host_busy being
decremented back to 0 there.

On further thought, those checks can be moved up before checking any of
the sdev or shost values (if blk_queue_empty, we don't care about
device_blocked or host_blocked). Agree? If so that issue goes away.

There is a still an issue with the single_lun checks, where we allow
continued IO to a device that has device_busy set, but not for a device
that has device_busy == 0 and target_busy != 0; so we have to get the
lock for device_busy, and the lock for target busy. I could keep the lock
hierarchy only in single_lun cases, or add another target lock that would
be part of a lock hierarchy.

Should I add a target lock? 

Excluding the single_lun case, do you think it's worthwhile to avoid the
lock hierarchy?

Related note: the new workq code in blk probably messes up our device_blocked
and host_blocked handling, since there is now a 3 millisecond timeout (see
blk_plug_queue). device_blocked and host_blocked should really be use some
type of timeout mechanism. Maybe a new blk_plug_timeout(). This might also
allow us to get rid of the one remaining recursive call into
scsi_request_fn.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn
  2003-03-26  0:58           ` Patrick Mansfield
@ 2003-03-26 17:07             ` Luben Tuikov
  2003-03-26 17:13               ` Patrick Mansfield
  0 siblings, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-26 17:07 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> I don't care much one way or the other, but I don't like ok2queue.
> 
> I was using sdev to distinguish the name from struct dev and functions
> related to struct dev. We already have the confusing names scsi_device_get
> and scsi_get_device - one for a "get" of a scsi_device one to get a
> struct dev given a scsi_host. And then the shost naming matches sdev.
> 
> Anyway, the following are fine with me:
> 	scsi_dev_ready, scsi_host_ready
> 	scsi_sdev_ready, scsi_shost_ready
> 

The problem here is _again_ the implicitness of your definitions.
``ready'' is not better than ``check''.  Why?

result = X_ready(arg) --> whatever result, the question is
``ready (or not) _for__what_????''

-- 
Luben



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn
  2003-03-26 17:07             ` Luben Tuikov
@ 2003-03-26 17:13               ` Patrick Mansfield
  2003-03-26 17:25                 ` Luben Tuikov
  0 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-26 17:13 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Wed, Mar 26, 2003 at 12:07:25PM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:
> > I don't care much one way or the other, but I don't like ok2queue.

> > Anyway, the following are fine with me:
> > 	scsi_dev_ready, scsi_host_ready
> > 	scsi_sdev_ready, scsi_shost_ready
> > 
> 
> The problem here is _again_ the implicitness of your definitions.
> ``ready'' is not better than ``check''.  Why?
> 
> result = X_ready(arg) --> whatever result, the question is
> ``ready (or not) _for__what_????''

OK -

	scsi_sdev_queue_ready 
	scsi_shost_queue_ready
or
	scsi_dev_queue_ready
	scsi_host_queue_ready

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn
  2003-03-26 17:13               ` Patrick Mansfield
@ 2003-03-26 17:25                 ` Luben Tuikov
  0 siblings, 0 replies; 34+ messages in thread
From: Luben Tuikov @ 2003-03-26 17:25 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> 
> OK -
> 
> 	scsi_sdev_queue_ready 
> 	scsi_shost_queue_ready
> or

**********************************
*                                *
*> 	scsi_dev_queue_ready     *
*> 	scsi_host_queue_ready    *
*>                               *
**********************************

-- 
Luben
P.S. Even though ``queue ready?'' doesn't quite mean
the same thing as ``can queue?''.  If I were you, I'd
use scsi_{dev,host}_canqueue(), or
scsi_{dev,host}_can_queue(), despite the unfortunate
choice of ``can_queue'' back in the days.







^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-25 20:36       ` [PATCH] 3/7 consolidate single_lun code Luben Tuikov
@ 2003-03-26 19:11         ` Patrick Mansfield
  2003-03-26 22:05           ` Luben Tuikov
  0 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-26 19:11 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Tue, Mar 25, 2003 at 03:36:19PM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:
> > Consolidate scsi single_lun code.
> > 

> >  /*
> > + * Called for single_lun devices. The target associated with current_sdev can
> > + * only handle one active command at a time. Scan through all of the luns
> > + * on the same target as current_sdev, return 1 if any are active.
> > + */
> 
> Here is parth of your comment:
> 	A single_lun (single_lu:1) marked target can handle a single
> 	active LU at a time.

OK I'll change "one active command" to "one active LU". Plus it does not
actually check current_sdev->device_busy, I'll clarify that comment.

> > +static int scsi_single_lun_check(struct scsi_device *current_sdev)
> 
> Here is the name:
> 
> static int scsi_single_lun_active(struct scsi_device *dev);
> 
> The reason is that this makes sense:
> 
> 	if (scsi_single_lun_active(dev)) {
> 		/* then there is an active lu */
> 	} else {
> 		/* No, there's no active devices or
> 		the target is NOT a single lun. */
> 	}
> 
> or the negation:
> 
> 	if (!scsi_single_lun_active(dev)) {
> 		/* no active lun devices or target is not a single
> 		LU, -- either way we can continue here: */
> 		...

OK - change the name to scsi_single_lun_active.

I was trying to keep the main path code fast, so we do not call out to any
functions if not the single_lun case. We can change
scsi_queue_next_request to be consistent with this approach: have one
function to handle leftover IO, one for single_lun cases, and one for
starved. And only call these functions when necessary.

> > + */
> > +static void scsi_single_lun_run(struct scsi_device *current_sdev,
> > +				 struct request_queue *q)
> 
> Now here I've got a problem.  This just pulls out the old logic
> into a separate fn.

That is the main intention of this particular patch - move the code out
of the way. Then the later patches add a per scsi_device queue_lock, and
then fix the single_lun code to work with the new locking.

> I'd rather you did this: in the request fn put the single_lun device
> into the starved devices list, just as you'd do for other devices.

> If you want you to discern between devices, you can add single_lun
> devices to the front and fully capable devices into the tail of the
> starved devices list -- this would make the starved devices list
> into a priority queue.

But the last single_lun LU that ran should have priority over any other
LU's on that same target (well it should really get some slice of IO time,
rather than allowing IO til it is no longer busy), and separately, the
first starved device should have priority over all other starved devices,
I can't do that (simply) with one list. 

single_lun devices are likely slow (CDROM), and we really don't want to
give them priority over other starved devices.

> static inline void scsi_run_starved_devs(struct Scsi_Host *shost)
> {
> 	unsigned long flags;
> 	LIST_HEAD(starved);
> 	
> 	spin_lock_irqsave(&shost->starved_devs_lock, flags);
> 	list_splice_init(&shost->starved_devs, &starved);
> 	spin_unlock_irqrestore(&shost->starved_devs_lock, flags);
> 	
> 	while (!list_empty(&starved)) {
> 		struct scsi_device *dev;
> 
> 		if (shost->host_blocked || shost->host_self_blocked ||
> 		    (shost->can_queue > 0 &&
> 		     shost->host_busy >= shost->can_queue))
> 			break;
> 	
> 		dev=list_entry(starved.next,struct scsi_device,starved_entry);
> 		list_del_init(dev->starved_entry);
> 		spin_lock_irqsave(dev->request_queue->queue_lock, flags);
> 		__blk_run_queue(dev->request_queue);
> 		spin_unlock_irqrestore(dev->request_queue->queue_lock, flags);
> 	}
> 	
> 	if (!list_empty(&starved)) {
> 		spin_lock_irqsave(&shost->starved_devs_lock, flags);
> 		list_splice_init(&starved, &shost->starved_devs);
> 		spin_unlock_irqrestore(&shost->starved_devs_lock, flags);
> 	}
> }

I did not take the list_splice an approach mainly because I wanted to
allow multiple CPU's to run the starved queues (host_blocked cases, not
can_queue). I don't have any data to back this up, and it probably does
not matter. 

For the can_queue limit being hit, with continuous IO across multiple LU's
on a host, there we will (likely) send only one more IO (run one starved
queue) and stop - the last few lines of the above code would always be
executed.

Also the same lock has to be used to protect the scsi_host->starved_list
and scsi_device->starved_entry. In the above code the first
list_splice_init above touches the shost->starved_devs->next->prev,
corresponding to a scsi_device->starved_entry->prev, while holding
starved_devs_lock but the following list_del_init is done holding the
queue_lock.

> Now since __blk_run_queue(q) call the scsi_request_fn() it MAY
> add the device back to the starved list, but that's no problem
> as it is already emtpy.  If the host blocks in due time, then
> the left over devices are added to the front and we get
> prioritization by time (submittal).

Yes, that happens whether we splice out a list or not.

> And this is how scsi_queue_next_request() might look like:
> (i.e. a more general form)
> 

> 	scsi_run_starved_devs(shost);
> 
> 	spin_lock_irqsave(q->queue_lock, flags);
> 	if (cmd != NULL) {
> 
> 		/*
> 		 * For some reason, we are not done with this request.
> 		 * This happens for I/O errors in the middle of the request,
> 		 * in which case we need to request the blocks that come after
> 		 * the bad sector.
> 		 */
> 		cmd->request->special = cmd;
> 		if (blk_rq_tagged(cmd->request))
> 			blk_queue_end_tag(q, cmd->request);
> 
> 		/*
> 		 * set REQ_SPECIAL - we have a command
> 		 * clear REQ_DONTPREP - we assume the sg table has been
> 		 *	nuked so we need to set it up again.
> 		 */
> 		cmd->request->flags |= REQ_SPECIAL;
> 		cmd->request->flags &= ~REQ_DONTPREP;
> 		__elv_add_request(q, cmd->request, 0, 0);
> 	}
> 
> 	__blk_run_queue(q);
> 
> 	spin_unlock_irqrestore(q->queue_lock, flags);

I did not move starved or single_lun handling before the incomplete IO
retry, as the retry should have higher priority.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-25 21:03             ` [PATCH] 6/7 add and use a " Luben Tuikov
@ 2003-03-26 21:33               ` Patrick Mansfield
  0 siblings, 0 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-26 21:33 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Tue, Mar 25, 2003 at 04:03:27PM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:

> >           */
> >  	scsi_host_busy_dec_and_test(host, device);
> > +	spin_lock_irqsave(SCpnt->device->request_queue->queue_lock, flags);
> > +	SCpnt->device->device_busy--;
> > +	spin_unlock_irqrestore(SCpnt->device->request_queue->queue_lock, flags);
> 
> 
> Such acrobatics warrant an atomic variable. I.e. make device_busy an
> atomic_t and avoid all this juggling.
> 
> atomic_dec(SCpnt->device->device_busy);

If we handled all of the SCSI IO completions the same way (got rid of
scsi_retry_command and more), the device_busy-- might happen in
scsi_queue_next_request or a related function, where we might do more than
just decrement device_busy with the lock held. This is related somewhat to
the current handling of scsi_queue_next_request, in that we should always
call it after IO completion.

Having both an atomic_t for device_busy and a lock protecting other
scsi_device and queue data does not appear to give any overall reduction
in executed code (at least on x86, I don't know about other archs). We get
rid of a spin lock/unlock for the above, but we have to add 3
atomic_read's and one atomic_inc in the main line code path - on x86 SMP
four assembler lock instructions (counting in my current tree with all the
lock patches applied).

We should leave this as-is for now, and future SCSI IO completion clean up
could determine how best to protect device_busy.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 7/7 fix single_lun code for per-scsi_device queue_lock
  2003-03-25 21:23               ` Luben Tuikov
@ 2003-03-26 21:47                 ` Patrick Mansfield
  2003-03-26 22:12                   ` Luben Tuikov
  0 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-26 21:47 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Tue, Mar 25, 2003 at 04:23:41PM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:
> > Fix single_lun code for per-scsi_device queue_lock
> > 
> > diff -purN -X /home/patman/dontdiff lksplit-25/drivers/scsi/scsi.h sl_lksplit-25/drivers/scsi/scsi.h
> > --- lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:10 2003
> > +++ sl_lksplit-25/drivers/scsi/scsi.h	Mon Mar 24 12:15:26 2003
> > @@ -531,6 +531,15 @@ extern struct list_head scsi_dev_info_li
> >  extern int scsi_dev_info_list_add_str(char *);
> >  
> >  /*
> > + * scsi_target: representation of a scsi target, for now, this is only
> > + * used for single_lun devices.
> > + */
> > +struct scsi_target {
> > +	unsigned int starget_busy;
> > +	unsigned int starget_refcnt;
> > +};
> 
> Yes ok, but let's do it right.
> First, I abhore all this letter playing to devise
> names like: starget_busy and starget_refcnt.

> starget_busy could be an atomic_t which is equal
> to the sum of all LU's pending_q_counts, i.e.
> it better, else we're losing commands, and also
> with a different name (more descriptive).
> Something like active_cmds, or active_cmd_count,
> but NOT ``..._cnt''!
> 
> Also, moving all the necessary variables into scsi_target, like
> say single_lun to be renamed single_lu.
> 
> Also having a list of devices:
> 
> 	struct list_head lu_list;
> 
> etc, etc.

> > +	struct scsi_target      *sdev_target;   /* used only for single_lun */
> 
> 
> Should probably be used for all devices.  Let's do it right.

I would like to have a scsi_target that always shows up, I don't have time
now to code it. So for now the klugey single_lun code.

> > +	spin_lock_irqsave(current_sdev->request_queue->queue_lock, flags2);
> > +	spin_lock_irqsave(current_sdev->host->host_lock, flags);
> 
> No, you do NOT need flags and flags2.  Look at other kernel code
> to see how this is done.  scsi_put_command() gives
> an example of doing double locks and irq state -- it's a nice one.

OK

> you MUST absolutely mention this somewhere with boldface red
> letters comment that the queue lock is always obtained FIRST
> and the host lock SECOND (and, trivially, released in opposite
> order).

OK will add comments, though undecided on whether to have both a
target lock and host_lock, or just go with host_lock (per comments sent
reply to James).

> > +	__blk_run_queue(current_sdev->request_queue);
> > +	spin_unlock_irqrestore(current_sdev->request_queue->queue_lock, flags2);
> 
> Now if you make device_busy an atomic_t type, then you DO not
> need these double locking acrobatics.

We should defer such changes until we clean up IO completion, per other
email comments.

> > +	if (sdev->single_lun) {
> > +		spin_lock_irqsave(sdev->host->host_lock, flags);
> > +		sdev->sdev_target->starget_refcnt--;
> > +		if (sdev->sdev_target->starget_refcnt == 0)
> > +			kfree(sdev->sdev_target);
> 
> 
> Have you seen ``atomic_dec_and_test()''?

OK

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-26 19:11         ` Patrick Mansfield
@ 2003-03-26 22:05           ` Luben Tuikov
  2003-03-27 22:43             ` Patrick Mansfield
  0 siblings, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-26 22:05 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> 
> But the last single_lun LU that ran should have priority over any other
> LU's on that same target (well it should really get some slice of IO time,
> rather than allowing IO til it is no longer busy), and separately, the
> first starved device should have priority over all other starved devices,
> I can't do that (simply) with one list. 
> 
> single_lun devices are likely slow (CDROM), and we really don't want to
> give them priority over other starved devices.

So, using the starved_list as a priority queue will not work?

Given: consumer of starved_list takes from its front.

``the last single_lun LU that run'' has priority over all others,
*IF* added at the front. (LIFO) (since any latter one of the same
sort (single lun) will be added at the front)

The *first* starved device DOES HAVE priority over all others,
*IF* added at the tail. (FIFO)  (since all others of the same
sort (non-single lun) all *also* be added at the tail)

So, so for single_lun, our priority queue (starved_list) behaves
as LIFO, and for non-single_lun, our priority queue (starved_list)
behaves as FIFO.

The bottom line is that you have one, universal logic for single
lun and non-single lun device simply by manipulating where
you insert in the starved list.

The insertion is, of course, done at scsi_request_fn.

(cont'd)

> 
>>static inline void scsi_run_starved_devs(struct Scsi_Host *shost)
>>{
>>	unsigned long flags;
>>	LIST_HEAD(starved);
>>	
>>	spin_lock_irqsave(&shost->starved_devs_lock, flags);
>>	list_splice_init(&shost->starved_devs, &starved);
>>	spin_unlock_irqrestore(&shost->starved_devs_lock, flags);
>>	
>>	while (!list_empty(&starved)) {
>>		struct scsi_device *dev;
>>
>>		if (shost->host_blocked || shost->host_self_blocked ||
>>		    (shost->can_queue > 0 &&
>>		     shost->host_busy >= shost->can_queue))
>>			break;
>>	
>>		dev=list_entry(starved.next,struct scsi_device,starved_entry);
>>		list_del_init(dev->starved_entry);
>>		spin_lock_irqsave(dev->request_queue->queue_lock, flags);
>>		__blk_run_queue(dev->request_queue);
>>		spin_unlock_irqrestore(dev->request_queue->queue_lock, flags);
>>	}
>>	
>>	if (!list_empty(&starved)) {
>>		spin_lock_irqsave(&shost->starved_devs_lock, flags);
>>		list_splice_init(&starved, &shost->starved_devs);
>>		spin_unlock_irqrestore(&shost->starved_devs_lock, flags);
>>	}
>>}
> 
> 
> I did not take the list_splice an approach mainly because I wanted to
> allow multiple CPU's to run the starved queues (host_blocked cases, not
> can_queue). I don't have any data to back this up, and it probably does
> not matter. 
> 
> For the can_queue limit being hit, with continuous IO across multiple LU's
> on a host, there we will (likely) send only one more IO (run one starved
> queue) and stop - the last few lines of the above code would always be
> executed.
> 
> Also the same lock has to be used to protect the scsi_host->starved_list
> and scsi_device->starved_entry. In the above code the first
> list_splice_init above touches the shost->starved_devs->next->prev,
> corresponding to a scsi_device->starved_entry->prev, while holding
> starved_devs_lock but the following list_del_init is done holding the
> queue_lock.

We're working with the local list now!

So that starved_entry is in the local starved list now,
and shost->starved_list is EMPTY!

On the ``following list_del_init'' dev->starved_entry is in
the local ``starved'' list!

-- 
Luben




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 7/7 fix single_lun code for per-scsi_device queue_lock
  2003-03-26 21:47                 ` Patrick Mansfield
@ 2003-03-26 22:12                   ` Luben Tuikov
  0 siblings, 0 replies; 34+ messages in thread
From: Luben Tuikov @ 2003-03-26 22:12 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> 
> OK will add comments, though undecided on whether to have both a
> target lock and host_lock, or just go with host_lock (per comments sent
> reply to James).

You must recognize that this is the same concern as James'.

It's a really important one for all the headaches and lock ups
it could bring if not well documented.

This is just the nature of ``locking hierarchies''.

-- 
Luben



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-26  2:01               ` Patrick Mansfield
@ 2003-03-27 16:09                 ` James Bottomley
  2003-03-28  0:30                   ` Patrick Mansfield
  0 siblings, 1 reply; 34+ messages in thread
From: James Bottomley @ 2003-03-27 16:09 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: SCSI Mailing List

On Tue, 2003-03-25 at 20:01, Patrick Mansfield wrote: 
> On Tue, Mar 25, 2003 at 03:20:32PM -0600, James Bottomley wrote:
> > and leads to awful problems
> > for your starved_list: the host_lock is actually protecting the starved
> > list, so the way you currently have it, you have to drop the host lock
> > to acquire the queue_lock to run the block queue.  Unfortunately, doing
> > this could lead to the device being readded when running the queue. 
> > Thus, the queue_next_request can spin removing and adding the device to
> > the starved list---probably working from a copy of the starved list will
> > help with this.
> 
> The above doesn't happen because of the while() check: if anyone gets put
> on or re-added to the starved list, we drop out of the while loop. The
> current code also allows mutiple IO completions to run starved queues in
> parallel (I doubt it helps any, but I have no evidence one way or the
> other).

Well, not necessarily.  Dropping out of the while loop assumes that
host_busy goes below can_queue.  However, if something else is sourcing
requests, it may get in between dropping the host_lock and acquiring the
queue_lock.  In this case, running the block queues would simply re-add
the request and keep host_busy at can_queue, thus we could get into a
loop. 

> > I'd also like to see avoidance of the locking hierarchy where possible. 
> > i.e. in the scsi_request_fn for instance, can we move the actions that
> > need the host_lock outside the queue_lock?
> 
> I tried coding to avoid the lock hierarchy but did not quite make it.

OK.

What about making the host processing in the scsi_request_fn exceptional
(i.e. putting it outside the queue lock after the request has been
dequeued).  That way we'd have to do a queue push back if it was
unacceptable (and get the counter decrements correct)?

> Without a lock hierarchy, (in scsi_request_fn) I have to lock, check for
> resources, increment host_busy or device_busy, and unlock.
> 
> Then, if unable to get a host resource, I have to decrement device_busy.
> 
> For the blk_queue_empty(q) or the "if !(req)" cases, I have to decrement
> both device_busy and host_busy.
> 
> I thought (past tense) moving up blk_queue_empty(q) and the "if (!req)"
> checks would cause problems by not decrementing device_blocked and
> host_blocked, and that there is no way to handle host_busy being
> decremented back to 0 there.
> 
> On further thought, those checks can be moved up before checking any of
> the sdev or shost values (if blk_queue_empty, we don't care about
> device_blocked or host_blocked). Agree? If so that issue goes away.

Hmm, moving them should be OK.  The !req case probably wants to be
treated the same as starvation (except you have to beware the
device_busy == 0 && host_busy == 0 case where the queue needs plugging).

> There is a still an issue with the single_lun checks, where we allow
> continued IO to a device that has device_busy set, but not for a device
> that has device_busy == 0 and target_busy != 0; so we have to get the
> lock for device_busy, and the lock for target busy. I could keep the lock
> hierarchy only in single_lun cases, or add another target lock that would
> be part of a lock hierarchy.

That's probably OK, since these checks are now exceptional (we only get
into the logic if the flag is set).

> Should I add a target lock? 

Not unless you can provide an *extremely* good justification for it.

> Excluding the single_lun case, do you think it's worthwhile to avoid the
> lock hierarchy?

As long as it doesn't contort the code, yes.  Lock hierarchies are
accidents waiting to happen, so taking reasonable pains to avoid them is
good.

> Related note: the new workq code in blk probably messes up our device_blocked
> and host_blocked handling, since there is now a 3 millisecond timeout (see
> blk_plug_queue). device_blocked and host_blocked should really be use some
> type of timeout mechanism. Maybe a new blk_plug_timeout(). This might also
> allow us to get rid of the one remaining recursive call into
> scsi_request_fn.

Since we never gave any guarantees other than "eventually" about the
restart time, I don't necessarily think it does.

James



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 0/7 per scsi_device queue lock patches
  2003-03-25  1:53 [PATCH] 0/7 per scsi_device queue lock patches Patrick Mansfield
  2003-03-25  1:54 ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Patrick Mansfield
@ 2003-03-27 16:14 ` James Bottomley
  1 sibling, 0 replies; 34+ messages in thread
From: James Bottomley @ 2003-03-27 16:14 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: SCSI Mailing List

On Mon, 2003-03-24 at 19:53, Patrick Mansfield wrote:
> The following patches against recent 2.5.x bk (pulled on march 24, should
> apply fine on top of 2.5.66) leads to splitting the scsi queue_lock into a
> pre-scsi_device lock rather than the current per-scsi_host lock.

I've put all of these patches in a BK tree at

http://linux-scsi.bkbits.net/scsi-locking-2.5

for people to play with.

James



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-26 22:05           ` Luben Tuikov
@ 2003-03-27 22:43             ` Patrick Mansfield
  2003-03-28 15:09               ` Luben Tuikov
  0 siblings, 1 reply; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-27 22:43 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Wed, Mar 26, 2003 at 05:05:09PM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:
> > 
> > But the last single_lun LU that ran should have priority over any other
> > LU's on that same target (well it should really get some slice of IO time,
> > rather than allowing IO til it is no longer busy), and separately, the
> > first starved device should have priority over all other starved devices,
> > I can't do that (simply) with one list. 
> > 
> > single_lun devices are likely slow (CDROM), and we really don't want to
> > give them priority over other starved devices.

> So, using the starved_list as a priority queue will not work?

> Given: consumer of starved_list takes from its front.
> 
> ``the last single_lun LU that run'' has priority over all others,
> *IF* added at the front. (LIFO) (since any latter one of the same
> sort (single lun) will be added at the front)

Yes if using the list_splice code, and if it is added even for cases when
it is able to send IO. This would differ from the current code - given 3
single_lun devices on a target, one can be prevented from sending IO (in
addition to all cases where one single_lun device can prevent IO to
other single_lun devices).

> The *first* starved device DOES HAVE priority over all others,
> *IF* added at the tail. (FIFO)  (since all others of the same
> sort (non-single lun) all *also* be added at the tail)

Then it has priority over other starved devices, but not over single_lun
devices put on the head. So a single_lun device put on the head of the
starved_list could potentially prevent other non-single_lun devices on the
same scsi_host from sending IO.

We could add another list_head for the single_lun, but I don't think it
is worth the extra code, data, and effort.

> > Also the same lock has to be used to protect the scsi_host->starved_list
> > and scsi_device->starved_entry. In the above code the first
> > list_splice_init above touches the shost->starved_devs->next->prev,
> > corresponding to a scsi_device->starved_entry->prev, while holding
> > starved_devs_lock but the following list_del_init is done holding the
> > queue_lock.
> 
> We're working with the local list now!
> 
> So that starved_entry is in the local starved list now,
> and shost->starved_list is EMPTY!
> 
> On the ``following list_del_init'' dev->starved_entry is in
> the local ``starved'' list!

dev->starved_entry is still visible and used in scsi_request_fn, removing
it from the starved list does not prevent scsi_request_fn from being
called and referencing it (if we have two locks and no lock hierarchy in
scsi_request_fn).

I'm not against using the list_splice, it does not clean up the
locking, but it probably fixes James issue (possible loop).

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 6/7 add and use a per-scsi_device queue_lock
  2003-03-27 16:09                 ` James Bottomley
@ 2003-03-28  0:30                   ` Patrick Mansfield
  0 siblings, 0 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-28  0:30 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List

On Thu, Mar 27, 2003 at 10:09:34AM -0600, James Bottomley wrote:
> On Tue, 2003-03-25 at 20:01, Patrick Mansfield wrote: 

> > The above doesn't happen because of the while() check: if anyone gets put
> > on or re-added to the starved list, we drop out of the while loop. The
> > current code also allows mutiple IO completions to run starved queues in
> > parallel (I doubt it helps any, but I have no evidence one way or the
> > other).
> 
> Well, not necessarily.  Dropping out of the while loop assumes that
> host_busy goes below can_queue.  However, if something else is sourcing

(Assume you meant hits can_queue.)

> requests, it may get in between dropping the host_lock and acquiring the
> queue_lock.  In this case, running the block queues would simply re-add
> the request and keep host_busy at can_queue, thus we could get into a
> loop. 

I don't see how new requests can cause a loop there, since a new request
will increment host_busy.

Or do you mean new IO can prevent other IO? It's possible for a new IO
(iff 1 request is sent) sneaking in and preventing a starved queue from
running (even with a list_splice algorithm).

Or do you mean the following case?

An IO completion with highly unlikely synchronization could lead to
a loop, for example:

       cpu1                                   cpu2

t1:

       in scsi_queue_next_request
       pulled sdev off starved
       list, called __blk_run_queue,
       one IO was sent, sdev was put
       back on starved_list

t2:
                                             got host_lock, decremented
					     host_busy

t3:
       waits for host_lock just
       at the end of the while loop

                                             releases host_lock

       go to t1

					     calls scsi_queue_next_request
					     and generally does nothing
					     since cpu1 ran the queue(s)

					     go to t2

It is probably impossible to keep the above syncronization, I don't know
if the irqs will route to another cpu in such a synchronized fashion (use
cpu1, but then use cpu2 for other interrupts from the same adapter).

Using list_splice (or somehow limiting it to one iteration accross all
sdev's) would prevent the above from occuring.

> > > I'd also like to see avoidance of the locking hierarchy where possible. 
> > > i.e. in the scsi_request_fn for instance, can we move the actions that
> > > need the host_lock outside the queue_lock?
> > 
> > I tried coding to avoid the lock hierarchy but did not quite make it.
> 
> OK.
> 
> What about making the host processing in the scsi_request_fn exceptional
> (i.e. putting it outside the queue lock after the request has been
> dequeued).  That way we'd have to do a queue push back if it was
> unacceptable (and get the counter decrements correct)?

I missed the dequeueing step.

> Hmm, moving them should be OK.  The !req case probably wants to be
> treated the same as starvation (except you have to beware the
> device_busy == 0 && host_busy == 0 case where the queue needs plugging).
> 

OK, I'll try to move those up, and do host locking/checking after the
blkdev_dequeue_request, and then push req back on the queue if the host
can't queue.

> > There is a still an issue with the single_lun checks, where we allow
> > continued IO to a device that has device_busy set, but not for a device
> > that has device_busy == 0 and target_busy != 0; so we have to get the
> > lock for device_busy, and the lock for target busy. I could keep the lock
> > hierarchy only in single_lun cases, or add another target lock that would
> > be part of a lock hierarchy.
> 
> That's probably OK, since these checks are now exceptional (we only get
> into the logic if the flag is set).

OK - so I will keep the lock hierarchy for single_lun.

> > Should I add a target lock? 
> 
> Not unless you can provide an *extremely* good justification for it.

Good, no target lock.

> > Related note: the new workq code in blk probably messes up our device_blocked
> > and host_blocked handling, since there is now a 3 millisecond timeout (see
> > blk_plug_queue). device_blocked and host_blocked should really be use some
> > type of timeout mechanism. Maybe a new blk_plug_timeout(). This might also
> > allow us to get rid of the one remaining recursive call into
> > scsi_request_fn.
> 
> Since we never gave any guarantees other than "eventually" about the
> restart time, I don't necessarily think it does.

It won't fail but we will quickly retry IO for a host busy or queue full
(when host_busy or device_busy == 0).

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-27 22:43             ` Patrick Mansfield
@ 2003-03-28 15:09               ` Luben Tuikov
  2003-03-28 20:06                 ` Patrick Mansfield
  0 siblings, 1 reply; 34+ messages in thread
From: Luben Tuikov @ 2003-03-28 15:09 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: linux-scsi

Patrick Mansfield wrote:
> On Wed, Mar 26, 2003 at 05:05:09PM -0500, Luben Tuikov wrote:
> 
>>Patrick Mansfield wrote:
>>

[This is your premise:]
>>>But the last single_lun LU that ran should have priority over any other
>>>LU's on that same target (well it should really get some slice of IO time,
>>>rather than allowing IO til it is no longer busy), and separately, the
>>>first starved device should have priority over all other starved devices,
>>>I can't do that (simply) with one list. 
>>Given: consumer of starved_list takes from its front.

Here I'm quoting the sinlge lun part of your premise above:

>>``the last single_lun LU that run'' has priority over all others,
>>*IF* added at the front. (LIFO) (since any latter one of the same
>>sort (single lun) will be added at the front)
> 
> 
> Yes if using the list_splice code, and if it is added even for cases when
> it is able to send IO. This would differ from the current code - given 3
> single_lun devices on a target, one can be prevented from sending IO (in
> addition to all cases where one single_lun device can prevent IO to
> other single_lun devices).
> 

Here I'm quoting the second part of your premise above:
 
>>The *first* starved device DOES HAVE priority over all others,
>>*IF* added at the tail. (FIFO)  (since all others of the same
>>sort (non-single lun) all *also* be added at the tail)
> 
> 
> Then it has priority over other starved devices, but not over single_lun
> devices put on the head. So a single_lun device put on the head of the
> starved_list could potentially prevent other non-single_lun devices on the
> same scsi_host from sending IO.

You see, all I'm doing is trying to come up with a suitable ADT
and algorithm for the premise which _you_ gave.

The problem is that you have no fixed arbitration criterium between single
lun and fully capable targets.  If you did, it's a little thing to get
the sinlge list as priority queue working.

I.e. What I'm asking is this: _even_ if you had 2 lists, one
for sinlge lun and another for fully capable targets,
how would you arbitrate between the two in scsi_request_fn().

Because if you tell me how, a sinlge list travesal of a single
starved devices as priority queue as I described will work.

What I'm trying to say it that, there exists an ordering, i.e.
insertion mechanism, for the problem at hand.

I.e. you know that the last finished single target is at the front,
the last finished fully capable target is at the tail, and
you can arbitrate any which way you want just from a single
list traversal:
	- last finished lu on a single lu target (front)
	- last finished lu on a fully cap. target (tail)
	- first finished lu on a single lu target (last single lun F->T)
	- first finished lu on a single lu target (last f. cap. lun T->F)
	- any other starved lu (single lu or fully cap.; arb.)
 
> We could add another list_head for the single_lun, but I don't think it
> is worth the extra code, data, and effort.

Right.  It is *not* worth it, mainly because I think that the task
can be accomplished with a single list.
 
>>>Also the same lock has to be used to protect the scsi_host->starved_list
>>>and scsi_device->starved_entry. In the above code the first
>>>list_splice_init above touches the shost->starved_devs->next->prev,
>>>corresponding to a scsi_device->starved_entry->prev, while holding
>>>starved_devs_lock but the following list_del_init is done holding the
>>>queue_lock.
>>
>>We're working with the local list now!
>>
>>So that starved_entry is in the local starved list now,
>>and shost->starved_list is EMPTY!
>>
>>On the ``following list_del_init'' dev->starved_entry is in
>>the local ``starved'' list!
> 
> 
> dev->starved_entry is still visible and used in scsi_request_fn, removing
> it from the starved list does not prevent scsi_request_fn from being
> called and referencing it (if we have two locks and no lock hierarchy in
> scsi_request_fn).

Well, this is just unfortunate.  This ``design feature'' is not very
good.  It is *still* my conviction that a scsi_cmnd has to have
only *one* list entry and belong to only *one* list at a time.
That list representing its state. I think I'm repreating myself.

> I'm not against using the list_splice, it does not clean up the
> locking, but it probably fixes James issue (possible loop).

-- 
Luben




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] 3/7 consolidate single_lun code
  2003-03-28 15:09               ` Luben Tuikov
@ 2003-03-28 20:06                 ` Patrick Mansfield
  0 siblings, 0 replies; 34+ messages in thread
From: Patrick Mansfield @ 2003-03-28 20:06 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: linux-scsi

On Fri, Mar 28, 2003 at 10:09:22AM -0500, Luben Tuikov wrote:
> Patrick Mansfield wrote:
> > On Wed, Mar 26, 2003 at 05:05:09PM -0500, Luben Tuikov wrote:

> You see, all I'm doing is trying to come up with a suitable ADT
> and algorithm for the premise which _you_ gave.
> 
> The problem is that you have no fixed arbitration criterium between single
> lun and fully capable targets.  If you did, it's a little thing to get
> the sinlge list as priority queue working.
> 
> I.e. What I'm asking is this: _even_ if you had 2 lists, one
> for sinlge lun and another for fully capable targets,
> how would you arbitrate between the two in scsi_request_fn().

OK I see your point, and I don't have an answer.

So we should be able to put single_lun sdev's on the starved_list, and
maintain the same priority (current scsi-locking-2.5 with all 7 patches
applied runs single_lun queue's, and then runs starved queues; it puts all
devices on the starved_list if the can_queue or other host limits are hit;
no device is put on the starved_list unless it is unable to send IO).

I'm working on getting rid of the lock hierarchy now, I would rather not
touch any more of the single_lun code (I would rather such devices go
away). If you or anyone can submit a patch (against the scsi-lock-2.5) for
single_lun code to use the starved_list that would be nice :)

I tested the single_lun code by booting with scsi_dev_flags=vend:prod:0x10
for some of the disks on my system, or you can modify the scsi_scan.c
table.

I'm not sure if we can ever prioritize the starved_list (holding a list of
sdev's or their request functions) without locking out one or the other of
the single lun or starved devices.

We might need a per-target or per-host queue (of requests, not of sdev's)
to fix it properly. A per-target (or per-host?) queue might also allow an
easier fix to prevent a one single_lun device from starving all others. I
would rather leave this as a broken hardware or bad configuration, the fix
is to get rid of the single_lun hardware, or only attach one of them
(target) to an adapter.

In any case the starved_list patch is an improvement over previous
handling for the starved cases, and for single_lun cases we should not be
any worse than before.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2003-03-28 20:06 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-25  1:53 [PATCH] 0/7 per scsi_device queue lock patches Patrick Mansfield
2003-03-25  1:54 ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Patrick Mansfield
2003-03-25  2:02   ` [PATCH] 2/7 add missing scsi_queue_next_request calls Patrick Mansfield
2003-03-25  2:02     ` [PATCH] 3/7 consolidate single_lun code Patrick Mansfield
2003-03-25  2:03       ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Patrick Mansfield
2003-03-25  2:03         ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Patrick Mansfield
2003-03-25  2:03           ` [PATCH] 6/7 add and use a per-scsi_device queue_lock Patrick Mansfield
2003-03-25  2:04             ` [PATCH] 7/7 fix single_lun code for " Patrick Mansfield
2003-03-25 21:23               ` Luben Tuikov
2003-03-26 21:47                 ` Patrick Mansfield
2003-03-26 22:12                   ` Luben Tuikov
2003-03-25 21:03             ` [PATCH] 6/7 add and use a " Luben Tuikov
2003-03-26 21:33               ` Patrick Mansfield
2003-03-25 21:20             ` James Bottomley
2003-03-26  2:01               ` Patrick Mansfield
2003-03-27 16:09                 ` James Bottomley
2003-03-28  0:30                   ` Patrick Mansfield
2003-03-25  7:12           ` [PATCH] 5/7 alloc a request_queue on each scsi_alloc_sdev call Christoph Hellwig
2003-03-25  7:18             ` Jens Axboe
2003-03-25 21:32         ` [PATCH] 4/7 cleanup/consolidate code in scsi_request_fn Luben Tuikov
2003-03-26  0:58           ` Patrick Mansfield
2003-03-26 17:07             ` Luben Tuikov
2003-03-26 17:13               ` Patrick Mansfield
2003-03-26 17:25                 ` Luben Tuikov
2003-03-25 20:36       ` [PATCH] 3/7 consolidate single_lun code Luben Tuikov
2003-03-26 19:11         ` Patrick Mansfield
2003-03-26 22:05           ` Luben Tuikov
2003-03-27 22:43             ` Patrick Mansfield
2003-03-28 15:09               ` Luben Tuikov
2003-03-28 20:06                 ` Patrick Mansfield
2003-03-25 20:50       ` Luben Tuikov
2003-03-25 19:41     ` [PATCH] 2/7 add missing scsi_queue_next_request calls Luben Tuikov
2003-03-25 19:39   ` [PATCH] 1/7 starved changes - use a list_head for starved queue's Luben Tuikov
2003-03-27 16:14 ` [PATCH] 0/7 per scsi_device queue lock patches James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox