[PATCH 0/13 v6] More device removal fixes

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/13 v6] More device removal fixes
@ 2012-11-28 12:39 Bart Van Assche
  2012-11-28 12:42 ` [PATCH v6 01/13] block: Rename queue dead flag Bart Van Assche
                   ` (14 more replies)
  0 siblings, 15 replies; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:39 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Fix a few race conditions that can be triggered by removing a device:
- Avoid that request_fn() can be invoked on a dead queue.
- Avoid that blk_cleanup_queue() can finish while request_fn is still
   running.
- Fix a race between starved list processing and device removal.
- Avoid that a SCSI LLD callback can be invoked after scsi_remove_host()
   finished.
- Speed up device removal by stopping error handling as soon as
   scsi_remove_host() started.

These patches have been tested on top of kernel v3.7-rc7.

Changes compared to v5:
- Avoid that block layer work can be scheduled on a dead queue.
- Do not invoke any SCSI LLD callback after scsi_remove_host() finished.
- Stop error handling as soon as scsi_remove_host() started.
- Remove the unused function bsg_goose_queue().
- Avoid that scsi_device_set_state() triggers a race condition.

Changes compared to v4:
- Moved queue_flag_set(QUEUE_FLAG_DEAD, q) from blk_drain_queue() into
   blk_cleanup_queue().
- Declared the new __blk_run_queue_uncond() function inline. Checked in
   the generated assembler code that this function is really inlined in
   __blk_run_queue().
- Elaborated several patch descriptions.
- Added sparse annotations to scsi_request_fn().
- Split several patches.

Changes compared to v3:
- Fixed a race condition by setting QUEUE_FLAG_DEAD earlier.
- Added a patch for fixing a race between starved list processing
   and device removal to this series.

Changes compared to v2:
- Split second patch into two patches.
- Refined patch descriptions.

Changes compared to v1:
- Included a patch to rename QUEUE_FLAG_DEAD.
- Refined the descriptions of the __blk_run_queue_uncond() and
   blk_cleanup_queue() functions.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v6 01/13] block: Rename queue dead flag
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
@ 2012-11-28 12:42 ` Bart Van Assche
  2012-11-28 12:43 ` [PATCH v6 02/13] block: Let blk_drain_queue() caller obtain the queue lock Bart Van Assche
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:42 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
stop. After this flag has been set queue draining starts. However,
during the queue draining phase it is still safe to invoke the
queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
flag.

This patch has been generated by running the following command
over the kernel source tree:

git grep -lEw 'blk_queue_dead|QUEUE_FLAG_DEAD' |
    xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g'      \
        -e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g';                \
sed -i.tmp -e "s/QUEUE_FLAG_DYING$(printf \\t)*5/QUEUE_FLAG_DYING$(printf \\t)5/g" \
    include/linux/blkdev.h;                                       \
sed -i.tmp -e 's/ DEAD/ DYING/g' -e 's/dead queue/a dying queue/' \
    -e 's/Dead queue/A dying queue/' block/blk-core.c

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Chanho Min <chanho.min@lge.com>
---
 block/blk-cgroup.c      |    2 +-
 block/blk-core.c        |   26 +++++++++++++-------------
 block/blk-exec.c        |    2 +-
 block/blk-sysfs.c       |    4 ++--
 block/blk-throttle.c    |    2 +-
 block/blk.h             |    2 +-
 drivers/scsi/scsi_lib.c |    2 +-
 include/linux/blkdev.h  |    4 ++--
 8 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index d0b7703..5dea4e8 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -231,7 +231,7 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
 	 * we shouldn't allow anything to go through for a bypassing queue.
 	 */
 	if (unlikely(blk_queue_bypass(q)))
-		return ERR_PTR(blk_queue_dead(q) ? -EINVAL : -EBUSY);
+		return ERR_PTR(blk_queue_dying(q) ? -EINVAL : -EBUSY);
 	return __blkg_lookup_create(blkcg, q, NULL);
 }
 EXPORT_SYMBOL_GPL(blkg_lookup_create);
diff --git a/block/blk-core.c b/block/blk-core.c
index 3c95c4d..adbca41 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -473,20 +473,20 @@ EXPORT_SYMBOL_GPL(blk_queue_bypass_end);
  * blk_cleanup_queue - shutdown a request queue
  * @q: request queue to shutdown
  *
- * Mark @q DEAD, drain all pending requests, destroy and put it.  All
+ * Mark @q DYING, drain all pending requests, destroy and put it.  All
  * future requests will be failed immediately with -ENODEV.
  */
 void blk_cleanup_queue(struct request_queue *q)
 {
 	spinlock_t *lock = q->queue_lock;
 
-	/* mark @q DEAD, no new request or merges will be allowed afterwards */
+	/* mark @q DYING, no new request or merges will be allowed afterwards */
 	mutex_lock(&q->sysfs_lock);
-	queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
+	queue_flag_set_unlocked(QUEUE_FLAG_DYING, q);
 	spin_lock_irq(lock);
 
 	/*
-	 * Dead queue is permanently in bypass mode till released.  Note
+	 * A dying queue is permanently in bypass mode till released.  Note
 	 * that, unlike blk_queue_bypass_start(), we aren't performing
 	 * synchronize_rcu() after entering bypass mode to avoid the delay
 	 * as some drivers create and destroy a lot of queues while
@@ -499,11 +499,11 @@ void blk_cleanup_queue(struct request_queue *q)
 
 	queue_flag_set(QUEUE_FLAG_NOMERGES, q);
 	queue_flag_set(QUEUE_FLAG_NOXMERGES, q);
-	queue_flag_set(QUEUE_FLAG_DEAD, q);
+	queue_flag_set(QUEUE_FLAG_DYING, q);
 	spin_unlock_irq(lock);
 	mutex_unlock(&q->sysfs_lock);
 
-	/* drain all requests queued before DEAD marking */
+	/* drain all requests queued before DYING marking */
 	blk_drain_queue(q, true);
 
 	/* @q won't process any more request, flush async actions */
@@ -716,7 +716,7 @@ EXPORT_SYMBOL(blk_init_allocated_queue);
 
 bool blk_get_queue(struct request_queue *q)
 {
-	if (likely(!blk_queue_dead(q))) {
+	if (likely(!blk_queue_dying(q))) {
 		__blk_get_queue(q);
 		return true;
 	}
@@ -870,7 +870,7 @@ static struct request *__get_request(struct request_list *rl, int rw_flags,
 	const bool is_sync = rw_is_sync(rw_flags) != 0;
 	int may_queue;
 
-	if (unlikely(blk_queue_dead(q)))
+	if (unlikely(blk_queue_dying(q)))
 		return NULL;
 
 	may_queue = elv_may_queue(q, rw_flags);
@@ -1050,7 +1050,7 @@ retry:
 	if (rq)
 		return rq;
 
-	if (!(gfp_mask & __GFP_WAIT) || unlikely(blk_queue_dead(q))) {
+	if (!(gfp_mask & __GFP_WAIT) || unlikely(blk_queue_dying(q))) {
 		blk_put_rl(rl);
 		return NULL;
 	}
@@ -1910,7 +1910,7 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
 		return -EIO;
 
 	spin_lock_irqsave(q->queue_lock, flags);
-	if (unlikely(blk_queue_dead(q))) {
+	if (unlikely(blk_queue_dying(q))) {
 		spin_unlock_irqrestore(q->queue_lock, flags);
 		return -ENODEV;
 	}
@@ -2885,9 +2885,9 @@ static void queue_unplugged(struct request_queue *q, unsigned int depth,
 	trace_block_unplug(q, depth, !from_schedule);
 
 	/*
-	 * Don't mess with dead queue.
+	 * Don't mess with a dying queue.
 	 */
-	if (unlikely(blk_queue_dead(q))) {
+	if (unlikely(blk_queue_dying(q))) {
 		spin_unlock(q->queue_lock);
 		return;
 	}
@@ -2996,7 +2996,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 		/*
 		 * Short-circuit if @q is dead
 		 */
-		if (unlikely(blk_queue_dead(q))) {
+		if (unlikely(blk_queue_dying(q))) {
 			__blk_end_request_all(rq, -ENODEV);
 			continue;
 		}
diff --git a/block/blk-exec.c b/block/blk-exec.c
index f71eac3..59b96eb 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -66,7 +66,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 
 	spin_lock_irq(q->queue_lock);
 
-	if (unlikely(blk_queue_dead(q))) {
+	if (unlikely(blk_queue_dying(q))) {
 		rq->errors = -ENXIO;
 		if (rq->end_io)
 			rq->end_io(rq, rq->errors);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index ce62046..7881477 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -466,7 +466,7 @@ queue_attr_show(struct kobject *kobj, struct attribute *attr, char *page)
 	if (!entry->show)
 		return -EIO;
 	mutex_lock(&q->sysfs_lock);
-	if (blk_queue_dead(q)) {
+	if (blk_queue_dying(q)) {
 		mutex_unlock(&q->sysfs_lock);
 		return -ENOENT;
 	}
@@ -488,7 +488,7 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
 
 	q = container_of(kobj, struct request_queue, kobj);
 	mutex_lock(&q->sysfs_lock);
-	if (blk_queue_dead(q)) {
+	if (blk_queue_dying(q)) {
 		mutex_unlock(&q->sysfs_lock);
 		return -ENOENT;
 	}
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a9664fa..3114622 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -302,7 +302,7 @@ static struct throtl_grp *throtl_lookup_create_tg(struct throtl_data *td,
 		/* if %NULL and @q is alive, fall back to root_tg */
 		if (!IS_ERR(blkg))
 			tg = blkg_to_tg(blkg);
-		else if (!blk_queue_dead(q))
+		else if (!blk_queue_dying(q))
 			tg = td_root_tg(td);
 	}
 
diff --git a/block/blk.h b/block/blk.h
index ca51543..2218a8a 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -96,7 +96,7 @@ static inline struct request *__elv_next_request(struct request_queue *q)
 			q->flush_queue_delayed = 1;
 			return NULL;
 		}
-		if (unlikely(blk_queue_dead(q)) ||
+		if (unlikely(blk_queue_dying(q)) ||
 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
 			return NULL;
 	}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9032e91..f1bf5af 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1418,7 +1418,7 @@ static int scsi_lld_busy(struct request_queue *q)
 	struct scsi_device *sdev = q->queuedata;
 	struct Scsi_Host *shost;
 
-	if (blk_queue_dead(q))
+	if (blk_queue_dying(q))
 		return 0;
 
 	shost = sdev->host;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1756001..aba8246 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -437,7 +437,7 @@ struct request_queue {
 #define QUEUE_FLAG_STOPPED	2	/* queue is stopped */
 #define	QUEUE_FLAG_SYNCFULL	3	/* read queue has been filled */
 #define QUEUE_FLAG_ASYNCFULL	4	/* write queue has been filled */
-#define QUEUE_FLAG_DEAD		5	/* queue being torn down */
+#define QUEUE_FLAG_DYING	5	/* queue being torn down */
 #define QUEUE_FLAG_BYPASS	6	/* act as dumb FIFO queue */
 #define QUEUE_FLAG_BIDI		7	/* queue supports bidi requests */
 #define QUEUE_FLAG_NOMERGES     8	/* disable merge attempts */
@@ -521,7 +521,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 
 #define blk_queue_tagged(q)	test_bit(QUEUE_FLAG_QUEUED, &(q)->queue_flags)
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
-#define blk_queue_dead(q)	test_bit(QUEUE_FLAG_DEAD, &(q)->queue_flags)
+#define blk_queue_dying(q)	test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags)
 #define blk_queue_bypass(q)	test_bit(QUEUE_FLAG_BYPASS, &(q)->queue_flags)
 #define blk_queue_nomerges(q)	test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags)
 #define blk_queue_noxmerges(q)	\
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 02/13] block: Let blk_drain_queue() caller obtain the queue lock
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
  2012-11-28 12:42 ` [PATCH v6 01/13] block: Rename queue dead flag Bart Van Assche
@ 2012-11-28 12:43 ` Bart Van Assche
  2012-11-28 12:44 ` [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:43 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Let the caller of blk_drain_queue() obtain the queue lock to improve
readability of the patch called "Avoid that request_fn is invoked on
a dead queue".

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Chanho Min <chanho.min@lge.com>
---
 block/blk-core.c |   30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index adbca41..9bcf21b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -349,7 +349,7 @@ void blk_put_queue(struct request_queue *q)
 EXPORT_SYMBOL(blk_put_queue);
 
 /**
- * blk_drain_queue - drain requests from request_queue
+ * __blk_drain_queue - drain requests from request_queue
  * @q: queue to drain
  * @drain_all: whether to drain all requests or only the ones w/ ELVPRIV
  *
@@ -357,15 +357,17 @@ EXPORT_SYMBOL(blk_put_queue);
  * If not, only ELVPRIV requests are drained.  The caller is responsible
  * for ensuring that no new requests which need to be drained are queued.
  */
-void blk_drain_queue(struct request_queue *q, bool drain_all)
+static void __blk_drain_queue(struct request_queue *q, bool drain_all)
+	__releases(q->queue_lock)
+	__acquires(q->queue_lock)
 {
 	int i;
 
+	lockdep_assert_held(q->queue_lock);
+
 	while (true) {
 		bool drain = false;
 
-		spin_lock_irq(q->queue_lock);
-
 		/*
 		 * The caller might be trying to drain @q before its
 		 * elevator is initialized.
@@ -401,11 +403,14 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
 			}
 		}
 
-		spin_unlock_irq(q->queue_lock);
-
 		if (!drain)
 			break;
+
+		spin_unlock_irq(q->queue_lock);
+
 		msleep(10);
+
+		spin_lock_irq(q->queue_lock);
 	}
 
 	/*
@@ -416,13 +421,9 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
 	if (q->request_fn) {
 		struct request_list *rl;
 
-		spin_lock_irq(q->queue_lock);
-
 		blk_queue_for_each_rl(rl, q)
 			for (i = 0; i < ARRAY_SIZE(rl->wait); i++)
 				wake_up_all(&rl->wait[i]);
-
-		spin_unlock_irq(q->queue_lock);
 	}
 }
 
@@ -446,7 +447,10 @@ void blk_queue_bypass_start(struct request_queue *q)
 	spin_unlock_irq(q->queue_lock);
 
 	if (drain) {
-		blk_drain_queue(q, false);
+		spin_lock_irq(q->queue_lock);
+		__blk_drain_queue(q, false);
+		spin_unlock_irq(q->queue_lock);
+
 		/* ensure blk_queue_bypass() is %true inside RCU read lock */
 		synchronize_rcu();
 	}
@@ -504,7 +508,9 @@ void blk_cleanup_queue(struct request_queue *q)
 	mutex_unlock(&q->sysfs_lock);
 
 	/* drain all requests queued before DYING marking */
-	blk_drain_queue(q, true);
+	spin_lock_irq(lock);
+	__blk_drain_queue(q, true);
+	spin_unlock_irq(lock);
 
 	/* @q won't process any more request, flush async actions */
 	del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
  2012-11-28 12:42 ` [PATCH v6 01/13] block: Rename queue dead flag Bart Van Assche
  2012-11-28 12:43 ` [PATCH v6 02/13] block: Let blk_drain_queue() caller obtain the queue lock Bart Van Assche
@ 2012-11-28 12:44 ` Bart Van Assche
  2012-12-02 13:23   ` Tejun Heo
  2012-11-28 12:45 ` [PATCH v6 04/13] block: Avoid scheduling delayed work " Bart Van Assche
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:44 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

A block driver may start cleaning up resources needed by its
request_fn as soon as blk_cleanup_queue() finished, so request_fn
must not be invoked after draining finished. This is important
when blk_run_queue() is invoked without any requests in progress.
As an example, if blk_drain_queue() and scsi_run_queue() run in
parallel, blk_drain_queue() may have finished all requests after
scsi_run_queue() has taken a SCSI device off the starved list but
before that last function has had a chance to run the queue.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chanho Min <chanho.min@lge.com>
---
 block/blk-core.c       |   31 +++++++++++++++++++++++++++----
 block/blk-exec.c       |    2 +-
 block/blk.h            |    2 ++
 include/linux/blkdev.h |    2 ++
 4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 9bcf21b..b8ba9fb 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -293,6 +293,25 @@ void blk_sync_queue(struct request_queue *q)
 EXPORT_SYMBOL(blk_sync_queue);
 
 /**
+ * __blk_run_queue_uncond - run a queue whether or not it has been stopped
+ * @q:	The queue to run
+ *
+ * Description:
+ *    Invoke request handling on a queue if there are any pending requests.
+ *    May be used to restart request handling after a request has completed.
+ *    This variant runs the queue whether or not the queue has been
+ *    stopped. Must be called with the queue lock held and interrupts
+ *    disabled. See also @blk_run_queue.
+ */
+inline void __blk_run_queue_uncond(struct request_queue *q)
+{
+	if (unlikely(blk_queue_dead(q)))
+		return;
+
+	q->request_fn(q);
+}
+
+/**
  * __blk_run_queue - run a single device queue
  * @q:	The queue to run
  *
@@ -305,7 +324,7 @@ void __blk_run_queue(struct request_queue *q)
 	if (unlikely(blk_queue_stopped(q)))
 		return;
 
-	q->request_fn(q);
+	__blk_run_queue_uncond(q);
 }
 EXPORT_SYMBOL(__blk_run_queue);
 
@@ -477,8 +496,8 @@ EXPORT_SYMBOL_GPL(blk_queue_bypass_end);
  * blk_cleanup_queue - shutdown a request queue
  * @q: request queue to shutdown
  *
- * Mark @q DYING, drain all pending requests, destroy and put it.  All
- * future requests will be failed immediately with -ENODEV.
+ * Mark @q DYING, drain all pending requests, mark @q DEAD, destroy and
+ * put it.  All future requests will be failed immediately with -ENODEV.
  */
 void blk_cleanup_queue(struct request_queue *q)
 {
@@ -507,9 +526,13 @@ void blk_cleanup_queue(struct request_queue *q)
 	spin_unlock_irq(lock);
 	mutex_unlock(&q->sysfs_lock);
 
-	/* drain all requests queued before DYING marking */
+	/*
+	 * Drain all requests queued before DYING marking. Set DEAD flag to
+	 * prevent that q->request_fn() gets invoked after draining finished.
+	 */
 	spin_lock_irq(lock);
 	__blk_drain_queue(q, true);
+	queue_flag_set(QUEUE_FLAG_DEAD, q);
 	spin_unlock_irq(lock);
 
 	/* @q won't process any more request, flush async actions */
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 59b96eb..74638ec 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -78,7 +78,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	__blk_run_queue(q);
 	/* the queue is stopped so it won't be run */
 	if (is_pm_resume)
-		q->request_fn(q);
+		__blk_run_queue_uncond(q);
 	spin_unlock_irq(q->queue_lock);
 }
 EXPORT_SYMBOL_GPL(blk_execute_rq_nowait);
diff --git a/block/blk.h b/block/blk.h
index 2218a8a..47fdfdd 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -145,6 +145,8 @@ int blk_try_merge(struct request *rq, struct bio *bio);
 
 void blk_queue_congestion_threshold(struct request_queue *q);
 
+void __blk_run_queue_uncond(struct request_queue *q);
+
 int blk_dev_init(void);
 
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index aba8246..8bc46c2 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -452,6 +452,7 @@ struct request_queue {
 #define QUEUE_FLAG_ADD_RANDOM  16	/* Contributes to random pool */
 #define QUEUE_FLAG_SECDISCARD  17	/* supports SECDISCARD */
 #define QUEUE_FLAG_SAME_FORCE  18	/* force complete on same CPU */
+#define QUEUE_FLAG_DEAD        19	/* queue tear-down finished */
 
 #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_STACKABLE)	|	\
@@ -522,6 +523,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 #define blk_queue_tagged(q)	test_bit(QUEUE_FLAG_QUEUED, &(q)->queue_flags)
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
 #define blk_queue_dying(q)	test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags)
+#define blk_queue_dead(q)	test_bit(QUEUE_FLAG_DEAD, &(q)->queue_flags)
 #define blk_queue_bypass(q)	test_bit(QUEUE_FLAG_BYPASS, &(q)->queue_flags)
 #define blk_queue_nomerges(q)	test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags)
 #define blk_queue_noxmerges(q)	\
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 04/13] block: Avoid scheduling delayed work on a dead queue
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (2 preceding siblings ...)
  2012-11-28 12:44 ` [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
@ 2012-11-28 12:45 ` Bart Van Assche
  2012-12-02 13:26   ` Tejun Heo
  2012-11-28 12:46 ` [PATCH v6 05/13] block: Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:45 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Running a queue must continue after it has been marked dying until
it has been marked dead. So the function blk_run_queue_async() must
not schedule delayed work after blk_cleanup_queue() has marked a queue
dead. Hence add a test for that queue state in blk_run_queue_async()
and make sure that queue_unplugged() invokes that function with the
queue lock held. This avoids that the queue state can change after
it has been tested and before mod_delayed_work() is invoked. Drop
the queue dying test in queue_unplugged() since it is now
superfluous: __blk_run_queue() already tests whether or not the
queue is dead.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c |   33 +++++++++------------------------
 1 file changed, 9 insertions(+), 24 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index b8ba9fb..12deca9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -219,12 +219,13 @@ static void blk_delay_work(struct work_struct *work)
  * Description:
  *   Sometimes queueing needs to be postponed for a little while, to allow
  *   resources to come back. This function will make sure that queueing is
- *   restarted around the specified time.
+ *   restarted around the specified time. Queue lock must be held.
  */
 void blk_delay_queue(struct request_queue *q, unsigned long msecs)
 {
-	queue_delayed_work(kblockd_workqueue, &q->delay_work,
-				msecs_to_jiffies(msecs));
+	if (likely(!blk_queue_dead(q)))
+		queue_delayed_work(kblockd_workqueue, &q->delay_work,
+				   msecs_to_jiffies(msecs));
 }
 EXPORT_SYMBOL(blk_delay_queue);
 
@@ -334,11 +335,11 @@ EXPORT_SYMBOL(__blk_run_queue);
  *
  * Description:
  *    Tells kblockd to perform the equivalent of @blk_run_queue on behalf
- *    of us.
+ *    of us. The caller must hold the queue lock.
  */
 void blk_run_queue_async(struct request_queue *q)
 {
-	if (likely(!blk_queue_stopped(q)))
+	if (likely(!blk_queue_stopped(q) && !blk_queue_dead(q)))
 		mod_delayed_work(kblockd_workqueue, &q->delay_work, 0);
 }
 EXPORT_SYMBOL(blk_run_queue_async);
@@ -2913,27 +2914,11 @@ static void queue_unplugged(struct request_queue *q, unsigned int depth,
 {
 	trace_block_unplug(q, depth, !from_schedule);
 
-	/*
-	 * Don't mess with a dying queue.
-	 */
-	if (unlikely(blk_queue_dying(q))) {
-		spin_unlock(q->queue_lock);
-		return;
-	}
-
-	/*
-	 * If we are punting this to kblockd, then we can safely drop
-	 * the queue_lock before waking kblockd (which needs to take
-	 * this lock).
-	 */
-	if (from_schedule) {
-		spin_unlock(q->queue_lock);
+	if (from_schedule)
 		blk_run_queue_async(q);
-	} else {
+	else
 		__blk_run_queue(q);
-		spin_unlock(q->queue_lock);
-	}
-
+	spin_unlock(q->queue_lock);
 }
 
 static void flush_plug_callbacks(struct blk_plug *plug, bool from_schedule)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 05/13] block: Make blk_cleanup_queue() wait until request_fn finished
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (3 preceding siblings ...)
  2012-11-28 12:45 ` [PATCH v6 04/13] block: Avoid scheduling delayed work " Bart Van Assche
@ 2012-11-28 12:46 ` Bart Van Assche
  2012-12-02 13:28   ` Tejun Heo
  2012-11-28 12:47 ` [PATCH v6 06/13] bsg: Remove unused function bsg_goose_queue() Bart Van Assche
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:46 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Some request_fn implementations, e.g. scsi_request_fn(), unlock
the queue lock internally. This may result in multiple threads
executing request_fn for the same queue simultaneously. Keep
track of the number of active request_fn calls and make sure that
blk_cleanup_queue() waits until all active request_fn invocations
have finished. A block driver may start cleaning up resources
needed by its request_fn as soon as blk_cleanup_queue() finished,
so blk_cleanup_queue() must wait for all outstanding request_fn
invocations to finish.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Chanho Min <chanho.min@lge.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
---
 block/blk-core.c       |   10 ++++++++++
 include/linux/blkdev.h |    6 ++++++
 2 files changed, 16 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 12deca9..4d09af4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -309,7 +309,16 @@ inline void __blk_run_queue_uncond(struct request_queue *q)
 	if (unlikely(blk_queue_dead(q)))
 		return;
 
+	/*
+	 * Some request_fn implementations, e.g. scsi_request_fn(), unlock
+	 * the queue lock internally. As a result multiple threads may be
+	 * running such a request function concurrently. Keep track of the
+	 * number of active request_fn invocations such that blk_drain_queue()
+	 * can wait until all these request_fn calls have finished.
+	 */
+	q->request_fn_active++;
 	q->request_fn(q);
+	q->request_fn_active--;
 }
 
 /**
@@ -408,6 +417,7 @@ static void __blk_drain_queue(struct request_queue *q, bool drain_all)
 			__blk_run_queue(q);
 
 		drain |= q->nr_rqs_elvpriv;
+		drain |= q->request_fn_active;
 
 		/*
 		 * Unfortunately, requests are queued at and tracked from
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8bc46c2..c9d233e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -378,6 +378,12 @@ struct request_queue {
 
 	unsigned int		nr_sorted;
 	unsigned int		in_flight[2];
+	/*
+	 * Number of active block driver functions for which blk_drain_queue()
+	 * must wait. Must be incremented around functions that unlock the
+	 * queue_lock internally, e.g. scsi_request_fn().
+	 */
+	unsigned int		request_fn_active;
 
 	unsigned int		rq_timeout;
 	struct timer_list	timeout;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 06/13] bsg: Remove unused function bsg_goose_queue()
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (4 preceding siblings ...)
  2012-11-28 12:46 ` [PATCH v6 05/13] block: Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
@ 2012-11-28 12:47 ` Bart Van Assche
  2012-12-02 13:29   ` Tejun Heo
  2012-11-28 12:48 ` [PATCH v6 07/13] Fix race between starved list processing and device removal Bart Van Assche
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:47 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

The function bsg_goose_queue() does not have any in-tree callers,
so let's remove it.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
---
 block/bsg-lib.c         |   13 -------------
 include/linux/bsg-lib.h |    1 -
 2 files changed, 14 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index deee61f..650f427 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -151,19 +151,6 @@ failjob_rls_job:
 	return -ENOMEM;
 }
 
-/*
- * bsg_goose_queue - restart queue in case it was stopped
- * @q: request q to be restarted
- */
-void bsg_goose_queue(struct request_queue *q)
-{
-	if (!q)
-		return;
-
-	blk_run_queue_async(q);
-}
-EXPORT_SYMBOL_GPL(bsg_goose_queue);
-
 /**
  * bsg_request_fn - generic handler for bsg requests
  * @q: request queue to manage
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index 4d0fb3d..a226652 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -67,6 +67,5 @@ void bsg_job_done(struct bsg_job *job, int result,
 int bsg_setup_queue(struct device *dev, struct request_queue *q, char *name,
 		    bsg_job_fn *job_fn, int dd_job_size);
 void bsg_request_fn(struct request_queue *q);
-void bsg_goose_queue(struct request_queue *q);
 
 #endif
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 07/13] Fix race between starved list processing and device removal
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (5 preceding siblings ...)
  2012-11-28 12:47 ` [PATCH v6 06/13] bsg: Remove unused function bsg_goose_queue() Bart Van Assche
@ 2012-11-28 12:48 ` Bart Van Assche
  2012-12-02 13:32   ` Tejun Heo
  2012-11-28 12:48 ` [PATCH v6 08/13] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:48 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Avoid that the sdev reference count can drop to zero before
the queue is run by scsi_run_queue(). Also avoid that the sdev
reference count can drop to zero in the same function by invoking
__blk_run_queue().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reported-and-tested-by: Chanho Min <chanho.min@lge.com>
Reference: http://lkml.org/lkml/2012/8/2/96
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/scsi_lib.c   |   12 +++++++-----
 drivers/scsi/scsi_sysfs.c |    7 ++++++-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f1bf5af..8d772df 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -452,11 +452,13 @@ static void scsi_run_queue(struct request_queue *q)
 			continue;
 		}
 
-		spin_unlock(shost->host_lock);
-		spin_lock(sdev->request_queue->queue_lock);
-		__blk_run_queue(sdev->request_queue);
-		spin_unlock(sdev->request_queue->queue_lock);
-		spin_lock(shost->host_lock);
+		get_device(&sdev->sdev_gendev);
+		spin_unlock_irqrestore(shost->host_lock, flags);
+
+		blk_run_queue(sdev->request_queue);
+		put_device(&sdev->sdev_gendev);
+
+		spin_lock_irqsave(shost->host_lock, flags);
 	}
 	/* put any unprocessed entries back */
 	list_splice(&starved_list, &shost->starved_list);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ce5224c..2661a957 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -348,7 +348,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	starget->reap_ref++;
 	list_del(&sdev->siblings);
 	list_del(&sdev->same_target_siblings);
-	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
 	cancel_work_sync(&sdev->event_work);
@@ -956,6 +955,8 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 void __scsi_remove_device(struct scsi_device *sdev)
 {
 	struct device *dev = &sdev->sdev_gendev;
+	struct Scsi_Host *shost = sdev->host;
+	unsigned long flags;
 
 	if (sdev->is_visible) {
 		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
@@ -977,6 +978,10 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	blk_cleanup_queue(sdev->request_queue);
 	cancel_work_sync(&sdev->requeue_work);
 
+	spin_lock_irqsave(shost->host_lock, flags);
+	list_del(&sdev->starved_entry);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	if (sdev->host->hostt->slave_destroy)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(dev);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 08/13] Remove get_device() / put_device() pair from scsi_request_fn()
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (6 preceding siblings ...)
  2012-11-28 12:48 ` [PATCH v6 07/13] Fix race between starved list processing and device removal Bart Van Assche
@ 2012-11-28 12:48 ` Bart Van Assche
  2012-12-02 13:34   ` Tejun Heo
  2012-11-28 12:50 ` [PATCH v6 09/13] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:48 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Now that all scsi_request_fn() callers hold a reference on the
SCSI device that function is invoked on and since
blk_cleanup_queue() waits until scsi_request_fn() has finished
it is safe to remove the get_device() / put_device() pair from
scsi_request_fn().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
---
 drivers/scsi/scsi_lib.c |   12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8d772df..cfc420b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1525,16 +1525,14 @@ static void scsi_softirq_done(struct request *rq)
  * Lock status: IO request lock assumed to be held when called.
  */
 static void scsi_request_fn(struct request_queue *q)
+	__releases(q->queue_lock)
+	__acquires(q->queue_lock)
 {
 	struct scsi_device *sdev = q->queuedata;
 	struct Scsi_Host *shost;
 	struct scsi_cmnd *cmd;
 	struct request *req;
 
-	if(!get_device(&sdev->sdev_gendev))
-		/* We must be tearing the block queue down already */
-		return;
-
 	/*
 	 * To start with, we keep looping until the queue is empty, or until
 	 * the host is no longer able to accept any more requests.
@@ -1643,11 +1641,7 @@ out_delay:
 	if (sdev->device_busy == 0)
 		blk_delay_queue(q, SCSI_QUEUE_DELAY);
 out:
-	/* must be careful here...if we trigger the ->remove() function
-	 * we cannot be holding the q lock */
-	spin_unlock_irq(q->queue_lock);
-	put_device(&sdev->sdev_gendev);
-	spin_lock_irq(q->queue_lock);
+	;
 }
 
 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 09/13] Avoid saving/restoring interrupt state inside scsi_remove_host()
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (7 preceding siblings ...)
  2012-11-28 12:48 ` [PATCH v6 08/13] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
@ 2012-11-28 12:50 ` Bart Van Assche
  2012-12-02 13:35   ` Tejun Heo
  2012-11-28 12:51 ` [PATCH v6 10/13] Make scsi_remove_host() wait for device removal Bart Van Assche
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:50 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Since it is not allowed to invoke scsi_remove_host() with interrupts
disabled, avoid saving and restoring the interrupt state inside
scsi_remove_host().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/hosts.c |   12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 593085a..6ae16cd 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -156,27 +156,25 @@ EXPORT_SYMBOL(scsi_host_set_state);
  **/
 void scsi_remove_host(struct Scsi_Host *shost)
 {
-	unsigned long flags;
-
 	mutex_lock(&shost->scan_mutex);
-	spin_lock_irqsave(shost->host_lock, flags);
+	spin_lock_irq(shost->host_lock);
 	if (scsi_host_set_state(shost, SHOST_CANCEL))
 		if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
+			spin_unlock_irq(shost->host_lock);
 			mutex_unlock(&shost->scan_mutex);
 			return;
 		}
-	spin_unlock_irqrestore(shost->host_lock, flags);
+	spin_unlock_irq(shost->host_lock);
 
 	scsi_autopm_get_host(shost);
 	scsi_forget_host(shost);
 	mutex_unlock(&shost->scan_mutex);
 	scsi_proc_host_rm(shost);
 
-	spin_lock_irqsave(shost->host_lock, flags);
+	spin_lock_irq(shost->host_lock);
 	if (scsi_host_set_state(shost, SHOST_DEL))
 		BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
-	spin_unlock_irqrestore(shost->host_lock, flags);
+	spin_unlock_irq(shost->host_lock);
 
 	transport_unregister_device(&shost->shost_gendev);
 	device_unregister(&shost->shost_dev);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (8 preceding siblings ...)
  2012-11-28 12:50 ` [PATCH v6 09/13] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
@ 2012-11-28 12:51 ` Bart Van Assche
  2012-12-02 13:45   ` Tejun Heo
  2012-11-28 12:52 ` [PATCH v6 11/13] Make scsi_remove_host() wait until error handling finished Bart Van Assche
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:51 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

SCSI LLDs may start cleaning up host resources needed by their
queuecommand() callback as soon as scsi_remove_host() finished.
Hence scsi_remove_host() must wait until blk_cleanup_queue() for
all devices associated with the host has finished. That avoids
that queuecommand() gets invoked after scsi_remove_host()
finished. Also, avoid adding new SCSI devices after
scsi_remove_host() started.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/hosts.c      |   30 ++++++++++++++++++++++++++++++
 drivers/scsi/scsi_priv.h  |    1 +
 drivers/scsi/scsi_sysfs.c |    1 +
 include/scsi/scsi_host.h  |    1 +
 4 files changed, 33 insertions(+)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 6ae16cd..7bd944e 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -150,12 +150,31 @@ int scsi_host_set_state(struct Scsi_Host *shost, enum scsi_host_state state)
 }
 EXPORT_SYMBOL(scsi_host_set_state);
 
+/* Return true if and only if scsi_remove_host() is allowed to finish. */
+static bool __scsi_remove_host_done(struct Scsi_Host *shost)
+{
+	lockdep_assert_held(shost->host_lock);
+
+	return list_empty(&shost->__devices);
+}
+
+/* Test whether scsi_remove_host() may finish, and if so, wake it up. */
+void __scsi_check_remove_host_done(struct Scsi_Host *shost)
+{
+	lockdep_assert_held(shost->host_lock);
+
+	if (__scsi_remove_host_done(shost))
+		wake_up(&shost->remove_host);
+}
+
 /**
  * scsi_remove_host - remove a scsi host
  * @shost:	a pointer to a scsi host to remove
  **/
 void scsi_remove_host(struct Scsi_Host *shost)
 {
+	DEFINE_WAIT(wait);
+
 	mutex_lock(&shost->scan_mutex);
 	spin_lock_irq(shost->host_lock);
 	if (scsi_host_set_state(shost, SHOST_CANCEL))
@@ -174,6 +193,16 @@ void scsi_remove_host(struct Scsi_Host *shost)
 	spin_lock_irq(shost->host_lock);
 	if (scsi_host_set_state(shost, SHOST_DEL))
 		BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
+	while (!__scsi_remove_host_done(shost)) {
+		prepare_to_wait_exclusive(&shost->remove_host, &wait,
+					  TASK_INTERRUPTIBLE);
+		if (__scsi_remove_host_done(shost))
+			break;
+		spin_unlock_irq(shost->host_lock);
+		schedule();
+		spin_lock_irq(shost->host_lock);
+	}
+	finish_wait(&shost->remove_host, &wait);
 	spin_unlock_irq(shost->host_lock);
 
 	transport_unregister_device(&shost->shost_gendev);
@@ -349,6 +378,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	shost->shost_state = SHOST_CREATED;
 	INIT_LIST_HEAD(&shost->__devices);
 	INIT_LIST_HEAD(&shost->__targets);
+	init_waitqueue_head(&shost->remove_host);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
 	INIT_LIST_HEAD(&shost->starved_list);
 	init_waitqueue_head(&shost->host_wait);
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 8f9a0ca..143517c 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -26,6 +26,7 @@ struct scsi_nl_hdr;
 /* hosts.c */
 extern int scsi_init_hosts(void);
 extern void scsi_exit_hosts(void);
+extern void __scsi_check_remove_host_done(struct Scsi_Host *shost);
 
 /* scsi.c */
 extern int scsi_dispatch_cmd(struct scsi_cmnd *cmd);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 2661a957..89195e2 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -348,6 +348,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	starget->reap_ref++;
 	list_del(&sdev->siblings);
 	list_del(&sdev->same_target_siblings);
+	__scsi_check_remove_host_done(sdev->host);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
 	cancel_work_sync(&sdev->event_work);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 4908480..1b7fd89 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -577,6 +577,7 @@ struct Scsi_Host {
 	struct completion     * eh_action; /* Wait for specific actions on the
 					      host. */
 	wait_queue_head_t       host_wait;
+	wait_queue_head_t	remove_host;
 	struct scsi_host_template *hostt;
 	struct scsi_transport_template *transportt;
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 11/13] Make scsi_remove_host() wait until error handling finished
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (9 preceding siblings ...)
  2012-11-28 12:51 ` [PATCH v6 10/13] Make scsi_remove_host() wait for device removal Bart Van Assche
@ 2012-11-28 12:52 ` Bart Van Assche
  2012-12-02 13:51   ` Tejun Heo
  2012-11-28 12:53 ` [PATCH v6 12/13] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:52 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

A SCSI LLD may start cleaning up host resources as soon as
scsi_remove_host() returns. These host resources may be needed by
the LLD in an implementation of one of the eh_* functions. So if
one of the eh_* functions is in progress when scsi_remove_host()
is invoked, wait until the eh_* function has finished. Also, do
not invoke any of the eh_* functions after scsi_remove_host() has
started.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Tejun Heo <tj@kernel.org>
---
 drivers/scsi/hosts.c      |    2 +-
 drivers/scsi/scsi_error.c |  114 ++++++++++++++++++++++++++++++++++-----------
 include/scsi/scsi_host.h  |    1 +
 3 files changed, 89 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 7bd944e..477b8d6 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -155,7 +155,7 @@ static bool __scsi_remove_host_done(struct Scsi_Host *shost)
 {
 	lockdep_assert_held(shost->host_lock);
 
-	return list_empty(&shost->__devices);
+	return list_empty(&shost->__devices) && !shost->eh_active;
 }
 
 /* Test whether scsi_remove_host() may finish, and if so, wake it up. */
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c1b05a8..cab2ac3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -536,8 +536,52 @@ static void scsi_eh_done(struct scsi_cmnd *scmd)
 }
 
 /**
+ * scsi_begin_eh - start host-related error handling
+ *
+ * Must be called before invoking any of the scsi_host_template.eh_* functions
+ * to avoid that scsi_remove_host() returns while one of these callback
+ * functions is in progress.
+ *
+ * Returns true if invoking the eh_* function is allowed and false if not.
+ * If this function returns true then scsi_end_eh() must be called eventually.
+ *
+ * Note: scsi_send_eh_cmnd() calls do not have to be protected by a
+ * scsi_begin_eh() / scsi_end_eh() pair since these operate on an unfinished
+ * block layer request. Since scsi_remove_host() waits until all SCSI devices
+ * have been removed and since blk_cleanup_queue() is invoked during SCSI
+ * device removal scsi_remove_host() won't finish while a scsi_send_eh_cmnd()
+ * call is in progress.
+ */
+static bool scsi_begin_eh(struct Scsi_Host *host)
+{
+	bool res;
+
+	spin_lock_irq(host->host_lock);
+	res = scsi_host_scan_allowed(host);
+	if (res) {
+		WARN_ON_ONCE(host->eh_active < 0 || host->eh_active > 1);
+		host->eh_active++;
+	}
+	spin_unlock_irq(host->host_lock);
+
+	return res;
+}
+
+/**
+ * scsi_end_eh - finish host-related error handling
+ */
+static void scsi_end_eh(struct Scsi_Host *host)
+{
+	spin_lock_irq(host->host_lock);
+	host->eh_active--;
+	WARN_ON_ONCE(host->eh_active < 0 || host->eh_active > 1);
+	__scsi_check_remove_host_done(host);
+	spin_unlock_irq(host->host_lock);
+}
+
+/**
  * scsi_try_host_reset - ask host adapter to reset itself
- * @scmd:	SCSI cmd to send hsot reset.
+ * @scmd:	SCSI cmd to send host reset.
  */
 static int scsi_try_host_reset(struct scsi_cmnd *scmd)
 {
@@ -552,14 +596,17 @@ static int scsi_try_host_reset(struct scsi_cmnd *scmd)
 	if (!hostt->eh_host_reset_handler)
 		return FAILED;
 
-	rtn = hostt->eh_host_reset_handler(scmd);
-
-	if (rtn == SUCCESS) {
-		if (!hostt->skip_settle_delay)
-			ssleep(HOST_RESET_SETTLE_TIME);
-		spin_lock_irqsave(host->host_lock, flags);
-		scsi_report_bus_reset(host, scmd_channel(scmd));
-		spin_unlock_irqrestore(host->host_lock, flags);
+	rtn = FAST_IO_FAIL;
+	if (scsi_begin_eh(host)) {
+		rtn = hostt->eh_host_reset_handler(scmd);
+		if (rtn == SUCCESS) {
+			if (!hostt->skip_settle_delay)
+				ssleep(HOST_RESET_SETTLE_TIME);
+			spin_lock_irqsave(host->host_lock, flags);
+			scsi_report_bus_reset(host, scmd_channel(scmd));
+			spin_unlock_irqrestore(host->host_lock, flags);
+		}
+		scsi_end_eh(host);
 	}
 
 	return rtn;
@@ -582,14 +629,17 @@ static int scsi_try_bus_reset(struct scsi_cmnd *scmd)
 	if (!hostt->eh_bus_reset_handler)
 		return FAILED;
 
-	rtn = hostt->eh_bus_reset_handler(scmd);
-
-	if (rtn == SUCCESS) {
-		if (!hostt->skip_settle_delay)
-			ssleep(BUS_RESET_SETTLE_TIME);
-		spin_lock_irqsave(host->host_lock, flags);
-		scsi_report_bus_reset(host, scmd_channel(scmd));
-		spin_unlock_irqrestore(host->host_lock, flags);
+	rtn = FAST_IO_FAIL;
+	if (scsi_begin_eh(host)) {
+		rtn = hostt->eh_bus_reset_handler(scmd);
+		if (rtn == SUCCESS) {
+			if (!hostt->skip_settle_delay)
+				ssleep(BUS_RESET_SETTLE_TIME);
+			spin_lock_irqsave(host->host_lock, flags);
+			scsi_report_bus_reset(host, scmd_channel(scmd));
+			spin_unlock_irqrestore(host->host_lock, flags);
+		}
+		scsi_end_eh(host);
 	}
 
 	return rtn;
@@ -621,12 +671,17 @@ static int scsi_try_target_reset(struct scsi_cmnd *scmd)
 	if (!hostt->eh_target_reset_handler)
 		return FAILED;
 
-	rtn = hostt->eh_target_reset_handler(scmd);
-	if (rtn == SUCCESS) {
-		spin_lock_irqsave(host->host_lock, flags);
-		__starget_for_each_device(scsi_target(scmd->device), NULL,
-					  __scsi_report_device_reset);
-		spin_unlock_irqrestore(host->host_lock, flags);
+	rtn = FAST_IO_FAIL;
+	if (scsi_begin_eh(host)) {
+		rtn = hostt->eh_target_reset_handler(scmd);
+		if (rtn == SUCCESS) {
+			spin_lock_irqsave(host->host_lock, flags);
+			__starget_for_each_device(scsi_target(scmd->device),
+						  NULL,
+						  __scsi_report_device_reset);
+			spin_unlock_irqrestore(host->host_lock, flags);
+		}
+		scsi_end_eh(host);
 	}
 
 	return rtn;
@@ -645,14 +700,19 @@ static int scsi_try_target_reset(struct scsi_cmnd *scmd)
 static int scsi_try_bus_device_reset(struct scsi_cmnd *scmd)
 {
 	int rtn;
-	struct scsi_host_template *hostt = scmd->device->host->hostt;
+	struct Scsi_Host *host = scmd->device->host;
+	struct scsi_host_template *hostt = host->hostt;
 
 	if (!hostt->eh_device_reset_handler)
 		return FAILED;
 
-	rtn = hostt->eh_device_reset_handler(scmd);
-	if (rtn == SUCCESS)
-		__scsi_report_device_reset(scmd->device, NULL);
+	rtn = FAST_IO_FAIL;
+	if (scsi_begin_eh(host)) {
+		rtn = hostt->eh_device_reset_handler(scmd);
+		if (rtn == SUCCESS)
+			__scsi_report_device_reset(scmd->device, NULL);
+		scsi_end_eh(host);
+	}
 	return rtn;
 }
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 1b7fd89..5e2fcd2 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -576,6 +576,7 @@ struct Scsi_Host {
 	struct task_struct    * ehandler;  /* Error recovery thread. */
 	struct completion     * eh_action; /* Wait for specific actions on the
 					      host. */
+	int			eh_active;
 	wait_queue_head_t       host_wait;
 	wait_queue_head_t	remove_host;
 	struct scsi_host_template *hostt;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 12/13] Avoid that scsi_device_set_state() triggers a race
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (10 preceding siblings ...)
  2012-11-28 12:52 ` [PATCH v6 11/13] Make scsi_remove_host() wait until error handling finished Bart Van Assche
@ 2012-11-28 12:53 ` Bart Van Assche
  2012-12-02 13:53   ` Tejun Heo
  2012-11-28 12:53 ` [PATCH v6 13/13] Do not queue new I/O after scsi_remove_host() started Bart Van Assche
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:53 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

Make concurrent invocations of scsi_device_set_state() safe.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
---
 drivers/scsi/scsi_lib.c   |   24 ++++++++++++------------
 drivers/scsi/scsi_sysfs.c |    9 ++++++++-
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index cfc420b..f3d6e0d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2072,7 +2072,9 @@ EXPORT_SYMBOL(scsi_test_unit_ready);
  *	@state:	state to change to.
  *
  *	Returns zero if unsuccessful or an error if the requested 
- *	transition is illegal.
+ *	transition is illegal. It is the responsibility of the caller to make
+ *      sure that a call of this function does not race with other code that
+ *      accesses the device state, e.g. by holding the host lock.
  */
 int
 scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
@@ -2433,21 +2435,19 @@ scsi_internal_device_block(struct scsi_device *sdev)
 	unsigned long flags;
 	int err = 0;
 
+	spin_lock_irqsave(q->queue_lock, flags);
 	err = scsi_device_set_state(sdev, SDEV_BLOCK);
-	if (err) {
+	if (err)
 		err = scsi_device_set_state(sdev, SDEV_CREATED_BLOCK);
 
-		if (err)
-			return err;
+	if (err == 0) {
+		/*
+		 * The device has transitioned to SDEV_BLOCK.  Stop the
+		 * block layer from calling the midlayer with this device's
+		 * request queue.
+		 */
+		blk_stop_queue(q);
 	}
-
-	/* 
-	 * The device has transitioned to SDEV_BLOCK.  Stop the
-	 * block layer from calling the midlayer with this device's
-	 * request queue. 
-	 */
-	spin_lock_irqsave(q->queue_lock, flags);
-	blk_stop_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
 	return 0;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 89195e2..3f5197e 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -960,8 +960,12 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	unsigned long flags;
 
 	if (sdev->is_visible) {
-		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
+		spin_lock_irqsave(shost->host_lock, flags);
+		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0) {
+			spin_unlock_irqrestore(shost->host_lock, flags);
 			return;
+		}
+		spin_unlock_irqrestore(shost->host_lock, flags);
 
 		bsg_unregister_queue(sdev->request_queue);
 		device_unregister(&sdev->sdev_dev);
@@ -975,7 +979,10 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	 * scsi_run_queue() invocations have finished before tearing down the
 	 * device.
 	 */
+	spin_lock_irqsave(shost->host_lock, flags);
 	scsi_device_set_state(sdev, SDEV_DEL);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	blk_cleanup_queue(sdev->request_queue);
 	cancel_work_sync(&sdev->requeue_work);
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 13/13] Do not queue new I/O after scsi_remove_host() started
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (11 preceding siblings ...)
  2012-11-28 12:53 ` [PATCH v6 12/13] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
@ 2012-11-28 12:53 ` Bart Van Assche
  2012-12-02 13:58   ` Tejun Heo
  2012-12-02 14:02 ` [PATCH 0/13 v6] More device removal fixes Tejun Heo
  2012-12-06 13:33 ` Jens Axboe
  14 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-11-28 12:53 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Tejun Heo,
	Chanho Min, Hannes Reinecke

The function scsi_remove_host() may get invoked concurrently with
scsi_request_fn(). Kill those I/O requests for which processing
starts after scsi_remove_host() has been invoked. This makes
device removal a little quicker by avoiding that such SCSI
commands time out.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Tejun Heo <tj@kernel.org>
---
 drivers/scsi/scsi_lib.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f3d6e0d..5fe25b3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1549,7 +1549,8 @@ static void scsi_request_fn(struct request_queue *q)
 		if (!req || !scsi_dev_queue_ready(q, sdev))
 			break;
 
-		if (unlikely(!scsi_device_online(sdev))) {
+		if (unlikely(!scsi_device_online(sdev) ||
+			     !scsi_host_scan_allowed(shost))) {
 			sdev_printk(KERN_ERR, sdev,
 				    "rejecting I/O to offline device\n");
 			scsi_kill_request(req, q);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue
  2012-11-28 12:44 ` [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
@ 2012-12-02 13:23   ` Tejun Heo
  2012-12-02 13:35     ` Bart Van Assche
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On Wed, Nov 28, 2012 at 01:44:53PM +0100, Bart Van Assche wrote:
> A block driver may start cleaning up resources needed by its
> request_fn as soon as blk_cleanup_queue() finished, so request_fn
> must not be invoked after draining finished. This is important
> when blk_run_queue() is invoked without any requests in progress.
> As an example, if blk_drain_queue() and scsi_run_queue() run in
> parallel, blk_drain_queue() may have finished all requests after
> scsi_run_queue() has taken a SCSI device off the starved list but
> before that last function has had a chance to run the queue.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Chanho Min <chanho.min@lge.com>

Acked-by: Tejun Heo <tj@kernel.org>

> +inline void __blk_run_queue_uncond(struct request_queue *q)

Does this inline actually make a difference in the generated code?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 04/13] block: Avoid scheduling delayed work on a dead queue
  2012-11-28 12:45 ` [PATCH v6 04/13] block: Avoid scheduling delayed work " Bart Van Assche
@ 2012-12-02 13:26   ` Tejun Heo
  2012-12-02 13:41     ` Bart Van Assche
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:26 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On Wed, Nov 28, 2012 at 01:45:56PM +0100, Bart Van Assche wrote:
> Running a queue must continue after it has been marked dying until
> it has been marked dead. So the function blk_run_queue_async() must
> not schedule delayed work after blk_cleanup_queue() has marked a queue
> dead. Hence add a test for that queue state in blk_run_queue_async()
> and make sure that queue_unplugged() invokes that function with the
> queue lock held. This avoids that the queue state can change after
> it has been tested and before mod_delayed_work() is invoked. Drop
> the queue dying test in queue_unplugged() since it is now
> superfluous: __blk_run_queue() already tests whether or not the
> queue is dead.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Jens Axboe <axboe@kernel.dk>

Acked-by: Tejun Heo <tj@kernel.org>

But, some nits.

> @@ -219,12 +219,13 @@ static void blk_delay_work(struct work_struct *work)
>   * Description:
>   *   Sometimes queueing needs to be postponed for a little while, to allow
>   *   resources to come back. This function will make sure that queueing is
> - *   restarted around the specified time.
> + *   restarted around the specified time. Queue lock must be held.
>   */

Comment is fine but lockdep_assert_held() is way more effective.

>  void blk_delay_queue(struct request_queue *q, unsigned long msecs)
>  {
> -	queue_delayed_work(kblockd_workqueue, &q->delay_work,
> -				msecs_to_jiffies(msecs));
> +	if (likely(!blk_queue_dead(q)))
> +		queue_delayed_work(kblockd_workqueue, &q->delay_work,
> +				   msecs_to_jiffies(msecs));
>  }
>  EXPORT_SYMBOL(blk_delay_queue);
>  
> @@ -334,11 +335,11 @@ EXPORT_SYMBOL(__blk_run_queue);
>   *
>   * Description:
>   *    Tells kblockd to perform the equivalent of @blk_run_queue on behalf
> - *    of us.
> + *    of us. The caller must hold the queue lock.
>   */

Ditto.

>  void blk_run_queue_async(struct request_queue *q)
>  {
> -	if (likely(!blk_queue_stopped(q)))
> +	if (likely(!blk_queue_stopped(q) && !blk_queue_dead(q)))
>  		mod_delayed_work(kblockd_workqueue, &q->delay_work, 0);
>  }
>  EXPORT_SYMBOL(blk_run_queue_async);

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 05/13] block: Make blk_cleanup_queue() wait until request_fn finished
  2012-11-28 12:46 ` [PATCH v6 05/13] block: Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
@ 2012-12-02 13:28   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On Wed, Nov 28, 2012 at 01:46:45PM +0100, Bart Van Assche wrote:
> Some request_fn implementations, e.g. scsi_request_fn(), unlock
> the queue lock internally. This may result in multiple threads
> executing request_fn for the same queue simultaneously. Keep
> track of the number of active request_fn calls and make sure that
> blk_cleanup_queue() waits until all active request_fn invocations
> have finished. A block driver may start cleaning up resources
> needed by its request_fn as soon as blk_cleanup_queue() finished,
> so blk_cleanup_queue() must wait for all outstanding request_fn
> invocations to finish.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Reported-by: Chanho Min <chanho.min@lge.com>
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 06/13] bsg: Remove unused function bsg_goose_queue()
  2012-11-28 12:47 ` [PATCH v6 06/13] bsg: Remove unused function bsg_goose_queue() Bart Van Assche
@ 2012-12-02 13:29   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On Wed, Nov 28, 2012 at 01:47:36PM +0100, Bart Van Assche wrote:
> The function bsg_goose_queue() does not have any in-tree callers,
> so let's remove it.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 07/13] Fix race between starved list processing and device removal
  2012-11-28 12:48 ` [PATCH v6 07/13] Fix race between starved list processing and device removal Bart Van Assche
@ 2012-12-02 13:32   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:32 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Wed, Nov 28, 2012 at 01:48:24PM +0100, Bart Van Assche wrote:
> @@ -452,11 +452,13 @@ static void scsi_run_queue(struct request_queue *q)
>  			continue;
>  		}
>  
> -		spin_unlock(shost->host_lock);
> -		spin_lock(sdev->request_queue->queue_lock);
> -		__blk_run_queue(sdev->request_queue);
> -		spin_unlock(sdev->request_queue->queue_lock);
> -		spin_lock(shost->host_lock);
> +		get_device(&sdev->sdev_gendev);
> +		spin_unlock_irqrestore(shost->host_lock, flags);
> +
> +		blk_run_queue(sdev->request_queue);
> +		put_device(&sdev->sdev_gendev);
> +
> +		spin_lock_irqsave(shost->host_lock, flags);

Some comment wouldn't hurt here.

> @@ -977,6 +978,10 @@ void __scsi_remove_device(struct scsi_device *sdev)
>  	blk_cleanup_queue(sdev->request_queue);
>  	cancel_work_sync(&sdev->requeue_work);
>  
> +	spin_lock_irqsave(shost->host_lock, flags);
> +	list_del(&sdev->starved_entry);
> +	spin_unlock_irqrestore(shost->host_lock, flags);
> +

And please add a comment explaining what's going on around
->starved_entry.  It's rather different from anything else.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 08/13] Remove get_device() / put_device() pair from scsi_request_fn()
  2012-11-28 12:48 ` [PATCH v6 08/13] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
@ 2012-12-02 13:34   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:34 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello,

On Wed, Nov 28, 2012 at 01:48:59PM +0100, Bart Van Assche wrote:
> Now that all scsi_request_fn() callers hold a reference on the
> SCSI device that function is invoked on and since
> blk_cleanup_queue() waits until scsi_request_fn() has finished
> it is safe to remove the get_device() / put_device() pair from
> scsi_request_fn().
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Acked-by: Tejun Heo <tj@kernel.org>
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Jens Axboe <axboe@kernel.dk>

Acked-by: Tejun Heo <tj@kernel.org>

with one nit.

> @@ -1643,11 +1641,7 @@ out_delay:
>  	if (sdev->device_busy == 0)
>  		blk_delay_queue(q, SCSI_QUEUE_DELAY);
>  out:
> -	/* must be careful here...if we trigger the ->remove() function
> -	 * we cannot be holding the q lock */
> -	spin_unlock_irq(q->queue_lock);
> -	put_device(&sdev->sdev_gendev);
> -	spin_lock_irq(q->queue_lock);
> +	;

There's one "goto out" at the end of the usual control flow.  Please
just return from there.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 09/13] Avoid saving/restoring interrupt state inside scsi_remove_host()
  2012-11-28 12:50 ` [PATCH v6 09/13] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
@ 2012-12-02 13:35   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:35 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On Wed, Nov 28, 2012 at 01:50:08PM +0100, Bart Van Assche wrote:
> Since it is not allowed to invoke scsi_remove_host() with interrupts
> disabled, avoid saving and restoring the interrupt state inside
> scsi_remove_host().
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Hannes Reinecke <hare@suse.de>

Acked-by: Tejun Heo <tj@kernel.org>

It would be nice to note that this doesn't make any functional
difference in the commit message.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue
  2012-12-02 13:23   ` Tejun Heo
@ 2012-12-02 13:35     ` Bart Van Assche
  0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2012-12-02 13:35 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On 12/02/12 14:23, Tejun Heo wrote:
> On Wed, Nov 28, 2012 at 01:44:53PM +0100, Bart Van Assche wrote:
>> A block driver may start cleaning up resources needed by its
>> request_fn as soon as blk_cleanup_queue() finished, so request_fn
>> must not be invoked after draining finished. This is important
>> when blk_run_queue() is invoked without any requests in progress.
>> As an example, if blk_drain_queue() and scsi_run_queue() run in
>> parallel, blk_drain_queue() may have finished all requests after
>> scsi_run_queue() has taken a SCSI device off the starved list but
>> before that last function has had a chance to run the queue.
>>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> Cc: James Bottomley <JBottomley@Parallels.com>
>> Cc: Mike Christie <michaelc@cs.wisc.edu>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Chanho Min <chanho.min@lge.com>
>
> Acked-by: Tejun Heo <tj@kernel.org>
>
>> +inline void __blk_run_queue_uncond(struct request_queue *q)
>
> Does this inline actually make a difference in the generated code?

Yes. Without that "inline" keyword that function is not inlined in 
__blk_run_queue() but with that "inline" keyword it is inlined. At least 
with gcc 4.7.1 - I have not checked any other gcc versions.

Bart.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 04/13] block: Avoid scheduling delayed work on a dead queue
  2012-12-02 13:26   ` Tejun Heo
@ 2012-12-02 13:41     ` Bart Van Assche
  2012-12-02 13:59       ` Tejun Heo
  0 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-12-02 13:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On 12/02/12 14:26, Tejun Heo wrote:
> On Wed, Nov 28, 2012 at 01:45:56PM +0100, Bart Van Assche wrote:
>> Running a queue must continue after it has been marked dying until
>> it has been marked dead. So the function blk_run_queue_async() must
>> not schedule delayed work after blk_cleanup_queue() has marked a queue
>> dead. Hence add a test for that queue state in blk_run_queue_async()
>> and make sure that queue_unplugged() invokes that function with the
>> queue lock held. This avoids that the queue state can change after
>> it has been tested and before mod_delayed_work() is invoked. Drop
>> the queue dying test in queue_unplugged() since it is now
>> superfluous: __blk_run_queue() already tests whether or not the
>> queue is dead.
>>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Mike Christie <michaelc@cs.wisc.edu>
>> Cc: Jens Axboe <axboe@kernel.dk>
>
> Acked-by: Tejun Heo <tj@kernel.org>
>
> But, some nits.
>
>> @@ -219,12 +219,13 @@ static void blk_delay_work(struct work_struct *work)
>>    * Description:
>>    *   Sometimes queueing needs to be postponed for a little while, to allow
>>    *   resources to come back. This function will make sure that queueing is
>> - *   restarted around the specified time.
>> + *   restarted around the specified time. Queue lock must be held.
>>    */
>
> Comment is fine but lockdep_assert_held() is way more effective.
>
>>   void blk_delay_queue(struct request_queue *q, unsigned long msecs)
>>   {
>> -	queue_delayed_work(kblockd_workqueue, &q->delay_work,
>> -				msecs_to_jiffies(msecs));
>> +	if (likely(!blk_queue_dead(q)))
>> +		queue_delayed_work(kblockd_workqueue, &q->delay_work,
>> +				   msecs_to_jiffies(msecs));
>>   }
>>   EXPORT_SYMBOL(blk_delay_queue);
>>
>> @@ -334,11 +335,11 @@ EXPORT_SYMBOL(__blk_run_queue);
>>    *
>>    * Description:
>>    *    Tells kblockd to perform the equivalent of @blk_run_queue on behalf
>> - *    of us.
>> + *    of us. The caller must hold the queue lock.
>>    */
>
> Ditto.

I agree. There is a reason though I have not yet added a 
lockdep_assert_held() statement in these functions: there are still 
unprotected calls of blk_run_queue_async() in drivers/md/dm.c and 
drivers/scsi/scsi_transport_fc.c.

Bart.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-11-28 12:51 ` [PATCH v6 10/13] Make scsi_remove_host() wait for device removal Bart Van Assche
@ 2012-12-02 13:45   ` Tejun Heo
  2012-12-02 13:48     ` Tejun Heo
  2012-12-03  8:23     ` Bart Van Assche
  0 siblings, 2 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Wed, Nov 28, 2012 at 01:51:13PM +0100, Bart Van Assche wrote:
> SCSI LLDs may start cleaning up host resources needed by their
> queuecommand() callback as soon as scsi_remove_host() finished.
> Hence scsi_remove_host() must wait until blk_cleanup_queue() for
> all devices associated with the host has finished. That avoids
> that queuecommand() gets invoked after scsi_remove_host()
> finished. Also, avoid adding new SCSI devices after
> scsi_remove_host() started.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/scsi/hosts.c      |   30 ++++++++++++++++++++++++++++++
>  drivers/scsi/scsi_priv.h  |    1 +
>  drivers/scsi/scsi_sysfs.c |    1 +
>  include/scsi/scsi_host.h  |    1 +
>  4 files changed, 33 insertions(+)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 6ae16cd..7bd944e 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -150,12 +150,31 @@ int scsi_host_set_state(struct Scsi_Host *shost, enum scsi_host_state state)
>  }
>  EXPORT_SYMBOL(scsi_host_set_state);
>  
> +/* Return true if and only if scsi_remove_host() is allowed to finish. */
> +static bool __scsi_remove_host_done(struct Scsi_Host *shost)
> +{
> +	lockdep_assert_held(shost->host_lock);
> +
> +	return list_empty(&shost->__devices);
> +}

This is a preference thing but I usually find this type of trivial
wrappers more obfuscating than actually helpful.  Dunno what James's
preference is tho.

> +/* Test whether scsi_remove_host() may finish, and if so, wake it up. */
> +void __scsi_check_remove_host_done(struct Scsi_Host *shost)
> +{
> +	lockdep_assert_held(shost->host_lock);
> +
> +	if (__scsi_remove_host_done(shost))
> +		wake_up(&shost->remove_host);
> +}

Why the __ prefix?  It's not like they have different
external/internal versions.

This being an one-time thing.  Using completion could be simpler.  e.g.

scsi_check_no_device_left()
{
	if (list_empty(__devices))
		complete(host->no_device_left);
}

scsi_remove_host()
{
	blah blah;
	scsi_check_remove_host_done();
	wait_for_completion(&host->no_device_left);
	blah blah;
}

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-12-02 13:45   ` Tejun Heo
@ 2012-12-02 13:48     ` Tejun Heo
  2012-12-03  8:23     ` Bart Van Assche
  1 sibling, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hey, again.

On Sun, Dec 02, 2012 at 05:45:15AM -0800, Tejun Heo wrote:
> > +/* Return true if and only if scsi_remove_host() is allowed to finish. */
> > +static bool __scsi_remove_host_done(struct Scsi_Host *shost)
> > +{
> > +	lockdep_assert_held(shost->host_lock);
> > +
> > +	return list_empty(&shost->__devices);
> > +}
> 
> This is a preference thing but I usually find this type of trivial
> wrappers more obfuscating than actually helpful.  Dunno what James's
> preference is tho.

Please disregard this one.  You're adding more conditions afterwards.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 11/13] Make scsi_remove_host() wait until error handling finished
  2012-11-28 12:52 ` [PATCH v6 11/13] Make scsi_remove_host() wait until error handling finished Bart Van Assche
@ 2012-12-02 13:51   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Wed, Nov 28, 2012 at 01:52:51PM +0100, Bart Van Assche wrote:
> A SCSI LLD may start cleaning up host resources as soon as
> scsi_remove_host() returns. These host resources may be needed by
> the LLD in an implementation of one of the eh_* functions. So if
> one of the eh_* functions is in progress when scsi_remove_host()
> is invoked, wait until the eh_* function has finished. Also, do
> not invoke any of the eh_* functions after scsi_remove_host() has
> started.

Nice catch.

>  /**
> + * scsi_begin_eh - start host-related error handling
> + *
> + * Must be called before invoking any of the scsi_host_template.eh_* functions
> + * to avoid that scsi_remove_host() returns while one of these callback
> + * functions is in progress.
> + *
> + * Returns true if invoking the eh_* function is allowed and false if not.
> + * If this function returns true then scsi_end_eh() must be called eventually.
> + *
> + * Note: scsi_send_eh_cmnd() calls do not have to be protected by a
> + * scsi_begin_eh() / scsi_end_eh() pair since these operate on an unfinished
> + * block layer request. Since scsi_remove_host() waits until all SCSI devices
> + * have been removed and since blk_cleanup_queue() is invoked during SCSI
> + * device removal scsi_remove_host() won't finish while a scsi_send_eh_cmnd()
> + * call is in progress.
> + */
> +static bool scsi_begin_eh(struct Scsi_Host *host)
> +{
> +	bool res;
> +
> +	spin_lock_irq(host->host_lock);
> +	res = scsi_host_scan_allowed(host);
> +	if (res) {
> +		WARN_ON_ONCE(host->eh_active < 0 || host->eh_active > 1);
> +		host->eh_active++;
> +	}
> +	spin_unlock_irq(host->host_lock);
> +
> +	return res;
> +}

No biggie but I find it somewhat unusual to use bool success/failure
return for a function named scsi_begin_eh().  0/-errno would be more
conventional.  People do use bool return for get/acquite type
operations but it's a bit unusual to do that for begin().  Plus, the
code would actually be simpler with -errno failure, right?

	if (scsi_begin_eh(host))
		return FAST_IO_FAIL;

	/* no change */

	scsi_end_eh(host);
	return rtn;

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 12/13] Avoid that scsi_device_set_state() triggers a race
  2012-11-28 12:53 ` [PATCH v6 12/13] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
@ 2012-12-02 13:53   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:53 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On Wed, Nov 28, 2012 at 01:53:24PM +0100, Bart Van Assche wrote:
> Make concurrent invocations of scsi_device_set_state() safe.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Mike Christie <michaelc@cs.wisc.edu>

Acked-by: Tejun Heo <tj@kernel.org>

Yeah, this is an old one.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 13/13] Do not queue new I/O after scsi_remove_host() started
  2012-11-28 12:53 ` [PATCH v6 13/13] Do not queue new I/O after scsi_remove_host() started Bart Van Assche
@ 2012-12-02 13:58   ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:58 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello,

On Wed, Nov 28, 2012 at 01:53:58PM +0100, Bart Van Assche wrote:
> The function scsi_remove_host() may get invoked concurrently with
> scsi_request_fn(). Kill those I/O requests for which processing
> starts after scsi_remove_host() has been invoked. This makes
> device removal a little quicker by avoiding that such SCSI
> commands time out.
...
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index f3d6e0d..5fe25b3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1549,7 +1549,8 @@ static void scsi_request_fn(struct request_queue *q)
>  		if (!req || !scsi_dev_queue_ready(q, sdev))
>  			break;
>  
> -		if (unlikely(!scsi_device_online(sdev))) {
> +		if (unlikely(!scsi_device_online(sdev) ||
> +			     !scsi_host_scan_allowed(shost))) {
>  			sdev_printk(KERN_ERR, sdev,
>  				    "rejecting I/O to offline device\n");
>  			scsi_kill_request(req, q);

I don't know.  This feels a bit weird to me.  Shouldn't we instead
make sure that scsi_device_onilne() test fails once host removal is
initiated?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 04/13] block: Avoid scheduling delayed work on a dead queue
  2012-12-02 13:41     ` Bart Van Assche
@ 2012-12-02 13:59       ` Tejun Heo
  0 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 13:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Sun, Dec 02, 2012 at 02:41:20PM +0100, Bart Van Assche wrote:
> I agree. There is a reason though I have not yet added a
> lockdep_assert_held() statement in these functions: there are still
> unprotected calls of blk_run_queue_async() in drivers/md/dm.c and
> drivers/scsi/scsi_transport_fc.c.

Hmmm... that's nasty.  Any chance you can fix them up too?

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/13 v6] More device removal fixes
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (12 preceding siblings ...)
  2012-11-28 12:53 ` [PATCH v6 13/13] Do not queue new I/O after scsi_remove_host() started Bart Van Assche
@ 2012-12-02 14:02 ` Tejun Heo
  2012-12-06 13:33 ` Jens Axboe
  14 siblings, 0 replies; 36+ messages in thread
From: Tejun Heo @ 2012-12-02 14:02 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Wed, Nov 28, 2012 at 01:39:53PM +0100, Bart Van Assche wrote:
> Fix a few race conditions that can be triggered by removing a device:
> - Avoid that request_fn() can be invoked on a dead queue.
> - Avoid that blk_cleanup_queue() can finish while request_fn is still
>   running.
> - Fix a race between starved list processing and device removal.
> - Avoid that a SCSI LLD callback can be invoked after scsi_remove_host()
>   finished.
> - Speed up device removal by stopping error handling as soon as
>   scsi_remove_host() started.
> 
> These patches have been tested on top of kernel v3.7-rc7.

Awesome work.  I had some nits but would be happy if the whole
patchset gets merged right now.  It's embarrassing that we are *still*
this broken and, nits aside, all the patches seem correct and to the
point.

Jens, James?  Except for the last few SCSI patches, this patchset has
now been around for quite a while.  Any chance to fast-track these?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-12-02 13:45   ` Tejun Heo
  2012-12-02 13:48     ` Tejun Heo
@ 2012-12-03  8:23     ` Bart Van Assche
  2012-12-03 16:15       ` Tejun Heo
  1 sibling, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-12-03  8:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On 12/02/12 14:45, Tejun Heo wrote:
> On Wed, Nov 28, 2012 at 01:51:13PM +0100, Bart Van Assche wrote:
>> +/* Test whether scsi_remove_host() may finish, and if so, wake it up. */
>> +void __scsi_check_remove_host_done(struct Scsi_Host *shost)
>> +{
>> +	lockdep_assert_held(shost->host_lock);
>> +
>> +	if (__scsi_remove_host_done(shost))
>> +		wake_up(&shost->remove_host);
>> +}
>
> This being an one-time thing.  Using completion could be simpler.  e.g.

Sorry but I'm not sure that would work here. A user can e.g. delete all 
SCSI devices associated with a SCSI host (echo 1 
 >/sys/class/scsi_host/host<n>/device/target<m>/<lun>/delete) and then 
issue a rescan to re-add LUNs. When using a completion the completion 
would be set as soon as the last SCSI device has been deleted instead of 
only after scsi_remove_host() finished removing the re-added SCSI devices.

Bart.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-12-03  8:23     ` Bart Van Assche
@ 2012-12-03 16:15       ` Tejun Heo
  2012-12-03 16:38         ` Bart Van Assche
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2012-12-03 16:15 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Mon, Dec 03, 2012 at 09:23:59AM +0100, Bart Van Assche wrote:
> On 12/02/12 14:45, Tejun Heo wrote:
> >On Wed, Nov 28, 2012 at 01:51:13PM +0100, Bart Van Assche wrote:
> >>+/* Test whether scsi_remove_host() may finish, and if so, wake it up. */
> >>+void __scsi_check_remove_host_done(struct Scsi_Host *shost)
> >>+{
> >>+	lockdep_assert_held(shost->host_lock);
> >>+
> >>+	if (__scsi_remove_host_done(shost))
> >>+		wake_up(&shost->remove_host);
> >>+}
> >
> >This being an one-time thing.  Using completion could be simpler.  e.g.
> 
> Sorry but I'm not sure that would work here. A user can e.g. delete
> all SCSI devices associated with a SCSI host (echo 1
> >/sys/class/scsi_host/host<n>/device/target<m>/<lun>/delete) and
> then issue a rescan to re-add LUNs. When using a completion the
> completion would be set as soon as the last SCSI device has been
> deleted instead of only after scsi_remove_host() finished removing
> the re-added SCSI devices.

You can gate the whole thing with test on whether the host is being
removed.  Once removal begins, no device should be allowed to be added
and the completion triggers iff host removal is in progress.  That
should work, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-12-03 16:15       ` Tejun Heo
@ 2012-12-03 16:38         ` Bart Van Assche
  2012-12-03 16:42           ` Tejun Heo
  0 siblings, 1 reply; 36+ messages in thread
From: Bart Van Assche @ 2012-12-03 16:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On 12/03/12 17:15, Tejun Heo wrote:
> On Mon, Dec 03, 2012 at 09:23:59AM +0100, Bart Van Assche wrote:
>> On 12/02/12 14:45, Tejun Heo wrote:
>>> On Wed, Nov 28, 2012 at 01:51:13PM +0100, Bart Van Assche wrote:
>>>> +/* Test whether scsi_remove_host() may finish, and if so, wake it up. */
>>>> +void __scsi_check_remove_host_done(struct Scsi_Host *shost)
>>>> +{
>>>> +	lockdep_assert_held(shost->host_lock);
>>>> +
>>>> +	if (__scsi_remove_host_done(shost))
>>>> +		wake_up(&shost->remove_host);
>>>> +}
>>>
>>> This being an one-time thing.  Using completion could be simpler.  e.g.
>>
>> Sorry but I'm not sure that would work here. A user can e.g. delete
>> all SCSI devices associated with a SCSI host (echo 1
>>> /sys/class/scsi_host/host<n>/device/target<m>/<lun>/delete) and
>> then issue a rescan to re-add LUNs. When using a completion the
>> completion would be set as soon as the last SCSI device has been
>> deleted instead of only after scsi_remove_host() finished removing
>> the re-added SCSI devices.
>
> You can gate the whole thing with test on whether the host is being
> removed.  Once removal begins, no device should be allowed to be added
> and the completion triggers iff host removal is in progress.  That
> should work, right?

It is indeed possible to invoke complete() only if the device list 
became empty with the host state equal to SHOST_CANCEL, 
SHOST_CANCEL_RECOVERY, SHOST_DEL or SHOST_DEL_RECOVERY and in 
scsi_remove_host() to wait for that completion only if the device list 
was not empty before the host state was changed into one of the four 
mentioned states. Do you really prefer this approach over the approach 
in the patch at the start of this thread ?

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-12-03 16:38         ` Bart Van Assche
@ 2012-12-03 16:42           ` Tejun Heo
  2012-12-07  7:41             ` Bart Van Assche
  0 siblings, 1 reply; 36+ messages in thread
From: Tejun Heo @ 2012-12-03 16:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

Hello, Bart.

On Mon, Dec 03, 2012 at 05:38:52PM +0100, Bart Van Assche wrote:
> It is indeed possible to invoke complete() only if the device list
> became empty with the host state equal to SHOST_CANCEL,
> SHOST_CANCEL_RECOVERY, SHOST_DEL or SHOST_DEL_RECOVERY and in

We can test this with !scsi_host_scan_allowed(), right?

> scsi_remove_host() to wait for that completion only if the device
> list was not empty before the host state was changed into one of the
> four mentioned states. Do you really prefer this approach over the
> approach in the patch at the start of this thread ?

Maybe I'm missing something but if possible completion tends to be
much simpler than using waitqueue directly.  I *think* that's the case
here.  Am I missing something?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/13 v6] More device removal fixes
  2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
                   ` (13 preceding siblings ...)
  2012-12-02 14:02 ` [PATCH 0/13 v6] More device removal fixes Tejun Heo
@ 2012-12-06 13:33 ` Jens Axboe
  14 siblings, 0 replies; 36+ messages in thread
From: Jens Axboe @ 2012-12-06 13:33 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Tejun Heo, Chanho Min,
	Hannes Reinecke

On 2012-11-28 13:39, Bart Van Assche wrote:
> Fix a few race conditions that can be triggered by removing a device:
> - Avoid that request_fn() can be invoked on a dead queue.
> - Avoid that blk_cleanup_queue() can finish while request_fn is still
>   running.
> - Fix a race between starved list processing and device removal.
> - Avoid that a SCSI LLD callback can be invoked after scsi_remove_host()
>   finished.
> - Speed up device removal by stopping error handling as soon as
>   scsi_remove_host() started.

Bart, I've applied 1-6 to the block 3.8 tree. I still am not super happy
with 5/6, but I don't have any better ideas at the moment.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 10/13] Make scsi_remove_host() wait for device removal
  2012-12-03 16:42           ` Tejun Heo
@ 2012-12-07  7:41             ` Bart Van Assche
  0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2012-12-07  7:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Chanho Min, Hannes Reinecke

On 12/03/12 17:42, Tejun Heo wrote:
> On Mon, Dec 03, 2012 at 05:38:52PM +0100, Bart Van Assche wrote:
>> It is indeed possible to invoke complete() only if the device list
>> became empty with the host state equal to SHOST_CANCEL,
>> SHOST_CANCEL_RECOVERY, SHOST_DEL or SHOST_DEL_RECOVERY and in
>
> We can test this with !scsi_host_scan_allowed(), right?
>
>> scsi_remove_host() to wait for that completion only if the device
>> list was not empty before the host state was changed into one of the
>> four mentioned states. Do you really prefer this approach over the
>> approach in the patch at the start of this thread ?
>
> Maybe I'm missing something but if possible completion tends to be
> much simpler than using waitqueue directly.  I *think* that's the case
> here.  Am I missing something?

Hello Tejun,

You are right that it should be possible to use a completion in this 
context instead of a wait queue. However, I'm not enthusiast about using 
a completion for this patch because it would mean introducing several 
additional if-statements. Such if-statements would make this patch even 
harder to test then it already is now.

Note: all your other review comments should have been addressed in v7 of 
this patch series.

Bart.


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2012-12-07  7:41 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-28 12:39 [PATCH 0/13 v6] More device removal fixes Bart Van Assche
2012-11-28 12:42 ` [PATCH v6 01/13] block: Rename queue dead flag Bart Van Assche
2012-11-28 12:43 ` [PATCH v6 02/13] block: Let blk_drain_queue() caller obtain the queue lock Bart Van Assche
2012-11-28 12:44 ` [PATCH v6 03/13] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
2012-12-02 13:23   ` Tejun Heo
2012-12-02 13:35     ` Bart Van Assche
2012-11-28 12:45 ` [PATCH v6 04/13] block: Avoid scheduling delayed work " Bart Van Assche
2012-12-02 13:26   ` Tejun Heo
2012-12-02 13:41     ` Bart Van Assche
2012-12-02 13:59       ` Tejun Heo
2012-11-28 12:46 ` [PATCH v6 05/13] block: Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
2012-12-02 13:28   ` Tejun Heo
2012-11-28 12:47 ` [PATCH v6 06/13] bsg: Remove unused function bsg_goose_queue() Bart Van Assche
2012-12-02 13:29   ` Tejun Heo
2012-11-28 12:48 ` [PATCH v6 07/13] Fix race between starved list processing and device removal Bart Van Assche
2012-12-02 13:32   ` Tejun Heo
2012-11-28 12:48 ` [PATCH v6 08/13] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
2012-12-02 13:34   ` Tejun Heo
2012-11-28 12:50 ` [PATCH v6 09/13] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
2012-12-02 13:35   ` Tejun Heo
2012-11-28 12:51 ` [PATCH v6 10/13] Make scsi_remove_host() wait for device removal Bart Van Assche
2012-12-02 13:45   ` Tejun Heo
2012-12-02 13:48     ` Tejun Heo
2012-12-03  8:23     ` Bart Van Assche
2012-12-03 16:15       ` Tejun Heo
2012-12-03 16:38         ` Bart Van Assche
2012-12-03 16:42           ` Tejun Heo
2012-12-07  7:41             ` Bart Van Assche
2012-11-28 12:52 ` [PATCH v6 11/13] Make scsi_remove_host() wait until error handling finished Bart Van Assche
2012-12-02 13:51   ` Tejun Heo
2012-11-28 12:53 ` [PATCH v6 12/13] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
2012-12-02 13:53   ` Tejun Heo
2012-11-28 12:53 ` [PATCH v6 13/13] Do not queue new I/O after scsi_remove_host() started Bart Van Assche
2012-12-02 13:58   ` Tejun Heo
2012-12-02 14:02 ` [PATCH 0/13 v6] More device removal fixes Tejun Heo
2012-12-06 13:33 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).