public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <Bart.VanAssche@sandisk.com>
To: "James.Bottomley@HansenPartnership.com"
	<James.Bottomley@HansenPartnership.com>,
	"bblock@linux.vnet.ibm.com" <bblock@linux.vnet.ibm.com>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"maxg@mellanox.com" <maxg@mellanox.com>,
	"israelr@mellanox.com" <israelr@mellanox.com>,
	"hare@suse.de" <hare@suse.de>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>
Subject: Re: [PATCH v3 3/4] sd: Make synchronize cache upon shutdown asynchronous
Date: Tue, 18 Apr 2017 23:47:16 +0000	[thread overview]
Message-ID: <1492559235.2689.27.camel@sandisk.com> (raw)
In-Reply-To: <1492530984.3306.25.camel@HansenPartnership.com>

[-- Attachment #1: Type: text/plain, Size: 563 bytes --]

On Tue, 2017-04-18 at 08:56 -0700, James Bottomley wrote:
> How about this approach.  It goes straight to DEL if the device is
> blocked (skipping CANCEL).  This means that all the commands issued in 
> ->shutdown will error in the mid-layer, thus making the removal proceed
> without being stopped.

Hello James,

The three attached patches pass my tests. Please let me know how you would
like to proceed with patch 1/3. Would you like to submit it yourself or is
it OK for you if I mention you as author and add your Signed-off-by?

Thanks,

Bart.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Make-__scsi_remove_device-go-straight-from-BLOCKED-t.patch --]
[-- Type: text/x-patch; name="0001-Make-__scsi_remove_device-go-straight-from-BLOCKED-t.patch", Size: 2470 bytes --]

From 9482fdc8b322f15ced6f64d57f45026367604a23 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue, 18 Apr 2017 10:11:02 -0700
Subject: [PATCH 1/3] Make __scsi_remove_device go straight from BLOCKED to DEL

If a device is blocked, make __scsi_remove_device() cause it to
transition to the DEL state. This means that all the commands
issued in .shutdown() will error in the mid-layer, thus making
the removal proceed without being stopped.

This patch is a slightly modified version of a patch from James
Bottomley.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Israel Rukshin <israelr@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_lib.c   |  2 +-
 drivers/scsi/scsi_sysfs.c | 12 +++++++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index eecc005099b2..277c8b3ae7b0 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2611,7 +2611,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
 		case SDEV_QUIESCE:
 		case SDEV_OFFLINE:
 		case SDEV_TRANSPORT_OFFLINE:
-		case SDEV_BLOCK:
 			break;
 		default:
 			goto illegal;
@@ -2625,6 +2624,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
 		case SDEV_OFFLINE:
 		case SDEV_TRANSPORT_OFFLINE:
 		case SDEV_CANCEL:
+		case SDEV_BLOCK:
 		case SDEV_CREATED_BLOCK:
 			break;
 		default:
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 82dfe07b1d47..f95d191ec809 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1282,9 +1282,19 @@ void __scsi_remove_device(struct scsi_device *sdev)
 		return;
 
 	if (sdev->is_visible) {
-		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
+		/*
+		 * If blocked, we go straight to DEL so any commands
+		 * issued during the driver shutdown (like sync cache)
+		 * are errored.
+		 */
+		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0 &&
+		    scsi_device_set_state(sdev, SDEV_DEL) != 0)
 			return;
 
+		if (sdev->sdev_state == SDEV_DEL)
+			sdev_printk(KERN_DEBUG, sdev,
+				    "Changed state from BLOCKED to DEL\n");
+
 		bsg_unregister_queue(sdev->request_queue);
 		device_unregister(&sdev->sdev_dev);
 		transport_remove_device(dev);
-- 
2.12.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Introduce-scsi_start_queue.patch --]
[-- Type: text/x-patch; name="0002-Introduce-scsi_start_queue.patch", Size: 2756 bytes --]

From c3f85b714fcfb12d43669b7f295a09f4718c2704 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue, 28 Mar 2017 14:00:17 -0700
Subject: [PATCH 2/3] Introduce scsi_start_queue()

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Israel Rukshin <israelr@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_lib.c  | 25 +++++++++++++++----------
 drivers/scsi/scsi_priv.h |  1 +
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 277c8b3ae7b0..376cd1da102c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2987,6 +2987,20 @@ scsi_internal_device_block(struct scsi_device *sdev, bool wait)
 }
 EXPORT_SYMBOL_GPL(scsi_internal_device_block);
  
+void scsi_start_queue(struct scsi_device *sdev)
+{
+	struct request_queue *q = sdev->request_queue;
+	unsigned long flags;
+
+	if (q->mq_ops) {
+		blk_mq_start_stopped_hw_queues(q, false);
+	} else {
+		spin_lock_irqsave(q->queue_lock, flags);
+		blk_start_queue(q);
+		spin_unlock_irqrestore(q->queue_lock, flags);
+	}
+}
+
 /**
  * scsi_internal_device_unblock - resume a device after a block request
  * @sdev:	device to resume
@@ -3007,9 +3021,6 @@ int
 scsi_internal_device_unblock(struct scsi_device *sdev,
 			     enum scsi_device_state new_state)
 {
-	struct request_queue *q = sdev->request_queue; 
-	unsigned long flags;
-
 	/*
 	 * Try to transition the scsi device to SDEV_RUNNING or one of the
 	 * offlined states and goose the device queue if successful.
@@ -3027,13 +3038,7 @@ scsi_internal_device_unblock(struct scsi_device *sdev,
 		 sdev->sdev_state != SDEV_OFFLINE)
 		return -EINVAL;
 
-	if (q->mq_ops) {
-		blk_mq_start_stopped_hw_queues(q, false);
-	} else {
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
+	scsi_start_queue(sdev);
 
 	return 0;
 }
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index f11bd102d6d5..c7629e31a75b 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -89,6 +89,7 @@ extern void scsi_run_host_queues(struct Scsi_Host *shost);
 extern void scsi_requeue_run_queue(struct work_struct *work);
 extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev);
 extern struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev);
+extern void scsi_start_queue(struct scsi_device *sdev);
 extern int scsi_mq_setup_tags(struct Scsi_Host *shost);
 extern void scsi_mq_destroy_tags(struct Scsi_Host *shost);
 extern int scsi_init_queue(void);
-- 
2.12.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-Avoid-that-__scsi_remove_device-hangs.patch --]
[-- Type: text/x-patch; name="0003-Avoid-that-__scsi_remove_device-hangs.patch", Size: 2422 bytes --]

From c383551a721d30d897d45244acd331ff0af94656 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue, 28 Mar 2017 14:00:25 -0700
Subject: [PATCH 3/3] Avoid that __scsi_remove_device() hangs

Since scsi_target_unblock() uses starget_for_each_device(), since
starget_for_each_device() uses scsi_device_get(), since
scsi_device_get() fails after unloading of the LLD kernel module
has been started scsi_target_unblock() may skip devices that were
affected by scsi_target_block(). Ensure that __scsi_remove_device()
does not hang for blocked SCSI devices. This patch avoids that
unloading the ib_srp kernel module can trigger the following hang:

Call Trace:
 schedule+0x35/0x80
 schedule_timeout+0x237/0x2d0
 io_schedule_timeout+0xa6/0x110
 wait_for_completion_io+0xa3/0x110
 blk_execute_rq+0xdf/0x120
 scsi_execute+0xce/0x150 [scsi_mod]
 scsi_execute_req_flags+0x8f/0xf0 [scsi_mod]
 sd_sync_cache+0xa9/0x190 [sd_mod]
 sd_shutdown+0x6a/0x100 [sd_mod]
 sd_remove+0x64/0xc0 [sd_mod]
 __device_release_driver+0x8d/0x120
 device_release_driver+0x1e/0x30
 bus_remove_device+0xf9/0x170
 device_del+0x127/0x240
 __scsi_remove_device+0xc1/0xd0 [scsi_mod]
 scsi_forget_host+0x57/0x60 [scsi_mod]
 scsi_remove_host+0x72/0x110 [scsi_mod]
 srp_remove_work+0x8b/0x200 [ib_srp]

Reported-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_sysfs.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index f95d191ec809..e9e80241ab5e 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1309,6 +1309,15 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	 * device.
 	 */
 	scsi_device_set_state(sdev, SDEV_DEL);
+	/*
+	 * Since scsi_target_unblock() is a no-op after unloading of the SCSI
+	 * LLD has started, explicitly restart the queue. Do this after the
+	 * device state has been changed into SDEV_DEL because
+	 * scsi_prep_state_check() returns BLKPREP_KILL for the SDEV_DEL state
+	 * Do this before calling blk_cleanup_queue() to avoid that that
+	 * function encounters a stopped queue.
+	 */
+	scsi_start_queue(sdev);
 	blk_cleanup_queue(sdev->request_queue);
 	cancel_work_sync(&sdev->requeue_work);
 
-- 
2.12.2


  parent reply	other threads:[~2017-04-18 23:47 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-17 17:34 [PATCH v3 0/4] Avoid that __scsi_remove_device() hangs Bart Van Assche
2017-04-17 17:34 ` [PATCH v3 1/4] Introduce scsi_start_queue() Bart Van Assche
2017-04-17 17:34 ` [PATCH v3 2/4] Introduce scsi_execute_async() Bart Van Assche
2017-04-17 17:34 ` [PATCH v3 3/4] sd: Make synchronize cache upon shutdown asynchronous Bart Van Assche
2017-04-18 14:44   ` Benjamin Block
2017-04-18 15:34     ` Bart Van Assche
2017-04-18 15:56     ` James Bottomley
2017-04-18 16:06       ` Bart Van Assche
2017-04-18 23:47       ` Bart Van Assche [this message]
2017-04-18 23:56         ` James Bottomley
2017-04-19  0:02           ` Bart Van Assche
2017-04-19  0:05             ` James Bottomley
2017-04-19 18:42           ` Bart Van Assche
2017-04-20 21:59           ` Bart Van Assche
2017-04-20 22:13             ` James Bottomley
2017-04-20 22:27               ` Bart Van Assche
2017-04-20 22:52               ` Bart Van Assche
2017-04-23 17:28                 ` James Bottomley
2017-04-24 21:46                   ` Bart Van Assche
2017-04-26 18:53       ` Bart Van Assche
2017-04-28 18:45         ` James Bottomley
2017-04-24  7:14   ` [lkp-robot] [sd] ab1218235c: INFO:possible_recursive_locking_detected kernel test robot
2017-04-17 17:34 ` [PATCH v3 4/4] Avoid that __scsi_remove_device() hangs Bart Van Assche
2017-04-18 11:58 ` [PATCH v3 0/4] " Israel Rukshin
2017-04-18 15:40   ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1492559235.2689.27.camel@sandisk.com \
    --to=bart.vanassche@sandisk.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bblock@linux.vnet.ibm.com \
    --cc=hare@suse.de \
    --cc=israelr@mellanox.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=maxg@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox