All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michaelc@cs.wisc.edu>
To: Christof Schmitt <christof.schmitt@de.ibm.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: Deleting SCSI device on blocked rport
Date: Sun, 05 Sep 2010 13:07:44 -0500	[thread overview]
Message-ID: <4C83DC70.5030304@cs.wisc.edu> (raw)
In-Reply-To: <20100902110540.GB4097@schmichrtp.mainz.de.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 2696 bytes --]

On 09/02/2010 06:05 AM, Christof Schmitt wrote:
> Deleting a SCSI device on a rport in the state FC_PORTSTATE_BLOCKED,
> but before the fast_io_fail_tmo expires results in a hanging kernel
> thread:
>
> STACK TRACE FOR TASK: 0x2a368b38 (sysfsd)
>
>   STACK:
>   0 schedule+1108 [0x5cac48]
>   1 schedule_timeout+528 [0x5cb7fc]
>   2 wait_for_common+266 [0x5ca6be]
>   3 blk_execute_rq+160 [0x354054]
>   4 scsi_execute+324 [0x3b7ef4]
>   5 scsi_execute_req+162 [0x3b80ca]
>   6 sd_sync_cache+138 [0x3cf662]
>   7 sd_shutdown+138 [0x3cf91a]
>   8 sd_remove+112 [0x3cfe4c]
>   9 __device_release_driver+124 [0x3a08b8]
> 10 device_release_driver+60 [0x3a0a5c]
> 11 bus_remove_device+266 [0x39fa76]
> 12 device_del+340 [0x39d818]
> 13 __scsi_remove_device+204 [0x3bcc48]
> 14 scsi_remove_device+66 [0x3bcc8e]
> 15 sysfs_schedule_callback_work+50 [0x260d66]
> 16 worker_thread+622 [0x162326]
> 17 kthread+160 [0x1680b0]
> 18 kernel_thread_starter+6 [0x10aaea]
>
> When the fast_io_fail_tmo or dev_loss_tmo expire, this does not
> change, so this has the potential of blocking the entire system.

Are you saying if you delete the device then one of those timers fires, 
nothing starts the queues? Is it because scsi_target_unblock is not 
seeing the devices, because the scsi_remove_device has already removed 
it from the device lists?

>
> The request queue seems to be STOPPED at the moment.
>          queue_flags = 0xa805
>

What causes the delete? Is it userspace or a scsi_remove_device by a LLD?

It looks like if the driver does fc_remove_host it will call 
fc_rport_final_delete->fc_terminate_rport_io->scsi_target_unblock..-> 
blk_start_queue which clears the queue_flags stopped bit and avoids the 
problem.

And it looks like if fc_terminate_rport_io is called by fast_io_fail or 
the dev_loss_tmo handlers that will call scsi_target_unblock too.

We hit something similar in iscsi, because it used to loop over devices 
in userspace and would call the device's delete sysfs attr.


> I am not sure how to approach this. One idea would be that the unblock
> in fc_terminate_rport_io should also trigger the release of the
> pending command, but it does not seem to happen.
>

I did the attached patch for iscsi. It starts the queue and runs it. You 
still have to wait for the transport class to move from blocked to 
online or dead/not-present, so the queuecommand chkready functions can 
fail or finish the IO.

One thing I was worried about was if something stopped the queues and 
really did not want IO sent, and did not have some checks like how FC 
and iSCSI do. Maybe we want to set some state on the scsi device so that 
scsi_request_fn can check it and just fail IO immediately?

[-- Attachment #2: flush-scsi-dev.patch --]
[-- Type: text/plain, Size: 1731 bytes --]

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9ade720..41c2625 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -67,8 +67,6 @@ static struct scsi_host_sg_pool scsi_sg_pools[] = {
 
 struct kmem_cache *scsi_sdb_cache;
 
-static void scsi_run_queue(struct request_queue *q);
-
 /*
  * Function:	scsi_unprep_request()
  *
@@ -397,7 +395,7 @@ static inline int scsi_host_is_busy(struct Scsi_Host *shost)
  * Notes:	The previous command was completely finished, start
  *		a new one if possible.
  */
-static void scsi_run_queue(struct request_queue *q)
+void scsi_run_queue(struct request_queue *q)
 {
 	struct scsi_device *sdev = q->queuedata;
 	struct Scsi_Host *shost = sdev->host;
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index b4056d1..d041cdb 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -26,6 +26,7 @@ extern int scsi_init_hosts(void);
 extern void scsi_exit_hosts(void);
 
 /* scsi.c */
+extern void scsi_run_queue(struct request_queue *q);
 extern int scsi_dispatch_cmd(struct scsi_cmnd *cmd);
 extern int scsi_setup_command_freelist(struct Scsi_Host *shost);
 extern void scsi_destroy_command_freelist(struct Scsi_Host *shost);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index c3f6737..b829ffc 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -923,6 +923,9 @@ void __scsi_remove_device(struct scsi_device *sdev)
 		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
 			return;
 
+		blk_start_queue(sdev->request_queue);
+		scsi_run_queue(sdev->request_queue);
+
 		bsg_unregister_queue(sdev->request_queue);
 		device_unregister(&sdev->sdev_dev);
 		transport_remove_device(dev);

  reply	other threads:[~2010-09-05 18:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-02 11:05 Deleting SCSI device on blocked rport Christof Schmitt
2010-09-05 18:07 ` Mike Christie [this message]
2010-09-06 10:47   ` Christof Schmitt
2010-09-06 10:59     ` [PATCH] scsi: Unblock devices in state SDEV_CANCEL Christof Schmitt
2010-10-06  8:10       ` Mike Christie
2010-10-06 11:45         ` [PATCH v2] scsi: Unblock devices in SDEV_CANCEL Christof Schmitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C83DC70.5030304@cs.wisc.edu \
    --to=michaelc@cs.wisc.edu \
    --cc=christof.schmitt@de.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.