public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* mpt3sas sleep from atomic context on v4.10
@ 2017-03-01  0:25 Omar Sandoval
  2017-03-01  1:07 ` Bart Van Assche
  0 siblings, 1 reply; 3+ messages in thread
From: Omar Sandoval @ 2017-03-01  0:25 UTC (permalink / raw)
  To: Bart Van Assche, linux-scsi
  Cc: Sathya Prakash, Chaitra P B, Suganath Prabu Subramani,
	Sreekanth Reddy, James E.J. Bottomley, Martin K. Petersen,
	kernel-team

I'm seeing this while testing on Linus' current master:

[  427.814466] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:149 __handle_irq_event_percpu+0x187/0x190
[  427.832552] irq 116 handler _base_interrupt+0x0/0x9e0 [mpt3sas] enabled interrupts

I tracked it down to commit 669f044170d8 ("scsi: srp_transport: Move
queuecommand() wait code to SCSI core"). That commit made it so
scsi_internal_device_block() can sleep, but mpt3sas calls this from an
interrupt handler:

_base_interrupt
-> _base_async_event
   -> mpt3sas_scsih_event_callback
      -> _scsih_check_topo_delete_events
         -> _scsih_block_io_to_children_attached_directly
	    -> _scsih_block_io_device
	       -> _scsih_internal_device_block
	          -> scsi_internal_device_block

This change was made in 4.10. Bart, can you take a look?

Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mpt3sas sleep from atomic context on v4.10
  2017-03-01  0:25 mpt3sas sleep from atomic context on v4.10 Omar Sandoval
@ 2017-03-01  1:07 ` Bart Van Assche
  2017-03-01  6:20   ` Omar Sandoval
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Van Assche @ 2017-03-01  1:07 UTC (permalink / raw)
  To: linux-scsi@vger.kernel.org, osandov@osandov.com
  Cc: chaitra.basappa@broadcom.com, sathya.prakash@broadcom.com,
	suganath-prabu.subramani@broadcom.com,
	Sreekanth.Reddy@broadcom.com, martin.petersen@oracle.com,
	jejb@linux.vnet.ibm.com, kernel-team@fb.com

On Tue, 2017-02-28 at 16:25 -0800, Omar Sandoval wrote:
> I'm seeing this while testing on Linus' current master:
> 
> [  427.814466] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:149 __handle_irq_event_percpu+0x187/0x190
> [  427.832552] irq 116 handler _base_interrupt+0x0/0x9e0 [mpt3sas] enabled interrupts
> 
> I tracked it down to commit 669f044170d8 ("scsi: srp_transport: Move
> queuecommand() wait code to SCSI core"). That commit made it so
> scsi_internal_device_block() can sleep, but mpt3sas calls this from an
> interrupt handler:
> 
> _base_interrupt
> -> _base_async_event
>    -> mpt3sas_scsih_event_callback
>       -> _scsih_check_topo_delete_events
>          -> _scsih_block_io_to_children_attached_directly
> 	    -> _scsih_block_io_device
> 	       -> _scsih_internal_device_block
> 	          -> scsi_internal_device_block
> 
> This change was made in 4.10. Bart, can you take a look?

How about the (entirely untested) patch below?

---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  3 ---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  4 ++--
 drivers/scsi/scsi_lib.c              | 32 ++++++++++++++++++--------------
 drivers/scsi/scsi_priv.h             |  3 ---
 include/scsi/scsi_device.h           |  4 ++++
 5 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 4ab634fc27df..1aa7f97613ab 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -1444,9 +1444,6 @@ void mpt3sas_transport_update_links(struct MPT3SAS_ADAPTER *ioc,
 	u64 sas_address, u16 handle, u8 phy_number, u8 link_rate);
 extern struct sas_function_template mpt3sas_transport_functions;
 extern struct scsi_transport_template *mpt3sas_transport_template;
-extern int scsi_internal_device_block(struct scsi_device *sdev);
-extern int scsi_internal_device_unblock(struct scsi_device *sdev,
-				enum scsi_device_state new_state);
 /* trigger data externs */
 void mpt3sas_send_trigger_data_event(struct MPT3SAS_ADAPTER *ioc,
 	struct SL_WH_TRIGGERS_EVENT_DATA_T *event_data);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 46e866c36c8a..16d34a4bdc2e 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2859,7 +2859,7 @@ _scsih_internal_device_block(struct scsi_device *sdev,
 	    sas_device_priv_data->sas_target->handle);
 	sas_device_priv_data->block = 1;
 
-	r = scsi_internal_device_block(sdev);
+	r = scsi_internal_device_block(sdev, false);
 	if (r == -EINVAL)
 		sdev_printk(KERN_WARNING, sdev,
 		    "device_block failed with return(%d) for handle(0x%04x)\n",
@@ -2895,7 +2895,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev,
 		    "performing a block followed by an unblock\n",
 		    r, sas_device_priv_data->sas_target->handle);
 		sas_device_priv_data->block = 1;
-		r = scsi_internal_device_block(sdev);
+		r = scsi_internal_device_block(sdev, false);
 		if (r)
 			sdev_printk(KERN_WARNING, sdev, "retried device_block "
 			    "failed with return(%d) for handle(0x%04x)\n",
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 3e32dc954c3c..77851697f130 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2945,6 +2945,8 @@ EXPORT_SYMBOL(scsi_target_resume);
 /**
  * scsi_internal_device_block - internal function to put a device temporarily into the SDEV_BLOCK state
  * @sdev:	device to block
+ * @wait:       Whether or not to wait until ongoing .queuecommand() /
+ *		.queue_rq() calls have finished.
  *
  * Block request made by scsi lld's to temporarily stop all
  * scsi commands on the specified device. May sleep.
@@ -2962,7 +2964,7 @@ EXPORT_SYMBOL(scsi_target_resume);
  * remove the rport mutex lock and unlock calls from srp_queuecommand().
  */
 int
-scsi_internal_device_block(struct scsi_device *sdev)
+scsi_internal_device_block(struct scsi_device *sdev, bool wait)
 {
 	struct request_queue *q = sdev->request_queue;
 	unsigned long flags;
@@ -2976,18 +2978,20 @@ scsi_internal_device_block(struct scsi_device *sdev)
 			return err;
 	}
 
-	/* 
-	 * The device has transitioned to SDEV_BLOCK.  Stop the
-	 * block layer from calling the midlayer with this device's
-	 * request queue. 
-	 */
-	if (q->mq_ops) {
-		blk_mq_quiesce_queue(q);
-	} else {
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_stop_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-		scsi_wait_for_queuecommand(sdev);
+	if (wait) {
+		/*
+		 * The device has transitioned to SDEV_BLOCK.  Stop the
+		 * block layer from calling the midlayer with this device's
+		 * request queue.
+		 */
+		if (q->mq_ops) {
+			blk_mq_quiesce_queue(q);
+		} else {
+			spin_lock_irqsave(q->queue_lock, flags);
+			blk_stop_queue(q);
+			spin_unlock_irqrestore(q->queue_lock, flags);
+			scsi_wait_for_queuecommand(sdev);
+		}
 	}
 
 	return 0;
@@ -3049,7 +3053,7 @@ EXPORT_SYMBOL_GPL(scsi_internal_device_unblock);
 static void
 device_block(struct scsi_device *sdev, void *data)
 {
-	scsi_internal_device_block(sdev);
+	scsi_internal_device_block(sdev, true);
 }
 
 static int
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 99bfc985e190..f11bd102d6d5 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -188,8 +188,5 @@ static inline void scsi_dh_remove_device(struct scsi_device *sdev) { }
  */
 
 #define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	600	/* units in seconds */
-extern int scsi_internal_device_block(struct scsi_device *sdev);
-extern int scsi_internal_device_unblock(struct scsi_device *sdev,
-					enum scsi_device_state new_state);
 
 #endif /* _SCSI_PRIV_H */
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 8990e580b278..b5018df7ad54 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -474,6 +474,10 @@ static inline int scsi_device_created(struct scsi_device *sdev)
 		sdev->sdev_state == SDEV_CREATED_BLOCK;
 }
 
+int scsi_internal_device_block(struct scsi_device *sdev, bool wait);
+int scsi_internal_device_unblock(struct scsi_device *sdev,
+				 enum scsi_device_state new_state);
+
 /* accessor functions for the SCSI parameters */
 static inline int scsi_device_sync(struct scsi_device *sdev)
 {
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: mpt3sas sleep from atomic context on v4.10
  2017-03-01  1:07 ` Bart Van Assche
@ 2017-03-01  6:20   ` Omar Sandoval
  0 siblings, 0 replies; 3+ messages in thread
From: Omar Sandoval @ 2017-03-01  6:20 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi@vger.kernel.org, chaitra.basappa@broadcom.com,
	sathya.prakash@broadcom.com,
	suganath-prabu.subramani@broadcom.com,
	Sreekanth.Reddy@broadcom.com, martin.petersen@oracle.com,
	jejb@linux.vnet.ibm.com, kernel-team@fb.com

On Wed, Mar 01, 2017 at 01:07:12AM +0000, Bart Van Assche wrote:
> On Tue, 2017-02-28 at 16:25 -0800, Omar Sandoval wrote:
> > I'm seeing this while testing on Linus' current master:
> > 
> > [  427.814466] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:149 __handle_irq_event_percpu+0x187/0x190
> > [  427.832552] irq 116 handler _base_interrupt+0x0/0x9e0 [mpt3sas] enabled interrupts
> > 
> > I tracked it down to commit 669f044170d8 ("scsi: srp_transport: Move
> > queuecommand() wait code to SCSI core"). That commit made it so
> > scsi_internal_device_block() can sleep, but mpt3sas calls this from an
> > interrupt handler:
> > 
> > _base_interrupt
> > -> _base_async_event
> >    -> mpt3sas_scsih_event_callback
> >       -> _scsih_check_topo_delete_events
> >          -> _scsih_block_io_to_children_attached_directly
> > 	    -> _scsih_block_io_device
> > 	       -> _scsih_internal_device_block
> > 	          -> scsi_internal_device_block
> > 
> > This change was made in 4.10. Bart, can you take a look?
> 
> How about the (entirely untested) patch below?
> 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  3 ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c |  4 ++--
>  drivers/scsi/scsi_lib.c              | 32 ++++++++++++++++++--------------
>  drivers/scsi/scsi_priv.h             |  3 ---
>  include/scsi/scsi_device.h           |  4 ++++
>  5 files changed, 24 insertions(+), 22 deletions(-)

[snip]

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 3e32dc954c3c..77851697f130 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2945,6 +2945,8 @@ EXPORT_SYMBOL(scsi_target_resume);
>  /**
>   * scsi_internal_device_block - internal function to put a device temporarily into the SDEV_BLOCK state
>   * @sdev:	device to block
> + * @wait:       Whether or not to wait until ongoing .queuecommand() /
> + *		.queue_rq() calls have finished.
>   *
>   * Block request made by scsi lld's to temporarily stop all
>   * scsi commands on the specified device. May sleep.
> @@ -2962,7 +2964,7 @@ EXPORT_SYMBOL(scsi_target_resume);
>   * remove the rport mutex lock and unlock calls from srp_queuecommand().
>   */
>  int
> -scsi_internal_device_block(struct scsi_device *sdev)
> +scsi_internal_device_block(struct scsi_device *sdev, bool wait)
>  {
>  	struct request_queue *q = sdev->request_queue;
>  	unsigned long flags;
> @@ -2976,18 +2978,20 @@ scsi_internal_device_block(struct scsi_device *sdev)
>  			return err;
>  	}
>  
> -	/* 
> -	 * The device has transitioned to SDEV_BLOCK.  Stop the
> -	 * block layer from calling the midlayer with this device's
> -	 * request queue. 
> -	 */
> -	if (q->mq_ops) {
> -		blk_mq_quiesce_queue(q);
> -	} else {
> -		spin_lock_irqsave(q->queue_lock, flags);
> -		blk_stop_queue(q);
> -		spin_unlock_irqrestore(q->queue_lock, flags);
> -		scsi_wait_for_queuecommand(sdev);
> +	if (wait) {
> +		/*
> +		 * The device has transitioned to SDEV_BLOCK.  Stop the
> +		 * block layer from calling the midlayer with this device's
> +		 * request queue.
> +		 */
> +		if (q->mq_ops) {
> +			blk_mq_quiesce_queue(q);
> +		} else {
> +			spin_lock_irqsave(q->queue_lock, flags);
> +			blk_stop_queue(q);
> +			spin_unlock_irqrestore(q->queue_lock, flags);
> +			scsi_wait_for_queuecommand(sdev);
> +		}
>  	}

I think here, we want this instead:

@@ -2987,7 +2989,8 @@ scsi_internal_device_block(struct scsi_device *sdev)
                spin_lock_irqsave(q->queue_lock, flags);
                blk_stop_queue(q);
                spin_unlock_irqrestore(q->queue_lock, flags);
-               scsi_wait_for_queuecommand(sdev);
+               if (wait)
+                       scsi_wait_for_queuecommand(sdev);
        }

        return 0;

That fixes the warnings for me.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-03-01  6:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-01  0:25 mpt3sas sleep from atomic context on v4.10 Omar Sandoval
2017-03-01  1:07 ` Bart Van Assche
2017-03-01  6:20   ` Omar Sandoval

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox