All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading
@ 2020-04-21 20:46 mwilck
  2020-04-21 20:46 ` [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: mwilck @ 2020-04-21 20:46 UTC (permalink / raw)
  To: Martin K. Petersen, Arun Easi, Quinn Tran
  Cc: Himanshu Madhani, Roman Bolshakov, Daniel Wagner, Bart Van Assche,
	James Bottomley, linux-scsi, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Hello Martin, Arun, Himanshu, all,

here is v4 of the little series I first submitted on Nov 29, 2019.
I dropped the more controversial part, hoping to get the actual fix
for the issue merged.

Reviews welcome.
Martin

Changes since v3:
 - In patch 2, moved check from qla2x00_post_async_XYZ_work() to
   qla2x00_alloc_work() as suggested by Arun
 - Dropped patch 3-5. With 1 and 2 applied, I haven't been able to reproduce
   shutdown hangs any more.

Changes since v2:
 - Removed "scsi: qla2xxx: avoid sending mailbox commands if firmware is
   stopped", because the first hunk is obsoleted by the (new) 1/3, and Arun
   suggested to use a different approach (which is now in 4/3) for the second
   hunk.
 - Removed "scsi: qla2xxx: don't shut down firmware before closing sessions"
   (nak'd by Arun).
 - Former 3/3 is now 1/5
 - Added "scsi: qla2xxx: check UNLOADING before posting async work". This one
   is key for avoiding lags when qla2xxx is unloaded.
 - Added revert of "scsi: qla2xxx: Fix unbound sleep in fcport delete path.",
   as I believe it's now obsolete.
   If we ever encounter unbound sleep there again, we should rather figure
   out the reason than simply abort waiting.
 - Added patch 4 and 5, a new attempt at avoiding mailbox and HW request queue
   access at low level. 4/5 was motivated by Arun's comments on my v2 series.
   5/5 is obviously similar in spirit to 77ddb94a4853 ("scsi: qla2xxx: Only
   allow operational MBX to proceed during RESET."), but I found that the
   rom_cmds list contains commands that would hang when the FW is stopped,
   so I created a new list. Perhaps some day the two can be consolidated.

Changes since v1:
 - Added patch 3 to set the UNLOADING flag before waiting for sessions
   to end (Roman)
 - Use test_and_set_bit() for UNLOADING (Hannes)

Martin Wilck (2):
  scsi: qla2xxx: set UNLOADING before waiting for session deletion
  scsi: qla2xxx: check UNLOADING before posting async work

 drivers/scsi/qla2xxx/qla_os.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

-- 
2.26.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion
  2020-04-21 20:46 [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading mwilck
@ 2020-04-21 20:46 ` mwilck
  2020-04-21 21:18   ` himanshu.madhani
  2020-04-21 20:46 ` [PATCH v4 2/2] scsi: qla2xxx: check UNLOADING before posting async work mwilck
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: mwilck @ 2020-04-21 20:46 UTC (permalink / raw)
  To: Martin K. Petersen, Arun Easi, Quinn Tran
  Cc: Himanshu Madhani, Roman Bolshakov, Daniel Wagner, Bart Van Assche,
	James Bottomley, linux-scsi, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

The purpose of the UNLOADING flag is to avoid port login procedures
to continue when a controller is in the process of shutting down.
It makes sense to set this flag before starting session teardown.

Furthermore, use atomic test_and_set_bit() to avoid the shutdown
being run multiple times in parallel. In qla2x00_disable_board_on_pci_error(),
the test for UNLOADING is postponed until after the check for an already
disabled PCI board.

Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
Reviewed-by: Arun Easi <aeasi@marvell.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/qla2xxx/qla_os.c | 32 ++++++++++++++------------------
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index d9072ea..ce0dabb 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3732,6 +3732,13 @@ qla2x00_remove_one(struct pci_dev *pdev)
 	}
 	qla2x00_wait_for_hba_ready(base_vha);
 
+	/*
+	 * if UNLOADING flag is already set, then continue unload,
+	 * where it was set first.
+	 */
+	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
+		return;
+
 	if (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) ||
 	    IS_QLA28XX(ha)) {
 		if (ha->flags.fw_started)
@@ -3750,15 +3757,6 @@ qla2x00_remove_one(struct pci_dev *pdev)
 
 	qla2x00_wait_for_sess_deletion(base_vha);
 
-	/*
-	 * if UNLOAD flag is already set, then continue unload,
-	 * where it was set first.
-	 */
-	if (test_bit(UNLOADING, &base_vha->dpc_flags))
-		return;
-
-	set_bit(UNLOADING, &base_vha->dpc_flags);
-
 	qla_nvme_delete(base_vha);
 
 	dma_free_coherent(&ha->pdev->dev,
@@ -6628,13 +6626,6 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
 	struct pci_dev *pdev = ha->pdev;
 	scsi_qla_host_t *base_vha = pci_get_drvdata(ha->pdev);
 
-	/*
-	 * if UNLOAD flag is already set, then continue unload,
-	 * where it was set first.
-	 */
-	if (test_bit(UNLOADING, &base_vha->dpc_flags))
-		return;
-
 	ql_log(ql_log_warn, base_vha, 0x015b,
 	    "Disabling adapter.\n");
 
@@ -6645,9 +6636,14 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
 		return;
 	}
 
-	qla2x00_wait_for_sess_deletion(base_vha);
+	/*
+	 * if UNLOADING flag is already set, then continue unload,
+	 * where it was set first.
+	 */
+	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
+		return;
 
-	set_bit(UNLOADING, &base_vha->dpc_flags);
+	qla2x00_wait_for_sess_deletion(base_vha);
 
 	qla2x00_delete_all_vps(ha, base_vha);
 
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] scsi: qla2xxx: check UNLOADING before posting async work
  2020-04-21 20:46 [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading mwilck
  2020-04-21 20:46 ` [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
@ 2020-04-21 20:46 ` mwilck
  2020-04-21 21:18   ` himanshu.madhani
  2020-04-21 21:25 ` [EXT] [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading Arun Easi
  2020-04-22  3:54 ` Martin K. Petersen
  3 siblings, 1 reply; 7+ messages in thread
From: mwilck @ 2020-04-21 20:46 UTC (permalink / raw)
  To: Martin K. Petersen, Arun Easi, Quinn Tran
  Cc: Himanshu Madhani, Roman Bolshakov, Daniel Wagner, Bart Van Assche,
	James Bottomley, linux-scsi, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

qlt_free_session_done() tries to post async PRLO / LOGO, and
waits for the completion of these async commands. If UNLOADING
is set, this is doomed to timeout, because the async logout
command will never complete.

The only way to avoid waiting pointlessly is to fail posting
these commands in the first place if the driver is in UNLOADING state.
In general, posting any command should be avoided when the driver
is UNLOADING.

With this patch, "rmmod qla2xxx" completes without noticeable delay.

Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/qla2xxx/qla_os.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index ce0dabb..8cce3e2 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -4862,6 +4862,9 @@ qla2x00_alloc_work(struct scsi_qla_host *vha, enum qla_work_type type)
 	struct qla_work_evt *e;
 	uint8_t bail;
 
+	if (test_bit(UNLOADING, &vha->dpc_flags))
+		return NULL;
+
 	QLA_VHA_MARK_BUSY(vha, bail);
 	if (bail)
 		return NULL;
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion
  2020-04-21 20:46 ` [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
@ 2020-04-21 21:18   ` himanshu.madhani
  0 siblings, 0 replies; 7+ messages in thread
From: himanshu.madhani @ 2020-04-21 21:18 UTC (permalink / raw)
  To: mwilck, Martin K. Petersen, Arun Easi, Quinn Tran
  Cc: Roman Bolshakov, Daniel Wagner, Bart Van Assche, James Bottomley,
	linux-scsi



On 4/21/20 3:46 PM, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> The purpose of the UNLOADING flag is to avoid port login procedures
> to continue when a controller is in the process of shutting down.
> It makes sense to set this flag before starting session teardown.
> 
> Furthermore, use atomic test_and_set_bit() to avoid the shutdown
> being run multiple times in parallel. In qla2x00_disable_board_on_pci_error(),
> the test for UNLOADING is postponed until after the check for an already
> disabled PCI board.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Reviewed-by: Arun Easi <aeasi@marvell.com>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>   drivers/scsi/qla2xxx/qla_os.c | 32 ++++++++++++++------------------
>   1 file changed, 14 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d9072ea..ce0dabb 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -3732,6 +3732,13 @@ qla2x00_remove_one(struct pci_dev *pdev)
>   	}
>   	qla2x00_wait_for_hba_ready(base_vha);
>   
> +	/*
> +	 * if UNLOADING flag is already set, then continue unload,
> +	 * where it was set first.
> +	 */
> +	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
> +		return;
> +
>   	if (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) ||
>   	    IS_QLA28XX(ha)) {
>   		if (ha->flags.fw_started)
> @@ -3750,15 +3757,6 @@ qla2x00_remove_one(struct pci_dev *pdev)
>   
>   	qla2x00_wait_for_sess_deletion(base_vha);
>   
> -	/*
> -	 * if UNLOAD flag is already set, then continue unload,
> -	 * where it was set first.
> -	 */
> -	if (test_bit(UNLOADING, &base_vha->dpc_flags))
> -		return;
> -
> -	set_bit(UNLOADING, &base_vha->dpc_flags);
> -
>   	qla_nvme_delete(base_vha);
>   
>   	dma_free_coherent(&ha->pdev->dev,
> @@ -6628,13 +6626,6 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
>   	struct pci_dev *pdev = ha->pdev;
>   	scsi_qla_host_t *base_vha = pci_get_drvdata(ha->pdev);
>   
> -	/*
> -	 * if UNLOAD flag is already set, then continue unload,
> -	 * where it was set first.
> -	 */
> -	if (test_bit(UNLOADING, &base_vha->dpc_flags))
> -		return;
> -
>   	ql_log(ql_log_warn, base_vha, 0x015b,
>   	    "Disabling adapter.\n");
>   
> @@ -6645,9 +6636,14 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
>   		return;
>   	}
>   
> -	qla2x00_wait_for_sess_deletion(base_vha);
> +	/*
> +	 * if UNLOADING flag is already set, then continue unload,
> +	 * where it was set first.
> +	 */
> +	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
> +		return;
>   
> -	set_bit(UNLOADING, &base_vha->dpc_flags);
> +	qla2x00_wait_for_sess_deletion(base_vha);
>   
>   	qla2x00_delete_all_vps(ha, base_vha);
>   
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani
Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] scsi: qla2xxx: check UNLOADING before posting async work
  2020-04-21 20:46 ` [PATCH v4 2/2] scsi: qla2xxx: check UNLOADING before posting async work mwilck
@ 2020-04-21 21:18   ` himanshu.madhani
  0 siblings, 0 replies; 7+ messages in thread
From: himanshu.madhani @ 2020-04-21 21:18 UTC (permalink / raw)
  To: mwilck, Martin K. Petersen, Arun Easi, Quinn Tran
  Cc: Roman Bolshakov, Daniel Wagner, Bart Van Assche, James Bottomley,
	linux-scsi



On 4/21/20 3:46 PM, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> qlt_free_session_done() tries to post async PRLO / LOGO, and
> waits for the completion of these async commands. If UNLOADING
> is set, this is doomed to timeout, because the async logout
> command will never complete.
> 
> The only way to avoid waiting pointlessly is to fail posting
> these commands in the first place if the driver is in UNLOADING state.
> In general, posting any command should be avoided when the driver
> is UNLOADING.
> 
> With this patch, "rmmod qla2xxx" completes without noticeable delay.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>   drivers/scsi/qla2xxx/qla_os.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index ce0dabb..8cce3e2 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -4862,6 +4862,9 @@ qla2x00_alloc_work(struct scsi_qla_host *vha, enum qla_work_type type)
>   	struct qla_work_evt *e;
>   	uint8_t bail;
>   
> +	if (test_bit(UNLOADING, &vha->dpc_flags))
> +		return NULL;
> +
>   	QLA_VHA_MARK_BUSY(vha, bail);
>   	if (bail)
>   		return NULL;
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani
Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading
  2020-04-21 20:46 [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading mwilck
  2020-04-21 20:46 ` [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
  2020-04-21 20:46 ` [PATCH v4 2/2] scsi: qla2xxx: check UNLOADING before posting async work mwilck
@ 2020-04-21 21:25 ` Arun Easi
  2020-04-22  3:54 ` Martin K. Petersen
  3 siblings, 0 replies; 7+ messages in thread
From: Arun Easi @ 2020-04-21 21:25 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Quinn Tran, Himanshu Madhani, Roman Bolshakov,
	Daniel Wagner, Bart Van Assche, James Bottomley, linux-scsi

Series look good.

Thank you for the patches and testing, Martin.

Regards,
-Arun

On Tue, 21 Apr 2020, 1:46pm, mwilck@suse.com wrote:

> External Email
> 
> ----------------------------------------------------------------------
> From: Martin Wilck <mwilck@suse.com>
> 
> Hello Martin, Arun, Himanshu, all,
> 
> here is v4 of the little series I first submitted on Nov 29, 2019.
> I dropped the more controversial part, hoping to get the actual fix
> for the issue merged.
> 
> Reviews welcome.
> Martin
> 
> Changes since v3:
>  - In patch 2, moved check from qla2x00_post_async_XYZ_work() to
>    qla2x00_alloc_work() as suggested by Arun
>  - Dropped patch 3-5. With 1 and 2 applied, I haven't been able to reproduce
>    shutdown hangs any more.
> 
> Changes since v2:
>  - Removed "scsi: qla2xxx: avoid sending mailbox commands if firmware is
>    stopped", because the first hunk is obsoleted by the (new) 1/3, and Arun
>    suggested to use a different approach (which is now in 4/3) for the second
>    hunk.
>  - Removed "scsi: qla2xxx: don't shut down firmware before closing sessions"
>    (nak'd by Arun).
>  - Former 3/3 is now 1/5
>  - Added "scsi: qla2xxx: check UNLOADING before posting async work". This one
>    is key for avoiding lags when qla2xxx is unloaded.
>  - Added revert of "scsi: qla2xxx: Fix unbound sleep in fcport delete path.",
>    as I believe it's now obsolete.
>    If we ever encounter unbound sleep there again, we should rather figure
>    out the reason than simply abort waiting.
>  - Added patch 4 and 5, a new attempt at avoiding mailbox and HW request queue
>    access at low level. 4/5 was motivated by Arun's comments on my v2 series.
>    5/5 is obviously similar in spirit to 77ddb94a4853 ("scsi: qla2xxx: Only
>    allow operational MBX to proceed during RESET."), but I found that the
>    rom_cmds list contains commands that would hang when the FW is stopped,
>    so I created a new list. Perhaps some day the two can be consolidated.
> 
> Changes since v1:
>  - Added patch 3 to set the UNLOADING flag before waiting for sessions
>    to end (Roman)
>  - Use test_and_set_bit() for UNLOADING (Hannes)
> 
> Martin Wilck (2):
>   scsi: qla2xxx: set UNLOADING before waiting for session deletion
>   scsi: qla2xxx: check UNLOADING before posting async work
> 
>  drivers/scsi/qla2xxx/qla_os.c | 35 +++++++++++++++++------------------
>  1 file changed, 17 insertions(+), 18 deletions(-)
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading
  2020-04-21 20:46 [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading mwilck
                   ` (2 preceding siblings ...)
  2020-04-21 21:25 ` [EXT] [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading Arun Easi
@ 2020-04-22  3:54 ` Martin K. Petersen
  3 siblings, 0 replies; 7+ messages in thread
From: Martin K. Petersen @ 2020-04-22  3:54 UTC (permalink / raw)
  To: mwilck
  Cc: Martin K. Petersen, Arun Easi, Quinn Tran, Himanshu Madhani,
	Roman Bolshakov, Daniel Wagner, Bart Van Assche, James Bottomley,
	linux-scsi


Martin,

> here is v4 of the little series I first submitted on Nov 29, 2019.  I
> dropped the more controversial part, hoping to get the actual fix for
> the issue merged.

Applied to 5.7/scsi-fixes, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-04-22  3:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-21 20:46 [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading mwilck
2020-04-21 20:46 ` [PATCH v4 1/2] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
2020-04-21 21:18   ` himanshu.madhani
2020-04-21 20:46 ` [PATCH v4 2/2] scsi: qla2xxx: check UNLOADING before posting async work mwilck
2020-04-21 21:18   ` himanshu.madhani
2020-04-21 21:25 ` [EXT] [PATCH v4 0/2] scsi: qla2xxx: fixes for driver unloading Arun Easi
2020-04-22  3:54 ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.