linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure
@ 2015-09-16 17:02 Matthew R. Ochs
  0 siblings, 0 replies; 3+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 17:02 UTC (permalink / raw)
  To: linux-scsi, James.Bottomley, nab, brking, imunsie, dja,
	andrew.donnellan
  Cc: mikey, linuxppc-dev, Manoj N. Kumar

The process_sense() routine can perform a read capacity which
can take some time to complete. If an EEH occurs while waiting
on the read capacity, the EEH handler is unable to obtain the
context's mutex in order to put the context in an error state.
The EEH handler will sit and wait until the context is free,
but this wait can last longer than the EEH handler tolerates,
leading to a failed recovery.

To address this issue, make the context unavailable to new,
non-system owned threads and release the context while calling
into process_sense(). After returning from process_sense() the
context mutex is reacquired and the context is made available
again. The context can be safely moved to the error state if
needed during the unavailable window as no other threads will
hold its reference.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index fb79b79fe..1c5e9ac 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1790,12 +1790,21 @@ static int cxlflash_disk_verify(struct scsi_device *sdev,
 	 * inquiry (i.e. the Unit attention is due to the WWN changing).
 	 */
 	if (verify->hint & DK_CXLFLASH_VERIFY_HINT_SENSE) {
+		/* Can't hold mutex across process_sense/read_cap16,
+		 * since we could have an intervening EEH event.
+		 */
+		ctxi->unavail = true;
+		mutex_unlock(&ctxi->mutex);
 		rc = process_sense(sdev, verify);
 		if (unlikely(rc)) {
 			dev_err(dev, "%s: Failed to validate sense data (%d)\n",
 				__func__, rc);
+			mutex_lock(&ctxi->mutex);
+			ctxi->unavail = false;
 			goto out;
 		}
+		mutex_lock(&ctxi->mutex);
+		ctxi->unavail = false;
 	}
 
 	switch (gli->mode) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
@ 2015-09-16 21:32 ` Matthew R. Ochs
  2015-09-23 19:09   ` Brian King
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:32 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The process_sense() routine can perform a read capacity which
can take some time to complete. If an EEH occurs while waiting
on the read capacity, the EEH handler is unable to obtain the
context's mutex in order to put the context in an error state.
The EEH handler will sit and wait until the context is free,
but this wait can last longer than the EEH handler tolerates,
leading to a failed recovery.

To address this issue, make the context unavailable to new,
non-system owned threads and release the context while calling
into process_sense(). After returning from process_sense() the
context mutex is reacquired and the context is made available
again. The context can be safely moved to the error state if
needed during the unavailable window as no other threads will
hold its reference.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index fb79b79fe..1c5e9ac 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1790,12 +1790,21 @@ static int cxlflash_disk_verify(struct scsi_device *sdev,
 	 * inquiry (i.e. the Unit attention is due to the WWN changing).
 	 */
 	if (verify->hint & DK_CXLFLASH_VERIFY_HINT_SENSE) {
+		/* Can't hold mutex across process_sense/read_cap16,
+		 * since we could have an intervening EEH event.
+		 */
+		ctxi->unavail = true;
+		mutex_unlock(&ctxi->mutex);
 		rc = process_sense(sdev, verify);
 		if (unlikely(rc)) {
 			dev_err(dev, "%s: Failed to validate sense data (%d)\n",
 				__func__, rc);
+			mutex_lock(&ctxi->mutex);
+			ctxi->unavail = false;
 			goto out;
 		}
+		mutex_lock(&ctxi->mutex);
+		ctxi->unavail = false;
 	}
 
 	switch (gli->mode) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure
  2015-09-16 21:32 ` [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
@ 2015-09-23 19:09   ` Brian King
  0 siblings, 0 replies; 3+ messages in thread
From: Brian King @ 2015-09-23 19:09 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-09-23 19:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-16 17:02 [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
  -- strict thread matches above, loose matches on Subject: below --
2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
2015-09-16 21:32 ` [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
2015-09-23 19:09   ` Brian King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).