linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>
To: linux-scsi@vger.kernel.org,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Nicholas A. Bellinger" <nab@linux-iscsi.org>,
	Brian King <brking@linux.vnet.ibm.com>,
	Ian Munsie <imunsie@au1.ibm.com>,
	Daniel Axtens <dja@ozlabs.au.ibm.com>,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	Tomas Henzl <thenzl@redhat.com>,
	David Laight <David.Laight@ACULAB.COM>
Cc: Michael Neuling <mikey@neuling.org>,
	linuxppc-dev@lists.ozlabs.org,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>
Subject: [PATCH v4 32/32] cxlflash: Fix to avoid potential deadlock on EEH
Date: Fri, 25 Sep 2015 18:19:57 -0500	[thread overview]
Message-ID: <1443223197-10153-1-git-send-email-mrochs@linux.vnet.ibm.com> (raw)
In-Reply-To: <1443222593-8828-1-git-send-email-mrochs@linux.vnet.ibm.com>

Ioctl threads that use scsi_execute() can run for an excessive amount
of time due to the fact that they have lengthy timeouts and retry logic
built in. Under normal operation this is not an issue. However, once EEH
enters the picture, a long execution time coupled with the possibility
that a timeout can trigger entry to the driver via registered reset
callbacks becomes a liability.

In particular, a deadlock can occur when an EEH event is encountered
while in running in scsi_execute(). As part of the recovery, the EEH
handler drains all currently running ioctls, waiting until they have
completed before proceeding with a reset. As the scsi_execute()'s are
situated on the ioctl path, the EEH handler will wait until they (and
the remainder of the ioctl handler they're associated with) have
completed. Normally this would not be much of an issue aside from the
longer recovery period. Unfortunately, the scsi_execute() triggers a
reset when it times out. The reset handler will see that the device is
already being reset and wait until that reset completed. This creates
a condition where the EEH handler becomes stuck, infinitely waiting for
the ioctl thread to complete.

To avoid this behavior, temporarily unmark the scsi_execute() threads
as an ioctl thread by releasing the ioctl read semaphore. This allows
the EEH handler to proceed with a recovery while the thread is still
running. Once the scsi_execute() returns, the ioctl read semaphore is
reacquired and the adapter state is rechecked in case it changed while
inside of scsi_execute(). The state check will wait if the adapter is
still being recovered or returns a failure if the recovery failed. In
the event that the adapter reset failed, the failure is simply returned
as the ioctl would be unable to continue.

Reported-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 30 +++++++++++++++++++++++++++++-
 drivers/scsi/cxlflash/superpipe.h |  2 ++
 drivers/scsi/cxlflash/vlun.c      | 29 +++++++++++++++++++++++++++++
 3 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index f625e07..8af7cdc 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -283,6 +283,24 @@ out:
  * @sdev:	SCSI device associated with LUN.
  * @lli:	LUN destined for capacity request.
  *
+ * The READ_CAP16 can take quite a while to complete. Should an EEH occur while
+ * in scsi_execute(), the EEH handler will attempt to recover. As part of the
+ * recovery, the handler drains all currently running ioctls, waiting until they
+ * have completed before proceeding with a reset. As this routine is used on the
+ * ioctl path, this can create a condition where the EEH handler becomes stuck,
+ * infinitely waiting for this ioctl thread. To avoid this behavior, temporarily
+ * unmark this thread as an ioctl thread by releasing the ioctl read semaphore.
+ * This will allow the EEH handler to proceed with a recovery while this thread
+ * is still running. Once the scsi_execute() returns, reacquire the ioctl read
+ * semaphore and check the adapter state in case it changed while inside of
+ * scsi_execute(). The state check will wait if the adapter is still being
+ * recovered or return a failure if the recovery failed. In the event that the
+ * adapter reset failed, simply return the failure as the ioctl would be unable
+ * to continue.
+ *
+ * Note that the above puts a requirement on this routine to only be called on
+ * an ioctl thread.
+ *
  * Return: 0 on success, -errno on failure
  */
 static int read_cap16(struct scsi_device *sdev, struct llun_info *lli)
@@ -314,8 +332,18 @@ retry:
 	dev_dbg(dev, "%s: %ssending cmd(0x%x)\n", __func__,
 		retry_cnt ? "re" : "", scsi_cmd[0]);
 
+	/* Drop the ioctl read semahpore across lengthy call */
+	up_read(&cfg->ioctl_rwsem);
 	result = scsi_execute(sdev, scsi_cmd, DMA_FROM_DEVICE, cmd_buf,
 			      CMD_BUFSIZE, sense_buf, to, CMD_RETRIES, 0, NULL);
+	down_read(&cfg->ioctl_rwsem);
+	rc = check_state(cfg);
+	if (rc) {
+		dev_err(dev, "%s: Failed state! result=0x08%X\n",
+			__func__, result);
+		rc = -ENODEV;
+		goto out;
+	}
 
 	if (driver_byte(result) == DRIVER_SENSE) {
 		result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
@@ -1221,7 +1249,7 @@ static const struct file_operations null_fops = {
  *
  * Return: 0 on success, -errno on failure
  */
-static int check_state(struct cxlflash_cfg *cfg)
+int check_state(struct cxlflash_cfg *cfg)
 {
 	struct device *dev = &cfg->dev->dev;
 	int rc = 0;
diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h
index 7df88ee..06a805a 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -147,4 +147,6 @@ void cxlflash_ba_terminate(struct ba_lun *);
 
 int cxlflash_manage_lun(struct scsi_device *, struct dk_cxlflash_manage_lun *);
 
+int check_state(struct cxlflash_cfg *);
+
 #endif /* ifndef _CXLFLASH_SUPERPIPE_H */
diff --git a/drivers/scsi/cxlflash/vlun.c b/drivers/scsi/cxlflash/vlun.c
index b0eaf55..a53f583 100644
--- a/drivers/scsi/cxlflash/vlun.c
+++ b/drivers/scsi/cxlflash/vlun.c
@@ -400,6 +400,24 @@ static int init_vlun(struct llun_info *lli)
  * @lba:	Logical block address to start write same.
  * @nblks:	Number of logical blocks to write same.
  *
+ * The SCSI WRITE_SAME16 can take quite a while to complete. Should an EEH occur
+ * while in scsi_execute(), the EEH handler will attempt to recover. As part of
+ * the recovery, the handler drains all currently running ioctls, waiting until
+ * they have completed before proceeding with a reset. As this routine is used
+ * on the ioctl path, this can create a condition where the EEH handler becomes
+ * stuck, infinitely waiting for this ioctl thread. To avoid this behavior,
+ * temporarily unmark this thread as an ioctl thread by releasing the ioctl read
+ * semaphore. This will allow the EEH handler to proceed with a recovery while
+ * this thread is still running. Once the scsi_execute() returns, reacquire the
+ * ioctl read semaphore and check the adapter state in case it changed while
+ * inside of scsi_execute(). The state check will wait if the adapter is still
+ * being recovered or return a failure if the recovery failed. In the event that
+ * the adapter reset failed, simply return the failure as the ioctl would be
+ * unable to continue.
+ *
+ * Note that the above puts a requirement on this routine to only be called on
+ * an ioctl thread.
+ *
  * Return: 0 on success, -errno on failure
  */
 static int write_same16(struct scsi_device *sdev,
@@ -433,9 +451,20 @@ static int write_same16(struct scsi_device *sdev,
 		put_unaligned_be32(ws_limit < left ? ws_limit : left,
 				   &scsi_cmd[10]);
 
+		/* Drop the ioctl read semahpore across lengthy call */
+		up_read(&cfg->ioctl_rwsem);
 		result = scsi_execute(sdev, scsi_cmd, DMA_TO_DEVICE, cmd_buf,
 				      CMD_BUFSIZE, sense_buf, to, CMD_RETRIES,
 				      0, NULL);
+		down_read(&cfg->ioctl_rwsem);
+		rc = check_state(cfg);
+		if (rc) {
+			dev_err(dev, "%s: Failed state! result=0x08%X\n",
+				__func__, result);
+			rc = -ENODEV;
+			goto out;
+		}
+
 		if (result) {
 			dev_err_ratelimited(dev, "%s: command failed for "
 					    "offset %lld result=0x%x\n",
-- 
2.1.0

  parent reply	other threads:[~2015-09-25 23:21 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-25 23:09 [PATCH v4 00/32] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
2015-09-25 23:12 ` [PATCH v4 01/32] cxlflash: Fix to avoid invalid port_sel value Matthew R. Ochs
2015-09-25 23:12 ` [PATCH v4 02/32] cxlflash: Replace magic numbers with literals Matthew R. Ochs
2015-09-29  5:40   ` Andrew Donnellan
2015-09-25 23:12 ` [PATCH v4 03/32] cxlflash: Fix read capacity timeout Matthew R. Ochs
2015-09-25 23:13 ` [PATCH v4 04/32] cxlflash: Fix potential oops following LUN removal Matthew R. Ochs
2015-09-25 23:13 ` [PATCH v4 05/32] cxlflash: Fix data corruption when vLUN used over multiple cards Matthew R. Ochs
2015-09-25 23:14 ` [PATCH v4 06/32] cxlflash: Fix to avoid sizeof(bool) Matthew R. Ochs
2015-09-28 22:35   ` Daniel Axtens
2015-09-25 23:14 ` [PATCH v4 07/32] cxlflash: Fix context encode mask width Matthew R. Ochs
2015-09-28 22:39   ` Daniel Axtens
2015-09-25 23:14 ` [PATCH v4 08/32] cxlflash: Fix to avoid CXL services during EEH Matthew R. Ochs
2015-09-28 22:07   ` Brian King
2015-09-28 23:05   ` Daniel Axtens
2015-09-29 19:28     ` Matthew R. Ochs
2015-09-25 23:14 ` [PATCH v4 09/32] cxlflash: Correct naming of limbo state and waitq Matthew R. Ochs
2015-09-28 23:09   ` Daniel Axtens
2015-09-25 23:14 ` [PATCH v4 10/32] cxlflash: Make functions static Matthew R. Ochs
2015-09-25 23:14 ` [PATCH v4 11/32] cxlflash: Refine host/device attributes Matthew R. Ochs
2015-09-29  4:29   ` Andrew Donnellan
2015-09-25 23:15 ` [PATCH v4 12/32] cxlflash: Fix to avoid spamming the kernel log Matthew R. Ochs
2015-09-29  5:05   ` Andrew Donnellan
2015-09-29 20:37     ` Matthew R. Ochs
2015-09-25 23:16 ` [PATCH v4 13/32] cxlflash: Fix to avoid stall while waiting on TMF Matthew R. Ochs
2015-09-25 23:16 ` [PATCH v4 14/32] cxlflash: Fix location of setting resid Matthew R. Ochs
2015-09-25 23:16 ` [PATCH v4 15/32] cxlflash: Fix host link up event handling Matthew R. Ochs
2015-09-25 23:16 ` [PATCH v4 16/32] cxlflash: Fix async interrupt bypass logic Matthew R. Ochs
2015-09-25 23:16 ` [PATCH v4 17/32] cxlflash: Remove dual port online dependency Matthew R. Ochs
2015-09-28 23:37   ` Daniel Axtens
2015-09-29 19:38     ` Matthew R. Ochs
2015-09-30 23:50       ` Daniel Axtens
2015-10-01 15:00         ` Matthew R. Ochs
2015-09-25 23:17 ` [PATCH v4 18/32] cxlflash: Fix AFU version access/storage and add check Matthew R. Ochs
2015-09-25 23:17 ` [PATCH v4 19/32] cxlflash: Correct usage of scsi_host_put() Matthew R. Ochs
2015-09-25 23:17 ` [PATCH v4 20/32] cxlflash: Fix to prevent workq from accessing freed memory Matthew R. Ochs
2015-09-25 23:17 ` [PATCH v4 21/32] cxlflash: Correct behavior in device reset handler following EEH Matthew R. Ochs
2015-09-25 23:17 ` [PATCH v4 22/32] cxlflash: Remove unnecessary scsi_block_requests Matthew R. Ochs
2015-09-25 23:18 ` [PATCH v4 23/32] cxlflash: Fix function prolog parameters and return codes Matthew R. Ochs
2015-09-29  4:36   ` Andrew Donnellan
2015-09-29 20:31     ` Matthew R. Ochs
2015-09-25 23:18 ` [PATCH v4 24/32] cxlflash: Fix MMIO and endianness errors Matthew R. Ochs
2015-09-29  1:52   ` Andrew Donnellan
2015-09-25 23:18 ` [PATCH v4 25/32] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
2015-09-29  1:25   ` Daniel Axtens
2015-09-29 20:11     ` Matthew R. Ochs
2015-09-30 23:53       ` Daniel Axtens
2015-09-25 23:18 ` [PATCH v4 26/32] cxlflash: Correct spelling, grammar, and alignment mistakes Matthew R. Ochs
2015-09-29  1:18   ` Andrew Donnellan
2015-09-25 23:19 ` [PATCH v4 27/32] cxlflash: Fix to prevent stale AFU RRQ Matthew R. Ochs
2015-09-29  1:36   ` Daniel Axtens
2015-09-29 20:22     ` Matthew R. Ochs
2015-09-30 23:51       ` Daniel Axtens
2015-09-25 23:19 ` [PATCH v4 28/32] MAINTAINERS: Add cxlflash driver Matthew R. Ochs
2015-09-25 23:19 ` [PATCH v4 29/32] cxlflash: Fix to double the delay each time Matthew R. Ochs
2015-09-29  1:19   ` Andrew Donnellan
2015-09-29  1:40   ` Daniel Axtens
2015-09-29 20:28     ` Matthew R. Ochs
2015-09-30  0:08       ` Daniel Axtens
2015-09-25 23:19 ` [PATCH v4 30/32] cxlflash: Fix to avoid corrupting adapter fops Matthew R. Ochs
2015-09-28 22:13   ` Brian King
2015-09-29  0:54   ` Andrew Donnellan
2015-09-30  0:18   ` Daniel Axtens
2015-09-25 23:19 ` [PATCH v4 31/32] cxlflash: Correct trace string Matthew R. Ochs
2015-09-29  1:20   ` Andrew Donnellan
2015-09-25 23:19 ` Matthew R. Ochs [this message]
2015-09-28 23:41   ` [PATCH v4 32/32] cxlflash: Fix to avoid potential deadlock on EEH Brian King
2015-09-29 19:40     ` Matthew R. Ochs
2015-09-30  0:33   ` Daniel Axtens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1443223197-10153-1-git-send-email-mrochs@linux.vnet.ibm.com \
    --to=mrochs@linux.vnet.ibm.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=brking@linux.vnet.ibm.com \
    --cc=dja@ozlabs.au.ibm.com \
    --cc=imunsie@au1.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=manoj@linux.vnet.ibm.com \
    --cc=mikey@neuling.org \
    --cc=nab@linux-iscsi.org \
    --cc=thenzl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).