public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	"Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>,
	Uma Krishnan <ukrishn@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sumit Semwal <sumit.semwal@linaro.org>
Subject: [PATCH 4.4 09/20] scsi: cxlflash: Fix to avoid EEH and host reset collisions
Date: Fri,  5 May 2017 11:32:59 -0700	[thread overview]
Message-ID: <20170505183231.312396030@linuxfoundation.org> (raw)
In-Reply-To: <20170505183230.937615081@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

commit 1d3324c382b1a617eb567e3650dcb51f22dfec9a upstream.

The EEH reset handler is ignorant to the current state of the driver
when processing a frozen event and initiating a device reset. This can
be an issue if an EEH event occurs while a user or stack initiated reset
is executing. More specifically, if an EEH occurs while the SCSI host
reset handler is active, the reset initiated by the EEH thread will
likely collide with the host reset thread. This can leave the device in
an inconsistent state, or worse, cause a system crash.

As a remedy, the EEH handler is updated to evaluate the device state and
take appropriate action (proceed, wait, or disconnect host). The host
reset handler is also updated to handle situations where an EEH occurred
during a host reset. In such situations, the host reset handler will
delay reporting back a success to give the EEH reset an opportunity to
complete.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/cxlflash/main.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1962,6 +1962,11 @@ retry:
  * cxlflash_eh_host_reset_handler() - reset the host adapter
  * @scp:	SCSI command from stack identifying host.
  *
+ * Following a reset, the state is evaluated again in case an EEH occurred
+ * during the reset. In such a scenario, the host reset will either yield
+ * until the EEH recovery is complete or return success or failure based
+ * upon the current device state.
+ *
  * Return:
  *	SUCCESS as defined in scsi/scsi.h
  *	FAILED as defined in scsi/scsi.h
@@ -1993,7 +1998,8 @@ static int cxlflash_eh_host_reset_handle
 		} else
 			cfg->state = STATE_NORMAL;
 		wake_up_all(&cfg->reset_waitq);
-		break;
+		ssleep(1);
+		/* fall through */
 	case STATE_RESET:
 		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
 		if (cfg->state == STATE_NORMAL)
@@ -2534,6 +2540,9 @@ static void drain_ioctls(struct cxlflash
  * @pdev:	PCI device struct.
  * @state:	PCI channel state.
  *
+ * When an EEH occurs during an active reset, wait until the reset is
+ * complete and then take action based upon the device state.
+ *
  * Return: PCI_ERS_RESULT_NEED_RESET or PCI_ERS_RESULT_DISCONNECT
  */
 static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
@@ -2547,6 +2556,10 @@ static pci_ers_result_t cxlflash_pci_err
 
 	switch (state) {
 	case pci_channel_io_frozen:
+		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
+		if (cfg->state == STATE_FAILTERM)
+			return PCI_ERS_RESULT_DISCONNECT;
+
 		cfg->state = STATE_RESET;
 		scsi_block_requests(cfg->host);
 		drain_ioctls(cfg);

  parent reply	other threads:[~2017-05-05 18:35 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-05 18:32 [PATCH 4.4 00/20] 4.4.67-stable review Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 01/20] timerfd: Protect the might cancel mechanism proper Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 02/20] Handle mismatched open calls Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 03/20] ASoC: intel: Fix PM and non-atomic crash in bytcr drivers Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 04/20] ALSA: ppc/awacs: shut up maybe-uninitialized warning Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 05/20] drbd: avoid redefinition of BITS_PER_PAGE Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 06/20] mtd: avoid stack overflow in MTD CFI code Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 07/20] net: tg3: avoid uninitialized variable warning Greg Kroah-Hartman
2017-05-05 18:32 ` [PATCH 4.4 08/20] scsi: cxlflash: Scan host only after the port is ready for I/O Greg Kroah-Hartman
2017-05-05 18:32 ` Greg Kroah-Hartman [this message]
2017-05-05 18:33 ` [PATCH 4.4 10/20] scsi: cxlflash: Improve EEH recovery time Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 11/20] 8250_pci: Fix potential use-after-free in error path Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 12/20] netlink: Allow direct reclaim for fallback allocation Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 13/20] IB/qib: rename BITS_PER_PAGE to RVT_BITS_PER_PAGE Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 14/20] IB/ehca: fix maybe-uninitialized warnings Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 15/20] ext4: require encryption feature for EXT4_IOC_SET_ENCRYPTION_POLICY Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 16/20] ext4 crypto: revalidate dentry after adding or removing the key Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 17/20] ext4 crypto: use dget_parent() in ext4_d_revalidate() Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 18/20] ext4/fscrypto: avoid RCU lookup in d_revalidate Greg Kroah-Hartman
2017-05-05 18:33 ` [PATCH 4.4 19/20] nfsd4: minor NFSv2/v3 write decoding cleanup Greg Kroah-Hartman
2017-05-06  1:58 ` [PATCH 4.4 00/20] 4.4.67-stable review Shuah Khan
2017-05-06  3:41   ` Greg Kroah-Hartman
2017-05-07 20:53 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170505183231.312396030@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mrochs@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=ukrishn@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox