From: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
To: linux-scsi@vger.kernel.org,
James Bottomley <jejb@linux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
"Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>,
"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>,
linuxppc-dev@lists.ozlabs.org, Ian Munsie <imunsie@au1.ibm.com>,
Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
Frederic Barrat <fbarrat@linux.vnet.ibm.com>,
Christophe Lombard <clombard@linux.vnet.ibm.com>
Subject: [PATCH 3/6] cxlflash: Fix to avoid EEH and host reset collisions
Date: Fri, 2 Sep 2016 15:39:30 -0500 [thread overview]
Message-ID: <1472848770-65034-1-git-send-email-ukrishn@linux.vnet.ibm.com> (raw)
In-Reply-To: <1472848612-64888-1-git-send-email-ukrishn@linux.vnet.ibm.com>
From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>
The EEH reset handler is ignorant to the current state of the
driver when processing a frozen event and initiating a device
reset. This can be an issue if an EEH event occurs while a user
or stack initiated reset is executing. More specifically, if an
EEH occurs while the SCSI host reset handler is active, the reset
initiated by the EEH thread will likely collide with the host reset
thread. This can leave the device in an inconsistent state, or worse,
cause a system crash.
As a remedy, the EEH handler is updated to evaluate the device state
and take appropriate action (proceed, wait, or disconnect host). The
host reset handler is also updated to handle situations where an EEH
occurred during a host reset. In such situations, the host reset handler
will delay reporting back a success to give the EEH reset an opportunity
to complete.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
drivers/scsi/cxlflash/main.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 4c2559a..4ef5235 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -2042,6 +2042,11 @@ retry:
* cxlflash_eh_host_reset_handler() - reset the host adapter
* @scp: SCSI command from stack identifying host.
*
+ * Following a reset, the state is evaluated again in case an EEH occurred
+ * during the reset. In such a scenario, the host reset will either yield
+ * until the EEH recovery is complete or return success or failure based
+ * upon the current device state.
+ *
* Return:
* SUCCESS as defined in scsi/scsi.h
* FAILED as defined in scsi/scsi.h
@@ -2074,7 +2079,8 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
} else
cfg->state = STATE_NORMAL;
wake_up_all(&cfg->reset_waitq);
- break;
+ ssleep(1);
+ /* fall through */
case STATE_RESET:
wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
if (cfg->state == STATE_NORMAL)
@@ -2590,6 +2596,9 @@ out_remove:
* @pdev: PCI device struct.
* @state: PCI channel state.
*
+ * When an EEH occurs during an active reset, wait until the reset is
+ * complete and then take action based upon the device state.
+ *
* Return: PCI_ERS_RESULT_NEED_RESET or PCI_ERS_RESULT_DISCONNECT
*/
static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
@@ -2603,6 +2612,10 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
switch (state) {
case pci_channel_io_frozen:
+ wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
+ if (cfg->state == STATE_FAILTERM)
+ return PCI_ERS_RESULT_DISCONNECT;
+
cfg->state = STATE_RESET;
scsi_block_requests(cfg->host);
drain_ioctls(cfg);
--
2.1.0
next prev parent reply other threads:[~2016-09-02 20:39 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-02 20:36 [PATCH 0/6] cxlflash: Miscellaneous fixes Uma Krishnan
2016-09-02 20:38 ` [PATCH 1/6] cxlflash: Scan host only after the port is ready for I/O Uma Krishnan
2016-09-07 23:46 ` Matthew R. Ochs
2016-09-02 20:39 ` [PATCH 2/6] cxlflash: Remove the device cleanly in the system shutdown path Uma Krishnan
2016-09-05 7:12 ` Andrew Donnellan
2016-09-06 20:06 ` Uma Krishnan
2016-09-07 23:46 ` Matthew R. Ochs
2016-09-02 20:39 ` Uma Krishnan [this message]
2016-09-09 22:13 ` [PATCH 3/6] cxlflash: Fix to avoid EEH and host reset collisions Uma Krishnan
2016-09-14 16:48 ` Martin K. Petersen
2016-09-02 20:40 ` [PATCH 4/6] cxlflash: Improve EEH recovery time Uma Krishnan
2016-09-09 22:14 ` Uma Krishnan
2016-09-02 20:40 ` [PATCH 5/6] cxlflash: Refactor WWPN setup Uma Krishnan
2016-09-09 22:14 ` Uma Krishnan
2016-09-02 20:40 ` [PATCH 6/6] cxlflash: Fix context reference tracking on detach Uma Krishnan
2016-09-09 22:14 ` Uma Krishnan
2016-09-09 11:35 ` [PATCH 0/6] cxlflash: Miscellaneous fixes Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1472848770-65034-1-git-send-email-ukrishn@linux.vnet.ibm.com \
--to=ukrishn@linux.vnet.ibm.com \
--cc=andrew.donnellan@au1.ibm.com \
--cc=brking@linux.vnet.ibm.com \
--cc=clombard@linux.vnet.ibm.com \
--cc=fbarrat@linux.vnet.ibm.com \
--cc=imunsie@au1.ibm.com \
--cc=jejb@linux.vnet.ibm.com \
--cc=linux-scsi@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=manoj@linux.vnet.ibm.com \
--cc=martin.petersen@oracle.com \
--cc=mrochs@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).