* [PATCH] ipr: fix eeh recovery for 64-bit adapters
@ 2012-01-16 21:30 Kleber Sacilotto de Souza
2012-01-16 22:04 ` Brian King
0 siblings, 1 reply; 3+ messages in thread
From: Kleber Sacilotto de Souza @ 2012-01-16 21:30 UTC (permalink / raw)
To: linux-scsi
Cc: James E.J. Bottomley, Brian King, Wayne Boyer, Wendy Xiong,
Kleber Sacilotto de Souza
In some scenarios, an EEH error can take a long time to be detected, since the
driver issues an MMIO read only after a device reset command times out and we
try to reset the adapter. This patch adds some code in ipr_cancel_op() to read
a hardware register so we detect the error earlier in case the op is being
aborted because of a timeout caused by a frozen adapter slot.
Another problem in such scenarios is that in __ipr_eh_host_reset() we change the
dump state flag from WAIT_FOR_DUMP to GET_DUMP, and the flag is later changed
from GET_DUMP to READ_DUMP in ipr_reset_restore_cfg_space(). However, if when
__ipr_eh_host_reset() is called by the SCSI error handling the function
ipr_reset_restore_cfg_space() has already been called by the PCI EEH code, we
end up with the flag in an inconsistent state. This patch also prevents this
problem.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
---
drivers/scsi/ipr.c | 24 ++++++++++++++++++------
1 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index fd860d9..7969b5a 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -4613,11 +4613,13 @@ static int __ipr_eh_host_reset(struct scsi_cmnd * scsi_cmd)
ENTER;
ioa_cfg = (struct ipr_ioa_cfg *) scsi_cmd->device->host->hostdata;
- dev_err(&ioa_cfg->pdev->dev,
- "Adapter being reset as a result of error recovery.\n");
+ if (!ioa_cfg->in_reset_reload) {
+ dev_err(&ioa_cfg->pdev->dev,
+ "Adapter being reset as a result of error recovery.\n");
- if (WAIT_FOR_DUMP == ioa_cfg->sdt_state)
- ioa_cfg->sdt_state = GET_DUMP;
+ if (WAIT_FOR_DUMP == ioa_cfg->sdt_state)
+ ioa_cfg->sdt_state = GET_DUMP;
+ }
rc = ipr_reset_reload(ioa_cfg, IPR_SHUTDOWN_ABBREV);
@@ -4907,7 +4909,7 @@ static int ipr_cancel_op(struct scsi_cmnd * scsi_cmd)
struct ipr_ioa_cfg *ioa_cfg;
struct ipr_resource_entry *res;
struct ipr_cmd_pkt *cmd_pkt;
- u32 ioasc;
+ u32 ioasc, int_reg;
int op_found = 0;
ENTER;
@@ -4920,7 +4922,17 @@ static int ipr_cancel_op(struct scsi_cmnd * scsi_cmd)
*/
if (ioa_cfg->in_reset_reload || ioa_cfg->ioa_is_dead)
return FAILED;
- if (!res || !ipr_is_gscsi(res))
+ if (!res)
+ return FAILED;
+
+ /*
+ * If we are aborting a timed out op, chances are that the timeout was caused
+ * by a still not detected EEH error. In such cases, reading a register will
+ * trigger the EEH recovery infrastructure.
+ */
+ int_reg = readl(ioa_cfg->regs.sense_interrupt_reg);
+
+ if (!ipr_is_gscsi(res))
return FAILED;
list_for_each_entry(ipr_cmd, &ioa_cfg->pending_q, queue) {
--
1.7.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-02-14 17:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-16 21:30 [PATCH] ipr: fix eeh recovery for 64-bit adapters Kleber Sacilotto de Souza
2012-01-16 22:04 ` Brian King
2012-02-14 17:41 ` Kleber Sacilotto de Souza
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.