linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steffen Maier <maier@linux.vnet.ibm.com>
To: "James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, linux-s390@vger.kernel.org,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Steffen Maier <maier@linux.vnet.ibm.com>,
	"#2 . 6 . 32+" <stable@vger.kernel.org>
Subject: [PATCH 03/10] zfcp: close window with unblocked rport during rport gone
Date: Wed, 10 Aug 2016 18:30:46 +0200	[thread overview]
Message-ID: <1470846653-90691-4-git-send-email-maier@linux.vnet.ibm.com> (raw)
In-Reply-To: <1470846653-90691-1-git-send-email-maier@linux.vnet.ibm.com>

On a successful end of reopen port forced,
zfcp_erp_strategy_followup_success() re-uses the port erp_action
and the subsequent zfcp_erp_action_cleanup() now
sees ZFCP_ERP_SUCCEEDED with
erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT
instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
but must not perform zfcp_scsi_schedule_rport_register().

We can detect this because the fresh port reopen erp_action
is in its very first step ZFCP_ERP_STEP_UNINITIALIZED.

Otherwise this opens a time window with unblocked rport
(until the followup port reopen recovery would block it again).
If a scsi_cmnd timeout occurs during this time window
fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents
a clean and timely path failover.
This should not happen if the path issue can be recovered
on FC transport layer such as path issues involving RSCNs.

Also, unnecessary and repeated DID_IMM_RETRY for pending and
undesired new requests occur because internally zfcp still
has its zfcp_port blocked.

As follow-on errors with scsi_eh, it can cause,
in the worst case, permanently lost paths due to one of:
sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk!
sd <scsidev>: Device offlined - not ready after error recovery

For fix validation and to aid future debugging with other recoveries
we now also trace (un)blocking of rports.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 5767620c383a ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED")
Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.h  |  7 ++++++-
 drivers/s390/scsi/zfcp_erp.c  | 12 +++++++++---
 drivers/s390/scsi/zfcp_scsi.c |  8 +++++++-
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index 0be3d48681ae..7901deb4ba89 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -2,7 +2,7 @@
  * zfcp device driver
  * debug feature declarations
  *
- * Copyright IBM Corp. 2008, 2010
+ * Copyright IBM Corp. 2008, 2015
  */
 
 #ifndef ZFCP_DBF_H
@@ -17,6 +17,11 @@
 
 #define ZFCP_DBF_INVALID_LUN	0xFFFFFFFFFFFFFFFFull
 
+enum zfcp_dbf_pseudo_erp_act_type {
+	ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD = 0xff,
+	ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL = 0xfe,
+};
+
 /**
  * struct zfcp_dbf_rec_trigger - trace record for triggered recovery action
  * @ready: number of ready recovery actions
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 3fb410977014..a59d678125bd 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -3,7 +3,7 @@
  *
  * Error Recovery Procedures (ERP).
  *
- * Copyright IBM Corp. 2002, 2010
+ * Copyright IBM Corp. 2002, 2015
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -1217,8 +1217,14 @@ static void zfcp_erp_action_cleanup(struct zfcp_erp_action *act, int result)
 		break;
 
 	case ZFCP_ERP_ACTION_REOPEN_PORT:
-		if (result == ZFCP_ERP_SUCCEEDED)
-			zfcp_scsi_schedule_rport_register(port);
+		/* This switch case might also happen after a forced reopen
+		 * was successfully done and thus overwritten with a new
+		 * non-forced reopen at `ersfs_2'. In this case, we must not
+		 * do the clean-up of the non-forced version.
+		 */
+		if (act->step != ZFCP_ERP_STEP_UNINITIALIZED)
+			if (result == ZFCP_ERP_SUCCEEDED)
+				zfcp_scsi_schedule_rport_register(port);
 		/* fall through */
 	case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
 		put_device(&port->dev);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index b3c6ff49103b..9069f98a1817 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -3,7 +3,7 @@
  *
  * Interface to Linux SCSI midlayer.
  *
- * Copyright IBM Corp. 2002, 2013
+ * Copyright IBM Corp. 2002, 2015
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -556,6 +556,9 @@ static void zfcp_scsi_rport_register(struct zfcp_port *port)
 	ids.port_id = port->d_id;
 	ids.roles = FC_RPORT_ROLE_FCP_TARGET;
 
+	zfcp_dbf_rec_trig("scpaddy", port->adapter, port, NULL,
+			  ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD,
+			  ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD);
 	rport = fc_remote_port_add(port->adapter->scsi_host, 0, &ids);
 	if (!rport) {
 		dev_err(&port->adapter->ccw_device->dev,
@@ -577,6 +580,9 @@ static void zfcp_scsi_rport_block(struct zfcp_port *port)
 	struct fc_rport *rport = port->rport;
 
 	if (rport) {
+		zfcp_dbf_rec_trig("scpdely", port->adapter, port, NULL,
+				  ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL,
+				  ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL);
 		fc_remote_port_delete(rport);
 		port->rport = NULL;
 	}
-- 
2.6.6

  parent reply	other threads:[~2016-08-10 16:30 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-10 16:30 [PATCH 00/10] zfcp fixes Steffen Maier
2016-08-10 16:30 ` [PATCH 01/10] zfcp: fix fc_host port_type with NPIV Steffen Maier
2016-08-11 11:19   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 02/10] zfcp: fix ELS/GS request&response length for hardware data router Steffen Maier
2016-08-11 11:20   ` Hannes Reinecke
2016-08-10 16:30 ` Steffen Maier [this message]
2016-08-11 11:21   ` [PATCH 03/10] zfcp: close window with unblocked rport during rport gone Hannes Reinecke
2016-08-10 16:30 ` [PATCH 04/10] zfcp: retain trace level for SCSI and HBA FSF response records Steffen Maier
2016-08-11 11:21   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 05/10] zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace Steffen Maier
2016-08-11 11:22   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 06/10] zfcp: trace on request for open and close of WKA port Steffen Maier
2016-08-11 11:22   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 07/10] zfcp: restore tracing of handle for port and LUN with HBA records Steffen Maier
2016-08-11 11:23   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 08/10] zfcp: fix D_ID field with actual value on tracing SAN responses Steffen Maier
2016-08-11 11:24   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 09/10] zfcp: fix payload trace length for SAN request&response Steffen Maier
2016-08-11 11:24   ` Hannes Reinecke
2016-08-10 16:30 ` [PATCH 10/10] zfcp: trace full payload of all SAN records (req,resp,iels) Steffen Maier
2016-08-11 11:25   ` Hannes Reinecke
2016-08-12 20:18 ` [PATCH 00/10] zfcp fixes Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1470846653-90691-4-git-send-email-maier@linux.vnet.ibm.com \
    --to=maier@linux.vnet.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).