[PATCH 0/3] zfcp fixes

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] zfcp fixes
@ 2016-12-09 16:16 Steffen Maier
  2016-12-09 16:16 ` [PATCH 1/3] zfcp: fix use-after-"free" in FC ingress path after TMF Steffen Maier
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Steffen Maier @ 2016-12-09 16:16 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Steffen Maier

Hi all,

here is a series of important zfcp fixes all marked for stable.
They apply to the fixes branch of James' scsi.git or to v4.9-rc8.

Patch 1 is the most important one since it fixes a panic
that can occur under certain circumstances.

Benjamin Block (1):
  zfcp: fix use-after-"free" in FC ingress path after TMF

Steffen Maier (2):
  zfcp: do not trace pure benign residual HBA responses at default level
  zfcp: fix rport unblock race with LUN recovery

 drivers/s390/scsi/zfcp_dbf.c     | 17 +++++++++--
 drivers/s390/scsi/zfcp_dbf.h     | 41 +++++++++++++++++++++++++--
 drivers/s390/scsi/zfcp_erp.c     | 61 ++++++++++++++++++++++++++++++++++++++--
 drivers/s390/scsi/zfcp_ext.h     |  4 ++-
 drivers/s390/scsi/zfcp_fsf.h     |  3 +-
 drivers/s390/scsi/zfcp_reqlist.h | 30 +++++++++++++++++++-
 drivers/s390/scsi/zfcp_scsi.c    | 61 ++++++++++++++++++++++++++++++++++++----
 7 files changed, 202 insertions(+), 15 deletions(-)

-- 
2.8.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] zfcp: fix use-after-"free" in FC ingress path after TMF
  2016-12-09 16:16 [PATCH 0/3] zfcp fixes Steffen Maier
@ 2016-12-09 16:16 ` Steffen Maier
  2016-12-09 16:16 ` [PATCH 2/3] zfcp: do not trace pure benign residual HBA responses at default level Steffen Maier
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2016-12-09 16:16 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Steffen Maier, Benjamin Block, #2 . 6 . 32+

From: Benjamin Block <bblock@linux.vnet.ibm.com>

When SCSI EH invokes zFCP's callbacks for eh_device_reset_handler() and
eh_target_reset_handler(), it expects us to relent the ownership over the
given scsi_cmnd and all other scsi_cmnds within the same scope - LUN or
target - when returning with SUCCESS from the callback ('release' them).
SCSI EH can then reuse those commands.

We did not follow this rule to release commands upon SUCCESS; and if
later a reply arrived for one of those supposed to be released commands,
we would still make use of the scsi_cmnd in our ingress tasklet. This
will at least result in undefined behavior or a kernel panic because of
a wrong kernel pointer dereference.

To fix this, we NULLify all pointers to scsi_cmnds
(struct zfcp_fsf_req *)->data in the matching scope if a TMF was
successful. This is done under the locks
(struct zfcp_adapter *)->abort_lock and (struct zfcp_reqlist *)->lock
to prevent the requests from being removed from the request-hashtable,
and the ingress tasklet from making use of the scsi_cmnd-pointer in
zfcp_fsf_fcp_cmnd_handler().

For cases where a reply arrives during SCSI EH, but before we get a
chance to NULLify the pointer - but before we return from the callback
-, we assume that the code is protected from races via the CAS operation
in blk_complete_request() that is called in scsi_done().

The following stacktrace shows an example for a crash resulting from the
previous behavior:

Unable to handle kernel pointer dereference at virtual kernel address fffffee17a672000
Oops: 0038 [#1] SMP
CPU: 2 PID: 0 Comm: swapper/2 Not tainted
task: 00000003f7ff5be0 ti: 00000003f3d38000 task.ti: 00000003f3d38000
Krnl PSW : 0404d00180000000 00000000001156b0 (smp_vcpu_scheduled+0x18/0x40)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
Krnl GPRS: 000000200000007e 0000000000000000 fffffee17a671fd8 0000000300000015
           ffffffff80000000 00000000005dfde8 07000003f7f80e00 000000004fa4e800
           000000036ce8d8f8 000000036ce8d9c0 00000003ece8fe00 ffffffff969c9e93
           00000003fffffffd 000000036ce8da10 00000000003bf134 00000003f3b07918
Krnl Code: 00000000001156a2: a7190000        lghi    %r1,0
           00000000001156a6: a7380015        lhi    %r3,21
          #00000000001156aa: e32050000008    ag    %r2,0(%r5)
          >00000000001156b0: 482022b0        lh    %r2,688(%r2)
           00000000001156b4: ae123000        sigp    %r1,%r2,0(%r3)
           00000000001156b8: b2220020        ipm    %r2
           00000000001156bc: 8820001c        srl    %r2,28
           00000000001156c0: c02700000001    xilf    %r2,1
Call Trace:
([<0000000000000000>] 0x0)
 [<000003ff807bdb8e>] zfcp_fsf_fcp_cmnd_handler+0x3de/0x490 [zfcp]
 [<000003ff807be30a>] zfcp_fsf_req_complete+0x252/0x800 [zfcp]
 [<000003ff807c0a48>] zfcp_fsf_reqid_check+0xe8/0x190 [zfcp]
 [<000003ff807c194e>] zfcp_qdio_int_resp+0x66/0x188 [zfcp]
 [<000003ff80440c64>] qdio_kick_handler+0xdc/0x310 [qdio]
 [<000003ff804463d0>] __tiqdio_inbound_processing+0xf8/0xcd8 [qdio]
 [<0000000000141fd4>] tasklet_action+0x9c/0x170
 [<0000000000141550>] __do_softirq+0xe8/0x258
 [<000000000010ce0a>] do_softirq+0xba/0xc0
 [<000000000014187c>] irq_exit+0xc4/0xe8
 [<000000000046b526>] do_IRQ+0x146/0x1d8
 [<00000000005d6a3c>] io_return+0x0/0x8
 [<00000000005d6422>] vtime_stop_cpu+0x4a/0xa0
([<0000000000000000>] 0x0)
 [<0000000000103d8a>] arch_cpu_idle+0xa2/0xb0
 [<0000000000197f94>] cpu_startup_entry+0x13c/0x1f8
 [<0000000000114782>] smp_start_secondary+0xda/0xe8
 [<00000000005d6efe>] restart_int_handler+0x56/0x6c
 [<0000000000000000>] 0x0
Last Breaking-Event-Address:
 [<00000000003bf12e>] arch_spin_lock_wait+0x56/0xb0

Suggested-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: ea127f9754 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
Cc: <stable@vger.kernel.org> #2.6.32+
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.h     | 11 ++++++++
 drivers/s390/scsi/zfcp_reqlist.h | 30 ++++++++++++++++++++-
 drivers/s390/scsi/zfcp_scsi.c    | 57 ++++++++++++++++++++++++++++++++++++++--
 3 files changed, 95 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index 36d07584271d..2d06b5d2c05b 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -388,4 +388,15 @@ void zfcp_dbf_scsi_devreset(char *tag, struct scsi_cmnd *scmnd, u8 flag)
 	_zfcp_dbf_scsi(tmp_tag, 1, scmnd, NULL);
 }
 
+/**
+ * zfcp_dbf_scsi_nullcmnd() - trace NULLify of SCSI command in dev/tgt-reset.
+ * @scmnd: SCSI command that was NULLified.
+ * @fsf_req: request that owned @scmnd.
+ */
+static inline void zfcp_dbf_scsi_nullcmnd(struct scsi_cmnd *scmnd,
+					  struct zfcp_fsf_req *fsf_req)
+{
+	_zfcp_dbf_scsi("scfc__1", 3, scmnd, fsf_req);
+}
+
 #endif /* ZFCP_DBF_H */
diff --git a/drivers/s390/scsi/zfcp_reqlist.h b/drivers/s390/scsi/zfcp_reqlist.h
index 7c2c6194dfca..703fce59befe 100644
--- a/drivers/s390/scsi/zfcp_reqlist.h
+++ b/drivers/s390/scsi/zfcp_reqlist.h
@@ -4,7 +4,7 @@
  * Data structure and helper functions for tracking pending FSF
  * requests.
  *
- * Copyright IBM Corp. 2009
+ * Copyright IBM Corp. 2009, 2016
  */
 
 #ifndef ZFCP_REQLIST_H
@@ -180,4 +180,32 @@ static inline void zfcp_reqlist_move(struct zfcp_reqlist *rl,
 	spin_unlock_irqrestore(&rl->lock, flags);
 }
 
+/**
+ * zfcp_reqlist_apply_for_all() - apply a function to every request.
+ * @rl: the requestlist that contains the target requests.
+ * @f: the function to apply to each request; the first parameter of the
+ *     function will be the target-request; the second parameter is the same
+ *     pointer as given with the argument @data.
+ * @data: freely chosen argument; passed through to @f as second parameter.
+ *
+ * Uses :c:macro:`list_for_each_entry` to iterate over the lists in the hash-
+ * table (not a 'safe' variant, so don't modify the list).
+ *
+ * Holds @rl->lock over the entire request-iteration.
+ */
+static inline void
+zfcp_reqlist_apply_for_all(struct zfcp_reqlist *rl,
+			   void (*f)(struct zfcp_fsf_req *, void *), void *data)
+{
+	struct zfcp_fsf_req *req;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&rl->lock, flags);
+	for (i = 0; i < ZFCP_REQ_LIST_BUCKETS; i++)
+		list_for_each_entry(req, &rl->buckets[i], list)
+			f(req, data);
+	spin_unlock_irqrestore(&rl->lock, flags);
+}
+
 #endif /* ZFCP_REQLIST_H */
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 9069f98a1817..11cd18c134c1 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -3,7 +3,7 @@
  *
  * Interface to Linux SCSI midlayer.
  *
- * Copyright IBM Corp. 2002, 2015
+ * Copyright IBM Corp. 2002, 2016
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -209,6 +209,57 @@ static int zfcp_scsi_eh_abort_handler(struct scsi_cmnd *scpnt)
 	return retval;
 }
 
+struct zfcp_scsi_req_filter {
+	u8 tmf_scope;
+	u32 lun_handle;
+	u32 port_handle;
+};
+
+static void zfcp_scsi_forget_cmnd(struct zfcp_fsf_req *old_req, void *data)
+{
+	struct zfcp_scsi_req_filter *filter =
+		(struct zfcp_scsi_req_filter *)data;
+
+	/* already aborted - prevent side-effects - or not a SCSI command */
+	if (old_req->data == NULL || old_req->fsf_command != FSF_QTCB_FCP_CMND)
+		return;
+
+	/* (tmf_scope == FCP_TMF_TGT_RESET || tmf_scope == FCP_TMF_LUN_RESET) */
+	if (old_req->qtcb->header.port_handle != filter->port_handle)
+		return;
+
+	if (filter->tmf_scope == FCP_TMF_LUN_RESET &&
+	    old_req->qtcb->header.lun_handle != filter->lun_handle)
+		return;
+
+	zfcp_dbf_scsi_nullcmnd((struct scsi_cmnd *)old_req->data, old_req);
+	old_req->data = NULL;
+}
+
+static void zfcp_scsi_forget_cmnds(struct zfcp_scsi_dev *zsdev, u8 tm_flags)
+{
+	struct zfcp_adapter *adapter = zsdev->port->adapter;
+	struct zfcp_scsi_req_filter filter = {
+		.tmf_scope = FCP_TMF_TGT_RESET,
+		.port_handle = zsdev->port->handle,
+	};
+	unsigned long flags;
+
+	if (tm_flags == FCP_TMF_LUN_RESET) {
+		filter.tmf_scope = FCP_TMF_LUN_RESET;
+		filter.lun_handle = zsdev->lun_handle;
+	}
+
+	/*
+	 * abort_lock secures against other processings - in the abort-function
+	 * and normal cmnd-handler - of (struct zfcp_fsf_req *)->data
+	 */
+	write_lock_irqsave(&adapter->abort_lock, flags);
+	zfcp_reqlist_apply_for_all(adapter->req_list, zfcp_scsi_forget_cmnd,
+				   &filter);
+	write_unlock_irqrestore(&adapter->abort_lock, flags);
+}
+
 static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
 {
 	struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
@@ -241,8 +292,10 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
 	if (fsf_req->status & ZFCP_STATUS_FSFREQ_TMFUNCFAILED) {
 		zfcp_dbf_scsi_devreset("fail", scpnt, tm_flags);
 		retval = FAILED;
-	} else
+	} else {
 		zfcp_dbf_scsi_devreset("okay", scpnt, tm_flags);
+		zfcp_scsi_forget_cmnds(zfcp_sdev, tm_flags);
+	}
 
 	zfcp_fsf_req_free(fsf_req);
 	return retval;
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] zfcp: do not trace pure benign residual HBA responses at default level
  2016-12-09 16:16 [PATCH 0/3] zfcp fixes Steffen Maier
  2016-12-09 16:16 ` [PATCH 1/3] zfcp: fix use-after-"free" in FC ingress path after TMF Steffen Maier
@ 2016-12-09 16:16 ` Steffen Maier
  2016-12-09 16:16 ` [PATCH 3/3] zfcp: fix rport unblock race with LUN recovery Steffen Maier
  2016-12-14 20:18 ` [PATCH 0/3] zfcp fixes Martin K. Petersen
  3 siblings, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2016-12-09 16:16 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Steffen Maier, #2 . 6 . 37+

Since quite a while, Linux issues enough SCSI commands per scsi_device
which successfully return with FCP_RESID_UNDER, FSF_FCP_RSP_AVAILABLE,
and SAM_STAT_GOOD.
This floods the HBA trace area and we cannot see other and
important HBA trace records long enough.

Therefore, do not trace HBA response errors for pure
benign residual under counts at the default trace level.

This excludes benign residual under count combined with other
validity bits set in FCP_RSP_IU, such as FCP_SNS_LEN_VAL.
For all those other cases, we still do want to see
both the HBA record and the corresponding SCSI record by default.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Cc: <stable@vger.kernel.org> #2.6.37+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.h | 30 ++++++++++++++++++++++++++++--
 drivers/s390/scsi/zfcp_fsf.h |  3 ++-
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index 2d06b5d2c05b..db186d44cfaf 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -2,7 +2,7 @@
  * zfcp device driver
  * debug feature declarations
  *
- * Copyright IBM Corp. 2008, 2015
+ * Copyright IBM Corp. 2008, 2016
  */
 
 #ifndef ZFCP_DBF_H
@@ -283,6 +283,30 @@ struct zfcp_dbf {
 	struct zfcp_dbf_scsi		scsi_buf;
 };
 
+/**
+ * zfcp_dbf_hba_fsf_resp_suppress - true if we should not trace by default
+ * @req: request that has been completed
+ *
+ * Returns true if FCP response with only benign residual under count.
+ */
+static inline
+bool zfcp_dbf_hba_fsf_resp_suppress(struct zfcp_fsf_req *req)
+{
+	struct fsf_qtcb *qtcb = req->qtcb;
+	u32 fsf_stat = qtcb->header.fsf_status;
+	struct fcp_resp *fcp_rsp;
+	u8 rsp_flags, fr_status;
+
+	if (qtcb->prefix.qtcb_type != FSF_IO_COMMAND)
+		return false; /* not an FCP response */
+	fcp_rsp = (struct fcp_resp *)&qtcb->bottom.io.fcp_rsp;
+	rsp_flags = fcp_rsp->fr_flags;
+	fr_status = fcp_rsp->fr_status;
+	return (fsf_stat == FSF_FCP_RSP_AVAILABLE) &&
+		(rsp_flags == FCP_RESID_UNDER) &&
+		(fr_status == SAM_STAT_GOOD);
+}
+
 static inline
 void zfcp_dbf_hba_fsf_resp(char *tag, int level, struct zfcp_fsf_req *req)
 {
@@ -304,7 +328,9 @@ void zfcp_dbf_hba_fsf_response(struct zfcp_fsf_req *req)
 		zfcp_dbf_hba_fsf_resp("fs_perr", 1, req);
 
 	} else if (qtcb->header.fsf_status != FSF_GOOD) {
-		zfcp_dbf_hba_fsf_resp("fs_ferr", 1, req);
+		zfcp_dbf_hba_fsf_resp("fs_ferr",
+				      zfcp_dbf_hba_fsf_resp_suppress(req)
+				      ? 5 : 1, req);
 
 	} else if ((req->fsf_command == FSF_QTCB_OPEN_PORT_WITH_DID) ||
 		   (req->fsf_command == FSF_QTCB_OPEN_LUN)) {
diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h
index be1c04b334c5..ea3c76ac0de1 100644
--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -3,7 +3,7 @@
  *
  * Interface to the FSF support functions.
  *
- * Copyright IBM Corp. 2002, 2015
+ * Copyright IBM Corp. 2002, 2016
  */
 
 #ifndef FSF_H
@@ -78,6 +78,7 @@
 #define FSF_APP_TAG_CHECK_FAILURE		0x00000082
 #define FSF_REF_TAG_CHECK_FAILURE		0x00000083
 #define FSF_ADAPTER_STATUS_AVAILABLE		0x000000AD
+#define FSF_FCP_RSP_AVAILABLE			0x000000AF
 #define FSF_UNKNOWN_COMMAND			0x000000E2
 #define FSF_UNKNOWN_OP_SUBTYPE                  0x000000E3
 #define FSF_INVALID_COMMAND_OPTION              0x000000E5
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] zfcp: fix rport unblock race with LUN recovery
  2016-12-09 16:16 [PATCH 0/3] zfcp fixes Steffen Maier
  2016-12-09 16:16 ` [PATCH 1/3] zfcp: fix use-after-"free" in FC ingress path after TMF Steffen Maier
  2016-12-09 16:16 ` [PATCH 2/3] zfcp: do not trace pure benign residual HBA responses at default level Steffen Maier
@ 2016-12-09 16:16 ` Steffen Maier
  2016-12-14 20:18 ` [PATCH 0/3] zfcp fixes Martin K. Petersen
  3 siblings, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2016-12-09 16:16 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Steffen Maier, #2 . 6 . 32+

It is unavoidable that zfcp_scsi_queuecommand() has to finish
requests with DID_IMM_RETRY (like fc_remote_port_chkready())
during the time window when zfcp detected an unavailable rport but
fc_remote_port_delete(), which is asynchronous via
zfcp_scsi_schedule_rport_block(), has not yet blocked the rport.

However, for the case when the rport becomes available again,
we should prevent unblocking the rport too early.
In contrast to other FCP LLDDs, zfcp has to open each LUN with the
FCP channel hardware before it can send I/O to a LUN.
So if a port already has LUNs attached and we unblock the rport just
after port recovery, recoveries of LUNs behind this port can still be
pending which in turn force zfcp_scsi_queuecommand() to unnecessarily
finish requests with DID_IMM_RETRY.

This also opens a time window with unblocked rport
(until the followup LUN reopen recovery has finished).
If a scsi_cmnd timeout occurs during this time window
fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents
a clean and timely path failover.
This should not happen if the path issue can be recovered
on FC transport layer such as path issues involving RSCNs.

Fix this by only calling zfcp_scsi_schedule_rport_register(),
to asynchronously trigger fc_remote_port_add(),
after all LUN recoveries as children of the rport have finished
and no new recoveries of equal or higher order were triggered meanwhile.
Finished intentionally includes any recovery result no matter if
successful or failed (still unblock rport so other successful LUNs work).
For simplicity, we check after each finished LUN recovery if there
is another LUN recovery pending on the same port and then do nothing.
We handle the special case of a successful recovery of a port
without LUN children the same way without changing this case's semantics.

For debugging we introduce 2 new trace records written if the rport
unblock attempt was aborted due to still unfinished or freshly triggered
recovery. The records are only written above the default trace level.

Benjamin noticed the important special case of new recovery that can be
triggered between having given up the erp_lock and before calling
zfcp_erp_action_cleanup() within zfcp_erp_strategy().
We must avoid the following sequence:

ERP thread                 rport_work      other context
-------------------------  --------------  --------------------------------
port is unblocked, rport still blocked,
 due to pending/running ERP action,
 so ((port->status & ...UNBLOCK) != 0)
 and (port->rport == NULL)
unlock ERP
zfcp_erp_action_cleanup()
case ZFCP_ERP_ACTION_REOPEN_LUN:
zfcp_erp_try_rport_unblock()
((status & ...UNBLOCK) != 0) [OLD!]
                                           zfcp_erp_port_reopen()
                                           lock ERP
                                           zfcp_erp_port_block()
                                           port->status clear ...UNBLOCK
                                           unlock ERP
                                           zfcp_scsi_schedule_rport_block()
                                           port->rport_task = RPORT_DEL
                                           queue_work(rport_work)
                           zfcp_scsi_rport_work()
                           (port->rport_task != RPORT_ADD)
                           port->rport_task = RPORT_NONE
                           zfcp_scsi_rport_block()
                           if (!port->rport) return
zfcp_scsi_schedule_rport_register()
port->rport_task = RPORT_ADD
queue_work(rport_work)
                           zfcp_scsi_rport_work()
                           (port->rport_task == RPORT_ADD)
                           port->rport_task = RPORT_NONE
                           zfcp_scsi_rport_register()
                           (port->rport == NULL)
                           rport = fc_remote_port_add()
                           port->rport = rport;

Now the rport was erroneously unblocked while the zfcp_port is blocked.
This is another situation we want to avoid due to scsi_eh potential. This
state would at least remain until the new recovery from the other context
finished successfully, or potentially forever if it failed.
In order to close this race, we take the erp_lock inside
zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or LUN.
With that, the possible corresponding rport state sequences would be:
(unblock[ERP thread],block[other context]) if the ERP thread gets erp_lock
first and still sees ((port->status & ...UNBLOCK) != 0),
(block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock
after the other context has already cleard ...UNBLOCK from port->status.

Since checking fields of struct erp_action is unsafe because they could
have been overwritten (re-used for new recovery) meanwhile, we only check
status of zfcp_port and LUN since these are only changed under erp_lock
elsewhere. Regarding the check of the proper status flags
(port or port_forced are similar to the shown adapter recovery):

[zfcp_erp_adapter_shutdown()]
zfcp_erp_adapter_reopen()
 zfcp_erp_adapter_block()
  * clear UNBLOCK ---------------------------------------+
 zfcp_scsi_schedule_rports_block()                       |
 write_lock_irqsave(&adapter->erp_lock, flags);-------+  |
 zfcp_erp_action_enqueue()                            |  |
  zfcp_erp_setup_act()                                |  |
   * set ERP_INUSE -----------------------------------|--|--+
 write_unlock_irqrestore(&adapter->erp_lock, flags);--+  |  |
.context-switch.                                         |  |
zfcp_erp_thread()                                        |  |
 zfcp_erp_strategy()                                     |  |
  write_lock_irqsave(&adapter->erp_lock, flags);------+  |  |
  ...                                                 |  |  |
  zfcp_erp_strategy_check_target()                    |  |  |
   zfcp_erp_strategy_check_adapter()                  |  |  |
    zfcp_erp_adapter_unblock()                        |  |  |
     * set UNBLOCK -----------------------------------|--+  |
  zfcp_erp_action_dequeue()                           |     |
   * clear ERP_INUSE ---------------------------------|-----+
  ...                                                 |
  write_unlock_irqrestore(&adapter->erp_lock, flags);-+

Hence, we should check for both UNBLOCK and ERP_INUSE because they are
interleaved.
Also we need to explicitly check ERP_FAILED for the link down case
which currently does not clear the UNBLOCK flag in
zfcp_fsf_link_down_info_eval().

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8830271c4819 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport")
Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.c  | 17 +++++++++---
 drivers/s390/scsi/zfcp_erp.c  | 61 +++++++++++++++++++++++++++++++++++++++++--
 drivers/s390/scsi/zfcp_ext.h  |  4 ++-
 drivers/s390/scsi/zfcp_scsi.c |  4 +--
 4 files changed, 77 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 581001989937..d5bf36ec8a75 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -289,11 +289,12 @@ void zfcp_dbf_rec_trig(char *tag, struct zfcp_adapter *adapter,
 
 
 /**
- * zfcp_dbf_rec_run - trace event related to running recovery
+ * zfcp_dbf_rec_run_lvl - trace event related to running recovery
+ * @level: trace level to be used for event
  * @tag: identifier for event
  * @erp: erp_action running
  */
-void zfcp_dbf_rec_run(char *tag, struct zfcp_erp_action *erp)
+void zfcp_dbf_rec_run_lvl(int level, char *tag, struct zfcp_erp_action *erp)
 {
 	struct zfcp_dbf *dbf = erp->adapter->dbf;
 	struct zfcp_dbf_rec *rec = &dbf->rec_buf;
@@ -319,11 +320,21 @@ void zfcp_dbf_rec_run(char *tag, struct zfcp_erp_action *erp)
 	else
 		rec->u.run.rec_count = atomic_read(&erp->adapter->erp_counter);
 
-	debug_event(dbf->rec, 1, rec, sizeof(*rec));
+	debug_event(dbf->rec, level, rec, sizeof(*rec));
 	spin_unlock_irqrestore(&dbf->rec_lock, flags);
 }
 
 /**
+ * zfcp_dbf_rec_run - trace event related to running recovery
+ * @tag: identifier for event
+ * @erp: erp_action running
+ */
+void zfcp_dbf_rec_run(char *tag, struct zfcp_erp_action *erp)
+{
+	zfcp_dbf_rec_run_lvl(1, tag, erp);
+}
+
+/**
  * zfcp_dbf_rec_run_wka - trace wka port event with info like running recovery
  * @tag: identifier for event
  * @wka_port: well known address port
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index a59d678125bd..7ccfce559034 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -3,7 +3,7 @@
  *
  * Error Recovery Procedures (ERP).
  *
- * Copyright IBM Corp. 2002, 2015
+ * Copyright IBM Corp. 2002, 2016
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -1204,6 +1204,62 @@ static void zfcp_erp_action_dequeue(struct zfcp_erp_action *erp_action)
 	}
 }
 
+/**
+ * zfcp_erp_try_rport_unblock - unblock rport if no more/new recovery
+ * @port: zfcp_port whose fc_rport we should try to unblock
+ */
+static void zfcp_erp_try_rport_unblock(struct zfcp_port *port)
+{
+	unsigned long flags;
+	struct zfcp_adapter *adapter = port->adapter;
+	int port_status;
+	struct Scsi_Host *shost = adapter->scsi_host;
+	struct scsi_device *sdev;
+
+	write_lock_irqsave(&adapter->erp_lock, flags);
+	port_status = atomic_read(&port->status);
+	if ((port_status & ZFCP_STATUS_COMMON_UNBLOCKED)    == 0 ||
+	    (port_status & (ZFCP_STATUS_COMMON_ERP_INUSE |
+			    ZFCP_STATUS_COMMON_ERP_FAILED)) != 0) {
+		/* new ERP of severity >= port triggered elsewhere meanwhile or
+		 * local link down (adapter erp_failed but not clear unblock)
+		 */
+		zfcp_dbf_rec_run_lvl(4, "ertru_p", &port->erp_action);
+		write_unlock_irqrestore(&adapter->erp_lock, flags);
+		return;
+	}
+	spin_lock(shost->host_lock);
+	__shost_for_each_device(sdev, shost) {
+		struct zfcp_scsi_dev *zsdev = sdev_to_zfcp(sdev);
+		int lun_status;
+
+		if (zsdev->port != port)
+			continue;
+		/* LUN under port of interest */
+		lun_status = atomic_read(&zsdev->status);
+		if ((lun_status & ZFCP_STATUS_COMMON_ERP_FAILED) != 0)
+			continue; /* unblock rport despite failed LUNs */
+		/* LUN recovery not given up yet [maybe follow-up pending] */
+		if ((lun_status & ZFCP_STATUS_COMMON_UNBLOCKED) == 0 ||
+		    (lun_status & ZFCP_STATUS_COMMON_ERP_INUSE) != 0) {
+			/* LUN blocked:
+			 * not yet unblocked [LUN recovery pending]
+			 * or meanwhile blocked [new LUN recovery triggered]
+			 */
+			zfcp_dbf_rec_run_lvl(4, "ertru_l", &zsdev->erp_action);
+			spin_unlock(shost->host_lock);
+			write_unlock_irqrestore(&adapter->erp_lock, flags);
+			return;
+		}
+	}
+	/* now port has no child or all children have completed recovery,
+	 * and no ERP of severity >= port was meanwhile triggered elsewhere
+	 */
+	zfcp_scsi_schedule_rport_register(port);
+	spin_unlock(shost->host_lock);
+	write_unlock_irqrestore(&adapter->erp_lock, flags);
+}
+
 static void zfcp_erp_action_cleanup(struct zfcp_erp_action *act, int result)
 {
 	struct zfcp_adapter *adapter = act->adapter;
@@ -1214,6 +1270,7 @@ static void zfcp_erp_action_cleanup(struct zfcp_erp_action *act, int result)
 	case ZFCP_ERP_ACTION_REOPEN_LUN:
 		if (!(act->status & ZFCP_STATUS_ERP_NO_REF))
 			scsi_device_put(sdev);
+		zfcp_erp_try_rport_unblock(port);
 		break;
 
 	case ZFCP_ERP_ACTION_REOPEN_PORT:
@@ -1224,7 +1281,7 @@ static void zfcp_erp_action_cleanup(struct zfcp_erp_action *act, int result)
 		 */
 		if (act->step != ZFCP_ERP_STEP_UNINITIALIZED)
 			if (result == ZFCP_ERP_SUCCEEDED)
-				zfcp_scsi_schedule_rport_register(port);
+				zfcp_erp_try_rport_unblock(port);
 		/* fall through */
 	case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
 		put_device(&port->dev);
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index c8fed9fa1cca..21c8c689b02b 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -3,7 +3,7 @@
  *
  * External function declarations.
  *
- * Copyright IBM Corp. 2002, 2015
+ * Copyright IBM Corp. 2002, 2016
  */
 
 #ifndef ZFCP_EXT_H
@@ -35,6 +35,8 @@ extern void zfcp_dbf_adapter_unregister(struct zfcp_adapter *);
 extern void zfcp_dbf_rec_trig(char *, struct zfcp_adapter *,
 			      struct zfcp_port *, struct scsi_device *, u8, u8);
 extern void zfcp_dbf_rec_run(char *, struct zfcp_erp_action *);
+extern void zfcp_dbf_rec_run_lvl(int level, char *tag,
+				 struct zfcp_erp_action *erp);
 extern void zfcp_dbf_rec_run_wka(char *, struct zfcp_fc_wka_port *, u64);
 extern void zfcp_dbf_hba_fsf_uss(char *, struct zfcp_fsf_req *);
 extern void zfcp_dbf_hba_fsf_res(char *, int, struct zfcp_fsf_req *);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 11cd18c134c1..07ffdbb5107f 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -88,9 +88,7 @@ int zfcp_scsi_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scpnt)
 	}
 
 	if (unlikely(!(status & ZFCP_STATUS_COMMON_UNBLOCKED))) {
-		/* This could be either
-		 * open LUN pending: this is temporary, will result in
-		 *	open LUN or ERP_FAILED, so retry command
+		/* This could be
 		 * call to rport_delete pending: mimic retry from
 		 * 	fc_remote_port_chkready until rport is BLOCKED
 		 */
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] zfcp fixes
  2016-12-09 16:16 [PATCH 0/3] zfcp fixes Steffen Maier
                   ` (2 preceding siblings ...)
  2016-12-09 16:16 ` [PATCH 3/3] zfcp: fix rport unblock race with LUN recovery Steffen Maier
@ 2016-12-14 20:18 ` Martin K. Petersen
  3 siblings, 0 replies; 5+ messages in thread
From: Martin K. Petersen @ 2016-12-14 20:18 UTC (permalink / raw)
  To: Steffen Maier
  Cc: James E . J . Bottomley, Martin K . Petersen, linux-scsi,
	linux-s390, Martin Schwidefsky, Heiko Carstens

>>>>> "Steffen" == Steffen Maier <maier@linux.vnet.ibm.com> writes:

Steffen,

Steffen> here is a series of important zfcp fixes all marked for stable.
Steffen> They apply to the fixes branch of James' scsi.git or to
Steffen> v4.9-rc8.

Applied to 4.10/scsi-fixes.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-12-14 20:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-09 16:16 [PATCH 0/3] zfcp fixes Steffen Maier
2016-12-09 16:16 ` [PATCH 1/3] zfcp: fix use-after-"free" in FC ingress path after TMF Steffen Maier
2016-12-09 16:16 ` [PATCH 2/3] zfcp: do not trace pure benign residual HBA responses at default level Steffen Maier
2016-12-09 16:16 ` [PATCH 3/3] zfcp: fix rport unblock race with LUN recovery Steffen Maier
2016-12-14 20:18 ` [PATCH 0/3] zfcp fixes Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).