[patch 00/13] zfcp fixes for 2.6.31-rc2

linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [patch 00/13] zfcp fixes for 2.6.31-rc2
@ 2009-07-13 13:06 Christof Schmitt
  2009-07-13 13:06 ` [patch 01/13] zfcp: Fix invalid command order Christof Schmitt
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens

James,

here is another series of zfcp fixes. About half of the fixes result
from a test case where zfcp has to detect and recover stalled
communication to the fcp channel, the other half are various fixes all
over.

The patches apply on top of the current git tree (2.6.31-rc2).

--
Christof

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 01/13] zfcp: Fix invalid command order
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 02/13] zfcp: Acquire qdio_stat_lock when reading the queue utilization Christof Schmitt
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Swen Schillig, Christof Schmitt

[-- Attachment #1: 702-zfcp-invalid-order.diff --]
[-- Type: text/plain, Size: 1581 bytes --]

From: Swen Schillig <swen@vnet.ibm.com>

We should not modify the port status after triggering an ERP action
for the port. It is not guaranteed which status is finally active
when the ERP action is performed. This can lead to situations which
are unwanted and hard to debug in case of a failure.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_fsf.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff -urpN linux-2.6/drivers/s390/scsi/zfcp_fsf.c linux-2.6-patched/drivers/s390/scsi/zfcp_fsf.c
--- linux-2.6/drivers/s390/scsi/zfcp_fsf.c	2009-07-12 21:08:41.000000000 +0200
+++ linux-2.6-patched/drivers/s390/scsi/zfcp_fsf.c	2009-07-12 21:08:56.000000000 +0200
@@ -1731,15 +1731,16 @@ static void zfcp_fsf_close_physical_port
 		zfcp_fsf_access_denied_port(req, port);
 		break;
 	case FSF_PORT_BOXED:
-		zfcp_erp_port_boxed(port, "fscpph2", req);
-		req->status |= ZFCP_STATUS_FSFREQ_ERROR |
-			       ZFCP_STATUS_FSFREQ_RETRY;
 		/* can't use generic zfcp_erp_modify_port_status because
 		 * ZFCP_STATUS_COMMON_OPEN must not be reset for the port */
 		atomic_clear_mask(ZFCP_STATUS_PORT_PHYS_OPEN, &port->status);
 		list_for_each_entry(unit, &port->unit_list_head, list)
 			atomic_clear_mask(ZFCP_STATUS_COMMON_OPEN,
 					  &unit->status);
+		zfcp_erp_port_boxed(port, "fscpph2", req);
+		req->status |= ZFCP_STATUS_FSFREQ_ERROR |
+			       ZFCP_STATUS_FSFREQ_RETRY;
+
 		break;
 	case FSF_ADAPTER_STATUS_AVAILABLE:
 		switch (header->fsf_status_qual.word[0]) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 02/13] zfcp: Acquire qdio_stat_lock when reading the queue utilization
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
  2009-07-13 13:06 ` [patch 01/13] zfcp: Fix invalid command order Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 03/13] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf Christof Schmitt
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 707-zfcp-qdio_stat_lock.diff --]
[-- Type: text/plain, Size: 1204 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

req_q_util is not atomic, so the qdio_stat_lock must be held when
reading this variable.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_sysfs.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff -urpN linux-2.6/drivers/s390/scsi/zfcp_sysfs.c linux-2.6-patched/drivers/s390/scsi/zfcp_sysfs.c
--- linux-2.6/drivers/s390/scsi/zfcp_sysfs.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6-patched/drivers/s390/scsi/zfcp_sysfs.c	2009-07-12 21:08:58.000000000 +0200
@@ -494,9 +494,14 @@ static ssize_t zfcp_sysfs_adapter_q_full
 	struct Scsi_Host *scsi_host = class_to_shost(dev);
 	struct zfcp_adapter *adapter =
 		(struct zfcp_adapter *) scsi_host->hostdata[0];
+	u64 util;
+
+	spin_lock_bh(&adapter->qdio_stat_lock);
+	util = adapter->req_q_util;
+	spin_unlock_bh(&adapter->qdio_stat_lock);
 
 	return sprintf(buf, "%d %llu\n", atomic_read(&adapter->qdio_outb_full),
-		       (unsigned long long)adapter->req_q_util);
+		       (unsigned long long)util);
 }
 static DEVICE_ATTR(queue_full, S_IRUGO, zfcp_sysfs_adapter_q_full_show, NULL);
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 03/13] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
  2009-07-13 13:06 ` [patch 01/13] zfcp: Fix invalid command order Christof Schmitt
  2009-07-13 13:06 ` [patch 02/13] zfcp: Acquire qdio_stat_lock when reading the queue utilization Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 04/13] zfcp: Use correct flags for zfcp_erp_notify Christof Schmitt
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 708-zfcp-enomem.diff --]
[-- Type: text/plain, Size: 702 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

When a fsf_req or a qtcb cannot be allocated return -ENOMEM instead of
-EIO.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_fsf.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:17:47.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:17:49.000000000 +0200
@@ -722,7 +722,7 @@ static struct zfcp_fsf_req *zfcp_fsf_req
 		req = zfcp_fsf_alloc_qtcb(pool);
 
 	if (unlikely(!req))
-		return ERR_PTR(-EIO);
+		return ERR_PTR(-ENOMEM);
 
 	if (adapter->req_no == 0)
 		adapter->req_no++;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 04/13] zfcp: Use correct flags for zfcp_erp_notify
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (2 preceding siblings ...)
  2009-07-13 13:06 ` [patch 03/13] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 05/13] zfcp: Use unchained mode for small ct and els requests Christof Schmitt
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 709-zfcp-erp-notify-flags.diff --]
[-- Type: text/plain, Size: 1206 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

zfcp_erp_notify uses the ZFCP_ERP_STATUS_* flags, so it is
ZFCP_STATUS_ERP_LOWMEM instead of ZFCP_ERP_NOMEM. Signalling
ZFCP_ERP_FAILED is not necessary, the missing d_id will show that the
nameserver did not return the d_id.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_erp.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -urpN linux-2.6/drivers/s390/scsi/zfcp_erp.c linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c
--- linux-2.6/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:08:41.000000000 +0200
+++ linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:08:59.000000000 +0200
@@ -854,10 +854,10 @@ void zfcp_erp_port_strategy_open_lookup(
 
 	retval = zfcp_fc_ns_gid_pn(&port->erp_action);
 	if (retval == -ENOMEM)
-		zfcp_erp_notify(&port->erp_action, ZFCP_ERP_NOMEM);
+		zfcp_erp_notify(&port->erp_action, ZFCP_STATUS_ERP_LOWMEM);
 	port->erp_action.step = ZFCP_ERP_STEP_NAMESERVER_LOOKUP;
 	if (retval)
-		zfcp_erp_notify(&port->erp_action, ZFCP_ERP_FAILED);
+		zfcp_erp_notify(&port->erp_action, 0);
 	zfcp_port_put(port);
 }
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 05/13] zfcp: Use unchained mode for small ct and els requests
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (3 preceding siblings ...)
  2009-07-13 13:06 ` [patch 04/13] zfcp: Use correct flags for zfcp_erp_notify Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 06/13] zfcp: Use -EIO for SBAL allocation failures Christof Schmitt
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 710-zfcp-unchained-mode.diff --]
[-- Type: text/plain, Size: 2370 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

The ELS ADISC and the GID_PN requests sent from zfcp fit into
unchained FSF requests. Change the FSF allocation logic to use
unchained requests whenever possible where everything fits in one
SBAL. This avoids acquiring more SBALs than necessary, especially
during zfcp recovery when things might be stalled.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_fsf.c |   33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

--- a/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:17:49.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:17:55.000000000 +0200
@@ -1010,6 +1010,23 @@ skip_fsfstatus:
 		send_ct->handler(send_ct->handler_data);
 }
 
+static void zfcp_fsf_setup_ct_els_unchained(struct qdio_buffer_element *sbale,
+					    struct scatterlist *sg_req,
+					    struct scatterlist *sg_resp)
+{
+	sbale[0].flags |= SBAL_FLAGS0_TYPE_WRITE_READ;
+	sbale[2].addr   = sg_virt(sg_req);
+	sbale[2].length = sg_req->length;
+	sbale[3].addr   = sg_virt(sg_resp);
+	sbale[3].length = sg_resp->length;
+	sbale[3].flags |= SBAL_FLAGS_LAST_ENTRY;
+}
+
+static int zfcp_fsf_one_sbal(struct scatterlist *sg)
+{
+	return sg_is_last(sg) && sg->length <= PAGE_SIZE;
+}
+
 static int zfcp_fsf_setup_ct_els_sbals(struct zfcp_fsf_req *req,
 				       struct scatterlist *sg_req,
 				       struct scatterlist *sg_resp,
@@ -1020,16 +1037,16 @@ static int zfcp_fsf_setup_ct_els_sbals(s
 	int bytes;
 
 	if (!(feat & FSF_FEATURE_ELS_CT_CHAINED_SBALS)) {
-		if (sg_req->length > PAGE_SIZE || sg_resp->length > PAGE_SIZE ||
-		    !sg_is_last(sg_req) || !sg_is_last(sg_resp))
+		if (!zfcp_fsf_one_sbal(sg_req) || !zfcp_fsf_one_sbal(sg_resp))
 			return -EOPNOTSUPP;
 
-		sbale[0].flags |= SBAL_FLAGS0_TYPE_WRITE_READ;
-		sbale[2].addr   = sg_virt(sg_req);
-		sbale[2].length = sg_req->length;
-		sbale[3].addr   = sg_virt(sg_resp);
-		sbale[3].length = sg_resp->length;
-		sbale[3].flags |= SBAL_FLAGS_LAST_ENTRY;
+		zfcp_fsf_setup_ct_els_unchained(sbale, sg_req, sg_resp);
+		return 0;
+	}
+
+	/* use single, unchained SBAL if it can hold the request */
+	if (zfcp_fsf_one_sbal(sg_req) && zfcp_fsf_one_sbal(sg_resp)) {
+		zfcp_fsf_setup_ct_els_unchained(sbale, sg_req, sg_resp);
 		return 0;
 	}
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 06/13] zfcp: Use -EIO for SBAL allocation failures
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (4 preceding siblings ...)
  2009-07-13 13:06 ` [patch 05/13] zfcp: Use unchained mode for small ct and els requests Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 07/13] zfcp: Fix logic for physical port close Christof Schmitt
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 711-zfcp-sbal-alloc-failuer.diff --]
[-- Type: text/plain, Size: 1274 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

-ENOMEM is for memory allocation problems, -EIO for queue/SBAL
allocation problems.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_fsf.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:46:05.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 14:03:14.000000000 +0200
@@ -1053,14 +1053,14 @@ static int zfcp_fsf_setup_ct_els_sbals(s
 	bytes = zfcp_qdio_sbals_from_sg(req, SBAL_FLAGS0_TYPE_WRITE_READ,
 					sg_req, max_sbals);
 	if (bytes <= 0)
-		return -ENOMEM;
+		return -EIO;
 	req->qtcb->bottom.support.req_buf_length = bytes;
 	req->sbale_curr = ZFCP_LAST_SBALE_PER_SBAL;
 
 	bytes = zfcp_qdio_sbals_from_sg(req, SBAL_FLAGS0_TYPE_WRITE_READ,
 					sg_resp, max_sbals);
 	if (bytes <= 0)
-		return -ENOMEM;
+		return -EIO;
 	req->qtcb->bottom.support.resp_buf_length = bytes;
 
 	return 0;
@@ -2559,7 +2559,6 @@ struct zfcp_fsf_req *zfcp_fsf_control_fi
 	bytes = zfcp_qdio_sbals_from_sg(req, direction, fsf_cfdc->sg,
 					FSF_MAX_SBALS_PER_REQ);
 	if (bytes != ZFCP_CFDC_MAX_SIZE) {
-		retval = -ENOMEM;
 		zfcp_fsf_req_free(req);
 		goto out;
 	}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 07/13] zfcp: Fix logic for physical port close
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (5 preceding siblings ...)
  2009-07-13 13:06 ` [patch 06/13] zfcp: Use -EIO for SBAL allocation failures Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 08/13] zfcp: Fix erp escalation procedure Christof Schmitt
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 713-zfcp-physical-port-close.diff --]
[-- Type: text/plain, Size: 896 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

After closing the port, we want it to be "not open" to consider the
action to be successful.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_erp.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -urpN linux-2.6/drivers/s390/scsi/zfcp_erp.c linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c
--- linux-2.6/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:08:59.000000000 +0200
+++ linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:09:01.000000000 +0200
@@ -801,7 +801,7 @@ static int zfcp_erp_port_forced_strategy
 			return ZFCP_ERP_FAILED;
 
 	case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
-		if (status & ZFCP_STATUS_PORT_PHYS_OPEN)
+		if (!(status & ZFCP_STATUS_PORT_PHYS_OPEN))
 			return ZFCP_ERP_SUCCEEDED;
 	}
 	return ZFCP_ERP_FAILED;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 08/13] zfcp: Fix erp escalation procedure
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (6 preceding siblings ...)
  2009-07-13 13:06 ` [patch 07/13] zfcp: Fix logic for physical port close Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 09/13] zfcp: Recover from stalled outbound queue Christof Schmitt
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 714-zfcp-erp-escalation.diff --]
[-- Type: text/plain, Size: 3350 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

If an action fails, retry it until the erp count exceeds the
threshold. If there is something fundamentally wrong, the FSF layer
will trigger a more appropriate action depending on the FSF status
codes.

The followup for successful actions is a different followup than
retrying failed actions, so split the code two functions to make this
clear.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_erp.c |   50 ++++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 26 deletions(-)

diff -urpN linux-2.6/drivers/s390/scsi/zfcp_erp.c linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c
--- linux-2.6/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:09:01.000000000 +0200
+++ linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:09:01.000000000 +0200
@@ -553,40 +553,35 @@ static void _zfcp_erp_unit_reopen_all(st
 		_zfcp_erp_unit_reopen(unit, clear, id, ref);
 }
 
-static void zfcp_erp_strategy_followup_actions(struct zfcp_erp_action *act)
+static void zfcp_erp_strategy_followup_failed(struct zfcp_erp_action *act)
 {
-	struct zfcp_adapter *adapter = act->adapter;
-	struct zfcp_port *port = act->port;
-	struct zfcp_unit *unit = act->unit;
-	u32 status = act->status;
-
-	/* initiate follow-up actions depending on success of finished action */
 	switch (act->action) {
-
 	case ZFCP_ERP_ACTION_REOPEN_ADAPTER:
-		if (status == ZFCP_ERP_SUCCEEDED)
-			_zfcp_erp_port_reopen_all(adapter, 0, "ersfa_1", NULL);
-		else
-			_zfcp_erp_adapter_reopen(adapter, 0, "ersfa_2", NULL);
+		_zfcp_erp_adapter_reopen(act->adapter, 0, "ersff_1", NULL);
 		break;
-
 	case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
-		if (status == ZFCP_ERP_SUCCEEDED)
-			_zfcp_erp_port_reopen(port, 0, "ersfa_3", NULL);
-		else
-			_zfcp_erp_adapter_reopen(adapter, 0, "ersfa_4", NULL);
+		_zfcp_erp_port_forced_reopen(act->port, 0, "ersff_2", NULL);
 		break;
-
 	case ZFCP_ERP_ACTION_REOPEN_PORT:
-		if (status == ZFCP_ERP_SUCCEEDED)
-			_zfcp_erp_unit_reopen_all(port, 0, "ersfa_5", NULL);
-		else
-			_zfcp_erp_port_forced_reopen(port, 0, "ersfa_6", NULL);
+		_zfcp_erp_port_reopen(act->port, 0, "ersff_3", NULL);
 		break;
-
 	case ZFCP_ERP_ACTION_REOPEN_UNIT:
-		if (status != ZFCP_ERP_SUCCEEDED)
-			_zfcp_erp_port_reopen(unit->port, 0, "ersfa_7", NULL);
+		_zfcp_erp_unit_reopen(act->unit, 0, "ersff_4", NULL);
+		break;
+	}
+}
+
+static void zfcp_erp_strategy_followup_success(struct zfcp_erp_action *act)
+{
+	switch (act->action) {
+	case ZFCP_ERP_ACTION_REOPEN_ADAPTER:
+		_zfcp_erp_port_reopen_all(act->adapter, 0, "ersfs_1", NULL);
+		break;
+	case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
+		_zfcp_erp_port_reopen(act->port, 0, "ersfs_2", NULL);
+		break;
+	case ZFCP_ERP_ACTION_REOPEN_PORT:
+		_zfcp_erp_unit_reopen_all(act->port, 0, "ersfs_3", NULL);
 		break;
 	}
 }
@@ -1289,7 +1284,10 @@ static int zfcp_erp_strategy(struct zfcp
 	retval = zfcp_erp_strategy_statechange(erp_action, retval);
 	if (retval == ZFCP_ERP_EXIT)
 		goto unlock;
-	zfcp_erp_strategy_followup_actions(erp_action);
+	if (retval == ZFCP_ERP_SUCCEEDED)
+		zfcp_erp_strategy_followup_success(erp_action);
+	if (retval == ZFCP_ERP_FAILED)
+		zfcp_erp_strategy_followup_failed(erp_action);
 
  unlock:
 	write_unlock(&adapter->erp_lock);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 09/13] zfcp: Recover from stalled outbound queue
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (7 preceding siblings ...)
  2009-07-13 13:06 ` [patch 08/13] zfcp: Fix erp escalation procedure Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 10/13] zfcp: Add port only once to FC transport class Christof Schmitt
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 715-zfcp-outbound-queue.diff --]
[-- Type: text/plain, Size: 1435 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

Depending on interruptions on some storage systems, the complete
channel can stall which looks like an outbound queue stall to Linux.
When trying to acquire a free SBAL for a non-SCSI command, zfcp waits
for 5 seconds for a free slot to appear. This is the right place to
detect a queue stall: If the wait times out, we assume a stalled queue
and try to recover this. 

The overall strategy should be to trigger the erp from specific
events, and not try an overall escalation from one failed port to a
full-blown queue recovery. If we manage to send a command, the status
codes for this command or a timeout will trigger the right follow-on
actions.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_fsf.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:18:08.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:18:10.000000000 +0200
@@ -670,8 +670,11 @@ static int zfcp_fsf_req_sbal_get(struct 
 			       zfcp_fsf_sbal_check(adapter), 5 * HZ);
 	if (ret > 0)
 		return 0;
-	if (!ret)
+	if (!ret) {
 		atomic_inc(&adapter->qdio_outb_full);
+		/* assume hanging outbound queue, try queue recovery */
+		zfcp_erp_adapter_reopen(adapter, 0, "fsrsg_1", NULL);
+	}
 
 	spin_lock_bh(&adapter->req_q_lock);
 	return -EIO;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 10/13] zfcp: Add port only once to FC transport class
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (8 preceding siblings ...)
  2009-07-13 13:06 ` [patch 09/13] zfcp: Recover from stalled outbound queue Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 11/13] zfcp: avoid double notify in lowmem scenario Christof Schmitt
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 716-zfcp-add-port-once.diff --]
[-- Type: text/plain, Size: 1119 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

When calling fc_remote_port_add make sure to not call it again before
fc_remote_port_delete has been called. In other words, ensure to
create a new fc_rport, then delete it, then create a new one again.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_scsi.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/drivers/s390/scsi/zfcp_scsi.c	2009-07-13 13:18:07.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_scsi.c	2009-07-13 13:18:14.000000000 +0200
@@ -534,6 +534,9 @@ static void zfcp_scsi_rport_register(str
 	struct fc_rport_identifiers ids;
 	struct fc_rport *rport;
 
+	if (port->rport)
+		return;
+
 	ids.node_name = port->wwnn;
 	ids.port_name = port->wwpn;
 	ids.port_id = port->d_id;
@@ -557,8 +560,10 @@ static void zfcp_scsi_rport_block(struct
 {
 	struct fc_rport *rport = port->rport;
 
-	if (rport)
+	if (rport) {
 		fc_remote_port_delete(rport);
+		port->rport = NULL;
+	}
 }
 
 void zfcp_scsi_schedule_rport_register(struct zfcp_port *port)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 11/13] zfcp: avoid double notify in lowmem scenario
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (9 preceding siblings ...)
  2009-07-13 13:06 ` [patch 10/13] zfcp: Add port only once to FC transport class Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 12/13] zfcp: Fix wka port processing Christof Schmitt
  2009-07-13 13:06 ` [patch 13/13] zfcp: Fix tracing of request id for abort requests Christof Schmitt
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Swen Schillig, Christof Schmitt

[-- Attachment #1: 718-zfcp-double-notify.diff --]
[-- Type: text/plain, Size: 1230 bytes --]

From: Swen Schillig <swen@vnet.ibm.com>

In a LOWMEM condition an ERP notification would have been sent twice
causing an unpredictable behaviour of the ERP.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_erp.c |   14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff -urpN linux-2.6/drivers/s390/scsi/zfcp_erp.c linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c
--- linux-2.6/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:09:01.000000000 +0200
+++ linux-2.6-patched/drivers/s390/scsi/zfcp_erp.c	2009-07-12 21:09:03.000000000 +0200
@@ -848,11 +848,17 @@ void zfcp_erp_port_strategy_open_lookup(
 					      gid_pn_work);
 
 	retval = zfcp_fc_ns_gid_pn(&port->erp_action);
-	if (retval == -ENOMEM)
+	if (!retval) {
+		port->erp_action.step = ZFCP_ERP_STEP_NAMESERVER_LOOKUP;
+		goto out;
+	}
+	if (retval == -ENOMEM) {
 		zfcp_erp_notify(&port->erp_action, ZFCP_STATUS_ERP_LOWMEM);
-	port->erp_action.step = ZFCP_ERP_STEP_NAMESERVER_LOOKUP;
-	if (retval)
-		zfcp_erp_notify(&port->erp_action, 0);
+		goto out;
+	}
+	/* all other error condtions */
+	zfcp_erp_notify(&port->erp_action, 0);
+out:
 	zfcp_port_put(port);
 }
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 12/13] zfcp: Fix wka port processing
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (10 preceding siblings ...)
  2009-07-13 13:06 ` [patch 11/13] zfcp: avoid double notify in lowmem scenario Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  2009-07-13 13:06 ` [patch 13/13] zfcp: Fix tracing of request id for abort requests Christof Schmitt
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Swen Schillig, Christof Schmitt

[-- Attachment #1: 719-zfcp-racy-wka.diff --]
[-- Type: text/plain, Size: 1905 bytes --]

From: Swen Schillig <swen@vnet.ibm.com>

Under certain conditions it is possible that a WKA port ist not opened
within the expected timeframe of half a second. In this situation
the WKA port remains in the state OPENING preventing any succeding
request to open the port. This led to unrecoverable remote ports.
Fixing this by always setting an appropriate WKA port status before
leaving the function and removing the timeout value here since it's 
not needed here because the general timeout processing would deal
with it if required.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_fc.c  |    8 +++-----
 drivers/s390/scsi/zfcp_fsf.c |    4 ++--
 2 files changed, 5 insertions(+), 7 deletions(-)

--- a/drivers/s390/scsi/zfcp_fc.c	2009-07-13 13:17:43.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_fc.c	2009-07-13 13:18:20.000000000 +0200
@@ -79,11 +79,9 @@ static int zfcp_wka_port_get(struct zfcp
 
 	mutex_unlock(&wka_port->mutex);
 
-	wait_event_timeout(
-		wka_port->completion_wq,
-		wka_port->status == ZFCP_WKA_PORT_ONLINE ||
-		wka_port->status == ZFCP_WKA_PORT_OFFLINE,
-		HZ >> 1);
+	wait_event(wka_port->completion_wq,
+		   wka_port->status == ZFCP_WKA_PORT_ONLINE ||
+		   wka_port->status == ZFCP_WKA_PORT_OFFLINE);
 
 	if (wka_port->status == ZFCP_WKA_PORT_ONLINE) {
 		atomic_inc(&wka_port->refcount);
--- a/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:18:10.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_fsf.c	2009-07-13 13:18:20.000000000 +0200
@@ -1627,10 +1627,10 @@ static void zfcp_fsf_open_wka_port_handl
 	case FSF_ACCESS_DENIED:
 		wka_port->status = ZFCP_WKA_PORT_OFFLINE;
 		break;
-	case FSF_PORT_ALREADY_OPEN:
-		break;
 	case FSF_GOOD:
 		wka_port->handle = header->port_handle;
+		/* fall through */
+	case FSF_PORT_ALREADY_OPEN:
 		wka_port->status = ZFCP_WKA_PORT_ONLINE;
 	}
 out:

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 13/13] zfcp: Fix tracing of request id for abort requests
  2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
                   ` (11 preceding siblings ...)
  2009-07-13 13:06 ` [patch 12/13] zfcp: Fix wka port processing Christof Schmitt
@ 2009-07-13 13:06 ` Christof Schmitt
  12 siblings, 0 replies; 14+ messages in thread
From: Christof Schmitt @ 2009-07-13 13:06 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-s390, schwidefsky, heiko.carstens,
	Christof Schmitt

[-- Attachment #1: 720-zfcp-trace-records.diff --]
[-- Type: text/plain, Size: 2732 bytes --]

From: Christof Schmitt <christof.schmitt@de.ibm.com>

The trace record for SCSI abort requests has a field for the request
id of the request to be aborted. Put the real request id instead of
zero.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
---

 drivers/s390/scsi/zfcp_scsi.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

--- a/drivers/s390/scsi/zfcp_scsi.c	2009-07-13 10:27:03.000000000 +0200
+++ b/drivers/s390/scsi/zfcp_scsi.c	2009-07-13 10:36:07.000000000 +0200
@@ -167,20 +167,21 @@ static int zfcp_scsi_eh_abort_handler(st
 	struct zfcp_unit *unit = scpnt->device->hostdata;
 	struct zfcp_fsf_req *old_req, *abrt_req;
 	unsigned long flags;
-	unsigned long old_req_id = (unsigned long) scpnt->host_scribble;
+	unsigned long old_reqid = (unsigned long) scpnt->host_scribble;
 	int retval = SUCCESS;
 	int retry = 3;
+	char *dbf_tag;
 
 	/* avoid race condition between late normal completion and abort */
 	write_lock_irqsave(&adapter->abort_lock, flags);
 
 	spin_lock(&adapter->req_list_lock);
-	old_req = zfcp_reqlist_find(adapter, old_req_id);
+	old_req = zfcp_reqlist_find(adapter, old_reqid);
 	spin_unlock(&adapter->req_list_lock);
 	if (!old_req) {
 		write_unlock_irqrestore(&adapter->abort_lock, flags);
 		zfcp_scsi_dbf_event_abort("lte1", adapter, scpnt, NULL,
-					  old_req_id);
+					  old_reqid);
 		return FAILED; /* completion could be in progress */
 	}
 	old_req->data = NULL;
@@ -189,7 +190,7 @@ static int zfcp_scsi_eh_abort_handler(st
 	write_unlock_irqrestore(&adapter->abort_lock, flags);
 
 	while (retry--) {
-		abrt_req = zfcp_fsf_abort_fcp_command(old_req_id, unit);
+		abrt_req = zfcp_fsf_abort_fcp_command(old_reqid, unit);
 		if (abrt_req)
 			break;
 
@@ -197,7 +198,7 @@ static int zfcp_scsi_eh_abort_handler(st
 		if (!(atomic_read(&adapter->status) &
 		      ZFCP_STATUS_COMMON_RUNNING)) {
 			zfcp_scsi_dbf_event_abort("nres", adapter, scpnt, NULL,
-						  old_req_id);
+						  old_reqid);
 			return SUCCESS;
 		}
 	}
@@ -208,13 +209,14 @@ static int zfcp_scsi_eh_abort_handler(st
 		   abrt_req->status & ZFCP_STATUS_FSFREQ_COMPLETED);
 
 	if (abrt_req->status & ZFCP_STATUS_FSFREQ_ABORTSUCCEEDED)
-		zfcp_scsi_dbf_event_abort("okay", adapter, scpnt, abrt_req, 0);
+		dbf_tag = "okay";
 	else if (abrt_req->status & ZFCP_STATUS_FSFREQ_ABORTNOTNEEDED)
-		zfcp_scsi_dbf_event_abort("lte2", adapter, scpnt, abrt_req, 0);
+		dbf_tag = "lte2";
 	else {
-		zfcp_scsi_dbf_event_abort("fail", adapter, scpnt, abrt_req, 0);
+		dbf_tag = "fail";
 		retval = FAILED;
 	}
+	zfcp_scsi_dbf_event_abort(dbf_tag, adapter, scpnt, abrt_req, old_reqid);
 	zfcp_fsf_req_free(abrt_req);
 	return retval;
 }

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-07-13 13:06 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-13 13:06 [patch 00/13] zfcp fixes for 2.6.31-rc2 Christof Schmitt
2009-07-13 13:06 ` [patch 01/13] zfcp: Fix invalid command order Christof Schmitt
2009-07-13 13:06 ` [patch 02/13] zfcp: Acquire qdio_stat_lock when reading the queue utilization Christof Schmitt
2009-07-13 13:06 ` [patch 03/13] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf Christof Schmitt
2009-07-13 13:06 ` [patch 04/13] zfcp: Use correct flags for zfcp_erp_notify Christof Schmitt
2009-07-13 13:06 ` [patch 05/13] zfcp: Use unchained mode for small ct and els requests Christof Schmitt
2009-07-13 13:06 ` [patch 06/13] zfcp: Use -EIO for SBAL allocation failures Christof Schmitt
2009-07-13 13:06 ` [patch 07/13] zfcp: Fix logic for physical port close Christof Schmitt
2009-07-13 13:06 ` [patch 08/13] zfcp: Fix erp escalation procedure Christof Schmitt
2009-07-13 13:06 ` [patch 09/13] zfcp: Recover from stalled outbound queue Christof Schmitt
2009-07-13 13:06 ` [patch 10/13] zfcp: Add port only once to FC transport class Christof Schmitt
2009-07-13 13:06 ` [patch 11/13] zfcp: avoid double notify in lowmem scenario Christof Schmitt
2009-07-13 13:06 ` [patch 12/13] zfcp: Fix wka port processing Christof Schmitt
2009-07-13 13:06 ` [patch 13/13] zfcp: Fix tracing of request id for abort requests Christof Schmitt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).