linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: add transport host byte values and common failure behavior
@ 2007-03-14 19:52 michaelc
  2007-03-14 19:52 ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs michaelc
  0 siblings, 1 reply; 15+ messages in thread
From: michaelc @ 2007-03-14 19:52 UTC (permalink / raw)
  To: linux-scsi

iSCSI and FC classes and drivers have several similarities in how they
handle transport problems. But between them and among the drivers
there are also several differences, all of which make configuring
them a little more fun they they need to be. The following patches
attempt to begin to implement some common behavior. The iscsi parts
are done in these patches since there are only three drivers and we all
did sometihng similar already. FC is more fun and I did not modify all
the drivers but I did modify the transport class
([PATCH 2/4] scsi: add transport host byte errors has more info).

The first patch can be ignored by most people. It is just some iscsi
reworking to bring hw and sw iscsi closer together.

The second patch adds the new transport errors

The last two patches covert the iscsi class and drivers and fcp class.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs
  2007-03-14 19:52 RFC: add transport host byte values and common failure behavior michaelc
@ 2007-03-14 19:52 ` michaelc
  2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
  2007-03-15 12:58   ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs James Smart
  0 siblings, 2 replies; 15+ messages in thread
From: michaelc @ 2007-03-14 19:52 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This patch adds some common iscsi state info to the iscsi class,
so we can view it in sysfs. It also adds a functions to check
the state before we queue IO (like the fc class). libiscsi does
this already internally, so this is moving some libiscsi functionality
to the class so hw iscsi can use it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
 drivers/scsi/libiscsi.c             |   40 +++++++++----
 drivers/scsi/qla4xxx/ql4_os.c       |   13 ++++
 drivers/scsi/scsi_transport_iscsi.c |  111 ++++++++++++++++++++++++++++++++++-
 include/scsi/libiscsi.h             |   18 ++++++
 include/scsi/scsi_transport_iscsi.h |   27 ++++-----
 5 files changed, 180 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 3f5b9b4..f4e1a1b 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -740,6 +740,7 @@ enum {
 	FAILURE_SESSION_TERMINATE,
 	FAILURE_SESSION_IN_RECOVERY,
 	FAILURE_SESSION_RECOVERY_TIMEOUT,
+	FAILURE_SESSION_NOT_READY,
 };
 
 int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(struct scsi_cmnd *))
@@ -757,8 +758,14 @@ int iscsi_queuecommand(struct scsi_cmnd 
 	host = sc->device->host;
 	session = iscsi_hostdata(host->hostdata);
 
-	spin_lock(&session->lock);
+	reason = iscsi_session_chkready(session_to_cls(session));
+	if (reason) {
+		sc->result = reason;
+		reason = FAILURE_SESSION_NOT_READY;
+		goto fault;
+	}
 
+	spin_lock(&session->lock);
 	/*
 	 * ISCSI_STATE_FAILED is a temp. state. The recovery
 	 * code will decide what is best to do with command queued
@@ -777,12 +784,16 @@ int iscsi_queuecommand(struct scsi_cmnd 
 			goto reject;
 		}
 
-		if (session->state == ISCSI_STATE_RECOVERY_FAILED)
+		if (session->state == ISCSI_STATE_RECOVERY_FAILED) {
 			reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
-		else if (session->state == ISCSI_STATE_TERMINATE)
+			sc->result = DID_IMM_RETRY << 16;
+		} else if (session->state == ISCSI_STATE_TERMINATE) {
 			reason = FAILURE_SESSION_TERMINATE;
-		else
+			sc->result = DID_NO_CONNECT << 16;
+		} else {
 			reason = FAILURE_SESSION_FREED;
+			sc->result = DID_NO_CONNECT << 16;
+		}
 		goto fault;
 	}
 
@@ -797,6 +808,7 @@ int iscsi_queuecommand(struct scsi_cmnd 
 	conn = session->leadconn;
 	if (!conn) {
 		reason = FAILURE_SESSION_FREED;
+		sc->result = DID_NO_CONNECT << 16;
 		goto fault;
 	}
 
@@ -840,7 +852,6 @@ fault:
 	spin_unlock(&session->lock);
 	printk(KERN_ERR "iscsi: cmd 0x%x is not queued (%d)\n",
 	       sc->cmnd[0], reason);
-	sc->result = (DID_NO_CONNECT << 16);
 	sc->resid = sc->request_bufflen;
 	sc->scsi_done(sc);
 	return 0;
@@ -1425,6 +1436,7 @@ iscsi_session_setup(struct iscsi_transpo
 	cls_session = iscsi_create_session(shost, iscsit, 0);
 	if (!cls_session)
 		goto module_put;
+
 	*(unsigned long*)shost->hostdata = (unsigned long)cls_session;
 
 	return cls_session;
@@ -1638,6 +1650,7 @@ int iscsi_conn_start(struct iscsi_cls_co
 	spin_lock_bh(&session->lock);
 	conn->c_stage = ISCSI_CONN_STARTED;
 	session->state = ISCSI_STATE_LOGGED_IN;
+	iscsi_set_session_logged_in(session_to_cls(session));
 
 	switch(conn->stop_stage) {
 	case STOP_CONN_RECOVER:
@@ -1695,22 +1708,22 @@ flush_control_queues(struct iscsi_sessio
 }
 
 /* Fail commands. Mutex and session lock held and recv side suspended */
-static void fail_all_commands(struct iscsi_conn *conn)
+static void fail_all_commands(struct iscsi_conn *conn, int error)
 {
 	struct iscsi_cmd_task *ctask, *tmp;
 
 	/* flush pending */
 	list_for_each_entry_safe(ctask, tmp, &conn->xmitqueue, running) {
-		debug_scsi("failing pending sc %p itt 0x%x\n", ctask->sc,
-			   ctask->itt);
-		fail_command(conn, ctask, DID_BUS_BUSY << 16);
+		debug_scsi("failing pending sc %p itt 0x%x err %d\n", ctask->sc,
+			   ctask->itt, error);
+		fail_command(conn, ctask, error << 16);
 	}
 
 	/* fail all other running */
 	list_for_each_entry_safe(ctask, tmp, &conn->run_list, running) {
-		debug_scsi("failing in progress sc %p itt 0x%x\n",
-			   ctask->sc, ctask->itt);
-		fail_command(conn, ctask, DID_BUS_BUSY << 16);
+		debug_scsi("failing in progress sc %p itt 0x%x err %d\n",
+			   ctask->sc, ctask->itt, error);
+		fail_command(conn, ctask, error << 16);
 	}
 
 	conn->ctask = NULL;
@@ -1768,7 +1781,8 @@ static void iscsi_start_session_recovery
 	 * flush queues.
 	 */
 	spin_lock_bh(&session->lock);
-	fail_all_commands(conn);
+	fail_all_commands(conn,
+			STOP_CONN_RECOVER ? DID_BUS_BUSY : DID_ERROR);
 	flush_control_queues(session, conn);
 	spin_unlock_bh(&session->lock);
 
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 0bfddf8..053533d 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -283,6 +283,7 @@ int qla4xxx_add_sess(struct ddb_entry *d
 	}
 
 	ddb_entry->sess->recovery_tmo = ddb_entry->ha->port_down_retry_count;
+	iscsi_set_session_logged_in(ddb_entry->sess);
 	iscsi_if_create_session_done(ddb_entry->conn);
 	return 0;
 }
@@ -413,9 +414,21 @@ static int qla4xxx_queuecommand(struct s
 {
 	struct scsi_qla_host *ha = to_qla_host(cmd->device->host);
 	struct ddb_entry *ddb_entry = cmd->device->hostdata;
+	struct iscsi_cls_session *sess = ddb_entry->sess;
 	struct srb *srb;
 	int rval;
 
+	if (!sess) {
+		cmd->result = DID_IMM_RETRY << 16;
+		goto qc_fail_command;
+	}
+
+	rval = iscsi_session_chkready(sess);
+	if (rval) {
+		cmd->result = rval;
+		goto qc_fail_command;
+	}
+
 	if (atomic_read(&ddb_entry->state) != DDB_STATE_ONLINE) {
 		if (atomic_read(&ddb_entry->state) == DDB_STATE_DEAD) {
 			cmd->result = DID_NO_CONNECT << 16;
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index ff05c84..373167b 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -30,7 +30,7 @@ #include <scsi/scsi_transport.h>
 #include <scsi/scsi_transport_iscsi.h>
 #include <scsi/iscsi_if.h>
 
-#define ISCSI_SESSION_ATTRS 11
+#define ISCSI_SESSION_ATTRS 12
 #define ISCSI_CONN_ATTRS 11
 #define ISCSI_HOST_ATTRS 0
 #define ISCSI_TRANSPORT_VERSION "2.0-724"
@@ -201,6 +201,51 @@ static struct iscsi_cls_conn *iscsi_conn
  * The following functions can be used by LLDs that allocate
  * their own scsi_hosts or by software iscsi LLDs
  */
+static struct {
+	int value;
+	char *name;
+} iscsi_session_state_names[] = {
+	{ ISCSI_SESSION_LOGGED_IN,	"Logged In" },
+	{ ISCSI_SESSION_FAILED,		"Failed" },
+	{ ISCSI_SESSION_FREE,		"Free" },
+};
+
+const char *iscsi_session_state_name(int state)
+{
+	int i;
+	char *name = NULL;
+
+	for (i = 0; i < ARRAY_SIZE(iscsi_session_state_names); i++) {
+		if (iscsi_session_state_names[i].value == state) {
+			name = iscsi_session_state_names[i].name;
+			break;
+		}
+	}
+	return name;
+}
+
+int iscsi_session_chkready(struct iscsi_cls_session *session)
+{
+	int err;
+
+	switch (session->state) {
+	case ISCSI_SESSION_LOGGED_IN:
+		err = 0;
+		break;
+	case ISCSI_SESSION_FAILED:
+		err = DID_IMM_RETRY << 16;
+		break;
+	case ISCSI_SESSION_FREE:
+		err = DID_NO_CONNECT  << 16;
+		break;
+	default:
+		err = DID_NO_CONNECT << 16;
+		break;
+	}
+	return err;
+}
+EXPORT_SYMBOL_GPL(iscsi_session_chkready);
+
 static void iscsi_session_release(struct device *dev)
 {
 	struct iscsi_cls_session *session = iscsi_dev_to_session(dev);
@@ -239,18 +284,53 @@ static void session_recovery_timedout(st
 	struct iscsi_cls_session *session =
 		container_of(work, struct iscsi_cls_session,
 			     recovery_work.work);
+	unsigned long flags;
 
 	dev_printk(KERN_INFO, &session->dev, "iscsi: session recovery timed "
 		  "out after %d secs\n", session->recovery_tmo);
 
+	spin_lock_irqsave(&session->lock, flags);
+	if (session->state == ISCSI_SESSION_FAILED)
+		session->state = ISCSI_SESSION_FREE;
+	else if (session->state == ISCSI_SESSION_LOGGED_IN) {
+		spin_unlock_irqrestore(&session->lock, flags);
+		return;
+	}
+	spin_unlock_irqrestore(&session->lock, flags);
+
 	if (session->transport->session_recovery_timedout)
 		session->transport->session_recovery_timedout(session);
-
 	scsi_target_unblock(&session->dev);
 }
 
+/*
+ * iscsi_set_session_logged_in: set set to logged in
+ * @session: iscsi class session
+ *
+ * This should be called when the LLD has logged in and is ready to
+ * start IO.
+ */
+void iscsi_set_session_logged_in(struct iscsi_cls_session *session)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&session->lock, flags);
+	session->state = ISCSI_SESSION_LOGGED_IN;
+	spin_unlock_irqrestore(&session->lock, flags);
+}
+EXPORT_SYMBOL_GPL(iscsi_set_session_logged_in);
+
+/*
+ * iscsi_unblock_session: run io
+ * @session: iscsi class session
+ *
+ * LLDs should call this when ready to execute IO. It needs to be called
+ * when in response to a blocked call.
+ */
 void iscsi_unblock_session(struct iscsi_cls_session *session)
 {
+	iscsi_set_session_logged_in(session);
+
 	if (!cancel_delayed_work(&session->recovery_work))
 		flush_scheduled_work();
 	scsi_target_unblock(&session->dev);
@@ -259,6 +339,12 @@ EXPORT_SYMBOL_GPL(iscsi_unblock_session)
 
 void iscsi_block_session(struct iscsi_cls_session *session)
 {
+	unsigned long flags;
+
+	spin_lock_irqsave(&session->lock, flags);
+	session->state = ISCSI_SESSION_FAILED;
+	spin_unlock_irqrestore(&session->lock, flags);
+
 	scsi_target_block(&session->dev);
 	schedule_delayed_work(&session->recovery_work,
 			     session->recovery_tmo * HZ);
@@ -278,9 +364,11 @@ iscsi_alloc_session(struct Scsi_Host *sh
 
 	session->transport = transport;
 	session->recovery_tmo = 120;
+	session->state = ISCSI_SESSION_FREE;
 	INIT_DELAYED_WORK(&session->recovery_work, session_recovery_timedout);
 	INIT_LIST_HEAD(&session->host_list);
 	INIT_LIST_HEAD(&session->sess_list);
+	spin_lock_init(&session->lock);
 
 	/* this is released in the dev's release function */
 	scsi_host_get(shost);
@@ -330,6 +418,8 @@ EXPORT_SYMBOL_GPL(iscsi_add_session);
  * @transport: iscsi transport
  *
  * This can be called from a LLD or iscsi_transport.
+ * It will leave the session in the free state. When ready
+ * to put it in the logged in state run iscsi_set_session_logged_in.
  **/
 struct iscsi_cls_session *
 iscsi_create_session(struct Scsi_Host *shost,
@@ -566,16 +656,23 @@ EXPORT_SYMBOL_GPL(iscsi_recv_pdu);
 
 void iscsi_conn_error(struct iscsi_cls_conn *conn, enum iscsi_err error)
 {
+	struct iscsi_cls_session *session = iscsi_conn_to_session(conn);
 	struct nlmsghdr	*nlh;
 	struct sk_buff	*skb;
 	struct iscsi_uevent *ev;
 	struct iscsi_internal *priv;
 	int len = NLMSG_SPACE(sizeof(*ev));
+	unsigned long flags;
 
 	priv = iscsi_if_transport_lookup(conn->transport);
 	if (!priv)
 		return;
 
+	spin_lock_irqsave(&session->lock, flags);
+	if (session->state == ISCSI_SESSION_LOGGED_IN)
+		session->state = ISCSI_SESSION_FAILED;
+	spin_unlock_irqrestore(&session->lock, flags);
+
 	skb = alloc_skb(len, GFP_ATOMIC);
 	if (!skb) {
 		dev_printk(KERN_ERR, &conn->dev, "iscsi: gracefully ignored "
@@ -1185,6 +1282,15 @@ iscsi_session_attr(data_seq_in_order, IS
 iscsi_session_attr(erl, ISCSI_PARAM_ERL);
 iscsi_session_attr(tpgt, ISCSI_PARAM_TPGT);
 
+static ssize_t
+show_priv_session_state(struct class_device *cdev, char *buf)
+{
+	struct iscsi_cls_session *session = iscsi_cdev_to_session(cdev);
+	return sprintf(buf, "%s\n", iscsi_session_state_name(session->state));
+}
+static ISCSI_CLASS_ATTR(priv_sess, state, S_IRUGO, show_priv_session_state,
+			NULL);
+
 #define iscsi_priv_session_attr_show(field, format)			\
 static ssize_t								\
 show_priv_session_##field(struct class_device *cdev, char *buf)		\
@@ -1365,6 +1471,7 @@ iscsi_register_transport(struct iscsi_tr
 	SETUP_SESSION_RD_ATTR(targetname, ISCSI_TARGET_NAME);
 	SETUP_SESSION_RD_ATTR(tpgt, ISCSI_TPGT);
 	SETUP_PRIV_SESSION_RD_ATTR(recovery_tmo);
+	SETUP_PRIV_SESSION_RD_ATTR(state);
 
 	BUG_ON(count > ISCSI_SESSION_ATTRS);
 	priv->session_attrs[count] = NULL;
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index ea0816d..1e5c00b 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -119,6 +119,14 @@ struct iscsi_cmd_task {
 	void			*dd_data;	/* driver/transport data */
 };
 
+/* Connection's states */
+enum {
+	ISCSI_CONN_INITIAL_STAGE,
+	ISCSI_CONN_STARTED,
+	ISCSI_CONN_STOPPED,
+	ISCSI_CONN_CLEANUP_WAIT,
+};
+
 struct iscsi_conn {
 	struct iscsi_cls_conn	*cls_conn;	/* ptr to class connection */
 	void			*dd_data;	/* iscsi_transport data */
@@ -205,6 +213,16 @@ struct iscsi_queue {
 	int			max;		/* Max number of elements */
 };
 
+/* Session's states */
+enum {
+	ISCSI_STATE_FREE = 1,
+	ISCSI_STATE_LOGGED_IN,
+	ISCSI_STATE_FAILED,
+	ISCSI_STATE_TERMINATE,
+	ISCSI_STATE_IN_RECOVERY,
+	ISCSI_STATE_RECOVERY_FAILED,
+};
+
 struct iscsi_session {
 	/* iSCSI session-wide sequencing */
 	uint32_t		cmdsn;
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index d5c218d..99067b9 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -141,13 +141,6 @@ extern void iscsi_conn_error(struct iscs
 extern int iscsi_recv_pdu(struct iscsi_cls_conn *conn, struct iscsi_hdr *hdr,
 			  char *data, uint32_t data_size);
 
-
-/* Connection's states */
-#define ISCSI_CONN_INITIAL_STAGE	0
-#define ISCSI_CONN_STARTED		1
-#define ISCSI_CONN_STOPPED		2
-#define ISCSI_CONN_CLEANUP_WAIT		3
-
 struct iscsi_cls_conn {
 	struct list_head conn_list;	/* item in connlist */
 	void *dd_data;			/* LLD private data */
@@ -161,18 +154,21 @@ struct iscsi_cls_conn {
 #define iscsi_dev_to_conn(_dev) \
 	container_of(_dev, struct iscsi_cls_conn, dev)
 
-/* Session's states */
-#define ISCSI_STATE_FREE		1
-#define ISCSI_STATE_LOGGED_IN		2
-#define ISCSI_STATE_FAILED		3
-#define ISCSI_STATE_TERMINATE		4
-#define ISCSI_STATE_IN_RECOVERY		5
-#define ISCSI_STATE_RECOVERY_FAILED	6
+#define iscsi_conn_to_session(_conn) \
+	iscsi_dev_to_session(_conn->dev.parent)
+
+/* iscsi class session state */
+enum {
+	ISCSI_SESSION_LOGGED_IN,
+	ISCSI_SESSION_FAILED,
+	ISCSI_SESSION_FREE,
+};
 
 struct iscsi_cls_session {
 	struct list_head sess_list;		/* item in session_list */
 	struct list_head host_list;
 	struct iscsi_transport *transport;
+	spinlock_t lock;			/* protects state */
 
 	/* recovery fields */
 	int recovery_tmo;
@@ -180,6 +176,7 @@ struct iscsi_cls_session {
 
 	int target_id;
 
+	int state;
 	int sid;				/* session id */
 	void *dd_data;				/* LLD private data */
 	struct device dev;	/* sysfs transport/container device */
@@ -202,6 +199,8 @@ struct iscsi_host {
 /*
  * session and connection functions that can be used by HW iSCSI LLDs
  */
+extern void iscsi_set_session_logged_in(struct iscsi_cls_session *session);
+extern int iscsi_session_chkready(struct iscsi_cls_session *session);
 extern struct iscsi_cls_session *iscsi_alloc_session(struct Scsi_Host *shost,
 					struct iscsi_transport *transport);
 extern int iscsi_add_session(struct iscsi_cls_session *session,
-- 
1.4.1.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] scsi: add transport host byte errors
  2007-03-14 19:52 ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs michaelc
@ 2007-03-14 19:52   ` michaelc
  2007-03-14 19:52     ` [PATCH 3/4] iscsi class, libiscsi and qla4xxx: convert to new transport host byte values michaelc
                       ` (3 more replies)
  2007-03-15 12:58   ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs James Smart
  1 sibling, 4 replies; 15+ messages in thread
From: michaelc @ 2007-03-14 19:52 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

Small guide to differences in iscsi and fc:

I_T nexus or port we are connected to at some other end point
iscsi: iscsi_session
fc: rport

fast_io_fail_tmo
iscsi: session recovery_tmo
fc: rport fast_io_fail_tmo

The difference is that when the timer fires, for iscsi we unblock the
queue and fail commands in the blocked queue. FC just fails IO running
in the driver/fw/hw. The IO in the blocked queue sits there until dev_loss_tmo.

dev_loss_tmo
iscsi: none yet (we are working on it :))
fc: dev_loss_tmo

Currently, if there is a transport problem the iscsi drivers will return
outstanding commands (commands being exeucted by the driver/fw/hw) with
DID_BUS_BUSY and block the session so no new commands can be queued.
Commands that are caught between the failure handling and blocking are
failed with DID_IMM_RETRY or one of the scsi ml queuecommand return values.
When the recovery_timeout fires, the iscsi drivers then fail IO with
DID_NO_CONNECT.

For fcp, some drivers will fail some outstanding IO (disk but possibly not
tape) with DID_BUS_BUSY or some other value that causes a retry and hits
the scsi_error.c failfast check, block the rport, and commands caught in the
race are failed with DID_IMM_RETRY. Other drivers, will hold onto all IO
and wait for the terminate_rport_io or dev_loss_tmo_callbk to be called.
In this case lpfc, could return the IO with DID_ERROR.

The following patches attempt unify what upper layers will see drivers
like multipath can make a good guess. This relies on drivers being
hooked into their transport class and implementing the terminate_rport_io
callback.

This first patch just defines two new host byte errors so drivers can
return the same value for when a rport/session is blocked and for
when the fast_io_fail_tmo fires.

The idea is that if the LLD/class detects a problem and is going to block
a rport/session, then if the LLD wants or must return the command to scsi-ml,
then it can return it with DID_TRANSPORT_BLOCKED. This will requeue
the IO into the same scsi queue it came from.

When using multipath and the fast_io_fail_tmo fires then the class
can fail commands with DID_TRANSPORT_FAILFAST or drivers can use
DID_TRANSPORT_FAILFAST in their terminate_rport_io callbacks or
the equivlent in iscsi if we ever implement more advanced recovery methods.
A LLD, like lpfc, could continue to return DID_ERROR and then it will hit
the normal failfast path. The point of the patches is that upper layers will
not see a failure that could be recovered from while the rport/session is
blocked until fast_io_fail_tmo/recovery_timeout fires.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
 drivers/scsi/constants.c  |    3 ++-
 drivers/scsi/scsi_error.c |   15 ++++++++++++++-
 include/scsi/scsi.h       |    2 ++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
index 2a458d6..9f0b284 100644
--- a/drivers/scsi/constants.c
+++ b/drivers/scsi/constants.c
@@ -1355,7 +1355,8 @@ #ifdef CONFIG_SCSI_CONSTANTS
 static const char * const hostbyte_table[]={
 "DID_OK", "DID_NO_CONNECT", "DID_BUS_BUSY", "DID_TIME_OUT", "DID_BAD_TARGET",
 "DID_ABORT", "DID_PARITY", "DID_ERROR", "DID_RESET", "DID_BAD_INTR",
-"DID_PASSTHROUGH", "DID_SOFT_ERROR", "DID_IMM_RETRY"};
+"DID_PASSTHROUGH", "DID_SOFT_ERROR", "DID_IMM_RETRY", "DID_TRANSPORT_BLOCKED",
+"DID_TRANSPORT_FAILFAST" };
 #define NUM_HOSTBYTE_STRS ARRAY_SIZE(hostbyte_table)
 
 static const char * const driverbyte_table[]={
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b8edcf5..7dbe0f4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1231,7 +1231,20 @@ int scsi_decide_disposition(struct scsi_
 
 	case DID_REQUEUE:
 		return ADD_TO_MLQUEUE;
-
+	case DID_TRANSPORT_BLOCKED:
+		/*
+		 * LLD/transport was disrupted during processing the IO.
+		 * The transport is now blocked and attempting to recover,
+		 * and the transport will decide what to do with the IO
+		 * based on its timers and recovery capablilities.
+		 */
+		return NEEDS_RETRY;
+	case DID_TRANSPORT_FAILFAST:
+		/*
+		 * The transport decided to failfast the IO (most likely
+		 * the fast io fail tmo fired), so send IO directly upwards.
+		 */
+		return SUCCESS;
 	case DID_ERROR:
 		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
 		    status_byte(scmd->result) == RESERVATION_CONFLICT)
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5c0e979..fa94d65 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -309,6 +309,8 @@ #define DID_SOFT_ERROR  0x0b	/* The low 
 #define DID_IMM_RETRY   0x0c	/* Retry without decrementing retry count  */
 #define DID_REQUEUE	0x0d	/* Requeue command (no immediate retry) also
 				 * without decrementing the retry count	   */
+#define DID_TRANSPORT_BLOCKED	0x0e /* Transport class will handle io */
+#define DID_TRANSPORT_FAILFAST	0x0f /* Transport class fastfailed the io */
 #define DRIVER_OK       0x00	/* Driver status                           */
 
 /*
-- 
1.4.1.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] iscsi class, libiscsi and qla4xxx: convert to new transport host byte values
  2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
@ 2007-03-14 19:52     ` michaelc
  2007-03-14 19:52       ` [PATCH 4/4] fc class: use " michaelc
  2007-03-14 20:11     ` [PATCH 2/4] scsi: add transport host byte errors Mike Christie
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: michaelc @ 2007-03-14 19:52 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This patch converts the iscsi drivers to the new host byte values
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
 drivers/scsi/libiscsi.c             |   21 ++++++++++++---------
 drivers/scsi/qla4xxx/ql4_isr.c      |    2 +-
 drivers/scsi/scsi_transport_iscsi.c |    4 ++--
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index f4e1a1b..455bbfd 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -779,20 +779,23 @@ int iscsi_queuecommand(struct scsi_cmnd 
 		 * be entering our queuecommand while a block is starting
 		 * up because the block code is not locked)
 		 */
-		if (session->state == ISCSI_STATE_IN_RECOVERY) {
+		switch (session->state) {
+		case ISCSI_STATE_IN_RECOVERY:
 			reason = FAILURE_SESSION_IN_RECOVERY;
-			goto reject;
-		}
-
-		if (session->state == ISCSI_STATE_RECOVERY_FAILED) {
+			sc->result = DID_TRANSPORT_BLOCKED << 16;
+			break;
+		case ISCSI_STATE_RECOVERY_FAILED:
 			reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
-			sc->result = DID_IMM_RETRY << 16;
-		} else if (session->state == ISCSI_STATE_TERMINATE) {
+			sc->result = DID_TRANSPORT_FAILFAST << 16;
+			break;
+		case ISCSI_STATE_TERMINATE:
 			reason = FAILURE_SESSION_TERMINATE;
 			sc->result = DID_NO_CONNECT << 16;
-		} else {
+			break;
+		default:
 			reason = FAILURE_SESSION_FREED;
 			sc->result = DID_NO_CONNECT << 16;
+			break;
 		}
 		goto fault;
 	}
@@ -1782,7 +1785,7 @@ static void iscsi_start_session_recovery
 	 */
 	spin_lock_bh(&session->lock);
 	fail_all_commands(conn,
-			STOP_CONN_RECOVER ? DID_BUS_BUSY : DID_ERROR);
+			STOP_CONN_RECOVER ? DID_TRANSPORT_FAILFAST : DID_ERROR);
 	flush_control_queues(session, conn);
 	spin_unlock_bh(&session->lock);
 
diff --git a/drivers/scsi/qla4xxx/ql4_isr.c b/drivers/scsi/qla4xxx/ql4_isr.c
index 35b9e36..31a786a 100644
--- a/drivers/scsi/qla4xxx/ql4_isr.c
+++ b/drivers/scsi/qla4xxx/ql4_isr.c
@@ -267,7 +267,7 @@ static void qla4xxx_status_entry(struct 
 		if (atomic_read(&ddb_entry->state) == DDB_STATE_ONLINE)
 			qla4xxx_mark_device_missing(ha, ddb_entry);
 
-		cmd->result = DID_BUS_BUSY << 16;
+		cmd->result = DID_BLOCKED_TRANSPORT << 16;
 		break;
 
 	case SCS_QUEUE_FULL:
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 373167b..87e6ffe 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -233,10 +233,10 @@ int iscsi_session_chkready(struct iscsi_
 		err = 0;
 		break;
 	case ISCSI_SESSION_FAILED:
-		err = DID_IMM_RETRY << 16;
+		err = DID_TRANSPORT_BLOCKED << 16;
 		break;
 	case ISCSI_SESSION_FREE:
-		err = DID_NO_CONNECT  << 16;
+		err = DID_TRANSPORT_FAILFAST << 16;
 		break;
 	default:
 		err = DID_NO_CONNECT << 16;
-- 
1.4.1.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] fc class: use transport host byte values
  2007-03-14 19:52     ` [PATCH 3/4] iscsi class, libiscsi and qla4xxx: convert to new transport host byte values michaelc
@ 2007-03-14 19:52       ` michaelc
  0 siblings, 0 replies; 15+ messages in thread
From: michaelc @ 2007-03-14 19:52 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This patch is just an example. It is not tested and it is probably
better to not use the fast_io_fail_tmo variable and instead modify
the state machine.

This patch also does not convert any FC drivers to use the new codes.

The patch does modify the port ready function to return the new
host byte values and modifies the rport fast io fail tmo to
do the same thing as iscsi (fail queued and running IO).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
 drivers/scsi/scsi_transport_fc.c |    4 ++++
 include/scsi/scsi_transport_fc.h |    8 ++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 58afdb4..af1ff02 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -1970,6 +1970,7 @@ fc_remote_port_add(struct Scsi_Host *sho
 					sizeof(rport->port_name));
 				rport->port_id = ids->port_id;
 
+				rport->fast_io_fail_timeout = 0;
 				rport->port_state = FC_PORTSTATE_ONLINE;
 				rport->roles = ids->roles;
 
@@ -2059,6 +2060,7 @@ fc_remote_port_add(struct Scsi_Host *sho
 				sizeof(rport->port_name));
 			rport->port_id = ids->port_id;
 			rport->roles = ids->roles;
+			rport->fast_io_fail_timeout = 0;
 			rport->port_state = FC_PORTSTATE_ONLINE;
 
 			if (fci->f->dd_fcrport_size)
@@ -2381,7 +2383,9 @@ fc_timeout_fail_rport_io(struct work_str
 	if (rport->port_state != FC_PORTSTATE_BLOCKED)
 		return;
 
+	rport->fast_io_fail_timeout = 1;
 	i->f->terminate_rport_io(rport);
+	scsi_target_unblock(&rport->dev);
 }
 
 /**
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index 798f7c7..b7f7e51 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -196,6 +196,7 @@ struct fc_rport {	/* aka fc_starget_attr
 	enum fc_port_state port_state;	/* Will only be ONLINE or UNKNOWN */
 	u32 scsi_target_id;
 	u32 fast_io_fail_tmo;
+	u8 fast_io_fail_timeout;
 
 	/* exported data */
 	void *dd_data;			/* Used for driver-specific storage */
@@ -508,17 +509,20 @@ fc_remote_port_chkready(struct fc_rport 
 {
 	int result;
 
+	if (rport->fastfail_timeout)
+		return DID_TRANSPORT_FAILFAST;
+
 	switch (rport->port_state) {
 	case FC_PORTSTATE_ONLINE:
 		if (rport->roles & FC_RPORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
-			result = DID_IMM_RETRY << 16;
+			result = DID_TRANSPORT_BLOCKED << 16;
 		else
 			result = DID_NO_CONNECT << 16;
 		break;
 	case FC_PORTSTATE_BLOCKED:
-		result = DID_IMM_RETRY << 16;
+		result = DID_TRANSPORT_BLOCKED << 16;
 		break;
 	default:
 		result = DID_NO_CONNECT << 16;
-- 
1.4.1.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] scsi: add transport host byte errors
  2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
  2007-03-14 19:52     ` [PATCH 3/4] iscsi class, libiscsi and qla4xxx: convert to new transport host byte values michaelc
@ 2007-03-14 20:11     ` Mike Christie
  2007-03-14 21:14     ` Mike Christie
  2007-03-15 13:20     ` [PATCH 2/4] scsi: add transport host byte errors James Smart
  3 siblings, 0 replies; 15+ messages in thread
From: Mike Christie @ 2007-03-14 20:11 UTC (permalink / raw)
  To: linux-scsi

michaelc@cs.wisc.edu wrote:
> The following patches attempt unify what upper layers will see drivers
> like multipath can make a good guess. This relies on drivers being

Oops, I mean to say with the patches upper layers like multipath will
see common behavior from LLDs hooked into transport classes. If the
class/LLD can block a rport/session and recover within
fast_io_fail_tmo/replacement_timeout seconds then multipath would not
see the problem. We will not have some drivers returning errors upwards
and some not and some drivers holding onto errors for until the device
is removed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] scsi: add transport host byte errors
  2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
  2007-03-14 19:52     ` [PATCH 3/4] iscsi class, libiscsi and qla4xxx: convert to new transport host byte values michaelc
  2007-03-14 20:11     ` [PATCH 2/4] scsi: add transport host byte errors Mike Christie
@ 2007-03-14 21:14     ` Mike Christie
  2007-03-15 13:41       ` [PATCH 4/4] fc class: use transport host byte values James Smart
  2007-03-15 13:20     ` [PATCH 2/4] scsi: add transport host byte errors James Smart
  3 siblings, 1 reply; 15+ messages in thread
From: Mike Christie @ 2007-03-14 21:14 UTC (permalink / raw)
  To: linux-scsi

michaelc@cs.wisc.edu wrote:
> When using multipath and the fast_io_fail_tmo fires then the class
> can fail commands with DID_TRANSPORT_FAILFAST or drivers can use

Bah, that is going to be wrong or ugly for tape commands.

We would want to add another new value, DID_TRANSPORT_ABORTED which
would check the retries or device type. Drivers would use this when
failing IO that was possibly running on the device.

> -
> +	case DID_TRANSPORT_BLOCKED:
> +		/*
> +		 * LLD/transport was disrupted during processing the IO.
> +		 * The transport is now blocked and attempting to recover,
> +		 * and the transport will decide what to do with the IO
> +		 * based on its timers and recovery capablilities.
> +		 */
> +		return NEEDS_RETRY;
> +	case DID_TRANSPORT_FAILFAST:
> +		/*
> +		 * The transport decided to failfast the IO (most likely
> +		 * the fast io fail tmo fired), so send IO directly upwards.
> +		 */
> +		return SUCCESS;
>  	case DID_ERROR:
>  		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
>  		    status_byte(scmd->result) == RESERVATION_CONFLICT)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs
  2007-03-14 19:52 ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs michaelc
  2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
@ 2007-03-15 12:58   ` James Smart
  2007-03-15 14:20     ` Mike Christie
  1 sibling, 1 reply; 15+ messages in thread
From: James Smart @ 2007-03-15 12:58 UTC (permalink / raw)
  To: michaelc; +Cc: linux-scsi





> @@ -1768,7 +1781,8 @@ static void iscsi_start_session_recovery
>  	 * flush queues.
>  	 */
>  	spin_lock_bh(&session->lock);
> -	fail_all_commands(conn);
> +	fail_all_commands(conn,
> +			STOP_CONN_RECOVER ? DID_BUS_BUSY : DID_ERROR);
>  	flush_control_queues(session, conn);
>  	spin_unlock_bh(&session->lock);
>  

I'm assuming you do not support error recovery level 2, right ?

> +int iscsi_session_chkready(struct iscsi_cls_session *session)
...
> +EXPORT_SYMBOL_GPL(iscsi_session_chkready);

Any reason this isn't an inline ?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] scsi: add transport host byte errors
  2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
                       ` (2 preceding siblings ...)
  2007-03-14 21:14     ` Mike Christie
@ 2007-03-15 13:20     ` James Smart
  2007-03-15 14:52       ` Mike Christie
  3 siblings, 1 reply; 15+ messages in thread
From: James Smart @ 2007-03-15 13:20 UTC (permalink / raw)
  To: michaelc; +Cc: linux-scsi



michaelc@cs.wisc.edu wrote:
> fast_io_fail_tmo
> iscsi: session recovery_tmo
> fc: rport fast_io_fail_tmo
> 
> The difference is that when the timer fires, for iscsi we unblock the
> queue and fail commands in the blocked queue. FC just fails IO running
> in the driver/fw/hw. The IO in the blocked queue sits there until dev_loss_tmo.

True - FC contacted the LLDD to terminate i/o, who has no notion of any io
that has yet to be sent to it via queuecommand(). Blocked i/o sits until
dev_loss_tmo, as that is when the sdev gets torn down. Perhaps a block layer call
should be created, that the FC transport can call, to terminate the blocked
queue.  Thoughts ?

> dev_loss_tmo
> iscsi: none yet (we are working on it :))
> fc: dev_loss_tmo
> 
> Currently, if there is a transport problem the iscsi drivers will return
> outstanding commands (commands being exeucted by the driver/fw/hw) with
> DID_BUS_BUSY and block the session so no new commands can be queued.
> Commands that are caught between the failure handling and blocking are
> failed with DID_IMM_RETRY or one of the scsi ml queuecommand return values.
> When the recovery_timeout fires, the iscsi drivers then fail IO with
> DID_NO_CONNECT.
> 
> For fcp, some drivers will fail some outstanding IO (disk but possibly not
> tape) with DID_BUS_BUSY or some other value that causes a retry and hits
> the scsi_error.c failfast check, block the rport, and commands caught in the
> race are failed with DID_IMM_RETRY. Other drivers, will hold onto all IO
> and wait for the terminate_rport_io or dev_loss_tmo_callbk to be called.
> In this case lpfc, could return the IO with DID_ERROR.

Note: Variability in behavior has to be allowed as both implementations are
within FC specification. Also, the "everything killed" scenario is a valid
worst case behavior that can always occur. The "it's not killed immediately"
scenario is an optimization towards best-case behavior (with better FC-MI-2
compliance).

Lpfc returns DID_ERROR as the io requests had been queued to the adapter,
may have gone out on the wire, and may have changed media. They were terminated
early based on the respective timeout. Thus, a BUSY status, which implies
no media change, is deemed inappropriate.  Based on the conversation, you
are implying that the layer above, which asked for the fastfail may want to
distinguish between an io terminated due to the fastfail timeout vs an io
that failed due to a real error. Easy enough to do - we just need a new
return status.  And, I see, that's what the patch below does.

So far, so good....

-- james s


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] fc class: use transport host byte values
  2007-03-14 21:14     ` Mike Christie
@ 2007-03-15 13:41       ` James Smart
  2007-03-15 14:26         ` Mike Christie
  0 siblings, 1 reply; 15+ messages in thread
From: James Smart @ 2007-03-15 13:41 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi

Background:
   The states, in the transport are:
      event               state
     --------------------------------
       n/a                running
     lose connectivity    blocked
     fastfail timeout     still blocked, but with fastfail indicator
     dev_loss timeout     unblocked/removed

Question:

The chkready helper in the transport catches race conditions where the
block is being put in place, but queuecommand is running on another cpu,
thus returning DID_IMM_RETRY.

Your patch is updating chkready to look for the case when fastfail has
fired, and returning DID_TRANSPORT_FASTFAIL  ?  Why ?  The request
queue would be blocked so there should be no call to queuecommand that
would fall into this category - unless you are assuming that fastfail
timeout is so fast that it could fall into the same race condition as
blocked.

What I think is right is - DID_TRANSPORT_FASTFAIL only is returned by
the drivers on the i/o they kill as part of the terminate_io function
invoked by fastfail (or on the new block queue kill request I mentioned
an email ago).  There should be no reason to add it to chkready unless
you are trying to be absolutely protective for the above race conditions.
(and hey - it makes sense that if you involve the request queue to kill
all io at fast fail, there won't be any io to call queuecommand for, and
even if it did get through, it'll be marked for requeuing to the request
queue who will note the fastfail kill has been called and can immediately
give it back to the upper layers with the proper status).

-- james s



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs
  2007-03-15 12:58   ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs James Smart
@ 2007-03-15 14:20     ` Mike Christie
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Christie @ 2007-03-15 14:20 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi

James Smart wrote:
> 
> 
> 
> 
>> @@ -1768,7 +1781,8 @@ static void iscsi_start_session_recovery
>>       * flush queues.
>>       */
>>      spin_lock_bh(&session->lock);
>> -    fail_all_commands(conn);
>> +    fail_all_commands(conn,
>> +            STOP_CONN_RECOVER ? DID_BUS_BUSY : DID_ERROR);
>>      flush_control_queues(session, conn);
>>      spin_unlock_bh(&session->lock);
>>  
> 
> I'm assuming you do not support error recovery level 2, right ?
> 

Yeah, we only do erl0 right now.

>> +int iscsi_session_chkready(struct iscsi_cls_session *session)
> ...
>> +EXPORT_SYMBOL_GPL(iscsi_session_chkready);
> 
> Any reason this isn't an inline ?
> 

No, will make it inline.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] fc class: use transport host byte values
  2007-03-15 13:41       ` [PATCH 4/4] fc class: use transport host byte values James Smart
@ 2007-03-15 14:26         ` Mike Christie
  2007-03-15 14:45           ` James Smart
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Christie @ 2007-03-15 14:26 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi

James Smart wrote:
> Background:
>   The states, in the transport are:
>      event               state
>     --------------------------------
>       n/a                running
>     lose connectivity    blocked
>     fastfail timeout     still blocked, but with fastfail indicator

My patch changed this one. When this times out, I unblock the queue.

>     dev_loss timeout     unblocked/removed
> 
> Question:
> 
> The chkready helper in the transport catches race conditions where the
> block is being put in place, but queuecommand is running on another cpu,
> thus returning DID_IMM_RETRY.
> 
> Your patch is updating chkready to look for the case when fastfail has
> fired, and returning DID_TRANSPORT_FASTFAIL  ?  Why ?  The request
> queue would be blocked so there should be no call to queuecommand that
> would fall into this category - unless you are assuming that fastfail
> timeout is so fast that it could fall into the same race condition as
> blocked.

In the patch, when the fast IO fail timer fires, I unblock the queues.

+       rport->fast_io_fail_timeout = 1;
        i->f->terminate_rport_io(rport);
+       scsi_target_unblock(&rport->dev);

At that time IO would be queued on drivers and would hit that test and
be failed. New IO would also be failed upwards by that check.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] fc class: use transport host byte values
  2007-03-15 14:26         ` Mike Christie
@ 2007-03-15 14:45           ` James Smart
  0 siblings, 0 replies; 15+ messages in thread
From: James Smart @ 2007-03-15 14:45 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi

ah....  OK.

At first glance, I didn't like simple addition of another flag. Although simple,
it grows data structures quickly. It seems more representative of a state flag,
or a iftest based on the state checking (e.g. we know we're blocked and whether
we have failfast pending). But this is largely stylistic. Preferences ?  I can
post an alternative.

-- james

Mike Christie wrote:
> James Smart wrote:
>> Background:
>>   The states, in the transport are:
>>      event               state
>>     --------------------------------
>>       n/a                running
>>     lose connectivity    blocked
>>     fastfail timeout     still blocked, but with fastfail indicator
> 
> My patch changed this one. When this times out, I unblock the queue.
> 
>>     dev_loss timeout     unblocked/removed
>>
>> Question:
>>
>> The chkready helper in the transport catches race conditions where the
>> block is being put in place, but queuecommand is running on another cpu,
>> thus returning DID_IMM_RETRY.
>>
>> Your patch is updating chkready to look for the case when fastfail has
>> fired, and returning DID_TRANSPORT_FASTFAIL  ?  Why ?  The request
>> queue would be blocked so there should be no call to queuecommand that
>> would fall into this category - unless you are assuming that fastfail
>> timeout is so fast that it could fall into the same race condition as
>> blocked.
> 
> In the patch, when the fast IO fail timer fires, I unblock the queues.
> 
> +       rport->fast_io_fail_timeout = 1;
>         i->f->terminate_rport_io(rport);
> +       scsi_target_unblock(&rport->dev);
> 
> At that time IO would be queued on drivers and would hit that test and
> be failed. New IO would also be failed upwards by that check.
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] scsi: add transport host byte errors
  2007-03-15 13:20     ` [PATCH 2/4] scsi: add transport host byte errors James Smart
@ 2007-03-15 14:52       ` Mike Christie
  2007-03-15 16:33         ` James Smart
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Christie @ 2007-03-15 14:52 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi

James Smart wrote:
> 
> 
> michaelc@cs.wisc.edu wrote:
>> fast_io_fail_tmo
>> iscsi: session recovery_tmo
>> fc: rport fast_io_fail_tmo
>>
>> The difference is that when the timer fires, for iscsi we unblock the
>> queue and fail commands in the blocked queue. FC just fails IO running
>> in the driver/fw/hw. The IO in the blocked queue sits there until
>> dev_loss_tmo.
> 
> True - FC contacted the LLDD to terminate i/o, who has no notion of any io
> that has yet to be sent to it via queuecommand(). Blocked i/o sits until
> dev_loss_tmo, as that is when the sdev gets torn down. Perhaps a block
> layer call
> should be created, that the FC transport can call, to terminate the blocked
> queue.  Thoughts ?

Do we want to fail IO that was sitting in the queue _and_ all new
incoming IO or just what was sitting in the queue?

The patches I sent, unplug the queues when failfast timer expires so
that is where the chk ready test for failfast comes in. When failfast
fires, the queue will be unplugged and we will hit the failfast test and
anything coming through will be failed.

Alternatively:

1. What about making the transport check ready test standard and adding
a transportt->check_ready callout which gets called before
scsi_dispatch_cmd calls the queuecommand?

2. Another option could be do add some code which does it a layer higher
at the scsi device level. The function would set the scsi_device state
to some value that would indicate the device is not ready and wants to
fail IO, then it would unplug the queue. The scsi_prep_fn would then
check for that state and fail IO. Or we could just set the state to an
existing value like offline and we would not have to modify and existing
state checks.

3. Or we could go one layer higher than that and add and set some block
layer bits. Unblock the queue and before the scsi_prep_fn is called the
block layer would check the state bit and fail IO.

4. Those would work if we want to fail IO that was queued and new
incoming IO. If we want to just fail IO that was queued, and queue IO
new incoming IO, then the block layer could offer a function which grabs
the queue lock, dequeued what was there and then fail each IO. scsi-ml
would then call that function as a helper. SCSI-ml again would not see
the IO here.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] scsi: add transport host byte errors
  2007-03-15 14:52       ` Mike Christie
@ 2007-03-15 16:33         ` James Smart
  0 siblings, 0 replies; 15+ messages in thread
From: James Smart @ 2007-03-15 16:33 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi


So my position changes a little now that you pointed out that you
release the request queue when fastfail kicks in.

Mike Christie wrote:
> Do we want to fail IO that was sitting in the queue _and_ all new
> incoming IO or just what was sitting in the queue?

I believe all i/o - so that the upper layer sees everything occurring
on each state change and can choose accordingly.

> 
> The patches I sent, unplug the queues when failfast timer expires so
> that is where the chk ready test for failfast comes in. When failfast
> fires, the queue will be unplugged and we will hit the failfast test and
> anything coming through will be failed.
> 
> Alternatively:
> 
> 1. What about making the transport check ready test standard and adding
> a transportt->check_ready callout which gets called before
> scsi_dispatch_cmd calls the queuecommand?

I like the idea, as I hated having to add this snippet to each driver.
The other place we had to use it was in slave_alloc(), so doing the same
thing there would be great.

> 2. Another option could be do add some code which does it a layer higher
> at the scsi device level. The function would set the scsi_device state
> to some value that would indicate the device is not ready and wants to
> fail IO, then it would unplug the queue. The scsi_prep_fn would then
> check for that state and fail IO. Or we could just set the state to an
> existing value like offline and we would not have to modify and existing
> state checks.
 >
 > 3. Or we could go one layer higher than that and add and set some block
 > layer bits. Unblock the queue and before the scsi_prep_fn is called the
 > block layer would check the state bit and fail IO.
 >

Yep, and likely a better decision long term (though more work) as it's
storage transport agnostic. The "other layer" guys can answer better on
this one. I would think we still have to keep the chkready()s as there will
always be race conditions.

> 4. Those would work if we want to fail IO that was queued and new
> incoming IO. If we want to just fail IO that was queued, and queue IO
> new incoming IO, then the block layer could offer a function which grabs
> the queue lock, dequeued what was there and then fail each IO. scsi-ml
> would then call that function as a helper. SCSI-ml again would not see
> the IO here.

True - but this is a new and different abort routine from the one today
that expects to kick in on timeout. We purposely stopped the eh handler
from aborting a command while you had connectivity lost. So, it would be
a bad idea to go down this path.

-- james s

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2007-03-15 16:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-14 19:52 RFC: add transport host byte values and common failure behavior michaelc
2007-03-14 19:52 ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs michaelc
2007-03-14 19:52   ` [PATCH 2/4] scsi: add transport host byte errors michaelc
2007-03-14 19:52     ` [PATCH 3/4] iscsi class, libiscsi and qla4xxx: convert to new transport host byte values michaelc
2007-03-14 19:52       ` [PATCH 4/4] fc class: use " michaelc
2007-03-14 20:11     ` [PATCH 2/4] scsi: add transport host byte errors Mike Christie
2007-03-14 21:14     ` Mike Christie
2007-03-15 13:41       ` [PATCH 4/4] fc class: use transport host byte values James Smart
2007-03-15 14:26         ` Mike Christie
2007-03-15 14:45           ` James Smart
2007-03-15 13:20     ` [PATCH 2/4] scsi: add transport host byte errors James Smart
2007-03-15 14:52       ` Mike Christie
2007-03-15 16:33         ` James Smart
2007-03-15 12:58   ` [PATCH 1/4] iscsi class, qla4xxx and libiscsi: export iscsi session state in sysfs James Smart
2007-03-15 14:20     ` Mike Christie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).