linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] nvme_fc: add dev_loss_tmo support
@ 2017-05-13 19:07 James Smart
  2017-05-13 19:07 ` [PATCH 1/8] nvme core: allow controller RESETTING to RECONNECTING transition James Smart
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


FC, on the SCSI side, has long had a device loss timeout which governed
how long it would hide connectivity loss to remote targets. The timeout
value is maintained in the SCSI FC transport and admins are used to
going there to maintain it.

Eventually, the SCSI FC transport will be moved into something
independent from and above SCSI so that SCSI and NVME protocols can
be peers. That's fairly distant. In the meantime, to add the
functionality now, and sync with the SCSI FC transport, the LLDD will
be used as the conduit. The initial value for the timeout can be set
by the LLDD when it creates the remoteport via
nvme_fc_register_remoteport(). Later, if the value is updated via the
SCSI transport, the LLDD can call a new nvme_fc routine to update
the remoteport's dev_loss_tmo value, which can then be propagated to
the controllers on the remoteport.

As the fabrics implementation already has a similar behavior
introduced on rdma, the ctrl_loss_tmo, which may be set on a
controller basis (finer granularity than the FC port used for the
connection), the nvme_fc transport will mediate and choose the lesser
of the controllers value and the remoteports value.

The patches were cut on nvme-4.12 after applying the patches in
http://lists.infradead.org/pipermail/linux-nvme/2017-May/010156.html
and
http://lists.infradead.org/pipermail/linux-nvme/2017-May/010166.html

James Smart (8):
  nvme core: allow controller RESETTING to RECONNECTING transition
  nvme_fc: change ctlr state assignments during reset/reconnect
  nvme_fc: add a dev_loss_tmo field to the remoteport
  nvme_fc: add dev_loss_tmo to controller
  nvme_fc: check connectivity before initiating reconnects
  nvme_fc: change failure code on remoteport connectivity loss
  nvme_fc: move remote port get/put/free location
  nvme_fc: add dev_loss_tmo timeout and remoteport resume support

 drivers/nvme/host/core.c       |   1 +
 drivers/nvme/host/fc.c         | 465 +++++++++++++++++++++++++++++++++++------
 include/linux/nvme-fc-driver.h |  11 +-
 3 files changed, 411 insertions(+), 66 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] nvme core: allow controller RESETTING to RECONNECTING transition
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 2/8] nvme_fc: change ctlr state assignments during reset/reconnect James Smart
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


Allow controller state transition : RESETTING to RECONNECTING

The nvme_fc transport will set the state to RESETTING when
tearing down the current association (error or controller reset),
then transition to RECONNECTING when attempting to establish a new
association.

-- james

Signed-off-by: James Smart <james.smart at broadcom.com>
---
originally posted standalone
recut on nvme-4.12

 drivers/nvme/host/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index d5e0906262ea..ae977ef886a3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -175,6 +175,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 	case NVME_CTRL_RECONNECTING:
 		switch (old_state) {
 		case NVME_CTRL_LIVE:
+		case NVME_CTRL_RESETTING:
 			changed = true;
 			/* FALLTHRU */
 		default:
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] nvme_fc: change ctlr state assignments during reset/reconnect
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
  2017-05-13 19:07 ` [PATCH 1/8] nvme core: allow controller RESETTING to RECONNECTING transition James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 3/8] nvme_fc: add a dev_loss_tmo field to the remoteport James Smart
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


Existing code:
Set NVME_CTRL_RESETTING upon entry from the core reset_ctrl
callback and left it set that way until reconnected.
Set NVME_CTRL_REcONNECTING after a transport detected error
and left it set that way until reconnected.

Revise the code so that NVME_CTRL_RESETTING is always set when
tearing down the association regardless of why/how, and after
the association is torn down, transition to NVME_CTRL_RECONNECTING
while it attempts to establish a new association with the target.

The RESETTING->RECONNECTING state transition is dependent upon the
patch that enables the transition.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 9a594ea323e8..55d0d8c3e92f 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1799,7 +1799,7 @@ nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg)
 	if (ctrl->queue_count > 1)
 		nvme_stop_queues(&ctrl->ctrl);
 
-	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING)) {
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) {
 		dev_err(ctrl->ctrl.device,
 			"NVME-FC{%d}: error_recovery: Couldn't change state "
 			"to RECONNECTING\n", ctrl->cnum);
@@ -2662,6 +2662,13 @@ nvme_fc_reset_ctrl_work(struct work_struct *work)
 	/* will block will waiting for io to terminate */
 	nvme_fc_delete_association(ctrl);
 
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING)) {
+		dev_err(ctrl->ctrl.device,
+			"NVME-FC{%d}: controller reset: Couldn't change "
+			"state to RECONNECTING\n", ctrl->cnum);
+		return;
+	}
+
 	ret = nvme_fc_create_association(ctrl);
 	if (ret)
 		nvme_fc_reconnect_or_delete(ctrl, ret);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] nvme_fc: add a dev_loss_tmo field to the remoteport
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
  2017-05-13 19:07 ` [PATCH 1/8] nvme core: allow controller RESETTING to RECONNECTING transition James Smart
  2017-05-13 19:07 ` [PATCH 2/8] nvme_fc: change ctlr state assignments during reset/reconnect James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 4/8] nvme_fc: add dev_loss_tmo to controller James Smart
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


Add a dev_loss_tmo value, paralleling the SCSI FC transport, for device
connectivity loss. The transport can initialize it in the
nvme_fc_register_remoteport() call or fall back on a default of 60s.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c         | 14 ++++++++++++++
 include/linux/nvme-fc-driver.h |  9 +++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 55d0d8c3e92f..6eaf70eaedf4 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -45,6 +45,10 @@ enum nvme_fc_queue_flags {
 
 #define NVMEFC_QUEUE_DELAY	3		/* ms units */
 
+#define NVME_FC_DEFAULT_DEV_LOSS_TMO	60	/* seconds */
+#define NVME_FC_EXPECTED_RECONNECT_TM	2	/* seconds - E_D_TOV */
+#define NVME_FC_MIN_DEV_LOSS_TMO	(2 * NVME_FC_EXPECTED_RECONNECT_TM)
+
 struct nvme_fc_queue {
 	struct nvme_fc_ctrl	*ctrl;
 	struct device		*dev;
@@ -438,6 +442,12 @@ nvme_fc_register_remoteport(struct nvme_fc_local_port *localport,
 	unsigned long flags;
 	int ret, idx;
 
+	if (pinfo->dev_loss_tmo &&
+			pinfo->dev_loss_tmo < NVME_FC_MIN_DEV_LOSS_TMO) {
+		ret = -EINVAL;
+		goto out_reghost_failed;
+	}
+
 	newrec = kmalloc((sizeof(*newrec) + lport->ops->remote_priv_sz),
 			 GFP_KERNEL);
 	if (!newrec) {
@@ -471,6 +481,10 @@ nvme_fc_register_remoteport(struct nvme_fc_local_port *localport,
 	newrec->remoteport.port_id = pinfo->port_id;
 	newrec->remoteport.port_state = FC_OBJSTATE_ONLINE;
 	newrec->remoteport.port_num = idx;
+	if (pinfo->dev_loss_tmo)
+		newrec->remoteport.dev_loss_tmo = pinfo->dev_loss_tmo;
+	else
+		newrec->remoteport.dev_loss_tmo = NVME_FC_DEFAULT_DEV_LOSS_TMO;
 
 	spin_lock_irqsave(&nvme_fc_lock, flags);
 	list_add_tail(&newrec->endp_list, &lport->endp_list);
diff --git a/include/linux/nvme-fc-driver.h b/include/linux/nvme-fc-driver.h
index c42313f9a451..61f3eae534ee 100644
--- a/include/linux/nvme-fc-driver.h
+++ b/include/linux/nvme-fc-driver.h
@@ -40,6 +40,8 @@
  * @node_name: FC WWNN for the port
  * @port_name: FC WWPN for the port
  * @port_role: What NVME roles are supported (see FC_PORT_ROLE_xxx)
+ * @dev_loss_tmo: maximum delay for reconnects to an association on
+ *             this device. Used only on a remoteport.
  *
  * Initialization values for dynamic port fields:
  * @port_id:      FC N_Port_ID currently assigned the port. Upper 8 bits must
@@ -50,6 +52,7 @@ struct nvme_fc_port_info {
 	u64			port_name;
 	u32			port_role;
 	u32			port_id;
+	u32			dev_loss_tmo;
 };
 
 
@@ -244,6 +247,9 @@ struct nvme_fc_local_port {
  *             The length of the buffer corresponds to the remote_priv_sz
  *             value specified in the nvme_fc_port_template supplied by
  *             the LLDD.
+ * @dev_loss_tmo: maximum delay for reconnects to an association on
+ *             this device. To modify, lldd must call
+ *             nvme_fc_set_remoteport_devloss().
  *
  * Fields with dynamic values. Values may change base on link or login
  * state. LLDD may reference fields directly to change them. Initialized by
@@ -259,10 +265,9 @@ struct nvme_fc_remote_port {
 	u32 port_role;
 	u64 node_name;
 	u64 port_name;
-
 	struct nvme_fc_local_port *localport;
-
 	void *private;
+	u32 dev_loss_tmo;
 
 	/* dynamic fields */
 	u32 port_id;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] nvme_fc: add dev_loss_tmo to controller
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (2 preceding siblings ...)
  2017-05-13 19:07 ` [PATCH 3/8] nvme_fc: add a dev_loss_tmo field to the remoteport James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 5/8] nvme_fc: check connectivity before initiating reconnects James Smart
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


This patch adds a dev_loss_tmo value to the controller. The value
is initialized from the remoteport.

The patch also adds a lldd-callable routine,
nvme_fc_set_remoteport_devloss() to change the value on the remoteport
and apply the new value to the controllers on the remote port.

The dev_loss_tmo value set on the controller will ultimately be the
maximum window allowed for reconnection, whether it was started due
to controller reset, transport error, or loss of connectivity to the
target.

The dev_loss_tmo value set on the controller will be the smaller of
the controller's (max_connects * reconnect_delay) window set at creation
and the remoteport's dev_loss_tmo value.  After selecting the smallest
value, if the controller's reconnect window is superceeded by the
remoteport's dev_loss_tmo value, the reconnect values will be
adjusted for the new dev_loss_tmo value.

Signed-off-by: James Smart <james.smart at broadcom.com>

---
This a revised version of what was in the RFC:
removed all references to kato, which was a red herring.

 drivers/nvme/host/fc.c         | 93 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/nvme-fc-driver.h |  2 +
 2 files changed, 95 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 6eaf70eaedf4..13c18966fadc 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -167,6 +167,7 @@ struct nvme_fc_ctrl {
 	struct work_struct	delete_work;
 	struct work_struct	reset_work;
 	struct delayed_work	connect_work;
+	u32			dev_loss_tmo;
 
 	struct kref		ref;
 	u32			flags;
@@ -2772,6 +2773,96 @@ __nvme_fc_options_match(struct nvmf_ctrl_options *opts,
 	return true;
 }
 
+static void
+nvme_fc_set_ctrl_devloss(struct nvme_fc_ctrl *ctrl,
+		struct nvmf_ctrl_options *opts)
+{
+	u32 dev_loss_tmo;
+
+	/*
+	 * dev_loss_tmo will be the max amount of time after an association
+	 * failure that will be allowed for a new association to be
+	 * established. It doesn't matter why the original association
+	 * failed (FC connectivity loss, transport error, admin-request).
+	 * The new association must be established before dev_loss_tmo
+	 * expires or the controller will be torn down.
+	 *
+	 * If the connect parameters are less than the FC port dev_loss_tmo
+	 * parameter, scale dev_loss_tmo to the connect parameters.
+	 *
+	 * If the connect parameters are larger than the FC port
+	 * dev_loss_tmo parameter, adjust the connect parameters so that
+	 * there is at least 1 attempt at a reconnect attempt before failing.
+	 * Note: reconnects will be attempted only if there is FC connectivity.
+	 */
+
+	if (opts->max_reconnects < 1)
+		opts->max_reconnects = 1;
+	dev_loss_tmo = opts->reconnect_delay * opts->max_reconnects;
+
+	ctrl->dev_loss_tmo =
+		min_t(u32, ctrl->rport->remoteport.dev_loss_tmo, dev_loss_tmo);
+	if (ctrl->dev_loss_tmo < ctrl->rport->remoteport.dev_loss_tmo)
+		dev_warn(ctrl->ctrl.device,
+			"NVME-FC{%d}: scaling dev_loss_tmo to reconnect "
+			"window (%d)\n",
+			ctrl->cnum, ctrl->dev_loss_tmo);
+
+	/* resync dev_loss_tmo with the reconnect window */
+	if (ctrl->dev_loss_tmo < opts->reconnect_delay * opts->max_reconnects) {
+		if (!ctrl->dev_loss_tmo)
+			opts->max_reconnects = 0;
+		else {
+			opts->reconnect_delay =
+				min_t(u32, opts->reconnect_delay,
+					ctrl->dev_loss_tmo -
+						NVME_FC_EXPECTED_RECONNECT_TM);
+			opts->max_reconnects = DIV_ROUND_UP(ctrl->dev_loss_tmo,
+						opts->reconnect_delay);
+			dev_warn(ctrl->ctrl.device,
+				"NVME-FC{%d}: dev_loss_tmo %d: scaling "
+				"reconnect delay %d max reconnects %d\n",
+				ctrl->cnum, ctrl->dev_loss_tmo,
+				opts->reconnect_delay, opts->max_reconnects);
+		}
+	}
+}
+
+int
+nvme_fc_set_remoteport_devloss(struct nvme_fc_remote_port *portptr,
+			u32 dev_loss_tmo)
+{
+	struct nvme_fc_rport *rport = remoteport_to_rport(portptr);
+	struct nvme_fc_ctrl *ctrl;
+	unsigned long flags;
+
+	/*
+	 * Allow dev_loss_tmo set to 0. This will allow
+	 * nvme_fc_unregister_remoteport() to immediately delete
+	 * controllers without waiting a dev_loss_tmo timeout.
+	 */
+	if (dev_loss_tmo && dev_loss_tmo < NVME_FC_MIN_DEV_LOSS_TMO)
+		return -ERANGE;
+
+	spin_lock_irqsave(&rport->lock, flags);
+
+	if (portptr->port_state != FC_OBJSTATE_ONLINE) {
+		spin_unlock_irqrestore(&rport->lock, flags);
+		return -EINVAL;
+	}
+
+	rport->remoteport.dev_loss_tmo = dev_loss_tmo;
+
+	list_for_each_entry(ctrl, &rport->ctrl_list, ctrl_list)
+		/* Apply values for use in next reconnect cycle */
+		nvme_fc_set_ctrl_devloss(ctrl, ctrl->ctrl.opts);
+
+	spin_unlock_irqrestore(&rport->lock, flags);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nvme_fc_set_remoteport_devloss);
+
 static struct nvme_ctrl *
 nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	struct nvme_fc_lport *lport, struct nvme_fc_rport *rport)
@@ -2884,6 +2975,8 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);
 	spin_unlock_irqrestore(&rport->lock, flags);
 
+	nvme_fc_set_ctrl_devloss(ctrl, opts);
+
 	ret = nvme_fc_create_association(ctrl);
 	if (ret) {
 		ctrl->ctrl.opts = NULL;
diff --git a/include/linux/nvme-fc-driver.h b/include/linux/nvme-fc-driver.h
index 61f3eae534ee..e96a073861b6 100644
--- a/include/linux/nvme-fc-driver.h
+++ b/include/linux/nvme-fc-driver.h
@@ -453,6 +453,8 @@ int nvme_fc_unregister_remoteport(struct nvme_fc_remote_port *remoteport);
 
 void nvme_fc_rescan_remoteport(struct nvme_fc_remote_port *remoteport);
 
+int nvme_fc_set_remoteport_devloss(struct nvme_fc_remote_port *remoteport,
+			u32 dev_loss_tmo);
 
 
 /*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] nvme_fc: check connectivity before initiating reconnects
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (3 preceding siblings ...)
  2017-05-13 19:07 ` [PATCH 4/8] nvme_fc: add dev_loss_tmo to controller James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 6/8] nvme_fc: change failure code on remoteport connectivity loss James Smart
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


check remoteport connectivity before initiating reconnects

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 37 +++++++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 13c18966fadc..930c9bfa0aca 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -628,6 +628,19 @@ nvme_fc_rescan_remoteport(struct nvme_fc_remote_port *remoteport)
 }
 EXPORT_SYMBOL_GPL(nvme_fc_rescan_remoteport);
 
+static inline bool
+nvme_fc_rport_is_online(struct nvme_fc_rport *rport)
+{
+	unsigned long flags;
+	bool online;
+
+	spin_lock_irqsave(&rport->lock, flags);
+	online = (rport->remoteport.port_state == FC_OBJSTATE_ONLINE);
+	spin_unlock_irqrestore(&rport->lock, flags);
+
+	return online;
+}
+
 
 /* *********************** FC-NVME DMA Handling **************************** */
 
@@ -2358,6 +2371,9 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 
 	++ctrl->ctrl.opts->nr_reconnects;
 
+	if (!nvme_fc_rport_is_online(ctrl->rport))
+		return -ENODEV;
+
 	/*
 	 * Create the admin queue
 	 */
@@ -2653,6 +2669,12 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
 		ctrl->cnum, status);
 
 	if (nvmf_should_reconnect(&ctrl->ctrl)) {
+		/*
+		 * Only schedule the reconnect if the remote port is online
+		 */
+		if (!nvme_fc_rport_is_online(ctrl->rport))
+			return;
+
 		dev_info(ctrl->ctrl.device,
 			"NVME-FC{%d}: Reconnect attempt in %d seconds.\n",
 			ctrl->cnum, ctrl->ctrl.opts->reconnect_delay);
@@ -2684,12 +2706,15 @@ nvme_fc_reset_ctrl_work(struct work_struct *work)
 		return;
 	}
 
-	ret = nvme_fc_create_association(ctrl);
-	if (ret)
-		nvme_fc_reconnect_or_delete(ctrl, ret);
-	else
-		dev_info(ctrl->ctrl.device,
-			"NVME-FC{%d}: controller reset complete\n", ctrl->cnum);
+	if (nvme_fc_rport_is_online(ctrl->rport)) {
+		ret = nvme_fc_create_association(ctrl);
+		if (ret)
+			nvme_fc_reconnect_or_delete(ctrl, ret);
+		else
+			dev_info(ctrl->ctrl.device,
+				"NVME-FC{%d}: controller reset complete\n",
+				ctrl->cnum);
+	}
 }
 
 /*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] nvme_fc: change failure code on remoteport connectivity loss
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (4 preceding siblings ...)
  2017-05-13 19:07 ` [PATCH 5/8] nvme_fc: check connectivity before initiating reconnects James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 7/8] nvme_fc: move remote port get/put/free location James Smart
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


Rather than return BLK_MQ_RQ_QUEUE_ERROR if connectivity has been
lost to the remoteport, which gets reflected all the way back to the
user, change failure location to let the lldd bounce it and have it
fall back into the busy logic which requeues the io.

This addresses io failures that occur on ios issued right at the time
of connectivity loss.

Note: check of connectivity is not done under a lock to avoid a
fast-path performance penalty. Thus expectation is that LLDD will
validate the connectivity as well.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 930c9bfa0aca..8fb3845f7faa 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1956,13 +1956,6 @@ nvme_fc_start_fcp_op(struct nvme_fc_ctrl *ctrl, struct nvme_fc_queue *queue,
 	u32 csn;
 	int ret;
 
-	/*
-	 * before attempting to send the io, check to see if we believe
-	 * the target device is present
-	 */
-	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE)
-		return BLK_MQ_RQ_QUEUE_ERROR;
-
 	if (!nvme_fc_ctrl_get(ctrl))
 		return BLK_MQ_RQ_QUEUE_ERROR;
 
@@ -2038,7 +2031,8 @@ nvme_fc_start_fcp_op(struct nvme_fc_ctrl *ctrl, struct nvme_fc_queue *queue,
 
 		nvme_fc_ctrl_put(ctrl);
 
-		if (ret != -EBUSY)
+		if (ctrl->rport->remoteport.port_state == FC_OBJSTATE_ONLINE &&
+				ret != -EBUSY)
 			return BLK_MQ_RQ_QUEUE_ERROR;
 
 		if (op->rq) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] nvme_fc: move remote port get/put/free location
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (5 preceding siblings ...)
  2017-05-13 19:07 ` [PATCH 6/8] nvme_fc: change failure code on remoteport connectivity loss James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-13 19:07 ` [PATCH 8/8] nvme_fc: add dev_loss_tmo timeout and remoteport resume support James Smart
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


move nvme_fc_rport_get/put and rport free to higher in the file to
avoid adding prototypes to resolve references in upcoming code additions

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 78 +++++++++++++++++++++++++-------------------------
 1 file changed, 39 insertions(+), 39 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8fb3845f7faa..8150e5d937cf 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -417,6 +417,45 @@ nvme_fc_signal_discovery_scan(struct nvme_fc_lport *lport,
 	kobject_uevent_env(&nvmefc_device->kobj, KOBJ_CHANGE, envp);
 }
 
+static void
+nvme_fc_free_rport(struct kref *ref)
+{
+	struct nvme_fc_rport *rport =
+		container_of(ref, struct nvme_fc_rport, ref);
+	struct nvme_fc_lport *lport =
+			localport_to_lport(rport->remoteport.localport);
+	unsigned long flags;
+
+	WARN_ON(rport->remoteport.port_state != FC_OBJSTATE_DELETED);
+	WARN_ON(!list_empty(&rport->ctrl_list));
+
+	/* remove from lport list */
+	spin_lock_irqsave(&nvme_fc_lock, flags);
+	list_del(&rport->endp_list);
+	spin_unlock_irqrestore(&nvme_fc_lock, flags);
+
+	/* let the LLDD know we've finished tearing it down */
+	lport->ops->remoteport_delete(&rport->remoteport);
+
+	ida_simple_remove(&lport->endp_cnt, rport->remoteport.port_num);
+
+	kfree(rport);
+
+	nvme_fc_lport_put(lport);
+}
+
+static void
+nvme_fc_rport_put(struct nvme_fc_rport *rport)
+{
+	kref_put(&rport->ref, nvme_fc_free_rport);
+}
+
+static int
+nvme_fc_rport_get(struct nvme_fc_rport *rport)
+{
+	return kref_get_unless_zero(&rport->ref);
+}
+
 /**
  * nvme_fc_register_remoteport - transport entry point called by an
  *                              LLDD to register the existence of a NVME
@@ -506,45 +545,6 @@ nvme_fc_register_remoteport(struct nvme_fc_local_port *localport,
 }
 EXPORT_SYMBOL_GPL(nvme_fc_register_remoteport);
 
-static void
-nvme_fc_free_rport(struct kref *ref)
-{
-	struct nvme_fc_rport *rport =
-		container_of(ref, struct nvme_fc_rport, ref);
-	struct nvme_fc_lport *lport =
-			localport_to_lport(rport->remoteport.localport);
-	unsigned long flags;
-
-	WARN_ON(rport->remoteport.port_state != FC_OBJSTATE_DELETED);
-	WARN_ON(!list_empty(&rport->ctrl_list));
-
-	/* remove from lport list */
-	spin_lock_irqsave(&nvme_fc_lock, flags);
-	list_del(&rport->endp_list);
-	spin_unlock_irqrestore(&nvme_fc_lock, flags);
-
-	/* let the LLDD know we've finished tearing it down */
-	lport->ops->remoteport_delete(&rport->remoteport);
-
-	ida_simple_remove(&lport->endp_cnt, rport->remoteport.port_num);
-
-	kfree(rport);
-
-	nvme_fc_lport_put(lport);
-}
-
-static void
-nvme_fc_rport_put(struct nvme_fc_rport *rport)
-{
-	kref_put(&rport->ref, nvme_fc_free_rport);
-}
-
-static int
-nvme_fc_rport_get(struct nvme_fc_rport *rport)
-{
-	return kref_get_unless_zero(&rport->ref);
-}
-
 static int
 nvme_fc_abort_lsops(struct nvme_fc_rport *rport)
 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] nvme_fc: add dev_loss_tmo timeout and remoteport resume support
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (6 preceding siblings ...)
  2017-05-13 19:07 ` [PATCH 7/8] nvme_fc: move remote port get/put/free location James Smart
@ 2017-05-13 19:07 ` James Smart
  2017-05-22 23:57 ` [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
  2017-05-23  7:20 ` Christoph Hellwig
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-13 19:07 UTC (permalink / raw)


This patch adds the dev_loss_tmo functionality to the transport.

When a remoteport is unregistered (connectivity lost), it is marked
DELETED and the following is perfomed on all the controllers on the
remoteport: the controller is reset to delete the current association.
Once the association is terminated, the dev_loss_tmo timer is started.
A reconnect is not scheduled as there is no connectivity. Note: the
start of the dev_loss_tmo timer is in the generic
delete-association/create-new-association path. Thus it will be started
regardless of whether the reset was due to remote port connectivity
loss, a controller reset, or a transport run-time error.

When a remoteport is registered (connectivity established), the
transport searches the list of remoteport structures that have pending
deletions (controllers waiting to have dev_loss_tmo fire, thus
preventing remoteport deletion). The transport looks for a matching
wwnn/wwpn. If one is found, the remoteport is transitioned back to
ONLINE, and the following occurs on all controllers on the remoteport:
any controllers in a RECONNECTING state have reconnection attempts kicked
off. If the controller was DELETING, it's natural RECONNECT transition
will start a reconnect attempt.

Once a controller successfully reconnects to a new association, any
dev_loss_tmo timer for it is terminated.

If a dev_loss_tmo timer for a controller fires, the controller is
unconditionally deleted.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 226 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 215 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8150e5d937cf..989d50c7133c 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -167,6 +167,7 @@ struct nvme_fc_ctrl {
 	struct work_struct	delete_work;
 	struct work_struct	reset_work;
 	struct delayed_work	connect_work;
+	struct delayed_work	dev_loss_work;
 	u32			dev_loss_tmo;
 
 	struct kref		ref;
@@ -456,6 +457,86 @@ nvme_fc_rport_get(struct nvme_fc_rport *rport)
 	return kref_get_unless_zero(&rport->ref);
 }
 
+static void
+nvme_fc_resume_controller(struct nvme_fc_ctrl *ctrl)
+{
+	switch (ctrl->ctrl.state) {
+	case NVME_CTRL_RECONNECTING:
+		/*
+		 * As all reconnects were suppressed, schedule a
+		 * connect.
+		 */
+		queue_delayed_work(nvme_fc_wq, &ctrl->connect_work, 0);
+		break;
+
+	case NVME_CTRL_RESETTING:
+		/*
+		 * Controller is already in the process of terminating the
+		 * association. No need to do anything further. The reconnect
+		 * step will naturally occur after the reset completes.
+		 */
+		break;
+
+	default:
+		/* no action to take - let it delete */
+		break;
+	}
+}
+
+static struct nvme_fc_rport *
+nvme_fc_attach_to_suspended_rport(struct nvme_fc_lport *lport,
+				struct nvme_fc_port_info *pinfo)
+{
+	struct nvme_fc_rport *rport;
+	struct nvme_fc_ctrl *ctrl;
+	unsigned long flags;
+
+	spin_lock_irqsave(&nvme_fc_lock, flags);
+
+	list_for_each_entry(rport, &lport->endp_list, endp_list) {
+		if (rport->remoteport.node_name != pinfo->node_name ||
+		    rport->remoteport.port_name != pinfo->port_name)
+			continue;
+
+		if (!nvme_fc_rport_get(rport)) {
+			rport = ERR_PTR(-ENOLCK);
+			goto out_done;
+		}
+
+		spin_unlock_irqrestore(&nvme_fc_lock, flags);
+
+		spin_lock_irqsave(&rport->lock, flags);
+
+		/* has it been unregistered */
+		if (rport->remoteport.port_state != FC_OBJSTATE_DELETED) {
+			/* means lldd called us twice */
+			spin_unlock_irqrestore(&rport->lock, flags);
+			nvme_fc_rport_put(rport);
+			return ERR_PTR(-ESTALE);
+		}
+
+		rport->remoteport.port_state = FC_OBJSTATE_ONLINE;
+
+		/*
+		 * kick off a reconnect attempt on all associations to the
+		 * remote port. A successful reconnects will resume i/o.
+		 */
+		list_for_each_entry(ctrl, &rport->ctrl_list, ctrl_list)
+			nvme_fc_resume_controller(ctrl);
+
+		spin_unlock_irqrestore(&rport->lock, flags);
+
+		return rport;
+	}
+
+	rport = NULL;
+
+out_done:
+	spin_unlock_irqrestore(&nvme_fc_lock, flags);
+
+	return rport;
+}
+
 /**
  * nvme_fc_register_remoteport - transport entry point called by an
  *                              LLDD to register the existence of a NVME
@@ -488,22 +569,46 @@ nvme_fc_register_remoteport(struct nvme_fc_local_port *localport,
 		goto out_reghost_failed;
 	}
 
+	if (!nvme_fc_lport_get(lport)) {
+		ret = -ESHUTDOWN;
+		goto out_reghost_failed;
+	}
+
+	/*
+	 * look to see if there is already a remoteport that is waiting
+	 * for a reconnect (within dev_loss_tmo) with the same WWN's.
+	 * If so, transition to it and reconnect.
+	 */
+	newrec = nvme_fc_attach_to_suspended_rport(lport, pinfo);
+
+	/* found an rport, but something about its state is bad */
+	if (IS_ERR(newrec)) {
+		ret = PTR_ERR(newrec);
+		goto out_lport_put;
+
+	/* found existing rport, which was resumed */
+	} else if (newrec) {
+		/* Ignore pinfo->dev_loss_tmo. Leave rport and ctlr's as is */
+
+		nvme_fc_lport_put(lport);
+		nvme_fc_signal_discovery_scan(lport, newrec);
+		*portptr = &newrec->remoteport;
+		return 0;
+	}
+
+	/* nothing found - allocate a new remoteport struct */
+
 	newrec = kmalloc((sizeof(*newrec) + lport->ops->remote_priv_sz),
 			 GFP_KERNEL);
 	if (!newrec) {
 		ret = -ENOMEM;
-		goto out_reghost_failed;
-	}
-
-	if (!nvme_fc_lport_get(lport)) {
-		ret = -ESHUTDOWN;
-		goto out_kfree_rport;
+		goto out_lport_put;
 	}
 
 	idx = ida_simple_get(&lport->endp_cnt, 0, 0, GFP_KERNEL);
 	if (idx < 0) {
 		ret = -ENOSPC;
-		goto out_lport_put;
+		goto out_kfree_rport;
 	}
 
 	INIT_LIST_HEAD(&newrec->endp_list);
@@ -535,10 +640,10 @@ nvme_fc_register_remoteport(struct nvme_fc_local_port *localport,
 	*portptr = &newrec->remoteport;
 	return 0;
 
-out_lport_put:
-	nvme_fc_lport_put(lport);
 out_kfree_rport:
 	kfree(newrec);
+out_lport_put:
+	nvme_fc_lport_put(lport);
 out_reghost_failed:
 	*portptr = NULL;
 	return ret;
@@ -569,6 +674,74 @@ nvme_fc_abort_lsops(struct nvme_fc_rport *rport)
 	return 0;
 }
 
+static void
+nvmet_fc_start_dev_loss_tmo(struct nvme_fc_ctrl *ctrl, u32 dev_loss_tmo)
+{
+	/* if dev_loss_tmo==0, dev loss is immediate */
+	if (!dev_loss_tmo) {
+		dev_info(ctrl->ctrl.device,
+			"NVME-FC{%d}: controller connectivity lost. "
+			"Deleting controller.\n",
+			ctrl->cnum);
+		__nvme_fc_del_ctrl(ctrl);
+		return;
+	}
+
+	dev_info(ctrl->ctrl.device,
+		"NVME-FC{%d}: controller connectivity lost. Awaiting reconnect",
+		ctrl->cnum);
+
+	switch (ctrl->ctrl.state) {
+	case NVME_CTRL_LIVE:
+		/*
+		 * Schedule a controller reset. The reset will terminate
+		 * the association and schedule the dev_loss_tmo timer.
+		 * The reconnect after terminating the association will
+		 * note the rport state and will not be scheduled.
+		 * The controller will sit in that state, with io
+		 * suspended at the block layer, until either dev_loss_tmo
+		 * expires or the remoteport is re-registered. If
+		 * re-registered, an immediate connect attempt will be
+		 * made.
+		 */
+		if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||
+		    !queue_work(nvme_fc_wq, &ctrl->reset_work))
+			__nvme_fc_del_ctrl(ctrl);
+		break;
+
+	case NVME_CTRL_RECONNECTING:
+		/*
+		 * The association has already been terminated and
+		 * dev_loss_tmo scheduled. The controller is either in
+		 * the process of connecting or has scheduled a
+		 * reconnect attempt.
+		 * If in the process of connecting, it will fail due
+		 * to loss of connectivity to the remoteport, and the
+		 * reconnect will not be scheduled as there is no
+		 * connectivity.
+		 * If awaiting the reconnect, terminate it as it'll only
+		 * fail.
+		 */
+		cancel_delayed_work_sync(&ctrl->connect_work);
+		break;
+
+	case NVME_CTRL_RESETTING:
+		/*
+		 * Controller is already in the process of terminating the
+		 * association. No need to do anything further. The reconnect
+		 * step will kick in naturally after the association is
+		 * terminated, detecting the lack of connectivity, and not
+		 * attempt a reconnect or schedule one.
+		 */
+		break;
+
+	case NVME_CTRL_DELETING:
+	default:
+		/* no action to take - let it delete */
+		break;
+	}
+}
+
 /**
  * nvme_fc_unregister_remoteport - transport entry point called by an
  *                              LLDD to deregister/remove a previously
@@ -598,15 +771,20 @@ nvme_fc_unregister_remoteport(struct nvme_fc_remote_port *portptr)
 	}
 	portptr->port_state = FC_OBJSTATE_DELETED;
 
-	/* tear down all associations to the remote port */
 	list_for_each_entry(ctrl, &rport->ctrl_list, ctrl_list)
-		__nvme_fc_del_ctrl(ctrl);
+		nvmet_fc_start_dev_loss_tmo(ctrl, portptr->dev_loss_tmo);
 
 	spin_unlock_irqrestore(&rport->lock, flags);
 
 	nvme_fc_abort_lsops(rport);
 
+	/*
+	 * release the reference, which will allow, if all controllers
+	 * go away, which should only occur after dev_loss_tmo occurs,
+	 * for the rport to be torn down.
+	 */
 	nvme_fc_rport_put(rport);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nvme_fc_unregister_remoteport);
@@ -2471,6 +2649,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 		nvme_queue_async_events(&ctrl->ctrl);
 	}
 
+	cancel_delayed_work_sync(&ctrl->dev_loss_work);
+
 	return 0;	/* Success */
 
 out_term_aen_ops:
@@ -2589,6 +2769,7 @@ nvme_fc_delete_ctrl_work(struct work_struct *work)
 
 	cancel_work_sync(&ctrl->reset_work);
 	cancel_delayed_work_sync(&ctrl->connect_work);
+	cancel_delayed_work_sync(&ctrl->dev_loss_work);
 
 	/*
 	 * kill the association on the link side.  this will block
@@ -2700,6 +2881,9 @@ nvme_fc_reset_ctrl_work(struct work_struct *work)
 		return;
 	}
 
+	queue_delayed_work(nvme_fc_wq, &ctrl->dev_loss_work,
+			ctrl->dev_loss_tmo * HZ);
+
 	if (nvme_fc_rport_is_online(ctrl->rport)) {
 		ret = nvme_fc_create_association(ctrl);
 		if (ret)
@@ -2767,6 +2951,25 @@ nvme_fc_connect_ctrl_work(struct work_struct *work)
 			ctrl->cnum);
 }
 
+static void
+nvme_fc_dev_loss_ctrl_work(struct work_struct *work)
+{
+	struct nvme_fc_ctrl *ctrl =
+			container_of(to_delayed_work(work),
+				struct nvme_fc_ctrl, dev_loss_work);
+
+	if (ctrl->ctrl.state != NVME_CTRL_DELETING) {
+		dev_warn(ctrl->ctrl.device,
+			"NVME-FC{%d}: Device failed to reconnect within "
+			"dev_loss_tmo (%d seconds). Deleting controller\n",
+			ctrl->cnum, ctrl->dev_loss_tmo);
+		if (__nvme_fc_del_ctrl(ctrl))
+			dev_warn(ctrl->ctrl.device,
+				"NVME-FC{%d}: delete request failed\n",
+				ctrl->cnum);
+	}
+}
+
 
 static const struct blk_mq_ops nvme_fc_admin_mq_ops = {
 	.queue_rq	= nvme_fc_queue_rq,
@@ -2936,6 +3139,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	INIT_WORK(&ctrl->delete_work, nvme_fc_delete_ctrl_work);
 	INIT_WORK(&ctrl->reset_work, nvme_fc_reset_ctrl_work);
 	INIT_DELAYED_WORK(&ctrl->connect_work, nvme_fc_connect_ctrl_work);
+	INIT_DELAYED_WORK(&ctrl->dev_loss_work, nvme_fc_dev_loss_ctrl_work);
 	spin_lock_init(&ctrl->lock);
 
 	/* io queue count */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 0/8] nvme_fc: add dev_loss_tmo support
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (7 preceding siblings ...)
  2017-05-13 19:07 ` [PATCH 8/8] nvme_fc: add dev_loss_tmo timeout and remoteport resume support James Smart
@ 2017-05-22 23:57 ` James Smart
  2017-05-23  7:20 ` Christoph Hellwig
  9 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-22 23:57 UTC (permalink / raw)


On 5/13/2017 12:07 PM, James Smart wrote:
> FC, on the SCSI side, has long had a device loss timeout which governed
> how long it would hide connectivity loss to remote targets. The timeout
> value is maintained in the SCSI FC transport and admins are used to
> going there to maintain it.
>
> Eventually, the SCSI FC transport will be moved into something
> independent from and above SCSI so that SCSI and NVME protocols can
> be peers. That's fairly distant. In the meantime, to add the
> functionality now, and sync with the SCSI FC transport, the LLDD will
> be used as the conduit. The initial value for the timeout can be set
> by the LLDD when it creates the remoteport via
> nvme_fc_register_remoteport(). Later, if the value is updated via the
> SCSI transport, the LLDD can call a new nvme_fc routine to update
> the remoteport's dev_loss_tmo value, which can then be propagated to
> the controllers on the remoteport.
>
> As the fabrics implementation already has a similar behavior
> introduced on rdma, the ctrl_loss_tmo, which may be set on a
> controller basis (finer granularity than the FC port used for the
> connection), the nvme_fc transport will mediate and choose the lesser
> of the controllers value and the remoteports value.
>
> The patches were cut on nvme-4.12 after applying the patches in
> http://lists.infradead.org/pipermail/linux-nvme/2017-May/010156.html
> and
> http://lists.infradead.org/pipermail/linux-nvme/2017-May/010166.html
>
> James Smart (8):
>   nvme core: allow controller RESETTING to RECONNECTING transition
>   nvme_fc: change ctlr state assignments during reset/reconnect
>   nvme_fc: add a dev_loss_tmo field to the remoteport
>   nvme_fc: add dev_loss_tmo to controller
>   nvme_fc: check connectivity before initiating reconnects
>   nvme_fc: change failure code on remoteport connectivity loss
>   nvme_fc: move remote port get/put/free location
>   nvme_fc: add dev_loss_tmo timeout and remoteport resume support
>
>  drivers/nvme/host/core.c       |   1 +
>  drivers/nvme/host/fc.c         | 465 +++++++++++++++++++++++++++++++++++------
>  include/linux/nvme-fc-driver.h |  11 +-
>  3 files changed, 411 insertions(+), 66 deletions(-)
>


or comments on this series ?

-- james

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/8] nvme_fc: add dev_loss_tmo support
  2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
                   ` (8 preceding siblings ...)
  2017-05-22 23:57 ` [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
@ 2017-05-23  7:20 ` Christoph Hellwig
  2017-05-23 17:01   ` James Smart
  9 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2017-05-23  7:20 UTC (permalink / raw)


On Sat, May 13, 2017@12:07:14PM -0700, James Smart wrote:
> As the fabrics implementation already has a similar behavior
> introduced on rdma, the ctrl_loss_tmo, which may be set on a
> controller basis (finer granularity than the FC port used for the
> connection), the nvme_fc transport will mediate and choose the lesser
> of the controllers value and the remoteports value.

I would much prefer if nvme-fc could stick to the same controller
concept as rdma.  Especially as it needs to be synchronized with
the reconnect delay and the keep alive timeout (and we need to
do a better job on synchronize the latter to start with I think).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/8] nvme_fc: add dev_loss_tmo support
  2017-05-23  7:20 ` Christoph Hellwig
@ 2017-05-23 17:01   ` James Smart
  0 siblings, 0 replies; 12+ messages in thread
From: James Smart @ 2017-05-23 17:01 UTC (permalink / raw)


On 5/23/2017 12:20 AM, Christoph Hellwig wrote:
> On Sat, May 13, 2017@12:07:14PM -0700, James Smart wrote:
>> As the fabrics implementation already has a similar behavior
>> introduced on rdma, the ctrl_loss_tmo, which may be set on a
>> controller basis (finer granularity than the FC port used for the
>> connection), the nvme_fc transport will mediate and choose the lesser
>> of the controllers value and the remoteports value.
>
> I would much prefer if nvme-fc could stick to the same controller
> concept as rdma.  Especially as it needs to be synchronized with
> the reconnect delay and the keep alive timeout (and we need to
> do a better job on synchronize the latter to start with I think).

Not sure which controller concept you are thinking it isn't staying in 
sync with. ctrl_loss_tmo is continued, but it is augmented by having to 
deal with a node-level timeout that exists on FC, which rdma doesn't 
have. reconnect_delay is still used, but disabled if there's no 
connectivity as there's no point in re-trying a connect if there's no 
connectivity.  The only other semantic nvme-fc does differently (and its 
somewhat a different topic) is: on controller resets, don't tear down if 
the 1st reconnect attempt fails, at least wait the duration of 
ctrl_loss_tmo and use reconnect_delay for other attempts if connected.

If the main issue is: you don't want to have a 2nd timeout value merged 
with the connect-request's ctrl_loss_tmo value, then we need to work out 
a solution. I don't want to make admins learn a new way to set per-node 
timeout values, which should apply to SCSI and NVME equally, and can be 
dynamic. Right now, they come from the SCSI fc transport, and there's a 
lot of admin and infrastructure around management from that area. I 
don't believe you can just ignore it.  So the simple choice, which was 
proposed, was to simply merge them in the transport. The result is still 
ctrl_loss_tmo with reconnect delay, but nvme-fc: a) changes the value 
from what the connect request if node-level value is smaller; and b) it 
can dynamically change if the node-level value dynamically changes.

One thing I can propose - if we're using uevents to trigger connect 
requests, is to have the uevents specify the ctrl_loss_tmo values to use 
for the connect request, which would be based on the node-level devloss 
value. This would keep all the timeout values coming in via the connect 
request.  I dislike pushing too much information like this through udev 
to a cli and back, but it would work. It still won't deal with dynamic 
updates, so some thoughts are needed to address that aspect.

Thoughts ?

-- james

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-05-23 17:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-13 19:07 [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
2017-05-13 19:07 ` [PATCH 1/8] nvme core: allow controller RESETTING to RECONNECTING transition James Smart
2017-05-13 19:07 ` [PATCH 2/8] nvme_fc: change ctlr state assignments during reset/reconnect James Smart
2017-05-13 19:07 ` [PATCH 3/8] nvme_fc: add a dev_loss_tmo field to the remoteport James Smart
2017-05-13 19:07 ` [PATCH 4/8] nvme_fc: add dev_loss_tmo to controller James Smart
2017-05-13 19:07 ` [PATCH 5/8] nvme_fc: check connectivity before initiating reconnects James Smart
2017-05-13 19:07 ` [PATCH 6/8] nvme_fc: change failure code on remoteport connectivity loss James Smart
2017-05-13 19:07 ` [PATCH 7/8] nvme_fc: move remote port get/put/free location James Smart
2017-05-13 19:07 ` [PATCH 8/8] nvme_fc: add dev_loss_tmo timeout and remoteport resume support James Smart
2017-05-22 23:57 ` [PATCH 0/8] nvme_fc: add dev_loss_tmo support James Smart
2017-05-23  7:20 ` Christoph Hellwig
2017-05-23 17:01   ` James Smart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).