[PATCH 0/8] IB SRP initiator patches for kernel 3.12

linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/8] IB SRP initiator patches for kernel 3.12
@ 2013-08-20 12:41 Bart Van Assche
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:41 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, linux-scsi

The purpose of this InfiniBand SRP initiator patch series is as follows:
- Make the SRP initiator driver better suited for use in a H.A. setup.
   Add fast_io_fail_tmo, dev_loss_tmo and reconnect_delay parameters.
   These can be used either to speed up failover or to avoid device
   removal when e.g. using initiator side mirroring.
- Make the SRP initiator better suited for use on NUMA systems by
   making the HCA completion vector configurable.
- Improve performance by making the queue size configurable.

Changes since the previous patch series are:
- Rewrote the srp_tmo_valid() to improve readability (requested by Dave
   Dillow).
- The combination (reconnect_delay < 0 && fast_io_fail_tmo < 0 &&
   dev_loss_tmo < 0) is now rejected as requested by Dave Dillow.
- Fixed a race between transport layer failure handling and device
   removal. This issue was reported by Vu Pham.

The previous patch series can be found here:
http://thread.gmane.org/gmane.linux.drivers.rdma/16389

The individual patches in this series are:
0001-IB-srp-Keep-rport-as-long-as-the-IB-transport-layer.patch
0002-scsi_transport_srp-Add-transport-layer-error-handlin.patch
0003-IB-srp-Add-srp_terminate_io.patch
0004-IB-srp-Use-SRP-transport-layer-error-recovery.patch
0005-IB-srp-Start-timers-if-a-transport-layer-error-occur.patch
0006-IB-srp-Make-transport-layer-retry-count-configurable.patch
0007-IB-srp-Introduce-srp_alloc_req_data.patch
0008-IB-srp-Make-queue-size-configurable.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/8] IB/srp: Keep rport as long as the IB transport layer
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
@ 2013-08-20 12:43   ` Bart Van Assche
  2013-08-20 12:44   ` [PATCH 2/8] scsi_transport_srp: Add transport layer error handling Bart Van Assche
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:43 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, linux-scsi,
	James Bottomley

Keep the rport data structure around after srp_remove_host() has
finished until cleanup of the IB transport layer has finished
completely. This is necessary because later patches use the rport
pointer inside the queuecommand callback. Without this patch
accessing the rport from inside a queuecommand callback is racy
because srp_remove_host() must be invoked before scsi_remove_host()
and because the queuecommand callback may get invoked after
srp_remove_host() has finished. In other words, without this patch
the queuecommand callback may get invoked after the rport has been
removed.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: James Bottomley <JBottomley-MU7nAjRaF3makBO8gow8eQ@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 drivers/scsi/scsi_transport_srp.c   |   18 ++++++++++++++++++
 include/scsi/scsi_transport_srp.h   |    2 ++
 4 files changed, 24 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index f93baf8..de49088 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -528,11 +528,13 @@ static void srp_remove_target(struct srp_target_port *target)
 	WARN_ON_ONCE(target->state != SRP_TARGET_REMOVED);
 
 	srp_del_scsi_host_attr(target->scsi_host);
+	srp_rport_get(target->rport);
 	srp_remove_host(target->scsi_host);
 	scsi_remove_host(target->scsi_host);
 	srp_disconnect_target(target);
 	ib_destroy_cm_id(target->cm_id);
 	srp_free_target_ib(target);
+	srp_rport_put(target->rport);
 	srp_free_req_data(target);
 	scsi_host_put(target->scsi_host);
 }
@@ -1994,6 +1996,7 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
 	}
 
 	rport->lld_data = target;
+	target->rport = rport;
 
 	spin_lock(&host->target_lock);
 	list_add_tail(&target->list, &host->target_list);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index e641088..02392f5 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -153,6 +153,7 @@ struct srp_target_port {
 	u16			io_class;
 	struct srp_host	       *srp_host;
 	struct Scsi_Host       *scsi_host;
+	struct srp_rport       *rport;
 	char			target_name[32];
 	unsigned int		scsi_id;
 	unsigned int		sg_tablesize;
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index f379c7f..f7ba94a 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -185,6 +185,24 @@ static int srp_host_match(struct attribute_container *cont, struct device *dev)
 }
 
 /**
+ * srp_rport_get() - increment rport reference count
+ */
+void srp_rport_get(struct srp_rport *rport)
+{
+	get_device(&rport->dev);
+}
+EXPORT_SYMBOL(srp_rport_get);
+
+/**
+ * srp_rport_put() - decrement rport reference count
+ */
+void srp_rport_put(struct srp_rport *rport)
+{
+	put_device(&rport->dev);
+}
+EXPORT_SYMBOL(srp_rport_put);
+
+/**
  * srp_rport_add - add a SRP remote port to the device hierarchy
  * @shost:	scsi host the remote port is connected to.
  * @ids:	The port id for the remote port.
diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h
index ff0f04a..5a2d2d1 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -38,6 +38,8 @@ extern struct scsi_transport_template *
 srp_attach_transport(struct srp_function_template *);
 extern void srp_release_transport(struct scsi_transport_template *);
 
+extern void srp_rport_get(struct srp_rport *rport);
+extern void srp_rport_put(struct srp_rport *rport);
 extern struct srp_rport *srp_rport_add(struct Scsi_Host *,
 				       struct srp_rport_identifiers *);
 extern void srp_rport_del(struct srp_rport *);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/8] scsi_transport_srp: Add transport layer error handling
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
  2013-08-20 12:43   ` [PATCH 1/8] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
@ 2013-08-20 12:44   ` Bart Van Assche
  2013-08-20 12:45   ` [PATCH 3/8] IB/srp: Add srp_terminate_io() Bart Van Assche
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:44 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, linux-scsi,
	James Bottomley

Add the necessary functions in the SRP transport module to allow
an SRP initiator driver to implement transport layer error handling
similar to the functionality already provided by the FC transport
layer. This includes:
- Support for implementing fast_io_fail_tmo, the time that should
  elapse after having detected a transport layer problem and
  before failing I/O.
- Support for implementing dev_loss_tmo, the time that should
  elapse after having detected a transport layer problem and
  before removing a remote port.
- Support for periodically trying to reconnect to an SRP target
  after connection to a target has been lost.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: James Bottomley <JBottomley-MU7nAjRaF3makBO8gow8eQ@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 Documentation/ABI/stable/sysfs-transport-srp |   39 ++
 drivers/scsi/scsi_transport_srp.c            |  504 +++++++++++++++++++++++++-
 include/scsi/scsi_transport_srp.h            |   65 +++-
 3 files changed, 605 insertions(+), 3 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-transport-srp b/Documentation/ABI/stable/sysfs-transport-srp
index b36fb0d..21bd480 100644
--- a/Documentation/ABI/stable/sysfs-transport-srp
+++ b/Documentation/ABI/stable/sysfs-transport-srp
@@ -5,6 +5,24 @@ Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
 Description:	Instructs an SRP initiator to disconnect from a target and to
 		remove all LUNs imported from that target.
 
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/dev_loss_tmo
+Date:		December 1, 2013
+KernelVersion:	3.12
+Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+Description:	Number of seconds the SCSI layer will wait after a transport
+		layer error has been observed before removing a target port.
+		Zero means immediate removal. Setting this attribute to "off"
+		will disable the dev_loss timer.
+
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/fast_io_fail_tmo
+Date:		December 1, 2013
+KernelVersion:	3.12
+Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+Description:	Number of seconds the SCSI layer will wait after a transport
+		layer error has been observed before failing I/O. Zero means
+		failing I/O immediately. Setting this attribute to "off" will
+		disable the fast_io_fail timer.
+
 What:		/sys/class/srp_remote_ports/port-<h>:<n>/port_id
 Date:		June 27, 2007
 KernelVersion:	2.6.24
@@ -12,8 +30,29 @@ Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
 Description:	16-byte local SRP port identifier in hexadecimal format. An
 		example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00.
 
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/reconnect_delay
+Date:		December 1, 2013
+KernelVersion:	3.12
+Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+Description:	Number of seconds the SCSI layer will wait after a reconnect
+		attempt failed before retrying. Setting this attribute to
+		"off" will disable time-based reconnecting.
+
 What:		/sys/class/srp_remote_ports/port-<h>:<n>/roles
 Date:		June 27, 2007
 KernelVersion:	2.6.24
 Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
 Description:	Role of the remote port. Either "SRP Initiator" or "SRP Target".
+
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/state
+Date:		December 1, 2013
+KernelVersion:	3.12
+Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+Description:	State of the transport layer used for communication with the
+		remote port. "running" if the transport layer is operational;
+		"blocked" if a transport layer error has been encountered but
+		the fail_io_fast_tmo timer has not yet fired; "fail-fast"
+		after the fail_io_fast_tmo timer has fired and before the
+		"dev_loss_tmo" timer has fired; "lost" after the
+		"dev_loss_tmo" timer has fired and before the port is finally
+		removed.
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index f7ba94a..ff1baa8 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -24,12 +24,15 @@
 #include <linux/err.h>
 #include <linux/slab.h>
 #include <linux/string.h>
+#include <linux/delay.h>
 
 #include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_device.h>
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_transport.h>
 #include <scsi/scsi_transport_srp.h>
+#include "scsi_priv.h"
 #include "scsi_transport_srp_internal.h"
 
 struct srp_host_attrs {
@@ -38,7 +41,7 @@ struct srp_host_attrs {
 #define to_srp_host_attrs(host)	((struct srp_host_attrs *)(host)->shost_data)
 
 #define SRP_HOST_ATTRS 0
-#define SRP_RPORT_ATTRS 3
+#define SRP_RPORT_ATTRS 8
 
 struct srp_internal {
 	struct scsi_transport_template t;
@@ -54,6 +57,36 @@ struct srp_internal {
 
 #define	dev_to_rport(d)	container_of(d, struct srp_rport, dev)
 #define transport_class_to_srp_rport(dev) dev_to_rport((dev)->parent)
+static inline struct Scsi_Host *rport_to_shost(struct srp_rport *r)
+{
+	return dev_to_shost(r->dev.parent);
+}
+
+/**
+ * srp_tmo_valid() - check timeout combination validity
+ *
+ * The combination of the three timeout parameters must be such that SCSI
+ * commands are finished in a reasonable time. Hence do not allow the fast I/O
+ * fail timeout to exceed SCSI_DEVICE_BLOCK_MAX_TIMEOUT. Furthermore, these
+ * parameters must be such that multipath can detect failed paths timely. Hence
+ * do not allow all three parameters to be disabled simultaneously.
+ */
+int srp_tmo_valid(int reconnect_delay, int fast_io_fail_tmo, int dev_loss_tmo)
+{
+	if (reconnect_delay < 0 && fast_io_fail_tmo < 0 && dev_loss_tmo < 0)
+		return -EINVAL;
+	if (reconnect_delay == 0)
+		return -EINVAL;
+	if (fast_io_fail_tmo > SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
+		return -EINVAL;
+	if (dev_loss_tmo >= LONG_MAX / HZ)
+		return -EINVAL;
+	if (fast_io_fail_tmo >= 0 && dev_loss_tmo >= 0 &&
+	    fast_io_fail_tmo >= dev_loss_tmo)
+		return -EINVAL;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(srp_tmo_valid);
 
 static int srp_host_setup(struct transport_container *tc, struct device *dev,
 			  struct device *cdev)
@@ -134,10 +167,447 @@ static ssize_t store_srp_rport_delete(struct device *dev,
 
 static DEVICE_ATTR(delete, S_IWUSR, NULL, store_srp_rport_delete);
 
+static ssize_t show_srp_rport_state(struct device *dev,
+				    struct device_attribute *attr,
+				    char *buf)
+{
+	static const char *const state_name[] = {
+		[SRP_RPORT_RUNNING]	= "running",
+		[SRP_RPORT_BLOCKED]	= "blocked",
+		[SRP_RPORT_FAIL_FAST]	= "fail-fast",
+		[SRP_RPORT_LOST]	= "lost",
+	};
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	enum srp_rport_state state = rport->state;
+
+	return sprintf(buf, "%s\n",
+		       (unsigned)state < ARRAY_SIZE(state_name) ?
+		       state_name[state] : "???");
+}
+
+static DEVICE_ATTR(state, S_IRUGO, show_srp_rport_state, NULL);
+
+static ssize_t show_reconnect_delay(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	if (rport->reconnect_delay >= 0)
+		return sprintf(buf, "%d\n", rport->reconnect_delay);
+	else
+		return sprintf(buf, "off\n");
+}
+
+static ssize_t store_reconnect_delay(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, const size_t count)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	int res, delay;
+
+	if (strncmp(buf, "off", 3) != 0) {
+		res = kstrtoint(buf, 0, &delay);
+		if (res)
+			goto out;
+	} else {
+		delay = -1;
+	}
+	res = srp_tmo_valid(delay, rport->fast_io_fail_tmo,
+			    rport->dev_loss_tmo);
+	if (res)
+		goto out;
+
+	if (rport->reconnect_delay <= 0 && delay > 0 &&
+	    rport->state != SRP_RPORT_RUNNING) {
+		queue_delayed_work(system_long_wq, &rport->reconnect_work,
+				   delay * HZ);
+	} else if (delay <= 0) {
+		cancel_delayed_work(&rport->reconnect_work);
+	}
+	rport->reconnect_delay = delay;
+	res = count;
+
+out:
+	return res;
+}
+
+static DEVICE_ATTR(reconnect_delay, S_IRUGO | S_IWUSR, show_reconnect_delay,
+		   store_reconnect_delay);
+
+static ssize_t show_failed_reconnects(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	return sprintf(buf, "%d\n", rport->failed_reconnects);
+}
+
+static DEVICE_ATTR(failed_reconnects, S_IRUGO, show_failed_reconnects, NULL);
+
+static ssize_t show_srp_rport_fast_io_fail_tmo(struct device *dev,
+					       struct device_attribute *attr,
+					       char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	if (rport->fast_io_fail_tmo >= 0)
+		return sprintf(buf, "%d\n", rport->fast_io_fail_tmo);
+	else
+		return sprintf(buf, "off\n");
+}
+
+static ssize_t store_srp_rport_fast_io_fail_tmo(struct device *dev,
+						struct device_attribute *attr,
+						const char *buf, size_t count)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	int res;
+	int fast_io_fail_tmo;
+
+	if (strncmp(buf, "off", 3) != 0) {
+		res = kstrtoint(buf, 0, &fast_io_fail_tmo);
+		if (res)
+			goto out;
+	} else {
+		fast_io_fail_tmo = -1;
+	}
+	res = srp_tmo_valid(rport->reconnect_delay, fast_io_fail_tmo,
+			    rport->dev_loss_tmo);
+	if (res)
+		goto out;
+	rport->fast_io_fail_tmo = fast_io_fail_tmo;
+	res = count;
+
+out:
+	return res;
+}
+
+static DEVICE_ATTR(fast_io_fail_tmo, S_IRUGO | S_IWUSR,
+		   show_srp_rport_fast_io_fail_tmo,
+		   store_srp_rport_fast_io_fail_tmo);
+
+static ssize_t show_srp_rport_dev_loss_tmo(struct device *dev,
+					   struct device_attribute *attr,
+					   char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	if (rport->dev_loss_tmo >= 0)
+		return sprintf(buf, "%d\n", rport->dev_loss_tmo);
+	else
+		return sprintf(buf, "off\n");
+}
+
+static ssize_t store_srp_rport_dev_loss_tmo(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	int res;
+	int dev_loss_tmo;
+
+	if (strncmp(buf, "off", 3) != 0) {
+		res = kstrtoint(buf, 0, &dev_loss_tmo);
+		if (res)
+			goto out;
+	} else {
+		dev_loss_tmo = -1;
+	}
+	res = srp_tmo_valid(rport->reconnect_delay, rport->fast_io_fail_tmo,
+			    dev_loss_tmo);
+	if (res)
+		goto out;
+	rport->dev_loss_tmo = dev_loss_tmo;
+	res = count;
+
+out:
+	return res;
+}
+
+static DEVICE_ATTR(dev_loss_tmo, S_IRUGO | S_IWUSR,
+		   show_srp_rport_dev_loss_tmo,
+		   store_srp_rport_dev_loss_tmo);
+
+static int srp_rport_set_state(struct srp_rport *rport,
+			       enum srp_rport_state new_state)
+{
+	enum srp_rport_state old_state = rport->state;
+
+	lockdep_assert_held(&rport->mutex);
+
+	switch (new_state) {
+	case SRP_RPORT_RUNNING:
+		switch (old_state) {
+		case SRP_RPORT_LOST:
+			goto invalid;
+		default:
+			break;
+		}
+		break;
+	case SRP_RPORT_BLOCKED:
+		switch (old_state) {
+		case SRP_RPORT_RUNNING:
+			break;
+		default:
+			goto invalid;
+		}
+		break;
+	case SRP_RPORT_FAIL_FAST:
+		switch (old_state) {
+		case SRP_RPORT_LOST:
+			goto invalid;
+		default:
+			break;
+		}
+		break;
+	case SRP_RPORT_LOST:
+		break;
+	}
+	rport->state = new_state;
+	return 0;
+
+invalid:
+	return -EINVAL;
+}
+
+/**
+ * scsi_request_fn_active() - number of kernel threads inside scsi_request_fn()
+ */
+static int scsi_request_fn_active(struct Scsi_Host *shost)
+{
+	struct scsi_device *sdev;
+	struct request_queue *q;
+	int request_fn_active = 0;
+
+	shost_for_each_device(sdev, shost) {
+		q = sdev->request_queue;
+
+		spin_lock_irq(q->queue_lock);
+		request_fn_active += q->request_fn_active;
+		spin_unlock_irq(q->queue_lock);
+	}
+
+	return request_fn_active;
+}
+
+/**
+ * srp_reconnect_rport() - reconnect to an SRP target port
+ *
+ * Blocks SCSI command queueing before invoking reconnect() such that
+ * queuecommand() won't be invoked concurrently with reconnect(). This is
+ * important since a reconnect() implementation may reallocate resources
+ * needed by queuecommand(). Please note that this function neither waits
+ * until outstanding requests have finished nor tries to abort these. It is
+ * the responsibility of the reconnect() function to finish outstanding
+ * commands before reconnecting to the target port.
+ */
+int srp_reconnect_rport(struct srp_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct srp_internal *i = to_srp_internal(shost->transportt);
+	struct scsi_device *sdev;
+	int res;
+
+	pr_debug("SCSI host %s\n", dev_name(&shost->shost_gendev));
+
+	res = mutex_lock_interruptible(&rport->mutex);
+	if (res)
+		goto out;
+	scsi_target_block(&shost->shost_gendev);
+	while (scsi_request_fn_active(shost))
+		msleep(20);
+	res = i->f->reconnect(rport);
+	pr_debug("%s (state %d): transport.reconnect() returned %d\n",
+		 dev_name(&shost->shost_gendev), rport->state, res);
+	if (res == 0) {
+		mutex_unlock(&rport->mutex);
+
+		cancel_delayed_work(&rport->fast_io_fail_work);
+		cancel_delayed_work(&rport->dev_loss_work);
+
+		mutex_lock(&rport->mutex);
+		rport->failed_reconnects = 0;
+		srp_rport_set_state(rport, SRP_RPORT_RUNNING);
+		scsi_target_unblock(&shost->shost_gendev, SDEV_RUNNING);
+		/*
+		 * It can occur that after fast_io_fail_tmo expired and before
+		 * dev_loss_tmo expired that the SCSI error handler has
+		 * offlined one or more devices. scsi_target_unblock() doesn't
+		 * change the state of these devices into running, so do that
+		 * explicitly.
+		 */
+		spin_lock_irq(shost->host_lock);
+		__shost_for_each_device(sdev, shost)
+			if (sdev->sdev_state == SDEV_OFFLINE)
+				sdev->sdev_state = SDEV_RUNNING;
+		spin_unlock_irq(shost->host_lock);
+	} else if (rport->state == SRP_RPORT_RUNNING) {
+		srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
+		scsi_target_unblock(&shost->shost_gendev,
+				    SDEV_TRANSPORT_OFFLINE);
+	}
+	mutex_unlock(&rport->mutex);
+
+out:
+	return res;
+}
+EXPORT_SYMBOL(srp_reconnect_rport);
+
+/**
+ * srp_reconnect_work() - reconnect and schedule a new attempt if necessary
+ */
+static void srp_reconnect_work(struct work_struct *work)
+{
+	struct srp_rport *rport = container_of(to_delayed_work(work),
+					struct srp_rport, reconnect_work);
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	int delay, res;
+
+	res = srp_reconnect_rport(rport);
+	if (res != 0) {
+		shost_printk(KERN_ERR, shost,
+			     "reconnect attempt %d failed (%d)\n",
+			     ++rport->failed_reconnects, res);
+		delay = rport->reconnect_delay *
+			min(100, max(1, rport->failed_reconnects - 10));
+		if (delay > 0)
+			queue_delayed_work(system_long_wq,
+					   &rport->reconnect_work, delay * HZ);
+	}
+}
+
+static void __rport_fail_io_fast(struct srp_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct srp_internal *i;
+
+	lockdep_assert_held(&rport->mutex);
+
+	if (srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST))
+		return;
+	scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE);
+
+	/* Involve the LLD if possible to terminate all I/O on the rport. */
+	i = to_srp_internal(shost->transportt);
+	if (i->f->terminate_rport_io)
+		i->f->terminate_rport_io(rport);
+}
+
+/**
+ * rport_fast_io_fail_timedout() - fast I/O failure timeout handler
+ */
+static void rport_fast_io_fail_timedout(struct work_struct *work)
+{
+	struct srp_rport *rport = container_of(to_delayed_work(work),
+					struct srp_rport, fast_io_fail_work);
+	struct Scsi_Host *shost = rport_to_shost(rport);
+
+	pr_debug("fast_io_fail_tmo expired for %s.\n",
+		 dev_name(&shost->shost_gendev));
+
+	mutex_lock(&rport->mutex);
+	__rport_fail_io_fast(rport);
+	mutex_unlock(&rport->mutex);
+}
+
+/**
+ * rport_dev_loss_timedout() - device loss timeout handler
+ */
+static void rport_dev_loss_timedout(struct work_struct *work)
+{
+	struct srp_rport *rport = container_of(to_delayed_work(work),
+					struct srp_rport, dev_loss_work);
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct srp_internal *i = to_srp_internal(shost->transportt);
+
+	pr_err("dev_loss_tmo expired for %s.\n",
+	       dev_name(&shost->shost_gendev));
+
+	mutex_lock(&rport->mutex);
+	WARN_ON(srp_rport_set_state(rport, SRP_RPORT_LOST) != 0);
+	scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE);
+	mutex_unlock(&rport->mutex);
+
+	i->f->rport_delete(rport);
+}
+
+/**
+ * srp_start_tl_fail_timers() - start the transport layer failure timers
+ *
+ * Start the transport layer fast I/O failure and device loss timers. Do not
+ * modify a timer that was already started.
+ */
+void srp_start_tl_fail_timers(struct srp_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	int fast_io_fail_tmo, dev_loss_tmo, delay;
+
+	mutex_lock(&rport->mutex);
+	if (!rport->deleted) {
+		delay = rport->reconnect_delay;
+		fast_io_fail_tmo = rport->fast_io_fail_tmo;
+		dev_loss_tmo = rport->dev_loss_tmo;
+		pr_debug("%s current state: %d\n",
+			 dev_name(&shost->shost_gendev), rport->state);
+
+		if (delay > 0)
+			queue_delayed_work(system_long_wq,
+					   &rport->reconnect_work,
+					   1UL * delay * HZ);
+		if (fast_io_fail_tmo >= 0 &&
+		    srp_rport_set_state(rport, SRP_RPORT_BLOCKED) == 0) {
+			pr_debug("%s new state: %d\n",
+				 dev_name(&shost->shost_gendev),
+				 rport->state);
+			scsi_target_block(&shost->shost_gendev);
+			queue_delayed_work(system_long_wq,
+					   &rport->fast_io_fail_work,
+					   1UL * fast_io_fail_tmo * HZ);
+		}
+		if (dev_loss_tmo >= 0)
+			queue_delayed_work(system_long_wq,
+					   &rport->dev_loss_work,
+					   1UL * dev_loss_tmo * HZ);
+	} else {
+		pr_debug("%s has already been deleted\n",
+			 dev_name(&shost->shost_gendev));
+		srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
+		scsi_target_unblock(&shost->shost_gendev,
+				    SDEV_TRANSPORT_OFFLINE);
+	}
+	mutex_unlock(&rport->mutex);
+}
+EXPORT_SYMBOL(srp_start_tl_fail_timers);
+
+/**
+ * srp_timed_out() - SRP transport intercept of the SCSI timeout EH
+ *
+ * If a timeout occurs while an rport is in the blocked state, ask the SCSI
+ * EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
+ * handle the timeout (BLK_EH_NOT_HANDLED).
+ *
+ * Note: This function is called from soft-IRQ context and with the request
+ * queue lock held.
+ */
+static enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
+{
+	struct scsi_device *sdev = scmd->device;
+	struct Scsi_Host *shost = sdev->host;
+	struct srp_internal *i = to_srp_internal(shost->transportt);
+
+	pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
+	return i->f->reset_timer_if_blocked && scsi_device_blocked(sdev) ?
+		BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
+}
+
 static void srp_rport_release(struct device *dev)
 {
 	struct srp_rport *rport = dev_to_rport(dev);
 
+	cancel_delayed_work_sync(&rport->reconnect_work);
+	cancel_delayed_work_sync(&rport->fast_io_fail_work);
+	cancel_delayed_work_sync(&rport->dev_loss_work);
+
 	put_device(dev->parent);
 	kfree(rport);
 }
@@ -214,12 +684,15 @@ struct srp_rport *srp_rport_add(struct Scsi_Host *shost,
 {
 	struct srp_rport *rport;
 	struct device *parent = &shost->shost_gendev;
+	struct srp_internal *i = to_srp_internal(shost->transportt);
 	int id, ret;
 
 	rport = kzalloc(sizeof(*rport), GFP_KERNEL);
 	if (!rport)
 		return ERR_PTR(-ENOMEM);
 
+	mutex_init(&rport->mutex);
+
 	device_initialize(&rport->dev);
 
 	rport->dev.parent = get_device(parent);
@@ -228,6 +701,17 @@ struct srp_rport *srp_rport_add(struct Scsi_Host *shost,
 	memcpy(rport->port_id, ids->port_id, sizeof(rport->port_id));
 	rport->roles = ids->roles;
 
+	if (i->f->reconnect)
+		rport->reconnect_delay = i->f->reconnect_delay ?
+			*i->f->reconnect_delay : 10;
+	INIT_DELAYED_WORK(&rport->reconnect_work, srp_reconnect_work);
+	rport->fast_io_fail_tmo = i->f->fast_io_fail_tmo ?
+		*i->f->fast_io_fail_tmo : 15;
+	rport->dev_loss_tmo = i->f->dev_loss_tmo ? *i->f->dev_loss_tmo : 600;
+	INIT_DELAYED_WORK(&rport->fast_io_fail_work,
+			  rport_fast_io_fail_timedout);
+	INIT_DELAYED_WORK(&rport->dev_loss_work, rport_dev_loss_timedout);
+
 	id = atomic_inc_return(&to_srp_host_attrs(shost)->next_port_id);
 	dev_set_name(&rport->dev, "port-%d:%d", shost->host_no, id);
 
@@ -277,6 +761,13 @@ void srp_rport_del(struct srp_rport *rport)
 	transport_remove_device(dev);
 	device_del(dev);
 	transport_destroy_device(dev);
+
+	mutex_lock(&rport->mutex);
+	if (rport->state == SRP_RPORT_BLOCKED)
+		__rport_fail_io_fast(rport);
+	rport->deleted = true;
+	mutex_unlock(&rport->mutex);
+
 	put_device(dev);
 }
 EXPORT_SYMBOL_GPL(srp_rport_del);
@@ -328,6 +819,8 @@ srp_attach_transport(struct srp_function_template *ft)
 	if (!i)
 		return NULL;
 
+	i->t.eh_timed_out = srp_timed_out;
+
 	i->t.tsk_mgmt_response = srp_tsk_mgmt_response;
 	i->t.it_nexus_response = srp_it_nexus_response;
 
@@ -345,6 +838,15 @@ srp_attach_transport(struct srp_function_template *ft)
 	count = 0;
 	i->rport_attrs[count++] = &dev_attr_port_id;
 	i->rport_attrs[count++] = &dev_attr_roles;
+	if (ft->has_rport_state) {
+		i->rport_attrs[count++] = &dev_attr_state;
+		i->rport_attrs[count++] = &dev_attr_fast_io_fail_tmo;
+		i->rport_attrs[count++] = &dev_attr_dev_loss_tmo;
+	}
+	if (ft->reconnect) {
+		i->rport_attrs[count++] = &dev_attr_reconnect_delay;
+		i->rport_attrs[count++] = &dev_attr_failed_reconnects;
+	}
 	if (ft->rport_delete)
 		i->rport_attrs[count++] = &dev_attr_delete;
 	i->rport_attrs[count++] = NULL;
diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h
index 5a2d2d1..4795dec 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -13,6 +13,26 @@ struct srp_rport_identifiers {
 	u8 roles;
 };
 
+/**
+ * enum srp_rport_state - SRP transport layer state
+ * @SRP_RPORT_RUNNING:   Transport layer operational.
+ * @SRP_RPORT_BLOCKED:   Transport layer not operational; fast I/O fail timer
+ *                       is running and I/O has been blocked.
+ * @SRP_RPORT_FAIL_FAST: Fast I/O fail timer has expired; fail I/O fast.
+ * @SRP_RPORT_LOST:      Device loss timer has expired; port is being removed.
+ */
+enum srp_rport_state {
+	SRP_RPORT_RUNNING,
+	SRP_RPORT_BLOCKED,
+	SRP_RPORT_FAIL_FAST,
+	SRP_RPORT_LOST,
+};
+
+/**
+ * struct srp_rport
+ * @mutex:   Protects against concurrent rport fast_io_fail / dev_loss_tmo /
+ *   reconnect activity.
+ */
 struct srp_rport {
 	/* for initiator and target drivers */
 
@@ -23,11 +43,29 @@ struct srp_rport {
 
 	/* for initiator drivers */
 
-	void *lld_data;	/* LLD private data */
+	void			*lld_data;	/* LLD private data */
+
+	struct mutex		mutex;
+	enum srp_rport_state	state;
+	bool			deleted;
+	int			reconnect_delay;
+	int			failed_reconnects;
+	struct delayed_work	reconnect_work;
+	int			fast_io_fail_tmo;
+	int			dev_loss_tmo;
+	struct delayed_work	fast_io_fail_work;
+	struct delayed_work	dev_loss_work;
 };
 
 struct srp_function_template {
 	/* for initiator drivers */
+	bool has_rport_state;
+	bool reset_timer_if_blocked;
+	int *reconnect_delay;
+	int *fast_io_fail_tmo;
+	int *dev_loss_tmo;
+	int (*reconnect)(struct srp_rport *rport);
+	void (*terminate_rport_io)(struct srp_rport *rport);
 	void (*rport_delete)(struct srp_rport *rport);
 	/* for target drivers */
 	int (* tsk_mgmt_response)(struct Scsi_Host *, u64, u64, int);
@@ -43,7 +81,30 @@ extern void srp_rport_put(struct srp_rport *rport);
 extern struct srp_rport *srp_rport_add(struct Scsi_Host *,
 				       struct srp_rport_identifiers *);
 extern void srp_rport_del(struct srp_rport *);
-
+extern int srp_tmo_valid(int reconnect_delay, int fast_io_fail_tmo,
+			 int dev_loss_tmo);
+extern int srp_reconnect_rport(struct srp_rport *rport);
+extern void srp_start_tl_fail_timers(struct srp_rport *rport);
 extern void srp_remove_host(struct Scsi_Host *);
 
+/**
+ * srp_chkready() - evaluate the transport layer state before I/O
+ *
+ * Returns a SCSI result code that can be returned by the LLD. The role of
+ * this function is similar to that of fc_remote_port_chkready().
+ */
+static inline int srp_chkready(struct srp_rport *rport)
+{
+	switch (rport->state) {
+	case SRP_RPORT_RUNNING:
+	case SRP_RPORT_BLOCKED:
+	default:
+		return 0;
+	case SRP_RPORT_FAIL_FAST:
+		return DID_TRANSPORT_FAILFAST << 16;
+	case SRP_RPORT_LOST:
+		return DID_NO_CONNECT << 16;
+	}
+}
+
 #endif
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/8] IB/srp: Add srp_terminate_io()
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
  2013-08-20 12:43   ` [PATCH 1/8] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
  2013-08-20 12:44   ` [PATCH 2/8] scsi_transport_srp: Add transport layer error handling Bart Van Assche
@ 2013-08-20 12:45   ` Bart Van Assche
  2013-08-20 12:46   ` [PATCH 4/8] IB/srp: Use SRP transport layer error recovery Bart Van Assche
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:45 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Finish all outstanding I/O requests after fast_io_fail_tmo expired,
which speeds up failover in a multipath setup. This patch is a
reworked version of a patch from Sebastian Riemer.

Reported-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index de49088..37dd3fb 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -688,17 +688,29 @@ static void srp_free_req(struct srp_target_port *target,
 	spin_unlock_irqrestore(&target->lock, flags);
 }
 
-static void srp_reset_req(struct srp_target_port *target, struct srp_request *req)
+static void srp_finish_req(struct srp_target_port *target,
+			   struct srp_request *req, int result)
 {
 	struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
 
 	if (scmnd) {
 		srp_free_req(target, req, scmnd, 0);
-		scmnd->result = DID_RESET << 16;
+		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
 	}
 }
 
+static void srp_terminate_io(struct srp_rport *rport)
+{
+	struct srp_target_port *target = rport->lld_data;
+	int i;
+
+	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+		struct srp_request *req = &target->req_ring[i];
+		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
+	}
+}
+
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct Scsi_Host *shost = target->scsi_host;
@@ -725,8 +737,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
 
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		if (req->scmnd)
-			srp_reset_req(target, req);
+		srp_finish_req(target, req, DID_RESET << 16);
 	}
 
 	INIT_LIST_HEAD(&target->free_tx);
@@ -1784,7 +1795,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		if (req->scmnd && req->scmnd->device == scmnd->device)
-			srp_reset_req(target, req);
+			srp_finish_req(target, req, DID_RESET << 16);
 	}
 
 	return SUCCESS;
@@ -2616,6 +2627,7 @@ static void srp_remove_one(struct ib_device *device)
 
 static struct srp_function_template ib_srp_transport_functions = {
 	.rport_delete		 = srp_rport_delete,
+	.terminate_rport_io	 = srp_terminate_io,
 };
 
 static int __init srp_init_module(void)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/8] IB/srp: Use SRP transport layer error recovery
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2013-08-20 12:45   ` [PATCH 3/8] IB/srp: Add srp_terminate_io() Bart Van Assche
@ 2013-08-20 12:46   ` Bart Van Assche
  2013-08-20 12:46   ` [PATCH 5/8] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:46 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Enable reconnect_delay, fast_io_fail_tmo and dev_loss_tmo
functionality for the IB SRP initiator. Add kernel module
parameters that allow to specify default values for these
three parameters.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |  129 +++++++++++++++++++++++++----------
 drivers/infiniband/ulp/srp/ib_srp.h |    1 -
 2 files changed, 94 insertions(+), 36 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 37dd3fb..a7fa7ed 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -86,6 +86,32 @@ module_param(topspin_workarounds, int, 0444);
 MODULE_PARM_DESC(topspin_workarounds,
 		 "Enable workarounds for Topspin/Cisco SRP target bugs if != 0");
 
+static struct kernel_param_ops srp_tmo_ops;
+
+static int srp_reconnect_delay = 10;
+module_param_cb(reconnect_delay, &srp_tmo_ops, &srp_reconnect_delay,
+		S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(reconnect_delay, "Time between successive reconnect attempts");
+
+static int srp_fast_io_fail_tmo = 15;
+module_param_cb(fast_io_fail_tmo, &srp_tmo_ops, &srp_fast_io_fail_tmo,
+		S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(fast_io_fail_tmo,
+		 "Number of seconds between the observation of a transport"
+		 " layer error and failing all I/O. \"off\" means that this"
+		 " functionality is disabled.");
+
+static int srp_dev_loss_tmo = 600;
+module_param_cb(dev_loss_tmo, &srp_tmo_ops, &srp_dev_loss_tmo,
+		S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(dev_loss_tmo,
+		 "Maximum number of seconds that the SRP transport should"
+		 " insulate transport layer errors. After this time has been"
+		 " exceeded the SCSI target is removed. Should be"
+		 " between 1 and " __stringify(SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
+		 " if fast_io_fail_tmo has not been set. \"off\" means that"
+		 " this functionality is disabled.");
+
 static void srp_add_one(struct ib_device *device);
 static void srp_remove_one(struct ib_device *device);
 static void srp_recv_completion(struct ib_cq *cq, void *target_ptr);
@@ -102,6 +128,48 @@ static struct ib_client srp_client = {
 
 static struct ib_sa_client srp_sa_client;
 
+static int srp_tmo_get(char *buffer, const struct kernel_param *kp)
+{
+	int tmo = *(int *)kp->arg;
+
+	if (tmo >= 0)
+		return sprintf(buffer, "%d", tmo);
+	else
+		return sprintf(buffer, "off");
+}
+
+static int srp_tmo_set(const char *val, const struct kernel_param *kp)
+{
+	int tmo, res;
+
+	if (strncmp(val, "off", 3) != 0) {
+		res = kstrtoint(val, 0, &tmo);
+		if (res)
+			goto out;
+	} else {
+		tmo = -1;
+	}
+	if (kp->arg == &srp_reconnect_delay)
+		res = srp_tmo_valid(tmo, srp_fast_io_fail_tmo,
+				    srp_dev_loss_tmo);
+	else if (kp->arg == &srp_fast_io_fail_tmo)
+		res = srp_tmo_valid(srp_reconnect_delay, tmo, srp_dev_loss_tmo);
+	else
+		res = srp_tmo_valid(srp_reconnect_delay, srp_fast_io_fail_tmo,
+				    tmo);
+	if (res)
+		goto out;
+	*(int *)kp->arg = tmo;
+
+out:
+	return res;
+}
+
+static struct kernel_param_ops srp_tmo_ops = {
+	.get = srp_tmo_get,
+	.set = srp_tmo_set,
+};
+
 static inline struct srp_target_port *host_to_target(struct Scsi_Host *host)
 {
 	return (struct srp_target_port *) host->hostdata;
@@ -711,13 +779,20 @@ static void srp_terminate_io(struct srp_rport *rport)
 	}
 }
 
-static int srp_reconnect_target(struct srp_target_port *target)
+/*
+ * It is up to the caller to ensure that srp_rport_reconnect() calls are
+ * serialized and that no concurrent srp_queuecommand(), srp_abort(),
+ * srp_reset_device() or srp_reset_host() calls will occur while this function
+ * is in progress. One way to realize that is not to call this function
+ * directly but to call srp_reconnect_rport() instead since that last function
+ * serializes calls of this function via rport->mutex and also blocks
+ * srp_queuecommand() calls before invoking this function.
+ */
+static int srp_rport_reconnect(struct srp_rport *rport)
 {
-	struct Scsi_Host *shost = target->scsi_host;
+	struct srp_target_port *target = rport->lld_data;
 	int i, ret;
 
-	scsi_target_block(&shost->shost_gendev);
-
 	srp_disconnect_target(target);
 	/*
 	 * Now get a new local CM ID so that we avoid confusing the target in
@@ -747,28 +822,9 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	if (ret == 0)
 		ret = srp_connect_target(target);
 
-	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
-			    SDEV_TRANSPORT_OFFLINE);
-	target->transport_offline = !!ret;
-
-	if (ret)
-		goto err;
-
-	shost_printk(KERN_INFO, target->scsi_host, PFX "reconnect succeeded\n");
-
-	return ret;
-
-err:
-	shost_printk(KERN_ERR, target->scsi_host,
-		     PFX "reconnect failed (%d), removing target port.\n", ret);
-
-	/*
-	 * We couldn't reconnect, so kill our target port off.
-	 * However, we have to defer the real removal because we
-	 * are in the context of the SCSI error handler now, which
-	 * will deadlock if we call scsi_remove_host().
-	 */
-	srp_queue_remove_work(target);
+	if (ret == 0)
+		shost_printk(KERN_INFO, target->scsi_host,
+			     PFX "reconnect succeeded\n");
 
 	return ret;
 }
@@ -1367,10 +1423,11 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	struct srp_cmd *cmd;
 	struct ib_device *dev;
 	unsigned long flags;
-	int len;
+	int len, result;
 
-	if (unlikely(target->transport_offline)) {
-		scmnd->result = DID_NO_CONNECT << 16;
+	result = srp_chkready(target->rport);
+	if (unlikely(result)) {
+		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
 		return 0;
 	}
@@ -1768,7 +1825,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			      SRP_TSK_ABORT_TASK) == 0)
 		ret = SUCCESS;
-	else if (target->transport_offline)
+	else if (target->rport->state == SRP_RPORT_LOST)
 		ret = FAST_IO_FAIL;
 	else
 		ret = FAILED;
@@ -1804,14 +1861,10 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 static int srp_reset_host(struct scsi_cmnd *scmnd)
 {
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
-	int ret = FAILED;
 
 	shost_printk(KERN_ERR, target->scsi_host, PFX "SRP reset_host called\n");
 
-	if (!srp_reconnect_target(target))
-		ret = SUCCESS;
-
-	return ret;
+	return srp_reconnect_rport(target->rport) == 0 ? SUCCESS : FAILED;
 }
 
 static int srp_slave_configure(struct scsi_device *sdev)
@@ -2626,6 +2679,12 @@ static void srp_remove_one(struct ib_device *device)
 }
 
 static struct srp_function_template ib_srp_transport_functions = {
+	.has_rport_state	 = true,
+	.reset_timer_if_blocked	 = true,
+	.reconnect_delay	 = &srp_reconnect_delay,
+	.fast_io_fail_tmo	 = &srp_fast_io_fail_tmo,
+	.dev_loss_tmo		 = &srp_dev_loss_tmo,
+	.reconnect		 = srp_rport_reconnect,
 	.rport_delete		 = srp_rport_delete,
 	.terminate_rport_io	 = srp_terminate_io,
 };
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 02392f5..b62a943 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -140,7 +140,6 @@ struct srp_target_port {
 	unsigned int		cmd_sg_cnt;
 	unsigned int		indirect_size;
 	bool			allow_ext_sg;
-	bool			transport_offline;
 
 	/* Everything above this point is used in the hot path of
 	 * command processing. Try to keep them packed into cachelines.
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/8] IB/srp: Start timers if a transport layer error occurs
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
                     ` (3 preceding siblings ...)
  2013-08-20 12:46   ` [PATCH 4/8] IB/srp: Use SRP transport layer error recovery Bart Van Assche
@ 2013-08-20 12:46   ` Bart Van Assche
  2013-08-20 12:47   ` [PATCH 6/8] IB/srp: Make transport layer retry count configurable Bart Van Assche
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:46 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Start the reconnect timer, fast_io_fail timer and dev_loss timers
if a transport layer error occurs.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   19 +++++++++++++++++++
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 2 files changed, 20 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index a7fa7ed..2b7ef6b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -602,6 +602,7 @@ static void srp_remove_target(struct srp_target_port *target)
 	srp_disconnect_target(target);
 	ib_destroy_cm_id(target->cm_id);
 	srp_free_target_ib(target);
+	cancel_work_sync(&target->tl_err_work);
 	srp_rport_put(target->rport);
 	srp_free_req_data(target);
 	scsi_host_put(target->scsi_host);
@@ -1371,6 +1372,21 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 			     PFX "Recv failed with error code %d\n", res);
 }
 
+/**
+ * srp_tl_err_work() - handle a transport layer error
+ *
+ * Note: This function may get invoked before the rport has been created,
+ * hence the target->rport test.
+ */
+static void srp_tl_err_work(struct work_struct *work)
+{
+	struct srp_target_port *target;
+
+	target = container_of(work, struct srp_target_port, tl_err_work);
+	if (target->rport)
+		srp_start_tl_fail_timers(target->rport);
+}
+
 static void srp_handle_qp_err(enum ib_wc_status wc_status,
 			      enum ib_wc_opcode wc_opcode,
 			      struct srp_target_port *target)
@@ -1380,6 +1396,7 @@ static void srp_handle_qp_err(enum ib_wc_status wc_status,
 			     PFX "failed %s status %d\n",
 			     wc_opcode & IB_WC_RECV ? "receive" : "send",
 			     wc_status);
+		queue_work(system_long_wq, &target->tl_err_work);
 	}
 	target->qp_in_error = true;
 }
@@ -1742,6 +1759,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 		if (ib_send_cm_drep(cm_id, NULL, 0))
 			shost_printk(KERN_ERR, target->scsi_host,
 				     PFX "Sending CM DREP failed\n");
+		queue_work(system_long_wq, &target->tl_err_work);
 		break;
 
 	case IB_CM_TIMEWAIT_EXIT:
@@ -2406,6 +2424,7 @@ static ssize_t srp_create_target(struct device *dev,
 			     sizeof (struct srp_indirect_buf) +
 			     target->cmd_sg_cnt * sizeof (struct srp_direct_buf);
 
+	INIT_WORK(&target->tl_err_work, srp_tl_err_work);
 	INIT_WORK(&target->remove_work, srp_remove_work);
 	spin_lock_init(&target->lock);
 	INIT_LIST_HEAD(&target->free_tx);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index b62a943..cbc0b14 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -176,6 +176,7 @@ struct srp_target_port {
 	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
 	struct srp_request	req_ring[SRP_CMD_SQ_SIZE];
 
+	struct work_struct	tl_err_work;
 	struct work_struct	remove_work;
 
 	struct list_head	list;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/8] IB/srp: Make transport layer retry count configurable
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
                     ` (4 preceding siblings ...)
  2013-08-20 12:46   ` [PATCH 5/8] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
@ 2013-08-20 12:47   ` Bart Van Assche
  2013-08-20 12:48   ` [PATCH 7/8] IB/srp: Introduce srp_alloc_req_data() Bart Van Assche
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:47 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Allow the InfiniBand RC retry count to be configured by the user
as an option in the target login string. Reducing this retry count
helps with reducing path failover time.

[bvanassche: Rewrote patch description / changed default retry count]
Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 Documentation/ABI/stable/sysfs-driver-ib_srp |    2 ++
 drivers/infiniband/ulp/srp/ib_srp.c          |   24 +++++++++++++++++++++++-
 drivers/infiniband/ulp/srp/ib_srp.h          |    1 +
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/stable/sysfs-driver-ib_srp b/Documentation/ABI/stable/sysfs-driver-ib_srp
index 5c53d28..18e9b27 100644
--- a/Documentation/ABI/stable/sysfs-driver-ib_srp
+++ b/Documentation/ABI/stable/sysfs-driver-ib_srp
@@ -61,6 +61,8 @@ Description:	Interface for making ib_srp connect to a new target.
 		  interrupt is handled by a different CPU then the comp_vector
 		  parameter can be used to spread the SRP completion workload
 		  over multiple CPU's.
+		* tl_retry_count, a number in the range 2..7 specifying the
+		  IB RC retry count.
 
 What:		/sys/class/infiniband_srp/srp-<hca>-<port_number>/ibdev
 Date:		January 2, 2006
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2b7ef6b..de4c3b7 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -458,7 +458,7 @@ static int srp_send_req(struct srp_target_port *target)
 	req->param.responder_resources	      = 4;
 	req->param.remote_cm_response_timeout = 20;
 	req->param.local_cm_response_timeout  = 20;
-	req->param.retry_count 		      = 7;
+	req->param.retry_count                = target->tl_retry_count;
 	req->param.rnr_retry_count 	      = 7;
 	req->param.max_cm_retries 	      = 15;
 
@@ -1991,6 +1991,14 @@ static ssize_t show_comp_vector(struct device *dev,
 	return sprintf(buf, "%d\n", target->comp_vector);
 }
 
+static ssize_t show_tl_retry_count(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct srp_target_port *target = host_to_target(class_to_shost(dev));
+
+	return sprintf(buf, "%d\n", target->tl_retry_count);
+}
+
 static ssize_t show_cmd_sg_entries(struct device *dev,
 				   struct device_attribute *attr, char *buf)
 {
@@ -2018,6 +2026,7 @@ static DEVICE_ATTR(zero_req_lim,    S_IRUGO, show_zero_req_lim,	   NULL);
 static DEVICE_ATTR(local_ib_port,   S_IRUGO, show_local_ib_port,   NULL);
 static DEVICE_ATTR(local_ib_device, S_IRUGO, show_local_ib_device, NULL);
 static DEVICE_ATTR(comp_vector,     S_IRUGO, show_comp_vector,     NULL);
+static DEVICE_ATTR(tl_retry_count,  S_IRUGO, show_tl_retry_count,  NULL);
 static DEVICE_ATTR(cmd_sg_entries,  S_IRUGO, show_cmd_sg_entries,  NULL);
 static DEVICE_ATTR(allow_ext_sg,    S_IRUGO, show_allow_ext_sg,    NULL);
 
@@ -2033,6 +2042,7 @@ static struct device_attribute *srp_host_attrs[] = {
 	&dev_attr_local_ib_port,
 	&dev_attr_local_ib_device,
 	&dev_attr_comp_vector,
+	&dev_attr_tl_retry_count,
 	&dev_attr_cmd_sg_entries,
 	&dev_attr_allow_ext_sg,
 	NULL
@@ -2158,6 +2168,7 @@ enum {
 	SRP_OPT_ALLOW_EXT_SG	= 1 << 10,
 	SRP_OPT_SG_TABLESIZE	= 1 << 11,
 	SRP_OPT_COMP_VECTOR	= 1 << 12,
+	SRP_OPT_TL_RETRY_COUNT	= 1 << 13,
 	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
 				   SRP_OPT_IOC_GUID	|
 				   SRP_OPT_DGID		|
@@ -2179,6 +2190,7 @@ static const match_table_t srp_opt_tokens = {
 	{ SRP_OPT_ALLOW_EXT_SG,		"allow_ext_sg=%u"	},
 	{ SRP_OPT_SG_TABLESIZE,		"sg_tablesize=%u"	},
 	{ SRP_OPT_COMP_VECTOR,		"comp_vector=%u"	},
+	{ SRP_OPT_TL_RETRY_COUNT,	"tl_retry_count=%u"	},
 	{ SRP_OPT_ERR,			NULL 			}
 };
 
@@ -2342,6 +2354,15 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 			target->comp_vector = token;
 			break;
 
+		case SRP_OPT_TL_RETRY_COUNT:
+			if (match_int(args, &token) || token < 2 || token > 7) {
+				pr_warn("bad tl_retry_count parameter '%s' (must be a number between 2 and 7)\n",
+					p);
+				goto out;
+			}
+			target->tl_retry_count = token;
+			break;
+
 		default:
 			pr_warn("unknown parameter or missing value '%s' in target creation request\n",
 				p);
@@ -2396,6 +2417,7 @@ static ssize_t srp_create_target(struct device *dev,
 	target->cmd_sg_cnt	= cmd_sg_entries;
 	target->sg_tablesize	= indirect_sg_entries ? : cmd_sg_entries;
 	target->allow_ext_sg	= allow_ext_sg;
+	target->tl_retry_count	= 7;
 
 	ret = srp_parse_options(buf, target);
 	if (ret)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index cbc0b14..446b045 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -157,6 +157,7 @@ struct srp_target_port {
 	unsigned int		scsi_id;
 	unsigned int		sg_tablesize;
 	int			comp_vector;
+	int			tl_retry_count;
 
 	struct ib_sa_path_rec	path;
 	__be16			orig_dgid[8];
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/8] IB/srp: Introduce srp_alloc_req_data()
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
                     ` (5 preceding siblings ...)
  2013-08-20 12:47   ` [PATCH 6/8] IB/srp: Make transport layer retry count configurable Bart Van Assche
@ 2013-08-20 12:48   ` Bart Van Assche
  2013-08-20 12:50   ` [PATCH 8/8] IB/srp: Make queue size configurable Bart Van Assche
  2013-09-10  2:53   ` [PATCH 0/8] IB SRP initiator patches for kernel 3.12 David Dillow
  8 siblings, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:48 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   64 ++++++++++++++++++++++-------------
 1 file changed, 40 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index de4c3b7..ece1f2d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -576,6 +576,42 @@ static void srp_free_req_data(struct srp_target_port *target)
 	}
 }
 
+static int srp_alloc_req_data(struct srp_target_port *target)
+{
+	struct srp_device *srp_dev = target->srp_host->srp_dev;
+	struct ib_device *ibdev = srp_dev->dev;
+	struct srp_request *req;
+	dma_addr_t dma_addr;
+	int i, ret = -ENOMEM;
+
+	INIT_LIST_HEAD(&target->free_reqs);
+
+	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+		req = &target->req_ring[i];
+		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
+					GFP_KERNEL);
+		req->map_page = kmalloc(SRP_FMR_SIZE * sizeof(void *),
+					GFP_KERNEL);
+		req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
+		if (!req->fmr_list || !req->map_page || !req->indirect_desc)
+			goto out;
+
+		dma_addr = ib_dma_map_single(ibdev, req->indirect_desc,
+					     target->indirect_size,
+					     DMA_TO_DEVICE);
+		if (ib_dma_mapping_error(ibdev, dma_addr))
+			goto out;
+
+		req->indirect_dma_addr = dma_addr;
+		req->index = i;
+		list_add_tail(&req->list, &target->free_reqs);
+	}
+	ret = 0;
+
+out:
+	return ret;
+}
+
 /**
  * srp_del_scsi_host_attr() - Remove attributes defined in the host template.
  * @shost: SCSI host whose attributes to remove from sysfs.
@@ -2393,8 +2429,7 @@ static ssize_t srp_create_target(struct device *dev,
 	struct Scsi_Host *target_host;
 	struct srp_target_port *target;
 	struct ib_device *ibdev = host->srp_dev->dev;
-	dma_addr_t dma_addr;
-	int i, ret;
+	int ret;
 
 	target_host = scsi_host_alloc(&srp_template,
 				      sizeof (struct srp_target_port));
@@ -2450,28 +2485,9 @@ static ssize_t srp_create_target(struct device *dev,
 	INIT_WORK(&target->remove_work, srp_remove_work);
 	spin_lock_init(&target->lock);
 	INIT_LIST_HEAD(&target->free_tx);
-	INIT_LIST_HEAD(&target->free_reqs);
-	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
-		struct srp_request *req = &target->req_ring[i];
-
-		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof (void *),
-					GFP_KERNEL);
-		req->map_page = kmalloc(SRP_FMR_SIZE * sizeof (void *),
-					GFP_KERNEL);
-		req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
-		if (!req->fmr_list || !req->map_page || !req->indirect_desc)
-			goto err_free_mem;
-
-		dma_addr = ib_dma_map_single(ibdev, req->indirect_desc,
-					     target->indirect_size,
-					     DMA_TO_DEVICE);
-		if (ib_dma_mapping_error(ibdev, dma_addr))
-			goto err_free_mem;
-
-		req->indirect_dma_addr = dma_addr;
-		req->index = i;
-		list_add_tail(&req->list, &target->free_reqs);
-	}
+	ret = srp_alloc_req_data(target);
+	if (ret)
+		goto err_free_mem;
 
 	ib_query_gid(ibdev, host->port, 0, &target->path.sgid);
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 8/8] IB/srp: Make queue size configurable
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
                     ` (6 preceding siblings ...)
  2013-08-20 12:48   ` [PATCH 7/8] IB/srp: Introduce srp_alloc_req_data() Bart Van Assche
@ 2013-08-20 12:50   ` Bart Van Assche
       [not found]     ` <52136609.3090406-HInyCGIudOg@public.gmane.org>
  2013-09-10  2:53   ` [PATCH 0/8] IB SRP initiator patches for kernel 3.12 David Dillow
  8 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 12:50 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma,
	Konrad Grzybowski

Certain storage configurations, e.g. a sufficiently large array of
hard disks in a RAID configuration, need a queue depth above 64 to
achieve optimal performance. Hence make the queue depth configurable.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Cc: Konrad Grzybowski <konradg-XLFqaoR9dsI@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |  125 ++++++++++++++++++++++++++---------
 drivers/infiniband/ulp/srp/ib_srp.h |   17 +++--
 2 files changed, 103 insertions(+), 39 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index ece1f2d..6de2323 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -299,16 +299,16 @@ static int srp_create_target_ib(struct srp_target_port *target)
 		return -ENOMEM;
 
 	recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
-			       srp_recv_completion, NULL, target, SRP_RQ_SIZE,
-			       target->comp_vector);
+			       srp_recv_completion, NULL, target,
+			       target->queue_size, target->comp_vector);
 	if (IS_ERR(recv_cq)) {
 		ret = PTR_ERR(recv_cq);
 		goto err;
 	}
 
 	send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
-			       srp_send_completion, NULL, target, SRP_SQ_SIZE,
-			       target->comp_vector);
+			       srp_send_completion, NULL, target,
+			       target->queue_size, target->comp_vector);
 	if (IS_ERR(send_cq)) {
 		ret = PTR_ERR(send_cq);
 		goto err_recv_cq;
@@ -317,8 +317,8 @@ static int srp_create_target_ib(struct srp_target_port *target)
 	ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP);
 
 	init_attr->event_handler       = srp_qp_event;
-	init_attr->cap.max_send_wr     = SRP_SQ_SIZE;
-	init_attr->cap.max_recv_wr     = SRP_RQ_SIZE;
+	init_attr->cap.max_send_wr     = target->queue_size;
+	init_attr->cap.max_recv_wr     = target->queue_size;
 	init_attr->cap.max_recv_sge    = 1;
 	init_attr->cap.max_send_sge    = 1;
 	init_attr->sq_sig_type         = IB_SIGNAL_ALL_WR;
@@ -364,6 +364,10 @@ err:
 	return ret;
 }
 
+/*
+ * Note: this function may be called without srp_alloc_iu_bufs() having been
+ * invoked. Hence the target->[rt]x_ring checks.
+ */
 static void srp_free_target_ib(struct srp_target_port *target)
 {
 	int i;
@@ -375,10 +379,18 @@ static void srp_free_target_ib(struct srp_target_port *target)
 	target->qp = NULL;
 	target->send_cq = target->recv_cq = NULL;
 
-	for (i = 0; i < SRP_RQ_SIZE; ++i)
-		srp_free_iu(target->srp_host, target->rx_ring[i]);
-	for (i = 0; i < SRP_SQ_SIZE; ++i)
-		srp_free_iu(target->srp_host, target->tx_ring[i]);
+	if (target->rx_ring) {
+		for (i = 0; i < target->queue_size; ++i)
+			srp_free_iu(target->srp_host, target->rx_ring[i]);
+		kfree(target->rx_ring);
+		target->rx_ring = NULL;
+	}
+	if (target->tx_ring) {
+		for (i = 0; i < target->queue_size; ++i)
+			srp_free_iu(target->srp_host, target->tx_ring[i]);
+		kfree(target->tx_ring);
+		target->tx_ring = NULL;
+	}
 }
 
 static void srp_path_rec_completion(int status,
@@ -564,7 +576,11 @@ static void srp_free_req_data(struct srp_target_port *target)
 	struct srp_request *req;
 	int i;
 
-	for (i = 0, req = target->req_ring; i < SRP_CMD_SQ_SIZE; ++i, ++req) {
+	if (!target->req_ring)
+		return;
+
+	for (i = 0; i < target->req_ring_size; ++i) {
+		req = &target->req_ring[i];
 		kfree(req->fmr_list);
 		kfree(req->map_page);
 		if (req->indirect_dma_addr) {
@@ -574,6 +590,9 @@ static void srp_free_req_data(struct srp_target_port *target)
 		}
 		kfree(req->indirect_desc);
 	}
+
+	kfree(target->req_ring);
+	target->req_ring = NULL;
 }
 
 static int srp_alloc_req_data(struct srp_target_port *target)
@@ -586,7 +605,12 @@ static int srp_alloc_req_data(struct srp_target_port *target)
 
 	INIT_LIST_HEAD(&target->free_reqs);
 
-	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+	target->req_ring = kzalloc(target->req_ring_size *
+				   sizeof(*target->req_ring), GFP_KERNEL);
+	if (!target->req_ring)
+		goto out;
+
+	for (i = 0; i < target->req_ring_size; ++i) {
 		req = &target->req_ring[i];
 		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
 					GFP_KERNEL);
@@ -810,7 +834,7 @@ static void srp_terminate_io(struct srp_rport *rport)
 	struct srp_target_port *target = rport->lld_data;
 	int i;
 
-	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
 	}
@@ -847,13 +871,13 @@ static int srp_rport_reconnect(struct srp_rport *rport)
 	else
 		srp_create_target_ib(target);
 
-	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		srp_finish_req(target, req, DID_RESET << 16);
 	}
 
 	INIT_LIST_HEAD(&target->free_tx);
-	for (i = 0; i < SRP_SQ_SIZE; ++i)
+	for (i = 0; i < target->queue_size; ++i)
 		list_add(&target->tx_ring[i]->list, &target->free_tx);
 
 	if (ret == 0)
@@ -1544,11 +1568,24 @@ err_unlock:
 	return SCSI_MLQUEUE_HOST_BUSY;
 }
 
+/*
+ * Note: the resources allocated in this function are freed in
+ * srp_free_target_ib().
+ */
 static int srp_alloc_iu_bufs(struct srp_target_port *target)
 {
 	int i;
 
-	for (i = 0; i < SRP_RQ_SIZE; ++i) {
+	target->rx_ring = kzalloc(target->queue_size * sizeof(*target->rx_ring),
+				  GFP_KERNEL);
+	if (!target->rx_ring)
+		goto err_no_ring;
+	target->tx_ring = kzalloc(target->queue_size * sizeof(*target->tx_ring),
+				  GFP_KERNEL);
+	if (!target->tx_ring)
+		goto err_no_ring;
+
+	for (i = 0; i < target->queue_size; ++i) {
 		target->rx_ring[i] = srp_alloc_iu(target->srp_host,
 						  target->max_ti_iu_len,
 						  GFP_KERNEL, DMA_FROM_DEVICE);
@@ -1556,7 +1593,7 @@ static int srp_alloc_iu_bufs(struct srp_target_port *target)
 			goto err;
 	}
 
-	for (i = 0; i < SRP_SQ_SIZE; ++i) {
+	for (i = 0; i < target->queue_size; ++i) {
 		target->tx_ring[i] = srp_alloc_iu(target->srp_host,
 						  target->max_iu_len,
 						  GFP_KERNEL, DMA_TO_DEVICE);
@@ -1569,16 +1606,18 @@ static int srp_alloc_iu_bufs(struct srp_target_port *target)
 	return 0;
 
 err:
-	for (i = 0; i < SRP_RQ_SIZE; ++i) {
+	for (i = 0; i < target->queue_size; ++i) {
 		srp_free_iu(target->srp_host, target->rx_ring[i]);
-		target->rx_ring[i] = NULL;
-	}
-
-	for (i = 0; i < SRP_SQ_SIZE; ++i) {
 		srp_free_iu(target->srp_host, target->tx_ring[i]);
-		target->tx_ring[i] = NULL;
 	}
 
+
+err_no_ring:
+	kfree(target->tx_ring);
+	target->tx_ring = NULL;
+	kfree(target->rx_ring);
+	target->rx_ring = NULL;
+
 	return -ENOMEM;
 }
 
@@ -1629,6 +1668,9 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
 		target->scsi_host->can_queue
 			= min(target->req_lim - SRP_TSK_MGMT_SQ_SIZE,
 			      target->scsi_host->can_queue);
+		target->scsi_host->cmd_per_lun
+			= min_t(int, target->scsi_host->can_queue,
+				target->scsi_host->cmd_per_lun);
 	} else {
 		shost_printk(KERN_WARNING, target->scsi_host,
 			     PFX "Unhandled RSP opcode %#x\n", lrsp->opcode);
@@ -1636,7 +1678,7 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
 		goto error;
 	}
 
-	if (!target->rx_ring[0]) {
+	if (!target->rx_ring) {
 		ret = srp_alloc_iu_bufs(target);
 		if (ret)
 			goto error;
@@ -1656,7 +1698,7 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
 	if (ret)
 		goto error_free;
 
-	for (i = 0; i < SRP_RQ_SIZE; i++) {
+	for (i = 0; i < target->queue_size; i++) {
 		struct srp_iu *iu = target->rx_ring[i];
 		ret = srp_post_recv(target, iu);
 		if (ret)
@@ -1903,7 +1945,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 	if (target->tsk_mgmt_status)
 		return FAILED;
 
-	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		if (req->scmnd && req->scmnd->device == scmnd->device)
 			srp_finish_req(target, req, DID_RESET << 16);
@@ -2096,9 +2138,9 @@ static struct scsi_host_template srp_template = {
 	.eh_host_reset_handler		= srp_reset_host,
 	.skip_settle_delay		= true,
 	.sg_tablesize			= SRP_DEF_SG_TABLESIZE,
-	.can_queue			= SRP_CMD_SQ_SIZE,
+	.can_queue			= SRP_DEFAULT_CMD_SQ_SIZE,
 	.this_id			= -1,
-	.cmd_per_lun			= SRP_CMD_SQ_SIZE,
+	.cmd_per_lun			= SRP_DEFAULT_CMD_SQ_SIZE,
 	.use_clustering			= ENABLE_CLUSTERING,
 	.shost_attrs			= srp_host_attrs
 };
@@ -2205,6 +2247,7 @@ enum {
 	SRP_OPT_SG_TABLESIZE	= 1 << 11,
 	SRP_OPT_COMP_VECTOR	= 1 << 12,
 	SRP_OPT_TL_RETRY_COUNT	= 1 << 13,
+	SRP_OPT_CAN_QUEUE	= 1 << 14,
 	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
 				   SRP_OPT_IOC_GUID	|
 				   SRP_OPT_DGID		|
@@ -2227,6 +2270,7 @@ static const match_table_t srp_opt_tokens = {
 	{ SRP_OPT_SG_TABLESIZE,		"sg_tablesize=%u"	},
 	{ SRP_OPT_COMP_VECTOR,		"comp_vector=%u"	},
 	{ SRP_OPT_TL_RETRY_COUNT,	"tl_retry_count=%u"	},
+	{ SRP_OPT_CAN_QUEUE,		"can_queue=%d"		},
 	{ SRP_OPT_ERR,			NULL 			}
 };
 
@@ -2321,13 +2365,25 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 			target->scsi_host->max_sectors = token;
 			break;
 
+		case SRP_OPT_CAN_QUEUE:
+			if (match_int(args, &token) || token < 1) {
+				pr_warn("bad can_queue parameter '%s'\n", p);
+				goto out;
+			}
+			target->scsi_host->can_queue = token;
+			target->queue_size = token + SRP_RSP_SQ_SIZE +
+					     SRP_TSK_MGMT_SQ_SIZE;
+			if (!(opt_mask & SRP_OPT_MAX_CMD_PER_LUN))
+				target->scsi_host->cmd_per_lun = token;
+			break;
+
 		case SRP_OPT_MAX_CMD_PER_LUN:
-			if (match_int(args, &token)) {
+			if (match_int(args, &token) || token < 1) {
 				pr_warn("bad max cmd_per_lun parameter '%s'\n",
 					p);
 				goto out;
 			}
-			target->scsi_host->cmd_per_lun = min(token, SRP_CMD_SQ_SIZE);
+			target->scsi_host->cmd_per_lun = token;
 			break;
 
 		case SRP_OPT_IO_CLASS:
@@ -2415,6 +2471,12 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 				pr_warn("target creation request is missing parameter '%s'\n",
 					srp_opt_tokens[i].pattern);
 
+	if (target->scsi_host->cmd_per_lun > target->scsi_host->can_queue
+	    && (opt_mask & SRP_OPT_MAX_CMD_PER_LUN))
+		pr_warn("cmd_per_lun = %d > can_queue = %d\n",
+			target->scsi_host->cmd_per_lun,
+			target->scsi_host->can_queue);
+
 out:
 	kfree(options);
 	return ret;
@@ -2453,11 +2515,14 @@ static ssize_t srp_create_target(struct device *dev,
 	target->sg_tablesize	= indirect_sg_entries ? : cmd_sg_entries;
 	target->allow_ext_sg	= allow_ext_sg;
 	target->tl_retry_count	= 7;
+	target->queue_size	= SRP_DEFAULT_QUEUE_SIZE;
 
 	ret = srp_parse_options(buf, target);
 	if (ret)
 		goto err;
 
+	target->req_ring_size = target->queue_size - SRP_TSK_MGMT_SQ_SIZE;
+
 	if (!srp_conn_unique(target->srp_host, target)) {
 		shost_printk(KERN_INFO, target->scsi_host,
 			     PFX "Already connected to target port with id_ext=%016llx;ioc_guid=%016llx;initiator_ext=%016llx\n",
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 446b045..5756810 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -57,14 +57,11 @@ enum {
 	SRP_MAX_LUN		= 512,
 	SRP_DEF_SG_TABLESIZE	= 12,
 
-	SRP_RQ_SHIFT    	= 6,
-	SRP_RQ_SIZE		= 1 << SRP_RQ_SHIFT,
-
-	SRP_SQ_SIZE		= SRP_RQ_SIZE,
+	SRP_DEFAULT_QUEUE_SIZE	= 1 << 6,
 	SRP_RSP_SQ_SIZE		= 1,
-	SRP_REQ_SQ_SIZE		= SRP_SQ_SIZE - SRP_RSP_SQ_SIZE,
 	SRP_TSK_MGMT_SQ_SIZE	= 1,
-	SRP_CMD_SQ_SIZE		= SRP_REQ_SQ_SIZE - SRP_TSK_MGMT_SQ_SIZE,
+	SRP_DEFAULT_CMD_SQ_SIZE = SRP_DEFAULT_QUEUE_SIZE - SRP_RSP_SQ_SIZE -
+				  SRP_TSK_MGMT_SQ_SIZE,
 
 	SRP_TAG_NO_REQ		= ~0U,
 	SRP_TAG_TSK_MGMT	= 1U << 31,
@@ -156,6 +153,8 @@ struct srp_target_port {
 	char			target_name[32];
 	unsigned int		scsi_id;
 	unsigned int		sg_tablesize;
+	int			queue_size;
+	int			req_ring_size;
 	int			comp_vector;
 	int			tl_retry_count;
 
@@ -173,9 +172,9 @@ struct srp_target_port {
 
 	int			zero_req_lim;
 
-	struct srp_iu	       *tx_ring[SRP_SQ_SIZE];
-	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
-	struct srp_request	req_ring[SRP_CMD_SQ_SIZE];
+	struct srp_iu	       **tx_ring;
+	struct srp_iu	       **rx_ring;
+	struct srp_request	*req_ring;
 
 	struct work_struct	tl_err_work;
 	struct work_struct	remove_work;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]     ` <52136609.3090406-HInyCGIudOg@public.gmane.org>
@ 2013-08-20 15:34       ` Sagi Grimberg
       [not found]         ` <52138C6E.6080201-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-09-10  3:01       ` David Dillow
  1 sibling, 1 reply; 31+ messages in thread
From: Sagi Grimberg @ 2013-08-20 15:34 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On 8/20/2013 3:50 PM, Bart Van Assche wrote:
> Certain storage configurations, e.g. a sufficiently large array of
> hard disks in a RAID configuration, need a queue depth above 64 to
> achieve optimal performance. Hence make the queue depth configurable.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> Cc: Konrad Grzybowski <konradg-XLFqaoR9dsI@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c |  125 ++++++++++++++++++++++++++---------
>   drivers/infiniband/ulp/srp/ib_srp.h |   17 +++--
>   2 files changed, 103 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index ece1f2d..6de2323 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -299,16 +299,16 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   		return -ENOMEM;
>   
>   	recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
> -			       srp_recv_completion, NULL, target, SRP_RQ_SIZE,
> -			       target->comp_vector);
> +			       srp_recv_completion, NULL, target,
> +			       target->queue_size, target->comp_vector);
>   	if (IS_ERR(recv_cq)) {
>   		ret = PTR_ERR(recv_cq);
>   		goto err;
>   	}
>   
>   	send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
> -			       srp_send_completion, NULL, target, SRP_SQ_SIZE,
> -			       target->comp_vector);
> +			       srp_send_completion, NULL, target,
> +			       target->queue_size, target->comp_vector);
>   	if (IS_ERR(send_cq)) {
>   		ret = PTR_ERR(send_cq);
>   		goto err_recv_cq;
> @@ -317,8 +317,8 @@ static int srp_create_target_ib(struct srp_target_port *target)
>   	ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP);
>   
>   	init_attr->event_handler       = srp_qp_event;
> -	init_attr->cap.max_send_wr     = SRP_SQ_SIZE;
> -	init_attr->cap.max_recv_wr     = SRP_RQ_SIZE;
> +	init_attr->cap.max_send_wr     = target->queue_size;
> +	init_attr->cap.max_recv_wr     = target->queue_size;
>   	init_attr->cap.max_recv_sge    = 1;
>   	init_attr->cap.max_send_sge    = 1;
>   	init_attr->sq_sig_type         = IB_SIGNAL_ALL_WR;
> @@ -364,6 +364,10 @@ err:
>   	return ret;
>   }
>   
> +/*
> + * Note: this function may be called without srp_alloc_iu_bufs() having been
> + * invoked. Hence the target->[rt]x_ring checks.
> + */
>   static void srp_free_target_ib(struct srp_target_port *target)
>   {
>   	int i;
> @@ -375,10 +379,18 @@ static void srp_free_target_ib(struct srp_target_port *target)
>   	target->qp = NULL;
>   	target->send_cq = target->recv_cq = NULL;
>   
> -	for (i = 0; i < SRP_RQ_SIZE; ++i)
> -		srp_free_iu(target->srp_host, target->rx_ring[i]);
> -	for (i = 0; i < SRP_SQ_SIZE; ++i)
> -		srp_free_iu(target->srp_host, target->tx_ring[i]);
> +	if (target->rx_ring) {
> +		for (i = 0; i < target->queue_size; ++i)
> +			srp_free_iu(target->srp_host, target->rx_ring[i]);
> +		kfree(target->rx_ring);
> +		target->rx_ring = NULL;
> +	}
> +	if (target->tx_ring) {
> +		for (i = 0; i < target->queue_size; ++i)
> +			srp_free_iu(target->srp_host, target->tx_ring[i]);
> +		kfree(target->tx_ring);
> +		target->tx_ring = NULL;
> +	}
>   }
>   
>   static void srp_path_rec_completion(int status,
> @@ -564,7 +576,11 @@ static void srp_free_req_data(struct srp_target_port *target)
>   	struct srp_request *req;
>   	int i;
>   
> -	for (i = 0, req = target->req_ring; i < SRP_CMD_SQ_SIZE; ++i, ++req) {
> +	if (!target->req_ring)
> +		return;
> +
> +	for (i = 0; i < target->req_ring_size; ++i) {
> +		req = &target->req_ring[i];
>   		kfree(req->fmr_list);
>   		kfree(req->map_page);
>   		if (req->indirect_dma_addr) {
> @@ -574,6 +590,9 @@ static void srp_free_req_data(struct srp_target_port *target)
>   		}
>   		kfree(req->indirect_desc);
>   	}
> +
> +	kfree(target->req_ring);
> +	target->req_ring = NULL;
>   }
>   
>   static int srp_alloc_req_data(struct srp_target_port *target)
> @@ -586,7 +605,12 @@ static int srp_alloc_req_data(struct srp_target_port *target)
>   
>   	INIT_LIST_HEAD(&target->free_reqs);
>   
> -	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
> +	target->req_ring = kzalloc(target->req_ring_size *
> +				   sizeof(*target->req_ring), GFP_KERNEL);
> +	if (!target->req_ring)
> +		goto out;
> +
> +	for (i = 0; i < target->req_ring_size; ++i) {
>   		req = &target->req_ring[i];
>   		req->fmr_list = kmalloc(target->cmd_sg_cnt * sizeof(void *),
>   					GFP_KERNEL);
> @@ -810,7 +834,7 @@ static void srp_terminate_io(struct srp_rport *rport)
>   	struct srp_target_port *target = rport->lld_data;
>   	int i;
>   
> -	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
> +	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
>   		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
>   	}
> @@ -847,13 +871,13 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>   	else
>   		srp_create_target_ib(target);
>   
> -	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
> +	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
>   		srp_finish_req(target, req, DID_RESET << 16);
>   	}
>   
>   	INIT_LIST_HEAD(&target->free_tx);
> -	for (i = 0; i < SRP_SQ_SIZE; ++i)
> +	for (i = 0; i < target->queue_size; ++i)
>   		list_add(&target->tx_ring[i]->list, &target->free_tx);
>   
>   	if (ret == 0)
> @@ -1544,11 +1568,24 @@ err_unlock:
>   	return SCSI_MLQUEUE_HOST_BUSY;
>   }
>   
> +/*
> + * Note: the resources allocated in this function are freed in
> + * srp_free_target_ib().
> + */
>   static int srp_alloc_iu_bufs(struct srp_target_port *target)
>   {
>   	int i;
>   
> -	for (i = 0; i < SRP_RQ_SIZE; ++i) {
> +	target->rx_ring = kzalloc(target->queue_size * sizeof(*target->rx_ring),
> +				  GFP_KERNEL);
> +	if (!target->rx_ring)
> +		goto err_no_ring;
> +	target->tx_ring = kzalloc(target->queue_size * sizeof(*target->tx_ring),
> +				  GFP_KERNEL);
> +	if (!target->tx_ring)
> +		goto err_no_ring;
> +
> +	for (i = 0; i < target->queue_size; ++i) {
>   		target->rx_ring[i] = srp_alloc_iu(target->srp_host,
>   						  target->max_ti_iu_len,
>   						  GFP_KERNEL, DMA_FROM_DEVICE);
> @@ -1556,7 +1593,7 @@ static int srp_alloc_iu_bufs(struct srp_target_port *target)
>   			goto err;
>   	}
>   
> -	for (i = 0; i < SRP_SQ_SIZE; ++i) {
> +	for (i = 0; i < target->queue_size; ++i) {
>   		target->tx_ring[i] = srp_alloc_iu(target->srp_host,
>   						  target->max_iu_len,
>   						  GFP_KERNEL, DMA_TO_DEVICE);
> @@ -1569,16 +1606,18 @@ static int srp_alloc_iu_bufs(struct srp_target_port *target)
>   	return 0;
>   
>   err:
> -	for (i = 0; i < SRP_RQ_SIZE; ++i) {
> +	for (i = 0; i < target->queue_size; ++i) {
>   		srp_free_iu(target->srp_host, target->rx_ring[i]);
> -		target->rx_ring[i] = NULL;
> -	}
> -
> -	for (i = 0; i < SRP_SQ_SIZE; ++i) {
>   		srp_free_iu(target->srp_host, target->tx_ring[i]);
> -		target->tx_ring[i] = NULL;
>   	}
>   
> +
> +err_no_ring:
> +	kfree(target->tx_ring);
> +	target->tx_ring = NULL;
> +	kfree(target->rx_ring);
> +	target->rx_ring = NULL;
> +
>   	return -ENOMEM;
>   }
>   
> @@ -1629,6 +1668,9 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
>   		target->scsi_host->can_queue
>   			= min(target->req_lim - SRP_TSK_MGMT_SQ_SIZE,
>   			      target->scsi_host->can_queue);
> +		target->scsi_host->cmd_per_lun
> +			= min_t(int, target->scsi_host->can_queue,
> +				target->scsi_host->cmd_per_lun);
>   	} else {
>   		shost_printk(KERN_WARNING, target->scsi_host,
>   			     PFX "Unhandled RSP opcode %#x\n", lrsp->opcode);
> @@ -1636,7 +1678,7 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
>   		goto error;
>   	}
>   
> -	if (!target->rx_ring[0]) {
> +	if (!target->rx_ring) {
>   		ret = srp_alloc_iu_bufs(target);
>   		if (ret)
>   			goto error;
> @@ -1656,7 +1698,7 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
>   	if (ret)
>   		goto error_free;
>   
> -	for (i = 0; i < SRP_RQ_SIZE; i++) {
> +	for (i = 0; i < target->queue_size; i++) {
>   		struct srp_iu *iu = target->rx_ring[i];
>   		ret = srp_post_recv(target, iu);
>   		if (ret)
> @@ -1903,7 +1945,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
>   	if (target->tsk_mgmt_status)
>   		return FAILED;
>   
> -	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
> +	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
>   		if (req->scmnd && req->scmnd->device == scmnd->device)
>   			srp_finish_req(target, req, DID_RESET << 16);
> @@ -2096,9 +2138,9 @@ static struct scsi_host_template srp_template = {
>   	.eh_host_reset_handler		= srp_reset_host,
>   	.skip_settle_delay		= true,
>   	.sg_tablesize			= SRP_DEF_SG_TABLESIZE,
> -	.can_queue			= SRP_CMD_SQ_SIZE,
> +	.can_queue			= SRP_DEFAULT_CMD_SQ_SIZE,
>   	.this_id			= -1,
> -	.cmd_per_lun			= SRP_CMD_SQ_SIZE,
> +	.cmd_per_lun			= SRP_DEFAULT_CMD_SQ_SIZE,
>   	.use_clustering			= ENABLE_CLUSTERING,
>   	.shost_attrs			= srp_host_attrs
>   };
> @@ -2205,6 +2247,7 @@ enum {
>   	SRP_OPT_SG_TABLESIZE	= 1 << 11,
>   	SRP_OPT_COMP_VECTOR	= 1 << 12,
>   	SRP_OPT_TL_RETRY_COUNT	= 1 << 13,
> +	SRP_OPT_CAN_QUEUE	= 1 << 14,
>   	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
>   				   SRP_OPT_IOC_GUID	|
>   				   SRP_OPT_DGID		|
> @@ -2227,6 +2270,7 @@ static const match_table_t srp_opt_tokens = {
>   	{ SRP_OPT_SG_TABLESIZE,		"sg_tablesize=%u"	},
>   	{ SRP_OPT_COMP_VECTOR,		"comp_vector=%u"	},
>   	{ SRP_OPT_TL_RETRY_COUNT,	"tl_retry_count=%u"	},
> +	{ SRP_OPT_CAN_QUEUE,		"can_queue=%d"		},
>   	{ SRP_OPT_ERR,			NULL 			}
>   };
>   
> @@ -2321,13 +2365,25 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
>   			target->scsi_host->max_sectors = token;
>   			break;
>   
> +		case SRP_OPT_CAN_QUEUE:
> +			if (match_int(args, &token) || token < 1) {
> +				pr_warn("bad can_queue parameter '%s'\n", p);
> +				goto out;
> +			}
> +			target->scsi_host->can_queue = token;
> +			target->queue_size = token + SRP_RSP_SQ_SIZE +
> +					     SRP_TSK_MGMT_SQ_SIZE;
> +			if (!(opt_mask & SRP_OPT_MAX_CMD_PER_LUN))
> +				target->scsi_host->cmd_per_lun = token;
> +			break;
> +
>   		case SRP_OPT_MAX_CMD_PER_LUN:
> -			if (match_int(args, &token)) {
> +			if (match_int(args, &token) || token < 1) {
>   				pr_warn("bad max cmd_per_lun parameter '%s'\n",
>   					p);
>   				goto out;
>   			}
> -			target->scsi_host->cmd_per_lun = min(token, SRP_CMD_SQ_SIZE);
> +			target->scsi_host->cmd_per_lun = token;
>   			break;
>   
>   		case SRP_OPT_IO_CLASS:
> @@ -2415,6 +2471,12 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
>   				pr_warn("target creation request is missing parameter '%s'\n",
>   					srp_opt_tokens[i].pattern);
>   
> +	if (target->scsi_host->cmd_per_lun > target->scsi_host->can_queue
> +	    && (opt_mask & SRP_OPT_MAX_CMD_PER_LUN))
> +		pr_warn("cmd_per_lun = %d > can_queue = %d\n",
> +			target->scsi_host->cmd_per_lun,
> +			target->scsi_host->can_queue);
> +
>   out:
>   	kfree(options);
>   	return ret;
> @@ -2453,11 +2515,14 @@ static ssize_t srp_create_target(struct device *dev,
>   	target->sg_tablesize	= indirect_sg_entries ? : cmd_sg_entries;
>   	target->allow_ext_sg	= allow_ext_sg;
>   	target->tl_retry_count	= 7;
> +	target->queue_size	= SRP_DEFAULT_QUEUE_SIZE;
>   
>   	ret = srp_parse_options(buf, target);
>   	if (ret)
>   		goto err;
>   
> +	target->req_ring_size = target->queue_size - SRP_TSK_MGMT_SQ_SIZE;
> +
>   	if (!srp_conn_unique(target->srp_host, target)) {
>   		shost_printk(KERN_INFO, target->scsi_host,
>   			     PFX "Already connected to target port with id_ext=%016llx;ioc_guid=%016llx;initiator_ext=%016llx\n",
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
> index 446b045..5756810 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.h
> +++ b/drivers/infiniband/ulp/srp/ib_srp.h
> @@ -57,14 +57,11 @@ enum {
>   	SRP_MAX_LUN		= 512,
>   	SRP_DEF_SG_TABLESIZE	= 12,
>   
> -	SRP_RQ_SHIFT    	= 6,
> -	SRP_RQ_SIZE		= 1 << SRP_RQ_SHIFT,
> -
> -	SRP_SQ_SIZE		= SRP_RQ_SIZE,
> +	SRP_DEFAULT_QUEUE_SIZE	= 1 << 6,
>   	SRP_RSP_SQ_SIZE		= 1,
> -	SRP_REQ_SQ_SIZE		= SRP_SQ_SIZE - SRP_RSP_SQ_SIZE,
>   	SRP_TSK_MGMT_SQ_SIZE	= 1,
> -	SRP_CMD_SQ_SIZE		= SRP_REQ_SQ_SIZE - SRP_TSK_MGMT_SQ_SIZE,
> +	SRP_DEFAULT_CMD_SQ_SIZE = SRP_DEFAULT_QUEUE_SIZE - SRP_RSP_SQ_SIZE -
> +				  SRP_TSK_MGMT_SQ_SIZE,
>   
>   	SRP_TAG_NO_REQ		= ~0U,
>   	SRP_TAG_TSK_MGMT	= 1U << 31,
> @@ -156,6 +153,8 @@ struct srp_target_port {
>   	char			target_name[32];
>   	unsigned int		scsi_id;
>   	unsigned int		sg_tablesize;
> +	int			queue_size;
> +	int			req_ring_size;
>   	int			comp_vector;
>   	int			tl_retry_count;
>   
> @@ -173,9 +172,9 @@ struct srp_target_port {
>   
>   	int			zero_req_lim;
>   
> -	struct srp_iu	       *tx_ring[SRP_SQ_SIZE];
> -	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
> -	struct srp_request	req_ring[SRP_CMD_SQ_SIZE];
> +	struct srp_iu	       **tx_ring;
> +	struct srp_iu	       **rx_ring;
> +	struct srp_request	*req_ring;
>   
>   	struct work_struct	tl_err_work;
>   	struct work_struct	remove_work;

Hey Bart,

I noticed this patch in your github and played with it, I agree that 
this patch is needed for a long time...

Question,
If srp now will allow larger queues while using a single global FMR pool 
of size 1024, isn't it more likely now that in stress environment srp 
will run out of FMRs to handle IO commands?
I mean that let's say that you have x scsi hosts with can_queue size of 
512 (+-) and all of them are running IO stress, is it possible that all 
FMRs will be inuse and no FMR is available to register the next IO SG-list?
Did you try out such a scenario?

I guess that in such a case IB core will return EAGAIN and SRP will 
return SCSI_MLQUEUE_HOST_BUSY.
I think it is a good Idea to move FMR pools to be per connection rather 
than a global pool, what do you think?

Cheers,
-Sagi
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]         ` <52138C6E.6080201-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-08-20 15:55           ` Bart Van Assche
       [not found]             ` <5213917B.3020403-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-08-20 15:55 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Roland Dreier, David Dillow, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On 08/20/13 17:34, Sagi Grimberg wrote:
> On 8/20/2013 3:50 PM, Bart Van Assche wrote:
>> Certain storage configurations, e.g. a sufficiently large array of
>> hard disks in a RAID configuration, need a queue depth above 64 to
>> achieve optimal performance. Hence make the queue depth configurable.
>> [ ... ]
>
> I noticed this patch in your github and played with it, I agree that
> this patch is needed for a long time...
>
> Question,
> If srp now will allow larger queues while using a single global FMR pool
> of size 1024, isn't it more likely now that in stress environment srp
> will run out of FMRs to handle IO commands?
> I mean that let's say that you have x scsi hosts with can_queue size of
> 512 (+-) and all of them are running IO stress, is it possible that all
> FMRs will be inuse and no FMR is available to register the next IO SG-list?
> Did you try out such a scenario?
>
> I guess that in such a case IB core will return EAGAIN and SRP will
> return SCSI_MLQUEUE_HOST_BUSY.
> I think it is a good Idea to move FMR pools to be per connection rather
> than a global pool, what do you think?

That makes sense to me. And as long as the above has not yet been 
implemented I'm fine with dropping patch 8/8 from this patch set.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]             ` <5213917B.3020403-HInyCGIudOg@public.gmane.org>
@ 2013-08-20 17:43               ` David Dillow
       [not found]                 ` <1377020595.22321.6.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: David Dillow @ 2013-08-20 17:43 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Sagi Grimberg, Roland Dreier, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On Tue, 2013-08-20 at 17:55 +0200, Bart Van Assche wrote:
> On 08/20/13 17:34, Sagi Grimberg wrote:
> > Question,
> > If srp now will allow larger queues while using a single global FMR pool
> > of size 1024, isn't it more likely now that in stress environment srp
> > will run out of FMRs to handle IO commands?
> > I mean that let's say that you have x scsi hosts with can_queue size of
> > 512 (+-) and all of them are running IO stress, is it possible that all
> > FMRs will be inuse and no FMR is available to register the next IO SG-list?
> > Did you try out such a scenario?
> >
> > I guess that in such a case IB core will return EAGAIN and SRP will
> > return SCSI_MLQUEUE_HOST_BUSY.
> > I think it is a good Idea to move FMR pools to be per connection rather
> > than a global pool, what do you think?
> 
> That makes sense to me. And as long as the above has not yet been 
> implemented I'm fine with dropping patch 8/8 from this patch set.

Don't drop it; most configs won't have all that many connections and
shouldn't have an issue; even those that do will only see a potential
slowdown when running with everything at once.

We can address the FMR/BMME issues on top of this patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                 ` <1377020595.22321.6.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2013-08-21  7:19                   ` Sagi Grimberg
  0 siblings, 0 replies; 31+ messages in thread
From: Sagi Grimberg @ 2013-08-21  7:19 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, Roland Dreier, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On 8/20/2013 8:43 PM, David Dillow wrote:
> On Tue, 2013-08-20 at 17:55 +0200, Bart Van Assche wrote:
>> On 08/20/13 17:34, Sagi Grimberg wrote:
>>> Question,
>>> If srp now will allow larger queues while using a single global FMR pool
>>> of size 1024, isn't it more likely now that in stress environment srp
>>> will run out of FMRs to handle IO commands?
>>> I mean that let's say that you have x scsi hosts with can_queue size of
>>> 512 (+-) and all of them are running IO stress, is it possible that all
>>> FMRs will be inuse and no FMR is available to register the next IO SG-list?
>>> Did you try out such a scenario?
>>>
>>> I guess that in such a case IB core will return EAGAIN and SRP will
>>> return SCSI_MLQUEUE_HOST_BUSY.
>>> I think it is a good Idea to move FMR pools to be per connection rather
>>> than a global pool, what do you think?
>> That makes sense to me. And as long as the above has not yet been
>> implemented I'm fine with dropping patch 8/8 from this patch set.
> Don't drop it; most configs won't have all that many connections and
> shouldn't have an issue; even those that do will only see a potential
> slowdown when running with everything at once.
>
> We can address the FMR/BMME issues on top of this patch.

Agree.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/8] IB SRP initiator patches for kernel 3.12
       [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
                     ` (7 preceding siblings ...)
  2013-08-20 12:50   ` [PATCH 8/8] IB/srp: Make queue size configurable Bart Van Assche
@ 2013-09-10  2:53   ` David Dillow
  8 siblings, 0 replies; 31+ messages in thread
From: David Dillow @ 2013-09-10  2:53 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma, linux-scsi

On Tue, 2013-08-20 at 14:41 +0200, Bart Van Assche wrote:
> Changes since the previous patch series are:
> - Rewrote the srp_tmo_valid() to improve readability (requested by Dave
>    Dillow).
> - The combination (reconnect_delay < 0 && fast_io_fail_tmo < 0 &&
>    dev_loss_tmo < 0) is now rejected as requested by Dave Dillow.
> - Fixed a race between transport layer failure handling and device
>    removal. This issue was reported by Vu Pham.

For patches 1-6,

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]     ` <52136609.3090406-HInyCGIudOg@public.gmane.org>
  2013-08-20 15:34       ` Sagi Grimberg
@ 2013-09-10  3:01       ` David Dillow
       [not found]         ` <1378782080.3794.6.camel-VK19RVc5TWXUd6DVheFtbw@public.gmane.org>
  1 sibling, 1 reply; 31+ messages in thread
From: David Dillow @ 2013-09-10  3:01 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma,
	Konrad Grzybowski

On Tue, 2013-08-20 at 14:50 +0200, Bart Van Assche wrote:
> @@ -2227,6 +2270,7 @@ static const match_table_t srp_opt_tokens = {
>         { SRP_OPT_SG_TABLESIZE,         "sg_tablesize=%u"       },
>         { SRP_OPT_COMP_VECTOR,          "comp_vector=%u"        },
>         { SRP_OPT_TL_RETRY_COUNT,       "tl_retry_count=%u"     },
> +       { SRP_OPT_CAN_QUEUE,            "can_queue=%d"          },

I'm pretty much OK with the patch, but since we're stuck with it going
forward, I'd like to have a better externally visible name here --
queue_depth? max_queue? queue_size?

Otherwise,
Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]         ` <1378782080.3794.6.camel-VK19RVc5TWXUd6DVheFtbw@public.gmane.org>
@ 2013-09-10 17:44           ` Bart Van Assche
       [not found]             ` <522F5A81.8040101-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-10 17:44 UTC (permalink / raw)
  To: David Dillow
  Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma,
	Konrad Grzybowski

On 09/10/13 05:01, David Dillow wrote:
> On Tue, 2013-08-20 at 14:50 +0200, Bart Van Assche wrote:
>> @@ -2227,6 +2270,7 @@ static const match_table_t srp_opt_tokens = {
>>          { SRP_OPT_SG_TABLESIZE,         "sg_tablesize=%u"       },
>>          { SRP_OPT_COMP_VECTOR,          "comp_vector=%u"        },
>>          { SRP_OPT_TL_RETRY_COUNT,       "tl_retry_count=%u"     },
>> +       { SRP_OPT_CAN_QUEUE,            "can_queue=%d"          },
>
> I'm pretty much OK with the patch, but since we're stuck with it going
> forward, I'd like to have a better externally visible name here --
> queue_depth? max_queue? queue_size?
>
> Otherwise,
> Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

Hello Dave,

If this name was not yet in use in any interface that is visible in user 
space, I would agree that we should come up with a better name. However, 
the SCSI mid-layer already uses that name today to export the queue 
size. To me this looks like a good reason to use the name "can_queue" ? 
An example:

$ cat /sys/class/scsi_host/host93/can_queue
62

See also the shost_rd_attr(can_queue, "%hd\n") statement in 
drivers/scsi/scsi_sysfs.c.

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]             ` <522F5A81.8040101-HInyCGIudOg@public.gmane.org>
@ 2013-09-11 22:16               ` David Dillow
       [not found]                 ` <1378937796.6649.5.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: David Dillow @ 2013-09-11 22:16 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, Roland Dreier, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
> If this name was not yet in use in any interface that is visible in user 
> space, I would agree that we should come up with a better name. However, 
> the SCSI mid-layer already uses that name today to export the queue 
> size. To me this looks like a good reason to use the name "can_queue" ? 
> An example:
> 
> $ cat /sys/class/scsi_host/host93/can_queue
> 62

Yes, I know it has been used before, but I'm torn between not furthering
a bad naming choice and consistency. Foolish consistency and all that...

I really don't like "can_queue", but I'll not complain if Roland decides
to take it as-is.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                 ` <1378937796.6649.5.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2013-09-12 16:16                   ` Jack Wang
       [not found]                     ` <5231E8CE.5060105-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2013-09-16 14:25                   ` Bart Van Assche
  1 sibling, 1 reply; 31+ messages in thread
From: Jack Wang @ 2013-09-12 16:16 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/12/2013 12:16 AM, David Dillow wrote:
> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>> If this name was not yet in use in any interface that is visible in user 
>> space, I would agree that we should come up with a better name. However, 
>> the SCSI mid-layer already uses that name today to export the queue 
>> size. To me this looks like a good reason to use the name "can_queue" ? 
>> An example:
>>
>> $ cat /sys/class/scsi_host/host93/can_queue
>> 62
> 
> Yes, I know it has been used before, but I'm torn between not furthering
> a bad naming choice and consistency. Foolish consistency and all that...
> 
> I really don't like "can_queue", but I'll not complain if Roland decides
> to take it as-is.
> 
> --

Hi,

What the allow range for this queue size?
Default cmd_per_lun and can_queue with same value makes no sense to me.
Could we bump can_queue to bigger value like 512?

Best
Jack

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                     ` <5231E8CE.5060105-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2013-09-12 16:30                       ` Bart Van Assche
       [not found]                         ` <5231EC1A.7030902-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-12 16:30 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/12/13 18:16, Jack Wang wrote:
> On 09/12/2013 12:16 AM, David Dillow wrote:
>> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>>> If this name was not yet in use in any interface that is visible in user
>>> space, I would agree that we should come up with a better name. However,
>>> the SCSI mid-layer already uses that name today to export the queue
>>> size. To me this looks like a good reason to use the name "can_queue" ?
>>> An example:
>>>
>>> $ cat /sys/class/scsi_host/host93/can_queue
>>> 62
>>
>> Yes, I know it has been used before, but I'm torn between not furthering
>> a bad naming choice and consistency. Foolish consistency and all that...
>>
>> I really don't like "can_queue", but I'll not complain if Roland decides
>> to take it as-is.
>
> What the allow range for this queue size?
> Default cmd_per_lun and can_queue with same value makes no sense to me.
> Could we bump can_queue to bigger value like 512?

Increasing the default value is only necessary when using a hard disk 
array at the target side. When using a single hard disk or an SSD at the 
target side the default value is fine.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                         ` <5231EC1A.7030902-HInyCGIudOg@public.gmane.org>
@ 2013-09-13  8:06                           ` Jack Wang
       [not found]                             ` <5232C76B.4010704-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jack Wang @ 2013-09-13  8:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/12/2013 06:30 PM, Bart Van Assche wrote:
> On 09/12/13 18:16, Jack Wang wrote:
>> On 09/12/2013 12:16 AM, David Dillow wrote:
>>> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>>>> If this name was not yet in use in any interface that is visible in
>>>> user
>>>> space, I would agree that we should come up with a better name.
>>>> However,
>>>> the SCSI mid-layer already uses that name today to export the queue
>>>> size. To me this looks like a good reason to use the name "can_queue" ?
>>>> An example:
>>>>
>>>> $ cat /sys/class/scsi_host/host93/can_queue
>>>> 62
>>>
>>> Yes, I know it has been used before, but I'm torn between not furthering
>>> a bad naming choice and consistency. Foolish consistency and all that...
>>>
>>> I really don't like "can_queue", but I'll not complain if Roland decides
>>> to take it as-is.
>>
>> What the allow range for this queue size?
>> Default cmd_per_lun and can_queue with same value makes no sense to me.
>> Could we bump can_queue to bigger value like 512?
> 
> Increasing the default value is only necessary when using a hard disk
> array at the target side. When using a single hard disk or an SSD at the
> target side the default value is fine.
> 
> Bart.
> 
Agree, from another side increasing default can_queue value is harmless
for single harddisk/SSD, but will boot performance for multiple LUN
attached to same host, so why not?

Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                             ` <5232C76B.4010704-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2013-09-13  8:40                               ` Bart Van Assche
       [not found]                                 ` <5232CF86.20507-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-13  8:40 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/13 10:06, Jack Wang wrote:
> On 09/12/2013 06:30 PM, Bart Van Assche wrote:
>> On 09/12/13 18:16, Jack Wang wrote:
>>> On 09/12/2013 12:16 AM, David Dillow wrote:
>>>> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>>>>> If this name was not yet in use in any interface that is visible in
>>>>> user
>>>>> space, I would agree that we should come up with a better name.
>>>>> However,
>>>>> the SCSI mid-layer already uses that name today to export the queue
>>>>> size. To me this looks like a good reason to use the name "can_queue" ?
>>>>> An example:
>>>>>
>>>>> $ cat /sys/class/scsi_host/host93/can_queue
>>>>> 62
>>>>
>>>> Yes, I know it has been used before, but I'm torn between not furthering
>>>> a bad naming choice and consistency. Foolish consistency and all that...
>>>>
>>>> I really don't like "can_queue", but I'll not complain if Roland decides
>>>> to take it as-is.
>>>
>>> What the allow range for this queue size?
>>> Default cmd_per_lun and can_queue with same value makes no sense to me.
>>> Could we bump can_queue to bigger value like 512?
>>
>> Increasing the default value is only necessary when using a hard disk
>> array at the target side. When using a single hard disk or an SSD at the
>> target side the default value is fine.
>
> Agree, from another side increasing default can_queue value is harmless
> for single harddisk/SSD, but will boot performance for multiple LUN
> attached to same host, so why not?

Increasing that parameter increases memory consumption for each path 
between initiator and target. In a setup where both the initiator and 
the target have multiple IB ports the number of paths can be large even 
when only a single LUN is exported by the target. That's why I'm not 
enthusiast about increasing the default value of the queue size.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                 ` <5232CF86.20507-HInyCGIudOg@public.gmane.org>
@ 2013-09-13  9:24                                   ` Bart Van Assche
       [not found]                                     ` <5232D9BC.7090808-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-13  9:24 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/13 10:40, Bart Van Assche wrote:
> On 09/13/13 10:06, Jack Wang wrote:
>> On 09/12/2013 06:30 PM, Bart Van Assche wrote:
>>> On 09/12/13 18:16, Jack Wang wrote:
>>>> On 09/12/2013 12:16 AM, David Dillow wrote:
>>>>> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>>>>>> If this name was not yet in use in any interface that is visible in
>>>>>> user
>>>>>> space, I would agree that we should come up with a better name.
>>>>>> However,
>>>>>> the SCSI mid-layer already uses that name today to export the queue
>>>>>> size. To me this looks like a good reason to use the name
>>>>>> "can_queue" ?
>>>>>> An example:
>>>>>>
>>>>>> $ cat /sys/class/scsi_host/host93/can_queue
>>>>>> 62
>>>>>
>>>>> Yes, I know it has been used before, but I'm torn between not
>>>>> furthering
>>>>> a bad naming choice and consistency. Foolish consistency and all
>>>>> that...
>>>>>
>>>>> I really don't like "can_queue", but I'll not complain if Roland
>>>>> decides
>>>>> to take it as-is.
>>>>
>>>> What the allow range for this queue size?
>>>> Default cmd_per_lun and can_queue with same value makes no sense to me.
>>>> Could we bump can_queue to bigger value like 512?
>>>
>>> Increasing the default value is only necessary when using a hard disk
>>> array at the target side. When using a single hard disk or an SSD at the
>>> target side the default value is fine.
>>
>> Agree, from another side increasing default can_queue value is harmless
>> for single harddisk/SSD, but will boot performance for multiple LUN
>> attached to same host, so why not?
>
> Increasing that parameter increases memory consumption for each path
> between initiator and target. In a setup where both the initiator and
> the target have multiple IB ports the number of paths can be large even
> when only a single LUN is exported by the target. That's why I'm not
> enthusiast about increasing the default value of the queue size.

(replying to my own e-mail)

Note: with the srp_daemon implementation available on github it's 
possible to set the can_queue parameter for all logins initiated by 
srp_daemon by adding something like the following at the end of 
/etc/srp_daemon.conf:

a can_queue=512

See also http://github.com/bvanassche/srptools.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                     ` <5232D9BC.7090808-HInyCGIudOg@public.gmane.org>
@ 2013-09-13 12:25                                       ` Jack Wang
       [not found]                                         ` <5233043F.5020804-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jack Wang @ 2013-09-13 12:25 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/2013 11:24 AM, Bart Van Assche wrote:
> On 09/13/13 10:40, Bart Van Assche wrote:
>> On 09/13/13 10:06, Jack Wang wrote:
>>> On 09/12/2013 06:30 PM, Bart Van Assche wrote:
>>>> On 09/12/13 18:16, Jack Wang wrote:
>>>>> On 09/12/2013 12:16 AM, David Dillow wrote:
>>>>>> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>>>>>>> If this name was not yet in use in any interface that is visible in
>>>>>>> user
>>>>>>> space, I would agree that we should come up with a better name.
>>>>>>> However,
>>>>>>> the SCSI mid-layer already uses that name today to export the queue
>>>>>>> size. To me this looks like a good reason to use the name
>>>>>>> "can_queue" ?
>>>>>>> An example:
>>>>>>>
>>>>>>> $ cat /sys/class/scsi_host/host93/can_queue
>>>>>>> 62
>>>>>>
>>>>>> Yes, I know it has been used before, but I'm torn between not
>>>>>> furthering
>>>>>> a bad naming choice and consistency. Foolish consistency and all
>>>>>> that...
>>>>>>
>>>>>> I really don't like "can_queue", but I'll not complain if Roland
>>>>>> decides
>>>>>> to take it as-is.
>>>>>
>>>>> What the allow range for this queue size?
>>>>> Default cmd_per_lun and can_queue with same value makes no sense to
>>>>> me.
>>>>> Could we bump can_queue to bigger value like 512?
>>>>
>>>> Increasing the default value is only necessary when using a hard disk
>>>> array at the target side. When using a single hard disk or an SSD at
>>>> the
>>>> target side the default value is fine.
>>>
>>> Agree, from another side increasing default can_queue value is harmless
>>> for single harddisk/SSD, but will boot performance for multiple LUN
>>> attached to same host, so why not?
>>
>> Increasing that parameter increases memory consumption for each path
>> between initiator and target. In a setup where both the initiator and
>> the target have multiple IB ports the number of paths can be large even
>> when only a single LUN is exported by the target. That's why I'm not
>> enthusiast about increasing the default value of the queue size.
> 
> (replying to my own e-mail)
> 
> Note: with the srp_daemon implementation available on github it's
> possible to set the can_queue parameter for all logins initiated by
> srp_daemon by adding something like the following at the end of
> /etc/srp_daemon.conf:
> 
> a can_queue=512
> 
> See also http://github.com/bvanassche/srptools.
> 
> Bart.
Thanks Bart,

I tried your srp-ha branch in github,
echo string
> SRP2="id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512"
to add_target failed with

>  ib_srp: unknown parameter or missing value 'can_queue=512
> [  122.405385] ' in target creation request
Do I miss something?

Cheers,
Jack

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                         ` <5233043F.5020804-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2013-09-13 13:33                                           ` Bart Van Assche
       [not found]                                             ` <52331444.8070007-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-13 13:33 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/13 14:25, Jack Wang wrote:
> I tried your srp-ha branch in github,
> echo string
>> SRP2="id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512"
> to add_target failed with
>
>>   ib_srp: unknown parameter or missing value 'can_queue=512
>> [  122.405385] ' in target creation request
> Do I miss something?

Hello Jack,

Have you already tried the same command with "echo -n" instead of "echo" 
? If I remember correctly the SRP initiator doesn't like a newline at 
the end of the string written into the sysfs add_target attribute.

Hope this helps,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                             ` <52331444.8070007-HInyCGIudOg@public.gmane.org>
@ 2013-09-13 13:51                                               ` Jack Wang
       [not found]                                                 ` <52331854.9010607-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jack Wang @ 2013-09-13 13:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/2013 03:33 PM, Bart Van Assche wrote:
> On 09/13/13 14:25, Jack Wang wrote:
>> I tried your srp-ha branch in github,
>> echo string
>>> SRP2="id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512"
>>>
>> to add_target failed with
>>
>>>   ib_srp: unknown parameter or missing value 'can_queue=512
>>> [  122.405385] ' in target creation request
>> Do I miss something?
> 
> Hello Jack,
> 
> Have you already tried the same command with "echo -n" instead of "echo"
> ? If I remember correctly the SRP initiator doesn't like a newline at
> the end of the string written into the sysfs add_target attribute.
> 
> Hope this helps,
> 
> Bart.
> 
Thanks Bart,

You're right, that is the problem, with -n option I can set can_queue
successfully.

But cat /sys/class/scsi_host/hostx/can_queue still show 63

Looks like it cause by the logic below:


>    /*
> 1870                  * Reserve credits for task management so we don't
> 1871                  * bounce requests back to the SCSI mid-layer.
> 1872                  */
> 1873                 target->scsi_host->can_queue
> 1874                         = min(target->req_lim - SRP_TSK_MGMT_SQ_SIZE,
> 1875                               target->scsi_host->can_queue);
> 1876                 target->scsi_host->cmd_per_lun
> 1877                         = min_t(int, target->scsi_host->can_queue,
> 1878                                 target->scsi_host->cmd_per_lun);

Could you give some hint on this, why we need to update can_queue and
cmd_per_lun here?

Cheers,
Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                                 ` <52331854.9010607-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2013-09-13 14:03                                                   ` Bart Van Assche
       [not found]                                                     ` <52331B47.9070202-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-13 14:03 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/13 15:51, Jack Wang wrote:
> On 09/13/2013 03:33 PM, Bart Van Assche wrote:
>> On 09/13/13 14:25, Jack Wang wrote:
>>> I tried your srp-ha branch in github,
>>> echo string
>>>> SRP2="id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512"
>>>>
>>> to add_target failed with
>>>
>>>>    ib_srp: unknown parameter or missing value 'can_queue=512
>>>> [  122.405385] ' in target creation request
>>> Do I miss something?
>>
>> Hello Jack,
>>
>> Have you already tried the same command with "echo -n" instead of "echo"
>> ? If I remember correctly the SRP initiator doesn't like a newline at
>> the end of the string written into the sysfs add_target attribute.
>>
>> Hope this helps,
>>
>> Bart.
>>
> Thanks Bart,
>
> You're right, that is the problem, with -n option I can set can_queue
> successfully.
>
> But cat /sys/class/scsi_host/hostx/can_queue still show 63
>
> Looks like it cause by the logic below:
>
>
>>     /*
>> 1870                  * Reserve credits for task management so we don't
>> 1871                  * bounce requests back to the SCSI mid-layer.
>> 1872                  */
>> 1873                 target->scsi_host->can_queue
>> 1874                         = min(target->req_lim - SRP_TSK_MGMT_SQ_SIZE,
>> 1875                               target->scsi_host->can_queue);
>> 1876                 target->scsi_host->cmd_per_lun
>> 1877                         = min_t(int, target->scsi_host->can_queue,
>> 1878                                 target->scsi_host->cmd_per_lun);
>
> Could you give some hint on this, why we need to update can_queue and
> cmd_per_lun here?

Hello Jack,

What is the value of target->req_lim on your setup ? The initiator 
limits the queue size to the target request limit because any attempt to 
queue more commands would cause the target to reply with "BUSY" anyway.

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                                     ` <52331B47.9070202-HInyCGIudOg@public.gmane.org>
@ 2013-09-13 14:15                                                       ` Jack Wang
       [not found]                                                         ` <52331E01.3060005-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jack Wang @ 2013-09-13 14:15 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/2013 04:03 PM, Bart Van Assche wrote:
> On 09/13/13 15:51, Jack Wang wrote:
>> On 09/13/2013 03:33 PM, Bart Van Assche wrote:
>>> On 09/13/13 14:25, Jack Wang wrote:
>>>> I tried your srp-ha branch in github,
>>>> echo string
>>>>> SRP2="id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512"
>>>>>
>>>>>
>>>> to add_target failed with
>>>>
>>>>>    ib_srp: unknown parameter or missing value 'can_queue=512
>>>>> [  122.405385] ' in target creation request
>>>> Do I miss something?
>>>
>>> Hello Jack,
>>>
>>> Have you already tried the same command with "echo -n" instead of "echo"
>>> ? If I remember correctly the SRP initiator doesn't like a newline at
>>> the end of the string written into the sysfs add_target attribute.
>>>
>>> Hope this helps,
>>>
>>> Bart.
>>>
>> Thanks Bart,
>>
>> You're right, that is the problem, with -n option I can set can_queue
>> successfully.
>>
>> But cat /sys/class/scsi_host/hostx/can_queue still show 63
>>
>> Looks like it cause by the logic below:
>>
>>
>>>     /*
>>> 1870                  * Reserve credits for task management so we don't
>>> 1871                  * bounce requests back to the SCSI mid-layer.
>>> 1872                  */
>>> 1873                 target->scsi_host->can_queue
>>> 1874                         = min(target->req_lim -
>>> SRP_TSK_MGMT_SQ_SIZE,
>>> 1875                               target->scsi_host->can_queue);
>>> 1876                 target->scsi_host->cmd_per_lun
>>> 1877                         = min_t(int, target->scsi_host->can_queue,
>>> 1878                                 target->scsi_host->cmd_per_lun);
>>
>> Could you give some hint on this, why we need to update can_queue and
>> cmd_per_lun here?
> 
> Hello Jack,
> 
> What is the value of target->req_lim on your setup ? The initiator
> limits the queue size to the target request limit because any attempt to
> queue more commands would cause the target to reply with "BUSY" anyway.
> 
> Bart.
> 
> 
Hello Bart,

> cat /sys/class/scsi_host/host36/req_lim 
> 64

I just checked srp spec, which do define such behaviour, I wonder in
SCST/SRPT, how the request limit is chosen, is it report from low level
hardware driver?

Thanks
Jack



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                                         ` <52331E01.3060005-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2013-09-13 14:31                                                           ` Jack Wang
  2013-09-13 14:31                                                           ` Bart Van Assche
  1 sibling, 0 replies; 31+ messages in thread
From: Jack Wang @ 2013-09-13 14:31 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/2013 04:15 PM, Jack Wang wrote:
> On 09/13/2013 04:03 PM, Bart Van Assche wrote:
>> On 09/13/13 15:51, Jack Wang wrote:
>>> On 09/13/2013 03:33 PM, Bart Van Assche wrote:
>>>> On 09/13/13 14:25, Jack Wang wrote:
>>>>> I tried your srp-ha branch in github,
>>>>> echo string
>>>>>> SRP2="id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512"
>>>>>>
>>>>>>
>>>>> to add_target failed with
>>>>>
>>>>>>    ib_srp: unknown parameter or missing value 'can_queue=512
>>>>>> [  122.405385] ' in target creation request
>>>>> Do I miss something?
>>>>
>>>> Hello Jack,
>>>>
>>>> Have you already tried the same command with "echo -n" instead of "echo"
>>>> ? If I remember correctly the SRP initiator doesn't like a newline at
>>>> the end of the string written into the sysfs add_target attribute.
>>>>
>>>> Hope this helps,
>>>>
>>>> Bart.
>>>>
>>> Thanks Bart,
>>>
>>> You're right, that is the problem, with -n option I can set can_queue
>>> successfully.
>>>
>>> But cat /sys/class/scsi_host/hostx/can_queue still show 63
>>>
>>> Looks like it cause by the logic below:
>>>
>>>
>>>>     /*
>>>> 1870                  * Reserve credits for task management so we don't
>>>> 1871                  * bounce requests back to the SCSI mid-layer.
>>>> 1872                  */
>>>> 1873                 target->scsi_host->can_queue
>>>> 1874                         = min(target->req_lim -
>>>> SRP_TSK_MGMT_SQ_SIZE,
>>>> 1875                               target->scsi_host->can_queue);
>>>> 1876                 target->scsi_host->cmd_per_lun
>>>> 1877                         = min_t(int, target->scsi_host->can_queue,
>>>> 1878                                 target->scsi_host->cmd_per_lun);
>>>
>>> Could you give some hint on this, why we need to update can_queue and
>>> cmd_per_lun here?
>>
>> Hello Jack,
>>
>> What is the value of target->req_lim on your setup ? The initiator
>> limits the queue size to the target request limit because any attempt to
>> queue more commands would cause the target to reply with "BUSY" anyway.
>>
>> Bart.
>>
>>
> Hello Bart,
> 
>> cat /sys/class/scsi_host/host36/req_lim 
>> 64
> 
> I just checked srp spec, which do define such behaviour, I wonder in
> SCST/SRPT, how the request limit is chosen, is it report from low level
> hardware driver?
> 
> Thanks
> Jack
> 
> 
> 

Reply to myself. SRPT default use in login_rsp request limit.
> #define SCST_MAX_TGT_DEV_COMMANDS            64

Cheers,
Jack

The
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                                                         ` <52331E01.3060005-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2013-09-13 14:31                                                           ` Jack Wang
@ 2013-09-13 14:31                                                           ` Bart Van Assche
  1 sibling, 0 replies; 31+ messages in thread
From: Bart Van Assche @ 2013-09-13 14:31 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, David Dillow, Roland Dreier, Vu Pham,
	Sebastian Riemer, linux-rdma, Konrad Grzybowski

On 09/13/13 16:15, Jack Wang wrote:
> Hello Bart,
>
>> cat /sys/class/scsi_host/host36/req_lim
>> 64
>
> I just checked srp spec, which do define such behaviour, I wonder in
> SCST/SRPT, how the request limit is chosen, is it report from low level
> hardware driver?

Hello Jack,

The following code probably provides the information you are looking for:

/*
  * Avoid QUEUE_FULL conditions by limiting the number of buffers used
  * for the SRP protocol to the SCST SCSI command queue size.
  */
ch->rq_size = min(SRPT_RQ_SIZE, scst_get_max_lun_commands(NULL, 0));

In other words, req_lim can be increased by increasing both SRPT_RQ_SIZE 
and SCST_MAX_DEV_COMMANDS.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                 ` <1378937796.6649.5.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  2013-09-12 16:16                   ` Jack Wang
@ 2013-09-16 14:25                   ` Bart Van Assche
       [not found]                     ` <523714D8.3020104-HInyCGIudOg@public.gmane.org>
  1 sibling, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2013-09-16 14:25 UTC (permalink / raw)
  To: David Dillow
  Cc: David Dillow, Roland Dreier, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On 09/12/13 00:16, David Dillow wrote:
> On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
>> If this name was not yet in use in any interface that is visible in user
>> space, I would agree that we should come up with a better name. However,
>> the SCSI mid-layer already uses that name today to export the queue
>> size. To me this looks like a good reason to use the name "can_queue" ?
>> An example:
>>
>> $ cat /sys/class/scsi_host/host93/can_queue
>> 62
>
> Yes, I know it has been used before, but I'm torn between not furthering
> a bad naming choice and consistency. Foolish consistency and all that...
>
> I really don't like "can_queue", but I'll not complain if Roland decides
> to take it as-is.

The merge window has been closed early which means that I'll have to 
resend this patch series anyway. How about using the name "queue_size" 
instead ?

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] IB/srp: Make queue size configurable
       [not found]                     ` <523714D8.3020104-HInyCGIudOg@public.gmane.org>
@ 2013-09-16 14:28                       ` David Dillow
  0 siblings, 0 replies; 31+ messages in thread
From: David Dillow @ 2013-09-16 14:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, Roland Dreier, Vu Pham, Sebastian Riemer,
	linux-rdma, Konrad Grzybowski

On Mon, 2013-09-16 at 16:25 +0200, Bart Van Assche wrote:
> On 09/12/13 00:16, David Dillow wrote:
> > On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote:
> >> If this name was not yet in use in any interface that is visible in user
> >> space, I would agree that we should come up with a better name. However,
> >> the SCSI mid-layer already uses that name today to export the queue
> >> size. To me this looks like a good reason to use the name "can_queue" ?
> >> An example:
> >>
> >> $ cat /sys/class/scsi_host/host93/can_queue
> >> 62
> >
> > Yes, I know it has been used before, but I'm torn between not furthering
> > a bad naming choice and consistency. Foolish consistency and all that...
> >
> > I really don't like "can_queue", but I'll not complain if Roland decides
> > to take it as-is.
> 
> The merge window has been closed early which means that I'll have to 
> resend this patch series anyway. How about using the name "queue_size" 
> instead ?

I'm good with that, but I should be able to make some time to pull it
all into a git tree for Roland; this will keep you from having to
resubmit.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-09-16 14:28 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-20 12:41 [PATCH 0/8] IB SRP initiator patches for kernel 3.12 Bart Van Assche
     [not found] ` <521363EA.8080906-HInyCGIudOg@public.gmane.org>
2013-08-20 12:43   ` [PATCH 1/8] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
2013-08-20 12:44   ` [PATCH 2/8] scsi_transport_srp: Add transport layer error handling Bart Van Assche
2013-08-20 12:45   ` [PATCH 3/8] IB/srp: Add srp_terminate_io() Bart Van Assche
2013-08-20 12:46   ` [PATCH 4/8] IB/srp: Use SRP transport layer error recovery Bart Van Assche
2013-08-20 12:46   ` [PATCH 5/8] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
2013-08-20 12:47   ` [PATCH 6/8] IB/srp: Make transport layer retry count configurable Bart Van Assche
2013-08-20 12:48   ` [PATCH 7/8] IB/srp: Introduce srp_alloc_req_data() Bart Van Assche
2013-08-20 12:50   ` [PATCH 8/8] IB/srp: Make queue size configurable Bart Van Assche
     [not found]     ` <52136609.3090406-HInyCGIudOg@public.gmane.org>
2013-08-20 15:34       ` Sagi Grimberg
     [not found]         ` <52138C6E.6080201-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-08-20 15:55           ` Bart Van Assche
     [not found]             ` <5213917B.3020403-HInyCGIudOg@public.gmane.org>
2013-08-20 17:43               ` David Dillow
     [not found]                 ` <1377020595.22321.6.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-08-21  7:19                   ` Sagi Grimberg
2013-09-10  3:01       ` David Dillow
     [not found]         ` <1378782080.3794.6.camel-VK19RVc5TWXUd6DVheFtbw@public.gmane.org>
2013-09-10 17:44           ` Bart Van Assche
     [not found]             ` <522F5A81.8040101-HInyCGIudOg@public.gmane.org>
2013-09-11 22:16               ` David Dillow
     [not found]                 ` <1378937796.6649.5.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-09-12 16:16                   ` Jack Wang
     [not found]                     ` <5231E8CE.5060105-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-12 16:30                       ` Bart Van Assche
     [not found]                         ` <5231EC1A.7030902-HInyCGIudOg@public.gmane.org>
2013-09-13  8:06                           ` Jack Wang
     [not found]                             ` <5232C76B.4010704-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-13  8:40                               ` Bart Van Assche
     [not found]                                 ` <5232CF86.20507-HInyCGIudOg@public.gmane.org>
2013-09-13  9:24                                   ` Bart Van Assche
     [not found]                                     ` <5232D9BC.7090808-HInyCGIudOg@public.gmane.org>
2013-09-13 12:25                                       ` Jack Wang
     [not found]                                         ` <5233043F.5020804-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-13 13:33                                           ` Bart Van Assche
     [not found]                                             ` <52331444.8070007-HInyCGIudOg@public.gmane.org>
2013-09-13 13:51                                               ` Jack Wang
     [not found]                                                 ` <52331854.9010607-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-13 14:03                                                   ` Bart Van Assche
     [not found]                                                     ` <52331B47.9070202-HInyCGIudOg@public.gmane.org>
2013-09-13 14:15                                                       ` Jack Wang
     [not found]                                                         ` <52331E01.3060005-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-13 14:31                                                           ` Jack Wang
2013-09-13 14:31                                                           ` Bart Van Assche
2013-09-16 14:25                   ` Bart Van Assche
     [not found]                     ` <523714D8.3020104-HInyCGIudOg@public.gmane.org>
2013-09-16 14:28                       ` David Dillow
2013-09-10  2:53   ` [PATCH 0/8] IB SRP initiator patches for kernel 3.12 David Dillow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).