From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
To: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>,
Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Sebastian Riemer
<sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>,
linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: [PATCH 07/10] IB/srp: Add periodic reconnect functionality
Date: Thu, 10 Oct 2013 14:16:59 +0200 [thread overview]
Message-ID: <52569ABB.4060205@acm.org> (raw)
In-Reply-To: <52569806.9080201-HInyCGIudOg@public.gmane.org>
After a transport layer occurred, periodically try to reconnect
to the target until the dev_loss timer expires. Protect the
callback functions that can be invoked from inside the SCSI EH
against concurrent invocation with srp_reconnect_rport() via the
rport mutex. Change the default dev_loss_tmo from 60s into 600s
to give the reconnect mechanism a chance to kick in.
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
Documentation/ABI/stable/sysfs-transport-srp | 8 +++++
drivers/infiniband/ulp/srp/ib_srp.c | 52 ++++++++++++++++++++++++----
2 files changed, 54 insertions(+), 6 deletions(-)
diff --git a/Documentation/ABI/stable/sysfs-transport-srp b/Documentation/ABI/stable/sysfs-transport-srp
index 178a44d..21bd480 100644
--- a/Documentation/ABI/stable/sysfs-transport-srp
+++ b/Documentation/ABI/stable/sysfs-transport-srp
@@ -30,6 +30,14 @@ Contact: linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Description: 16-byte local SRP port identifier in hexadecimal format. An
example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00.
+What: /sys/class/srp_remote_ports/port-<h>:<n>/reconnect_delay
+Date: December 1, 2013
+KernelVersion: 3.12
+Contact: linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+Description: Number of seconds the SCSI layer will wait after a reconnect
+ attempt failed before retrying. Setting this attribute to
+ "off" will disable time-based reconnecting.
+
What: /sys/class/srp_remote_ports/port-<h>:<n>/roles
Date: June 27, 2007
KernelVersion: 2.6.24
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index ceb84b6..4be3eb8 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -88,6 +88,11 @@ MODULE_PARM_DESC(topspin_workarounds,
static struct kernel_param_ops srp_tmo_ops;
+static int srp_reconnect_delay = 10;
+module_param_cb(reconnect_delay, &srp_tmo_ops, &srp_reconnect_delay,
+ S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(reconnect_delay, "Time between successive reconnect attempts");
+
static int srp_fast_io_fail_tmo = 15;
module_param_cb(fast_io_fail_tmo, &srp_tmo_ops, &srp_fast_io_fail_tmo,
S_IRUGO | S_IWUSR);
@@ -96,7 +101,7 @@ MODULE_PARM_DESC(fast_io_fail_tmo,
" layer error and failing all I/O. \"off\" means that this"
" functionality is disabled.");
-static int srp_dev_loss_tmo = 60;
+static int srp_dev_loss_tmo = 600;
module_param_cb(dev_loss_tmo, &srp_tmo_ops, &srp_dev_loss_tmo,
S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(dev_loss_tmo,
@@ -144,10 +149,14 @@ static int srp_tmo_set(const char *val, const struct kernel_param *kp)
} else {
tmo = -1;
}
- if (kp->arg == &srp_fast_io_fail_tmo)
- res = srp_tmo_valid(tmo, srp_dev_loss_tmo);
+ if (kp->arg == &srp_reconnect_delay)
+ res = srp_tmo_valid(tmo, srp_fast_io_fail_tmo,
+ srp_dev_loss_tmo);
+ else if (kp->arg == &srp_fast_io_fail_tmo)
+ res = srp_tmo_valid(srp_reconnect_delay, tmo, srp_dev_loss_tmo);
else
- res = srp_tmo_valid(srp_fast_io_fail_tmo, tmo);
+ res = srp_tmo_valid(srp_reconnect_delay, srp_fast_io_fail_tmo,
+ tmo);
if (res)
goto out;
*(int *)kp->arg = tmo;
@@ -1426,18 +1435,29 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
{
struct srp_target_port *target = host_to_target(shost);
+ struct srp_rport *rport = target->rport;
struct srp_request *req;
struct srp_iu *iu;
struct srp_cmd *cmd;
struct ib_device *dev;
unsigned long flags;
int len, result;
+ const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler;
+
+ /*
+ * The SCSI EH thread is the only context from which srp_queuecommand()
+ * can get invoked for blocked devices (SDEV_BLOCK /
+ * SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() by
+ * locking the rport mutex if invoked from inside the SCSI EH.
+ */
+ if (in_scsi_eh)
+ mutex_lock(&rport->mutex);
result = srp_chkready(target->rport);
if (unlikely(result)) {
scmnd->result = result;
scmnd->scsi_done(scmnd);
- return 0;
+ goto unlock_rport;
}
spin_lock_irqsave(&target->lock, flags);
@@ -1482,6 +1502,10 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
goto err_unmap;
}
+unlock_rport:
+ if (in_scsi_eh)
+ mutex_unlock(&rport->mutex);
+
return 0;
err_unmap:
@@ -1496,6 +1520,9 @@ err_iu:
err_unlock:
spin_unlock_irqrestore(&target->lock, flags);
+ if (in_scsi_eh)
+ mutex_unlock(&rport->mutex);
+
return SCSI_MLQUEUE_HOST_BUSY;
}
@@ -1780,6 +1807,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
static int srp_send_tsk_mgmt(struct srp_target_port *target,
u64 req_tag, unsigned int lun, u8 func)
{
+ struct srp_rport *rport = target->rport;
struct ib_device *dev = target->srp_host->srp_dev->dev;
struct srp_iu *iu;
struct srp_tsk_mgmt *tsk_mgmt;
@@ -1789,12 +1817,20 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
init_completion(&target->tsk_mgmt_done);
+ /*
+ * Lock the rport mutex to avoid that srp_create_target_ib() is
+ * invoked while a task management function is being sent.
+ */
+ mutex_lock(&rport->mutex);
spin_lock_irq(&target->lock);
iu = __srp_get_tx_iu(target, SRP_IU_TSK_MGMT);
spin_unlock_irq(&target->lock);
- if (!iu)
+ if (!iu) {
+ mutex_unlock(&rport->mutex);
+
return -1;
+ }
ib_dma_sync_single_for_cpu(dev, iu->dma, sizeof *tsk_mgmt,
DMA_TO_DEVICE);
@@ -1811,8 +1847,11 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
DMA_TO_DEVICE);
if (srp_post_send(target, iu, sizeof *tsk_mgmt)) {
srp_put_tx_iu(target, iu, SRP_IU_TSK_MGMT);
+ mutex_unlock(&rport->mutex);
+
return -1;
}
+ mutex_unlock(&rport->mutex);
if (!wait_for_completion_timeout(&target->tsk_mgmt_done,
msecs_to_jiffies(SRP_ABORT_TIMEOUT_MS)))
@@ -2713,6 +2752,7 @@ static void srp_remove_one(struct ib_device *device)
static struct srp_function_template ib_srp_transport_functions = {
.has_rport_state = true,
.reset_timer_if_blocked = true,
+ .reconnect_delay = &srp_reconnect_delay,
.fast_io_fail_tmo = &srp_fast_io_fail_tmo,
.dev_loss_tmo = &srp_dev_loss_tmo,
.reconnect = srp_rport_reconnect,
--
1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-10-10 12:16 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-10 12:05 [PATCH 0/10] IB SRP initiator patches for kernel 3.12 Bart Van Assche
2013-10-10 12:10 ` [PATCH 02/10] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
[not found] ` <52569806.9080201-HInyCGIudOg@public.gmane.org>
2013-10-10 12:08 ` [PATCH 01/10] IB/srp: Make transport layer retry count configurable Bart Van Assche
2013-10-10 12:12 ` [PATCH 03/10] scsi_transport_srp: Add transport layer error handling Bart Van Assche
[not found] ` <525699A9.6090802-HInyCGIudOg@public.gmane.org>
2013-10-19 16:13 ` Bart Van Assche
[not found] ` <5262AF8D.7070700-HInyCGIudOg@public.gmane.org>
2013-10-25 22:42 ` David Dillow
2013-10-26 6:30 ` Bart Van Assche
2013-10-10 12:13 ` [PATCH 04/10] IB/srp: Use SRP transport layer error recovery Bart Van Assche
2013-10-10 12:14 ` [PATCH 05/10] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
2013-10-10 12:15 ` [PATCH 06/10] scsi_transport_srp: Add periodic reconnect support Bart Van Assche
2013-10-10 12:16 ` Bart Van Assche [this message]
2013-10-10 12:18 ` [PATCH 08/10] IB/srp: Export sgid to sysfs Bart Van Assche
2013-10-10 12:18 ` [PATCH 09/10] IB/srp: Introduce srp_alloc_req_data() Bart Van Assche
2013-10-10 12:19 ` [PATCH 10/10] IB/srp: Make queue size configurable Bart Van Assche
[not found] ` <52569B6A.30301-HInyCGIudOg@public.gmane.org>
2013-10-11 11:34 ` Jack Wang
2013-10-25 22:13 ` David Dillow
[not found] ` <1382739213.17280.3.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-10-26 6:13 ` Bart Van Assche
2013-10-25 22:14 ` [PATCH 0/10] IB SRP initiator patches for kernel 3.12 David Dillow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52569ABB.4060205@acm.org \
--to=bvanassche-hinycgiudog@public.gmane.org \
--cc=dillowda-1Heg1YXhbW8@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org \
--cc=vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.