* [PATCH 00/11] First pass at merging Bart's HA work
@ 2012-11-26 4:44 David Dillow
2012-11-26 4:44 ` [PATCH 01/11] IB/srp: enlarge block layer timeout David Dillow
` (9 more replies)
0 siblings, 10 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, bvanassche-HInyCGIudOg,
roland-BHEL68pLQRGGvPXPguhicg
Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA
work to upstream. It is not complete, as I have not yet added the
transport layer error handling and related patches. It is also currently
missing the patch to maintain a single connection for an I_T nexus.
I swapped Ishai's code to recreate QP/CQs for each connection, as that
adds recovery from fatal hardware errors, and reduces the code needed to
avoid stale completions. Similarly, Vu's patch slightly reduces reconnect
times.
Target blocking/unblocking hasn't been added from Bart's patch that
removes the SRP_TARGET_CONNECTING state; I think it is probably the right
thing to do, but want to think it over a bit more.
Similarly, I'm looking for a clean way to start the reconnection effort
as soon as we get an error through the CQ -- we know that any pending
commands are lost to us at that point, so we should be able to kick them
back upstream quickly. This should allow multipath to reissue them on one
of the remaining good paths.
This series compiles, but is otherwise UNTESTED. I'll be working on that
over the next few days, with an eye on getting as much of Bart's work
into 3.8 as possible.
One may also pull this series from github:
git pull git://github.com/dillow/srp-initiator.git ha-merge-v1
Bart Van Assche (7):
IB/srp: enlarge block layer timeout
IB/srp: keep processing commands during host removal
IB/srp: Document sysfs attributes
srp_transport: Fix attribute registration
srp_transport: Simplify attribute initialization code
srp_transport: Document sysfs attributes
IB/srp: Allow SRP disconnect through sysfs
David Dillow (2):
IB/srp: simplify state tracking
IB/srp: don't send anything on a bad QP
Ishai Rabinovitz (1):
IB/srp: destroy and recreate QP and CQs on each connection
Vu Pham (1):
IB/srp: send disconnect request without waiting for CM timewait exit
Documentation/ABI/stable/sysfs-driver-ib_srp | 156 +++++++++++++++
Documentation/ABI/stable/sysfs-transport-srp | 19 ++
drivers/infiniband/ulp/srp/ib_srp.c | 274 +++++++++++++++-----------
drivers/infiniband/ulp/srp/ib_srp.h | 13 +-
drivers/scsi/scsi_transport_srp.c | 51 +++---
include/scsi/scsi_transport_srp.h | 8 +
6 files changed, 378 insertions(+), 143 deletions(-)
create mode 100644 Documentation/ABI/stable/sysfs-driver-ib_srp
create mode 100644 Documentation/ABI/stable/sysfs-transport-srp
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 01/11] IB/srp: enlarge block layer timeout
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 4:44 ` [PATCH 02/11] IB/srp: simplify state tracking David Dillow
` (8 subsequent siblings)
9 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma; +Cc: linux-scsi, bvanassche, roland
From: Bart Van Assche <bvanassche@acm.org>
Enlarge the block layer timeout for disks such that it is above
the InfiniBand transport layer timeout.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
---
drivers/infiniband/ulp/srp/ib_srp.c | 45 +++++++++++++++++++++++++++++++++++
drivers/infiniband/ulp/srp/ib_srp.h | 2 +
2 files changed, 47 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 922d845..6c0cd66 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1419,6 +1419,33 @@ err:
return -ENOMEM;
}
+static uint32_t srp_compute_rq_tmo(struct ib_qp_attr *qp_attr, int attr_mask)
+{
+ uint64_t T_tr_ns, max_compl_time_ms;
+ uint32_t rq_tmo_jiffies;
+
+ /*
+ * According to section 11.2.4.2 in the IBTA spec (Modify Queue Pair,
+ * table 91), both the QP timeout and the retry count have to be set
+ * for RC QP's during the RTR to RTS transition.
+ */
+ WARN_ON((attr_mask & (IB_QP_TIMEOUT | IB_QP_RETRY_CNT)) !=
+ (IB_QP_TIMEOUT | IB_QP_RETRY_CNT));
+
+ /*
+ * Set target->rq_tmo_jiffies to one second more than the largest time
+ * it can take before an error completion is generated. See also
+ * C9-140..142 in the IBTA spec for more information about how to
+ * convert the QP Local ACK Timeout value to nanoseconds.
+ */
+ T_tr_ns = 4096 * (1ULL << qp_attr->timeout);
+ max_compl_time_ms = qp_attr->retry_cnt * 4 * T_tr_ns;
+ do_div(max_compl_time_ms, NSEC_PER_MSEC);
+ rq_tmo_jiffies = msecs_to_jiffies(max_compl_time_ms + 1000);
+
+ return rq_tmo_jiffies;
+}
+
static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
struct srp_login_rsp *lrsp,
struct srp_target_port *target)
@@ -1478,6 +1505,8 @@ static void srp_cm_rep_handler(struct ib_cm_id *cm_id,
if (ret)
goto error_free;
+ target->rq_tmo_jiffies = srp_compute_rq_tmo(qp_attr, attr_mask);
+
ret = ib_modify_qp(target->qp, qp_attr, attr_mask);
if (ret)
goto error_free;
@@ -1729,6 +1758,21 @@ static int srp_reset_host(struct scsi_cmnd *scmnd)
return ret;
}
+static int srp_slave_configure(struct scsi_device *sdev)
+{
+ struct Scsi_Host *shost = sdev->host;
+ struct srp_target_port *target = host_to_target(shost);
+ struct request_queue *q = sdev->request_queue;
+ unsigned long timeout;
+
+ if (sdev->type == TYPE_DISK) {
+ timeout = max_t(unsigned, 30 * HZ, target->rq_tmo_jiffies);
+ blk_queue_rq_timeout(q, timeout);
+ }
+
+ return 0;
+}
+
static ssize_t show_id_ext(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -1861,6 +1905,7 @@ static struct scsi_host_template srp_template = {
.module = THIS_MODULE,
.name = "InfiniBand SRP initiator",
.proc_name = DRV_NAME,
+ .slave_configure = srp_slave_configure,
.info = srp_target_info,
.queuecommand = srp_queuecommand,
.eh_abort_handler = srp_abort,
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 020caf0..e3a6304 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -163,6 +163,8 @@ struct srp_target_port {
struct ib_sa_query *path_query;
int path_query_id;
+ u32 rq_tmo_jiffies;
+
struct ib_cm_id *cm_id;
int max_ti_iu_len;
--
1.7.7.6
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 02/11] IB/srp: simplify state tracking
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
2012-11-26 4:44 ` [PATCH 01/11] IB/srp: enlarge block layer timeout David Dillow
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 9:46 ` Bart Van Assche
2012-11-26 4:44 ` [PATCH 05/11] IB/srp: destroy and recreate QP and CQs on each connection David Dillow
` (7 subsequent siblings)
9 siblings, 1 reply; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma; +Cc: linux-scsi, bvanassche, roland
The state of the target has several conditions that overlap, making it
easier to model as a bit-field of exceptional conditions rather than an
enum of all possible states.
Bart Van Assche did the hard work of identifying the states that can be
removed, and did a first patch that consolidated the state space.
Needs-to-be-signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
---
drivers/infiniband/ulp/srp/ib_srp.c | 146 +++++++++++++++++------------------
drivers/infiniband/ulp/srp/ib_srp.h | 11 +--
2 files changed, 76 insertions(+), 81 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 6c0cd66..1e8ce81 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -102,6 +102,34 @@ static struct ib_client srp_client = {
static struct ib_sa_client srp_sa_client;
+static inline void srp_mark_connected(struct srp_target_port *target)
+{
+ smp_mb__before_clear_bit();
+ clear_bit(SRP_STATE_ERROR, &target->state);
+ clear_bit(SRP_STATE_DISCONNECTED, &target->state);
+ smp_mb__after_clear_bit();
+}
+
+static inline bool srp_mark_disconnected(struct srp_target_port *target)
+{
+ return test_and_set_bit(SRP_STATE_DISCONNECTED, &target->state);
+}
+
+static inline bool srp_is_disconnected(struct srp_target_port *target)
+{
+ return ACCESS_ONCE(target->state) & (1UL << SRP_STATE_DISCONNECTED);
+}
+
+static inline bool srp_is_removed(struct srp_target_port *target)
+{
+ return ACCESS_ONCE(target->state) & (1UL << SRP_STATE_REMOVED);
+}
+
+static inline bool srp_in_error(struct srp_target_port *target)
+{
+ return ACCESS_ONCE(target->state) & (1UL << SRP_STATE_ERROR);
+}
+
static inline struct srp_target_port *host_to_target(struct Scsi_Host *host)
{
return (struct srp_target_port *) host->hostdata;
@@ -430,6 +458,9 @@ static int srp_send_req(struct srp_target_port *target)
static void srp_disconnect_target(struct srp_target_port *target)
{
+ if (srp_mark_disconnected(target))
+ return;
+
/* XXX should send SRP_I_LOGOUT request */
init_completion(&target->done);
@@ -441,21 +472,6 @@ static void srp_disconnect_target(struct srp_target_port *target)
wait_for_completion(&target->done);
}
-static bool srp_change_state(struct srp_target_port *target,
- enum srp_target_state old,
- enum srp_target_state new)
-{
- bool changed = false;
-
- spin_lock_irq(&target->lock);
- if (target->state == old) {
- target->state = new;
- changed = true;
- }
- spin_unlock_irq(&target->lock);
- return changed;
-}
-
static void srp_free_req_data(struct srp_target_port *target)
{
struct ib_device *ibdev = target->srp_host->srp_dev->dev;
@@ -494,9 +510,6 @@ static void srp_remove_work(struct work_struct *work)
struct srp_target_port *target =
container_of(work, struct srp_target_port, work);
- if (!srp_change_state(target, SRP_TARGET_DEAD, SRP_TARGET_REMOVED))
- return;
-
spin_lock(&target->srp_host->target_lock);
list_del(&target->list);
spin_unlock(&target->srp_host->target_lock);
@@ -504,12 +517,26 @@ static void srp_remove_work(struct work_struct *work)
srp_del_scsi_host_attr(target->scsi_host);
srp_remove_host(target->scsi_host);
scsi_remove_host(target->scsi_host);
+
+ /*
+ * Now that we've removed our SCSI host, we will not see new requests
+ * from the mid-layer. We can now clean up our IB resources.
+ */
+ srp_disconnect_target(target);
ib_destroy_cm_id(target->cm_id);
srp_free_target_ib(target);
srp_free_req_data(target);
scsi_host_put(target->scsi_host);
}
+static void srp_queued_remove(struct srp_target_port *target)
+{
+ if (!test_and_set_bit(SRP_STATE_REMOVED, &target->state)) {
+ INIT_WORK(&target->work, srp_remove_work);
+ queue_work(ib_wq, &target->work);
+ }
+}
+
static int srp_connect_target(struct srp_target_port *target)
{
int retries = 3;
@@ -650,7 +677,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
struct ib_wc wc;
int i, ret;
- if (!srp_change_state(target, SRP_TARGET_LIVE, SRP_TARGET_CONNECTING))
+ if (srp_is_removed(target))
return -EAGAIN;
srp_disconnect_target(target);
@@ -686,13 +713,11 @@ static int srp_reconnect_target(struct srp_target_port *target)
for (i = 0; i < SRP_SQ_SIZE; ++i)
list_add(&target->tx_ring[i]->list, &target->free_tx);
- target->qp_in_error = 0;
ret = srp_connect_target(target);
if (ret)
goto err;
- if (!srp_change_state(target, SRP_TARGET_CONNECTING, SRP_TARGET_LIVE))
- ret = -EAGAIN;
+ srp_mark_connected(target);
return ret;
@@ -705,17 +730,8 @@ err:
* However, we have to defer the real removal because we
* are in the context of the SCSI error handler now, which
* will deadlock if we call scsi_remove_host().
- *
- * Schedule our work inside the lock to avoid a race with
- * the flush_scheduled_work() in srp_remove_one().
*/
- spin_lock_irq(&target->lock);
- if (target->state == SRP_TARGET_CONNECTING) {
- target->state = SRP_TARGET_DEAD;
- INIT_WORK(&target->work, srp_remove_work);
- queue_work(ib_wq, &target->work);
- }
- spin_unlock_irq(&target->lock);
+ srp_queued_remove(target);
return ret;
}
@@ -1273,7 +1289,7 @@ static void srp_recv_completion(struct ib_cq *cq, void *target_ptr)
shost_printk(KERN_ERR, target->scsi_host,
PFX "failed receive status %d\n",
wc.status);
- target->qp_in_error = 1;
+ set_bit(SRP_STATE_ERROR, &target->state);
break;
}
@@ -1292,7 +1308,7 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
shost_printk(KERN_ERR, target->scsi_host,
PFX "failed send status %d\n",
wc.status);
- target->qp_in_error = 1;
+ set_bit(SRP_STATE_ERROR, &target->state);
break;
}
@@ -1311,14 +1327,15 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
unsigned long flags;
int len;
- if (target->state == SRP_TARGET_CONNECTING)
- goto err;
+ if (unlikely(target->state)) {
+ if (srp_is_disconnected(target))
+ goto err;
- if (target->state == SRP_TARGET_DEAD ||
- target->state == SRP_TARGET_REMOVED) {
- scmnd->result = DID_BAD_TARGET << 16;
- scmnd->scsi_done(scmnd);
- return 0;
+ if (srp_is_removed(target)) {
+ scmnd->result = DID_BAD_TARGET << 16;
+ scmnd->scsi_done(scmnd);
+ return 0;
+ }
}
spin_lock_irqsave(&target->lock, flags);
@@ -1628,6 +1645,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
case IB_CM_DREQ_RECEIVED:
shost_printk(KERN_WARNING, target->scsi_host,
PFX "DREQ received - connection closed\n");
+ srp_mark_disconnected(target);
if (ib_send_cm_drep(cm_id, NULL, 0))
shost_printk(KERN_ERR, target->scsi_host,
PFX "Sending CM DREP failed\n");
@@ -1665,8 +1683,7 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
struct srp_iu *iu;
struct srp_tsk_mgmt *tsk_mgmt;
- if (target->state == SRP_TARGET_DEAD ||
- target->state == SRP_TARGET_REMOVED)
+ if (srp_is_removed(target))
return -1;
init_completion(&target->tsk_mgmt_done);
@@ -1710,7 +1727,9 @@ static int srp_abort(struct scsi_cmnd *scmnd)
shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
- if (!req || target->qp_in_error || !srp_claim_req(target, req, scmnd))
+ if (srp_in_error(target))
+ return FAILED;
+ if (!req || !srp_claim_req(target, req, scmnd))
return FAILED;
srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
SRP_TSK_ABORT_TASK);
@@ -1728,7 +1747,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n");
- if (target->qp_in_error)
+ if (srp_in_error(target))
return FAILED;
if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd->device->lun,
SRP_TSK_LUN_RESET))
@@ -1943,7 +1962,7 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
list_add_tail(&target->list, &host->target_list);
spin_unlock(&host->target_lock);
- target->state = SRP_TARGET_LIVE;
+ srp_mark_connected(target);
scsi_scan_target(&target->scsi_host->shost_gendev,
0, target->scsi_id, SCAN_WILD_CARD, 0);
@@ -2207,6 +2226,7 @@ static ssize_t srp_create_target(struct device *dev,
target = host_to_target(target_host);
+ target->state = 1UL << SRP_STATE_DISCONNECTED;
target->io_class = SRP_REV16A_IB_IO_CLASS;
target->scsi_host = target_host;
target->srp_host = host;
@@ -2277,7 +2297,6 @@ static ssize_t srp_create_target(struct device *dev,
if (ret)
goto err_free_ib;
- target->qp_in_error = 0;
ret = srp_connect_target(target);
if (ret) {
shost_printk(KERN_ERR, target->scsi_host,
@@ -2467,8 +2486,7 @@ static void srp_remove_one(struct ib_device *device)
{
struct srp_device *srp_dev;
struct srp_host *host, *tmp_host;
- LIST_HEAD(target_list);
- struct srp_target_port *target, *tmp_target;
+ struct srp_target_port *target;
srp_dev = ib_get_client_data(device, &srp_client);
@@ -2481,36 +2499,14 @@ static void srp_remove_one(struct ib_device *device)
wait_for_completion(&host->released);
/*
- * Mark all target ports as removed, so we stop queueing
- * commands and don't try to reconnect.
+ * Initiate removal of our targets, and wait for that to
+ * complete.
*/
spin_lock(&host->target_lock);
- list_for_each_entry(target, &host->target_list, list) {
- spin_lock_irq(&target->lock);
- target->state = SRP_TARGET_REMOVED;
- spin_unlock_irq(&target->lock);
- }
+ list_for_each_entry(target, &host->target_list, list)
+ srp_queued_remove(target);
spin_unlock(&host->target_lock);
-
- /*
- * Wait for any reconnection tasks that may have
- * started before we marked our target ports as
- * removed, and any target port removal tasks.
- */
flush_workqueue(ib_wq);
-
- list_for_each_entry_safe(target, tmp_target,
- &host->target_list, list) {
- srp_del_scsi_host_attr(target->scsi_host);
- srp_remove_host(target->scsi_host);
- scsi_remove_host(target->scsi_host);
- srp_disconnect_target(target);
- ib_destroy_cm_id(target->cm_id);
- srp_free_target_ib(target);
- srp_free_req_data(target);
- scsi_host_put(target->scsi_host);
- }
-
kfree(host);
}
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index e3a6304..750ba47 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -78,11 +78,10 @@ enum {
SRP_MAP_NO_FMR = 1,
};
-enum srp_target_state {
- SRP_TARGET_LIVE,
- SRP_TARGET_CONNECTING,
- SRP_TARGET_DEAD,
- SRP_TARGET_REMOVED
+enum srp_state_bits {
+ SRP_STATE_REMOVED = 0,
+ SRP_STATE_DISCONNECTED = 1,
+ SRP_STATE_ERROR = 2,
};
enum srp_iu_type {
@@ -137,7 +136,7 @@ struct srp_target_port {
struct ib_qp *qp;
u32 lkey;
u32 rkey;
- enum srp_target_state state;
+ unsigned long state;
unsigned int max_iu_len;
unsigned int cmd_sg_cnt;
unsigned int indirect_size;
--
1.7.7.6
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 03/11] IB/srp: don't send anything on a bad QP
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 9:17 ` Bart Van Assche
2012-11-26 4:44 ` [PATCH 04/11] IB/srp: keep processing commands during host removal David Dillow
` (3 subsequent siblings)
4 siblings, 1 reply; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, bvanassche-HInyCGIudOg,
roland-BHEL68pLQRGGvPXPguhicg
Once we know we have an issue with the QP, there is no point trying to
send anything else down the pipe. This also allows us to consolidate
code in the SCSI EH path.
Needs-to-be-signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
[ adapted to new state tracking code ]
Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
---
drivers/infiniband/ulp/srp/ib_srp.c | 21 ++++++++-------------
1 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 1e8ce81..2951e1c 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1328,14 +1328,12 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
int len;
if (unlikely(target->state)) {
- if (srp_is_disconnected(target))
+ if (!srp_is_removed(target))
goto err;
- if (srp_is_removed(target)) {
- scmnd->result = DID_BAD_TARGET << 16;
- scmnd->scsi_done(scmnd);
- return 0;
- }
+ scmnd->result = DID_BAD_TARGET << 16;
+ scmnd->scsi_done(scmnd);
+ return 0;
}
spin_lock_irqsave(&target->lock, flags);
@@ -1683,7 +1681,7 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
struct srp_iu *iu;
struct srp_tsk_mgmt *tsk_mgmt;
- if (srp_is_removed(target))
+ if (target->state)
return -1;
init_completion(&target->tsk_mgmt_done);
@@ -1727,12 +1725,11 @@ static int srp_abort(struct scsi_cmnd *scmnd)
shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
- if (srp_in_error(target))
- return FAILED;
if (!req || !srp_claim_req(target, req, scmnd))
return FAILED;
- srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
- SRP_TSK_ABORT_TASK);
+ if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
+ SRP_TSK_ABORT_TASK))
+ return FAILED;
srp_free_req(target, req, scmnd, 0);
scmnd->result = DID_ABORT << 16;
scmnd->scsi_done(scmnd);
@@ -1747,8 +1744,6 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n");
- if (srp_in_error(target))
- return FAILED;
if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd->device->lun,
SRP_TSK_LUN_RESET))
return FAILED;
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 04/11] IB/srp: keep processing commands during host removal
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 4:44 ` [PATCH 03/11] IB/srp: don't send anything on a bad QP David Dillow
@ 2012-11-26 4:44 ` David Dillow
[not found] ` <8715294a23dded5879b3a327c470d9b6a39ddbc4.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 4:44 ` [PATCH 08/11] srp_transport: Fix attribute registration David Dillow
` (2 subsequent siblings)
4 siblings, 1 reply; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, bvanassche-HInyCGIudOg,
roland-BHEL68pLQRGGvPXPguhicg
From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Some SCSI upper layer drivers, e.g. sd, issue SCSI commands from
inside scsi_remove_host() (see also the sd_shutdown() call in
sd_remove()). Make sure that these commands have a chance to reach
the SCSI device.
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
[ adapted to new state tracking ]
Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
---
drivers/infiniband/ulp/srp/ib_srp.c | 11 ++++++-----
1 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2951e1c..f7d7e6a 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1328,12 +1328,13 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
int len;
if (unlikely(target->state)) {
- if (!srp_is_removed(target))
+ /*
+ * Only requeue commands if we cannot send them to the target.
+ * We'll let commands through during shutdown so that caches
+ * get flushed, etc.
+ */
+ if (srp_is_disconnected(target) || srp_in_error(target))
goto err;
-
- scmnd->result = DID_BAD_TARGET << 16;
- scmnd->scsi_done(scmnd);
- return 0;
}
spin_lock_irqsave(&target->lock, flags);
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 05/11] IB/srp: destroy and recreate QP and CQs on each connection
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
2012-11-26 4:44 ` [PATCH 01/11] IB/srp: enlarge block layer timeout David Dillow
2012-11-26 4:44 ` [PATCH 02/11] IB/srp: simplify state tracking David Dillow
@ 2012-11-26 4:44 ` David Dillow
[not found] ` <8fa9a268ec4dc587970161efe94968f3263aad3b.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 4:44 ` [PATCH 06/11] IB/srp: send disconnect request without waiting for CM timewait exit David Dillow
` (6 subsequent siblings)
9 siblings, 1 reply; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma
Cc: linux-scsi, bvanassche, roland, Ishai Rabinovitz,
Michael S. Tsirkin
From: Ishai Rabinovitz <ishai@mellanox.co.il>
HW QP FATAL errors persist over a reset operation, but we can recover
from that by recreating the QP and associated CQs for each connection.
Creating a new QP/CQ also completely forecloses any possibility of
getting stale completions or packets on the new connection.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
[ updated to current code from OFED, cleaned up commit message ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
---
drivers/infiniband/ulp/srp/ib_srp.c | 66 ++++++++++++++++++----------------
1 files changed, 35 insertions(+), 31 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index f7d7e6a..a481727 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -250,27 +250,29 @@ static int srp_new_cm_id(struct srp_target_port *target)
static int srp_create_target_ib(struct srp_target_port *target)
{
struct ib_qp_init_attr *init_attr;
+ struct ib_cq *recv_cq, *send_cq;
+ struct ib_qp *qp;
int ret;
init_attr = kzalloc(sizeof *init_attr, GFP_KERNEL);
if (!init_attr)
return -ENOMEM;
- target->recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
- srp_recv_completion, NULL, target, SRP_RQ_SIZE, 0);
- if (IS_ERR(target->recv_cq)) {
- ret = PTR_ERR(target->recv_cq);
+ recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
+ srp_recv_completion, NULL, target, SRP_RQ_SIZE, 0);
+ if (IS_ERR(recv_cq)) {
+ ret = PTR_ERR(recv_cq);
goto err;
}
- target->send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
- srp_send_completion, NULL, target, SRP_SQ_SIZE, 0);
- if (IS_ERR(target->send_cq)) {
- ret = PTR_ERR(target->send_cq);
+ send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
+ srp_send_completion, NULL, target, SRP_SQ_SIZE, 0);
+ if (IS_ERR(send_cq)) {
+ ret = PTR_ERR(send_cq);
goto err_recv_cq;
}
- ib_req_notify_cq(target->recv_cq, IB_CQ_NEXT_COMP);
+ ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP);
init_attr->event_handler = srp_qp_event;
init_attr->cap.max_send_wr = SRP_SQ_SIZE;
@@ -279,30 +281,41 @@ static int srp_create_target_ib(struct srp_target_port *target)
init_attr->cap.max_send_sge = 1;
init_attr->sq_sig_type = IB_SIGNAL_ALL_WR;
init_attr->qp_type = IB_QPT_RC;
- init_attr->send_cq = target->send_cq;
- init_attr->recv_cq = target->recv_cq;
+ init_attr->send_cq = send_cq;
+ init_attr->recv_cq = recv_cq;
- target->qp = ib_create_qp(target->srp_host->srp_dev->pd, init_attr);
- if (IS_ERR(target->qp)) {
- ret = PTR_ERR(target->qp);
+ qp = ib_create_qp(target->srp_host->srp_dev->pd, init_attr);
+ if (IS_ERR(qp)) {
+ ret = PTR_ERR(qp);
goto err_send_cq;
}
- ret = srp_init_qp(target, target->qp);
+ ret = srp_init_qp(target, qp);
if (ret)
goto err_qp;
+ if (target->qp)
+ ib_destroy_qp(target->qp);
+ if (target->recv_cq)
+ ib_destroy_cq(target->recv_cq);
+ if (target->send_cq)
+ ib_destroy_cq(target->send_cq);
+
+ target->qp = qp;
+ target->recv_cq = recv_cq;
+ target->send_cq = send_cq;
+
kfree(init_attr);
return 0;
err_qp:
- ib_destroy_qp(target->qp);
+ ib_destroy_qp(qp);
err_send_cq:
- ib_destroy_cq(target->send_cq);
+ ib_destroy_cq(send_cq);
err_recv_cq:
- ib_destroy_cq(target->recv_cq);
+ ib_destroy_cq(recv_cq);
err:
kfree(init_attr);
@@ -317,6 +330,9 @@ static void srp_free_target_ib(struct srp_target_port *target)
ib_destroy_cq(target->send_cq);
ib_destroy_cq(target->recv_cq);
+ target->qp = NULL;
+ target->send_cq = target->recv_cq = NULL;
+
for (i = 0; i < SRP_RQ_SIZE; ++i)
srp_free_iu(target->srp_host, target->rx_ring[i]);
for (i = 0; i < SRP_SQ_SIZE; ++i)
@@ -673,8 +689,6 @@ static void srp_reset_req(struct srp_target_port *target, struct srp_request *re
static int srp_reconnect_target(struct srp_target_port *target)
{
- struct ib_qp_attr qp_attr;
- struct ib_wc wc;
int i, ret;
if (srp_is_removed(target))
@@ -689,20 +703,10 @@ static int srp_reconnect_target(struct srp_target_port *target)
if (ret)
goto err;
- qp_attr.qp_state = IB_QPS_RESET;
- ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE);
- if (ret)
- goto err;
-
- ret = srp_init_qp(target, target->qp);
+ ret = srp_create_target_ib(target);
if (ret)
goto err;
- while (ib_poll_cq(target->recv_cq, 1, &wc) > 0)
- ; /* nothing */
- while (ib_poll_cq(target->send_cq, 1, &wc) > 0)
- ; /* nothing */
-
for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
struct srp_request *req = &target->req_ring[i];
if (req->scmnd)
--
1.7.7.6
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 06/11] IB/srp: send disconnect request without waiting for CM timewait exit
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
` (2 preceding siblings ...)
2012-11-26 4:44 ` [PATCH 05/11] IB/srp: destroy and recreate QP and CQs on each connection David Dillow
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 4:44 ` [PATCH 07/11] IB/srp: Document sysfs attributes David Dillow
` (5 subsequent siblings)
9 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma; +Cc: linux-scsi, Vu Pham, bvanassche, roland
From: Vu Pham <vu@mellanox.com>
From: Vu Pham <vu@mellanox.com>
Now that SRP recreates the CM id, QP, and CQ for each connection,
there is no need to wait for the timewait state to complete.
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: David Dillow <dillowda@ornl.gov>
---
drivers/infiniband/ulp/srp/ib_srp.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index a481727..c0cc245 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -479,13 +479,11 @@ static void srp_disconnect_target(struct srp_target_port *target)
/* XXX should send SRP_I_LOGOUT request */
- init_completion(&target->done);
if (ib_send_cm_dreq(target->cm_id, NULL, 0)) {
shost_printk(KERN_DEBUG, target->scsi_host,
PFX "Sending CM DREQ failed\n");
return;
}
- wait_for_completion(&target->done);
}
static void srp_free_req_data(struct srp_target_port *target)
@@ -1658,7 +1656,6 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
shost_printk(KERN_ERR, target->scsi_host,
PFX "connection closed\n");
- comp = 1;
target->status = 0;
break;
--
1.7.7.6
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 07/11] IB/srp: Document sysfs attributes
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
` (3 preceding siblings ...)
2012-11-26 4:44 ` [PATCH 06/11] IB/srp: send disconnect request without waiting for CM timewait exit David Dillow
@ 2012-11-26 4:44 ` David Dillow
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
` (4 subsequent siblings)
9 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma; +Cc: linux-scsi, bvanassche, roland
From: Bart Van Assche <bvanassche@acm.org>
Document the sysfs attributes of the SRP initiator according
to the rules specified in Documentation/ABI/README.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
---
Documentation/ABI/stable/sysfs-driver-ib_srp | 156 ++++++++++++++++++++++++++
1 files changed, 156 insertions(+), 0 deletions(-)
create mode 100644 Documentation/ABI/stable/sysfs-driver-ib_srp
diff --git a/Documentation/ABI/stable/sysfs-driver-ib_srp b/Documentation/ABI/stable/sysfs-driver-ib_srp
new file mode 100644
index 0000000..481aae9
--- /dev/null
+++ b/Documentation/ABI/stable/sysfs-driver-ib_srp
@@ -0,0 +1,156 @@
+What: /sys/class/infiniband_srp/srp-<hca>-<port_number>/add_target
+Date: January 2, 2006
+KernelVersion: 2.6.15
+Contact: linux-rdma@vger.kernel.org
+Description: Interface for making ib_srp connect to a new target.
+ One can request ib_srp to connect to a new target by writing
+ a comma-separated list of login parameters to this sysfs
+ attribute. The supported parameters are:
+ * id_ext, a 16-digit hexadecimal number specifying the eight
+ byte identifier extension in the 16-byte SRP target port
+ identifier. The target port identifier is sent by ib_srp
+ to the target in the SRP_LOGIN_REQ request.
+ * ioc_guid, a 16-digit hexadecimal number specifying the eight
+ byte I/O controller GUID portion of the 16-byte target port
+ identifier.
+ * dgid, a 32-digit hexadecimal number specifying the
+ destination GID.
+ * pkey, a four-digit hexadecimal number specifying the
+ InfiniBand partition key.
+ * service_id, a 16-digit hexadecimal number specifying the
+ InfiniBand service ID used to establish communication with
+ the SRP target. How to find out the value of the service ID
+ is specified in the documentation of the SRP target.
+ * max_sect, a decimal number specifying the maximum number of
+ 512-byte sectors to be transferred via a single SCSI command.
+ * max_cmd_per_lun, a decimal number specifying the maximum
+ number of outstanding commands for a single LUN.
+ * io_class, a hexadecimal number specifying the SRP I/O class.
+ Must be either 0xff00 (rev 10) or 0x0100 (rev 16a). The I/O
+ class defines the format of the SRP initiator and target
+ port identifiers.
+ * initiator_ext, a 16-digit hexadecimal number specifying the
+ identifier extension portion of the SRP initiator port
+ identifier. This data is sent by the initiator to the target
+ in the SRP_LOGIN_REQ request.
+ * cmd_sg_entries, a number in the range 1..255 that specifies
+ the maximum number of data buffer descriptors stored in the
+ SRP_CMD information unit itself. With allow_ext_sg=0 the
+ parameter cmd_sg_entries defines the maximum S/G list length
+ for a single SRP_CMD, and commands whose S/G list length
+ exceeds this limit after S/G list collapsing will fail.
+ * allow_ext_sg, whether ib_srp is allowed to include a partial
+ memory descriptor list in an SRP_CMD instead of the entire
+ list. If a partial memory descriptor list has been included
+ in an SRP_CMD the remaining memory descriptors are
+ communicated from initiator to target via an additional RDMA
+ transfer. Setting allow_ext_sg to 1 increases the maximum
+ amount of data that can be transferred between initiator and
+ target via a single SCSI command. Since not all SRP target
+ implementations support partial memory descriptor lists the
+ default value for this option is 0.
+ * sg_tablesize, a number in the range 1..2048 specifying the
+ maximum S/G list length the SCSI layer is allowed to pass to
+ ib_srp. Specifying a value that exceeds cmd_sg_entries is
+ only safe with partial memory descriptor list support enabled
+ (allow_ext_sg=1).
+
+What: /sys/class/infiniband_srp/srp-<hca>-<port_number>/ibdev
+Date: January 2, 2006
+KernelVersion: 2.6.15
+Contact: linux-rdma@vger.kernel.org
+Description: HCA name (<hca>).
+
+What: /sys/class/infiniband_srp/srp-<hca>-<port_number>/port
+Date: January 2, 2006
+KernelVersion: 2.6.15
+Contact: linux-rdma@vger.kernel.org
+Description: HCA port number (<port_number>).
+
+What: /sys/class/scsi_host/host<n>/allow_ext_sg
+Date: May 19, 2011
+KernelVersion: 2.6.39
+Contact: linux-rdma@vger.kernel.org
+Description: Whether ib_srp is allowed to include a partial memory
+ descriptor list in an SRP_CMD when communicating with an SRP
+ target.
+
+What: /sys/class/scsi_host/host<n>/cmd_sg_entries
+Date: May 19, 2011
+KernelVersion: 2.6.39
+Contact: linux-rdma@vger.kernel.org
+Description: Maximum number of data buffer descriptors that may be sent to
+ the target in a single SRP_CMD request.
+
+What: /sys/class/scsi_host/host<n>/dgid
+Date: June 17, 2006
+KernelVersion: 2.6.17
+Contact: linux-rdma@vger.kernel.org
+Description: InfiniBand destination GID used for communication with the SRP
+ target. Differs from orig_dgid if port redirection has happened.
+
+What: /sys/class/scsi_host/host<n>/id_ext
+Date: June 17, 2006
+KernelVersion: 2.6.17
+Contact: linux-rdma@vger.kernel.org
+Description: Eight-byte identifier extension portion of the 16-byte target
+ port identifier.
+
+What: /sys/class/scsi_host/host<n>/ioc_guid
+Date: June 17, 2006
+KernelVersion: 2.6.17
+Contact: linux-rdma@vger.kernel.org
+Description: Eight-byte I/O controller GUID portion of the 16-byte target
+ port identifier.
+
+What: /sys/class/scsi_host/host<n>/local_ib_device
+Date: November 29, 2006
+KernelVersion: 2.6.19
+Contact: linux-rdma@vger.kernel.org
+Description: Name of the InfiniBand HCA used for communicating with the
+ SRP target.
+
+What: /sys/class/scsi_host/host<n>/local_ib_port
+Date: November 29, 2006
+KernelVersion: 2.6.19
+Contact: linux-rdma@vger.kernel.org
+Description: Number of the HCA port used for communicating with the
+ SRP target.
+
+What: /sys/class/scsi_host/host<n>/orig_dgid
+Date: June 17, 2006
+KernelVersion: 2.6.17
+Contact: linux-rdma@vger.kernel.org
+Description: InfiniBand destination GID specified in the parameters
+ written to the add_target sysfs attribute.
+
+What: /sys/class/scsi_host/host<n>/pkey
+Date: June 17, 2006
+KernelVersion: 2.6.17
+Contact: linux-rdma@vger.kernel.org
+Description: A 16-bit number representing the InfiniBand partition key used
+ for communication with the SRP target.
+
+What: /sys/class/scsi_host/host<n>/req_lim
+Date: October 20, 2010
+KernelVersion: 2.6.36
+Contact: linux-rdma@vger.kernel.org
+Description: Number of requests ib_srp can send to the target before it has
+ to wait for more credits. For more information see also the
+ SRP credit algorithm in the SRP specification.
+
+What: /sys/class/scsi_host/host<n>/service_id
+Date: June 17, 2006
+KernelVersion: 2.6.17
+Contact: linux-rdma@vger.kernel.org
+Description: InfiniBand service ID used for establishing communication with
+ the SRP target.
+
+What: /sys/class/scsi_host/host<n>/zero_req_lim
+Date: September 20, 2006
+KernelVersion: 2.6.18
+Contact: linux-rdma@vger.kernel.org
+Description: Number of times the initiator had to wait before sending a
+ request to the target because it ran out of credits. For more
+ information see also the SRP credit algorithm in the SRP
+ specification.
--
1.7.7.6
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 08/11] srp_transport: Fix attribute registration
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 4:44 ` [PATCH 03/11] IB/srp: don't send anything on a bad QP David Dillow
2012-11-26 4:44 ` [PATCH 04/11] IB/srp: keep processing commands during host removal David Dillow
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 4:44 ` [PATCH 09/11] srp_transport: Simplify attribute initialization code David Dillow
2012-11-26 4:44 ` [PATCH 11/11] IB/srp: Allow SRP disconnect through sysfs David Dillow
4 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, bvanassche-HInyCGIudOg,
roland-BHEL68pLQRGGvPXPguhicg, FUJITA Tomonori, Robert Jennings
From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Register transport attributes after the attribute array has been
set up instead of before. The current code can trigger a race
condition because the code reading the attribute array can run
on another thread than the code that initialized that array.
Make sure that any code reading the attribute array will see all
values written into that array.
Cc: FUJITA Tomonori <fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: Robert Jennings <rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
---
drivers/scsi/scsi_transport_srp.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 21a045e..07c4394 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -324,13 +324,14 @@ srp_attach_transport(struct srp_function_template *ft)
i->rport_attr_cont.ac.attrs = &i->rport_attrs[0];
i->rport_attr_cont.ac.class = &srp_rport_class.class;
i->rport_attr_cont.ac.match = srp_rport_match;
- transport_container_register(&i->rport_attr_cont);
count = 0;
SETUP_RPORT_ATTRIBUTE_RD(port_id);
SETUP_RPORT_ATTRIBUTE_RD(roles);
i->rport_attrs[count] = NULL;
+ transport_container_register(&i->rport_attr_cont);
+
i->f = ft;
return &i->t;
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 09/11] srp_transport: Simplify attribute initialization code
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
` (2 preceding siblings ...)
2012-11-26 4:44 ` [PATCH 08/11] srp_transport: Fix attribute registration David Dillow
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 5:02 ` David Dillow
2012-11-26 4:44 ` [PATCH 11/11] IB/srp: Allow SRP disconnect through sysfs David Dillow
4 siblings, 1 reply; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, David Dillow,
bvanassche-HInyCGIudOg, roland-BHEL68pLQRGGvPXPguhicg,
FUJITA Tomonori, Robert Jennings
From: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org
From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Eliminate the private_rport_attrs[] array and the SETUP_*() macros
used to set up that array since the information in that array
duplicates the information in the static device attributes. Also,
verify whether SRP_RPORT_ATTRS is large enough since it is easy
to forget to update that macro when adding new attributes.
Cc: FUJITA Tomonori <fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: Robert Jennings <rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
---
drivers/scsi/scsi_transport_srp.c | 26 ++++----------------------
1 files changed, 4 insertions(+), 22 deletions(-)
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 07c4394..0d85f79 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -47,7 +47,6 @@ struct srp_internal {
struct device_attribute *host_attrs[SRP_HOST_ATTRS + 1];
struct device_attribute *rport_attrs[SRP_RPORT_ATTRS + 1];
- struct device_attribute private_rport_attrs[SRP_RPORT_ATTRS];
struct transport_container rport_attr_cont;
};
@@ -72,24 +71,6 @@ static DECLARE_TRANSPORT_CLASS(srp_host_class, "srp_host", srp_host_setup,
static DECLARE_TRANSPORT_CLASS(srp_rport_class, "srp_remote_ports",
NULL, NULL, NULL);
-#define SETUP_TEMPLATE(attrb, field, perm, test, ro_test, ro_perm) \
- i->private_##attrb[count] = dev_attr_##field; \
- i->private_##attrb[count].attr.mode = perm; \
- if (ro_test) { \
- i->private_##attrb[count].attr.mode = ro_perm; \
- i->private_##attrb[count].store = NULL; \
- } \
- i->attrb[count] = &i->private_##attrb[count]; \
- if (test) \
- count++
-
-#define SETUP_RPORT_ATTRIBUTE_RD(field) \
- SETUP_TEMPLATE(rport_attrs, field, S_IRUGO, 1, 0, 0)
-
-#define SETUP_RPORT_ATTRIBUTE_RW(field) \
- SETUP_TEMPLATE(rport_attrs, field, S_IRUGO | S_IWUSR, \
- 1, 1, S_IRUGO)
-
#define SRP_PID(p) \
(p)->port_id[0], (p)->port_id[1], (p)->port_id[2], (p)->port_id[3], \
(p)->port_id[4], (p)->port_id[5], (p)->port_id[6], (p)->port_id[7], \
@@ -326,9 +307,10 @@ srp_attach_transport(struct srp_function_template *ft)
i->rport_attr_cont.ac.match = srp_rport_match;
count = 0;
- SETUP_RPORT_ATTRIBUTE_RD(port_id);
- SETUP_RPORT_ATTRIBUTE_RD(roles);
- i->rport_attrs[count] = NULL;
+ i->rport_attrs[count++] = &dev_attr_port_id;
+ i->rport_attrs[count++] = &dev_attr_roles;
+ i->rport_attrs[count++] = NULL;
+ BUG_ON(count > ARRAY_SIZE(i->rport_attrs));
transport_container_register(&i->rport_attr_cont);
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 10/11] srp_transport: Document sysfs attributes
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
` (5 preceding siblings ...)
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
@ 2012-11-26 4:44 ` David Dillow
2012-11-26 7:57 ` [PATCH 00/11] First pass at merging Bart's HA work Or Gerlitz
` (2 subsequent siblings)
9 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma
Cc: linux-scsi, bvanassche, roland, FUJITA Tomonori, Robert Jennings
From: Bart Van Assche <bvanassche@acm.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
---
Documentation/ABI/stable/sysfs-transport-srp | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)
create mode 100644 Documentation/ABI/stable/sysfs-transport-srp
diff --git a/Documentation/ABI/stable/sysfs-transport-srp b/Documentation/ABI/stable/sysfs-transport-srp
new file mode 100644
index 0000000..7b0d4a5
--- /dev/null
+++ b/Documentation/ABI/stable/sysfs-transport-srp
@@ -0,0 +1,12 @@
+What: /sys/class/srp_remote_ports/port-<h>:<n>/port_id
+Date: June 27, 2007
+KernelVersion: 2.6.24
+Contact: linux-scsi@vger.kernel.org
+Description: 16-byte local SRP port identifier in hexadecimal format. An
+ example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00.
+
+What: /sys/class/srp_remote_ports/port-<h>:<n>/roles
+Date: June 27, 2007
+KernelVersion: 2.6.24
+Contact: linux-scsi@vger.kernel.org
+Description: Role of the remote port. Either "SRP Initiator" or "SRP Target".
--
1.7.7.6
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 11/11] IB/srp: Allow SRP disconnect through sysfs
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
` (3 preceding siblings ...)
2012-11-26 4:44 ` [PATCH 09/11] srp_transport: Simplify attribute initialization code David Dillow
@ 2012-11-26 4:44 ` David Dillow
4 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 4:44 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, bvanassche-HInyCGIudOg,
roland-BHEL68pLQRGGvPXPguhicg, FUJITA Tomonori, Robert Jennings
From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Make it possible to disconnect the IB RC connection used by the
SRP protocol to communicate with a target.
Let the SRP transport layer create a sysfs "delete" attribute for
initiator drivers that support this functionality.
Cc: FUJITA Tomonori <fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: Robert Jennings <rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
---
Documentation/ABI/stable/sysfs-transport-srp | 7 +++++++
drivers/infiniband/ulp/srp/ib_srp.c | 10 ++++++++++
drivers/scsi/scsi_transport_srp.c | 22 +++++++++++++++++++++-
include/scsi/scsi_transport_srp.h | 8 ++++++++
4 files changed, 46 insertions(+), 1 deletions(-)
diff --git a/Documentation/ABI/stable/sysfs-transport-srp b/Documentation/ABI/stable/sysfs-transport-srp
index 7b0d4a5..b36fb0d 100644
--- a/Documentation/ABI/stable/sysfs-transport-srp
+++ b/Documentation/ABI/stable/sysfs-transport-srp
@@ -1,3 +1,10 @@
+What: /sys/class/srp_remote_ports/port-<h>:<n>/delete
+Date: June 1, 2012
+KernelVersion: 3.7
+Contact: linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+Description: Instructs an SRP initiator to disconnect from a target and to
+ remove all LUNs imported from that target.
+
What: /sys/class/srp_remote_ports/port-<h>:<n>/port_id
Date: June 27, 2007
KernelVersion: 2.6.24
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index c0cc245..8b987bf 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -551,6 +551,13 @@ static void srp_queued_remove(struct srp_target_port *target)
}
}
+static void srp_rport_delete(struct srp_rport *rport)
+{
+ struct srp_target_port *target = rport->lld_data;
+
+ srp_queued_remove(target);
+}
+
static int srp_connect_target(struct srp_target_port *target)
{
int retries = 3;
@@ -1955,6 +1962,8 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
return PTR_ERR(rport);
}
+ rport->lld_data = target;
+
spin_lock(&host->target_lock);
list_add_tail(&target->list, &host->target_list);
spin_unlock(&host->target_lock);
@@ -2516,6 +2525,7 @@ static void srp_remove_one(struct ib_device *device)
}
static struct srp_function_template ib_srp_transport_functions = {
+ .rport_delete = srp_rport_delete,
};
static int __init srp_init_module(void)
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 0d85f79..f379c7f 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -38,7 +38,7 @@ struct srp_host_attrs {
#define to_srp_host_attrs(host) ((struct srp_host_attrs *)(host)->shost_data)
#define SRP_HOST_ATTRS 0
-#define SRP_RPORT_ATTRS 2
+#define SRP_RPORT_ATTRS 3
struct srp_internal {
struct scsi_transport_template t;
@@ -116,6 +116,24 @@ show_srp_rport_roles(struct device *dev, struct device_attribute *attr,
static DEVICE_ATTR(roles, S_IRUGO, show_srp_rport_roles, NULL);
+static ssize_t store_srp_rport_delete(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct srp_rport *rport = transport_class_to_srp_rport(dev);
+ struct Scsi_Host *shost = dev_to_shost(dev);
+ struct srp_internal *i = to_srp_internal(shost->transportt);
+
+ if (i->f->rport_delete) {
+ i->f->rport_delete(rport);
+ return count;
+ } else {
+ return -ENOSYS;
+ }
+}
+
+static DEVICE_ATTR(delete, S_IWUSR, NULL, store_srp_rport_delete);
+
static void srp_rport_release(struct device *dev)
{
struct srp_rport *rport = dev_to_rport(dev);
@@ -309,6 +327,8 @@ srp_attach_transport(struct srp_function_template *ft)
count = 0;
i->rport_attrs[count++] = &dev_attr_port_id;
i->rport_attrs[count++] = &dev_attr_roles;
+ if (ft->rport_delete)
+ i->rport_attrs[count++] = &dev_attr_delete;
i->rport_attrs[count++] = NULL;
BUG_ON(count > ARRAY_SIZE(i->rport_attrs));
diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h
index 9c60ca1..ff0f04a 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -14,13 +14,21 @@ struct srp_rport_identifiers {
};
struct srp_rport {
+ /* for initiator and target drivers */
+
struct device dev;
u8 port_id[16];
u8 roles;
+
+ /* for initiator drivers */
+
+ void *lld_data; /* LLD private data */
};
struct srp_function_template {
+ /* for initiator drivers */
+ void (*rport_delete)(struct srp_rport *rport);
/* for target drivers */
int (* tsk_mgmt_response)(struct Scsi_Host *, u64, u64, int);
int (* it_nexus_response)(struct Scsi_Host *, u64, int);
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH 09/11] srp_transport: Simplify attribute initialization code
2012-11-26 4:44 ` [PATCH 09/11] srp_transport: Simplify attribute initialization code David Dillow
@ 2012-11-26 5:02 ` David Dillow
0 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-26 5:02 UTC (permalink / raw)
To: linux-rdma; +Cc: linux-scsi, bvanassche
On Sun, 2012-11-25 at 23:44 -0500, David Dillow wrote:
> From: David Dillow <dillowda@ornl.gov
>
> From: Bart Van Assche <bvanassche@acm.org>
>
> Eliminate the private_rport_attrs[] array and the SETUP_*() macros
> used to set up that array since the information in that array
> duplicates the information in the static device attributes. Also,
> verify whether SRP_RPORT_ATTRS is large enough since it is easy
> to forget to update that macro when adding new attributes.
Bah, screwed up some edits for the batch email. This one (and Vu's) is
properly attributed in the repository.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
` (6 preceding siblings ...)
2012-11-26 4:44 ` [PATCH 10/11] srp_transport: Document sysfs attributes David Dillow
@ 2012-11-26 7:57 ` Or Gerlitz
2012-11-27 4:53 ` David Dillow
2012-11-26 18:50 ` Roland Dreier
2012-11-27 16:34 ` Bart Van Assche
9 siblings, 1 reply; 42+ messages in thread
From: Or Gerlitz @ 2012-11-26 7:57 UTC (permalink / raw)
To: David Dillow; +Cc: linux-rdma, linux-scsi, bvanassche, roland
On Mon, Nov 26, 2012 at 6:44 AM, David Dillow <dillowda@ornl.gov> wrote:
> One may also pull this series from github:
> git pull git://github.com/dillow/srp-initiator.git ha-merge-v1
Hi Dave,
The kernel maintainers file specifies the following tree
git://git.kernel.org/pub/scm/linux/kernel/git/dad/srp-initiator.git
for you -- I assume you've moved to github during the kernel.org
sitedown period, could you go back to kernel.org? its nice to have the
same look and feel e.g through git web for the IB, SRP, Networking,
SCSI etc trees.
Or.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 03/11] IB/srp: don't send anything on a bad QP
2012-11-26 4:44 ` [PATCH 03/11] IB/srp: don't send anything on a bad QP David Dillow
@ 2012-11-26 9:17 ` Bart Van Assche
[not found] ` <50B333AF.6040509-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-11-26 9:17 UTC (permalink / raw)
To: David Dillow; +Cc: linux-rdma, linux-scsi, roland
On 11/26/12 05:44, David Dillow wrote:
> Once we know we have an issue with the QP, there is no point trying to
> send anything else down the pipe. This also allows us to consolidate
> code in the SCSI EH path.
>
[ ... ]
> @@ -1683,7 +1681,7 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
> struct srp_iu *iu;
> struct srp_tsk_mgmt *tsk_mgmt;
>
> - if (srp_is_removed(target))
> + if (target->state)
> return -1;
>
Hi Dave,
After I posted the patch on which the above patch has been based I
realized that testing the connection state at the start of
srp_send_tsk_mgmt() is not sufficient to avoid QPN use-after-free. If a
DREQ is received by the initiator after the above test has been
performed and before the task management function has been sent it is
still possible to send a task management function over a closed QP. I'd
like to address this in a different way - see also the thread called
"SCSI LLDs, the SCSI error handler and host resource lifetime" on the
linux-scsi mailing list (November 20,
http://marc.info/?t=135342155500003&r=1). Sorry for the confusion I caused.
Bart.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 04/11] IB/srp: keep processing commands during host removal
[not found] ` <8715294a23dded5879b3a327c470d9b6a39ddbc4.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
@ 2012-11-26 9:43 ` Bart Van Assche
2012-11-27 3:16 ` David Dillow
0 siblings, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-11-26 9:43 UTC (permalink / raw)
To: David Dillow
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, roland-BHEL68pLQRGGvPXPguhicg
On 11/26/12 05:44, David Dillow wrote:
> From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>
> Some SCSI upper layer drivers, e.g. sd, issue SCSI commands from
> inside scsi_remove_host() (see also the sd_shutdown() call in
> sd_remove()). Make sure that these commands have a chance to reach
> the SCSI device.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> [ adapted to new state tracking ]
> Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> ---
> drivers/infiniband/ulp/srp/ib_srp.c | 11 ++++++-----
> 1 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 2951e1c..f7d7e6a 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1328,12 +1328,13 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
> int len;
>
> if (unlikely(target->state)) {
> - if (!srp_is_removed(target))
> + /*
> + * Only requeue commands if we cannot send them to the target.
> + * We'll let commands through during shutdown so that caches
> + * get flushed, etc.
> + */
> + if (srp_is_disconnected(target) || srp_in_error(target))
> goto err;
> -
> - scmnd->result = DID_BAD_TARGET << 16;
> - scmnd->scsi_done(scmnd);
> - return 0;
> }
>
> spin_lock_irqsave(&target->lock, flags);
Hi Dave,
After I posted the patch on which the above patch has been based I
realized that it can be simplified. Since the patch called "Eliminate
state SRP_TARGET_CONNECTING" blocks the SCSI host as long as the RDMA
RC connection is being reconnected srp_queuecommand() won't be invoked
during the "disconnected" or "error" state. How about adding the patch
below just after the patch that eliminates the state
SRP_TARGET_CONNECTING ?
Bart.
[PATCH] ib_srp: Keep processing commands during host removal
Some SCSI upper layer drivers, e.g. sd, issue SCSI commands from
inside scsi_remove_host(). See also the sd_shutdown() call in
sd_remove(). Make sure that these commands have a chance to reach
the SCSI device.
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
---
drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 39723e7..03b9fa0 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1324,13 +1324,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
unsigned long flags;
int len;
- if (target->state == SRP_TARGET_DEAD ||
- target->state == SRP_TARGET_REMOVED) {
- scmnd->result = DID_BAD_TARGET << 16;
- scmnd->scsi_done(scmnd);
- return 0;
- }
-
spin_lock_irqsave(&target->lock, flags);
iu = __srp_get_tx_iu(target, SRP_IU_CMD);
if (!iu)
--
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH 02/11] IB/srp: simplify state tracking
2012-11-26 4:44 ` [PATCH 02/11] IB/srp: simplify state tracking David Dillow
@ 2012-11-26 9:46 ` Bart Van Assche
[not found] ` <50B33A91.3060103-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-11-26 9:46 UTC (permalink / raw)
To: David Dillow; +Cc: linux-rdma, linux-scsi, roland
On 11/26/12 05:44, David Dillow wrote:
> The state of the target has several conditions that overlap, making it
> easier to model as a bit-field of exceptional conditions rather than an
> enum of all possible states.
>
> Bart Van Assche did the hard work of identifying the states that can be
> removed, and did a first patch that consolidated the state space.
[ ... ]
Hi Dave,
Could you please explain why you would prefer to use test_bit () /
test_and_set_bit() and clear_bit() for managing the SRP target state ?
As far as I can see the target state is not changed in any code path
where it matters how fast the state is changed. Maybe I'm missing
something ?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
` (7 preceding siblings ...)
2012-11-26 7:57 ` [PATCH 00/11] First pass at merging Bart's HA work Or Gerlitz
@ 2012-11-26 18:50 ` Roland Dreier
2012-11-26 19:15 ` James Bottomley
2012-11-27 16:34 ` Bart Van Assche
9 siblings, 1 reply; 42+ messages in thread
From: Roland Dreier @ 2012-11-26 18:50 UTC (permalink / raw)
To: David Dillow; +Cc: linux-rdma@vger.kernel.org, linux-scsi, Bart Van Assche
> This series compiles, but is otherwise UNTESTED. I'll be working on that
> over the next few days, with an eye on getting as much of Bart's work
> into 3.8 as possible.
Hi Dave,
Great to have you back. Certainly I'd like to get this stuff into 3.8 too.
A couple of comments:
- I think the srp_transport stuff should go through linux-scsi / James B.
instead of my tree, esp. since it's shared with the IBM vscsi stuff (I think)
- I see Bart had a few comments about a few of your patches, I'll wait
for you guys to hash that out.
Otherwise definitely happy to merge this for 3.8!
Thanks,
Roland
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 05/11] IB/srp: destroy and recreate QP and CQs on each connection
[not found] ` <8fa9a268ec4dc587970161efe94968f3263aad3b.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
@ 2012-11-26 18:57 ` Bart Van Assche
0 siblings, 0 replies; 42+ messages in thread
From: Bart Van Assche @ 2012-11-26 18:57 UTC (permalink / raw)
To: David Dillow
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, roland-BHEL68pLQRGGvPXPguhicg,
Ishai Rabinovitz, Michael S. Tsirkin
On 11/26/12 05:44, David Dillow wrote:
> From: Ishai Rabinovitz <ishai-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
>
> HW QP FATAL errors persist over a reset operation, but we can recover
> from that by recreating the QP and associated CQs for each connection.
> Creating a new QP/CQ also completely forecloses any possibility of
> getting stale completions or packets on the new connection.
[ ... ]
This looks like a more elegant way than what I had came up with ("Make
srp_disconnect_target() wait for IB completions") so I'm fine with this
patch, and with the next one too.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 18:50 ` Roland Dreier
@ 2012-11-26 19:15 ` James Bottomley
2012-11-26 19:22 ` Roland Dreier
2012-11-27 4:04 ` David Dillow
0 siblings, 2 replies; 42+ messages in thread
From: James Bottomley @ 2012-11-26 19:15 UTC (permalink / raw)
To: Roland Dreier
Cc: David Dillow, linux-rdma@vger.kernel.org, linux-scsi,
Bart Van Assche
On Mon, 2012-11-26 at 10:50 -0800, Roland Dreier wrote:
> - I think the srp_transport stuff should go through linux-scsi /
> James B.
> instead of my tree, esp. since it's shared with the IBM vscsi stuff
> (I think)
> - I see Bart had a few comments about a few of your patches, I'll
> wait
> for you guys to hash that out.
I'm amenable to that, but we do need an agreed patch set, as Roland
says. I also hate to apply the pressure, but I suspect -rc7 was the
last -rc, so I'm expecting the merge window to open on 2/12.
James
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 19:15 ` James Bottomley
@ 2012-11-26 19:22 ` Roland Dreier
2012-11-27 4:04 ` David Dillow
1 sibling, 0 replies; 42+ messages in thread
From: Roland Dreier @ 2012-11-26 19:22 UTC (permalink / raw)
To: James Bottomley
Cc: David Dillow, linux-rdma@vger.kernel.org, linux-scsi,
Bart Van Assche
> I'm amenable to that, but we do need an agreed patch set, as Roland
> says. I also hate to apply the pressure, but I suspect -rc7 was the
> last -rc, so I'm expecting the merge window to open on 2/12.
I think the srp_transport bits are all simple and non-controversial.
So at least from my perspective, OK to merge right now, except
that Dave mentioned he screwed up the attribution on one email,
but that should be easy to fix with a quick resend.
- R.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 04/11] IB/srp: keep processing commands during host removal
2012-11-26 9:43 ` Bart Van Assche
@ 2012-11-27 3:16 ` David Dillow
0 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-27 3:16 UTC (permalink / raw)
To: Bart Van Assche; +Cc: linux-rdma, linux-scsi, roland
On Mon, 2012-11-26 at 10:43 +0100, Bart Van Assche wrote:
> On 11/26/12 05:44, David Dillow wrote:
> > From: Bart Van Assche <bvanassche@acm.org>
> >
> > Some SCSI upper layer drivers, e.g. sd, issue SCSI commands from
> > inside scsi_remove_host() (see also the sd_shutdown() call in
> > sd_remove()). Make sure that these commands have a chance to reach
> > the SCSI device.
> Hi Dave,
>
> After I posted the patch on which the above patch has been based I
> realized that it can be simplified. Since the patch called "Eliminate
> state SRP_TARGET_CONNECTING" blocks the SCSI host as long as the RDMA
> RC connection is being reconnected srp_queuecommand() won't be invoked
> during the "disconnected" or "error" state. How about adding the patch
> below just after the patch that eliminates the state
> SRP_TARGET_CONNECTING ?
Right, I left off out the part of that patch that blocks the target as I
wanted to think about its implications a bit more. I think it is
probably the right thing to do, and will likely add that functionality
back.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 03/11] IB/srp: don't send anything on a bad QP
[not found] ` <50B333AF.6040509-HInyCGIudOg@public.gmane.org>
@ 2012-11-27 3:31 ` David Dillow
0 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-27 3:31 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, roland-BHEL68pLQRGGvPXPguhicg
On Mon, 2012-11-26 at 10:17 +0100, Bart Van Assche wrote:
> On 11/26/12 05:44, David Dillow wrote:
> > Once we know we have an issue with the QP, there is no point trying to
> > send anything else down the pipe. This also allows us to consolidate
> > code in the SCSI EH path.
> After I posted the patch on which the above patch has been based I
> realized that testing the connection state at the start of
> srp_send_tsk_mgmt() is not sufficient to avoid QPN use-after-free. If a
> DREQ is received by the initiator after the above test has been
> performed and before the task management function has been sent it is
> still possible to send a task management function over a closed QP.
AFIACT, DREQ does not actually close the QP -- it only tells us that the
other side would like to. We don't actually close the connection until
we try to send on it again, I think -- not sure if we see recv failures
for the queued work items.
Regardless, the issue of resource lifetime is an issue that needs
solving.
> I'd like to address this in a different way - see also the thread called
> "SCSI LLDs, the SCSI error handler and host resource lifetime" on the
> linux-scsi mailing list (November 20,
> http://marc.info/?t=135342155500003&r=1).
I like the direction you propose there. It seems that scsi_remove_host()
at one point waited for the EH thread to exit -- or perhaps it was part
of scsi_host_put() chain -- as there's the longstanding deferral to the
work queue for the SRP target removal. Of course, that's been there for
~5 years now, and things have changed in the SCSI stack.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 02/11] IB/srp: simplify state tracking
[not found] ` <50B33A91.3060103-HInyCGIudOg@public.gmane.org>
@ 2012-11-27 3:56 ` David Dillow
0 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-27 3:56 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, roland-BHEL68pLQRGGvPXPguhicg
On Mon, 2012-11-26 at 10:46 +0100, Bart Van Assche wrote:
> On 11/26/12 05:44, David Dillow wrote:
> > The state of the target has several conditions that overlap, making it
> > easier to model as a bit-field of exceptional conditions rather than an
> > enum of all possible states.
> >
> > Bart Van Assche did the hard work of identifying the states that can be
> > removed, and did a first patch that consolidated the state space.
> [ ... ]
>
> Hi Dave,
>
> Could you please explain why you would prefer to use test_bit () /
> test_and_set_bit() and clear_bit() for managing the SRP target state ?
> As far as I can see the target state is not changed in any code path
> where it matters how fast the state is changed. Maybe I'm missing
> something ?
I don't think you are missing anything; if anything, I probably am. I
still believe that there is a possibility of overlapping states, and
some of the changes you had introduced had us checking multiple state
variables to see if we could perform an action. Moving to bit flags
cleaned that up, at least in my head.
Your more recent series does not need to check multiple variables given
the target block, and while resource lifetimes confuse the various
states, I may be inclined to go back to your set. It's had more testing,
for sure.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 19:15 ` James Bottomley
2012-11-26 19:22 ` Roland Dreier
@ 2012-11-27 4:04 ` David Dillow
[not found] ` <1353989041.28917.24.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-11-29 20:21 ` Roland Dreier
1 sibling, 2 replies; 42+ messages in thread
From: David Dillow @ 2012-11-27 4:04 UTC (permalink / raw)
To: James Bottomley
Cc: Roland Dreier, linux-rdma@vger.kernel.org, linux-scsi,
Bart Van Assche, fujita.tomonori, rcj
On Mon, 2012-11-26 at 23:15 +0400, James Bottomley wrote:
> On Mon, 2012-11-26 at 10:50 -0800, Roland Dreier wrote:
> > - I think the srp_transport stuff should go through linux-scsi /
> > James B.
> > instead of my tree, esp. since it's shared with the IBM vscsi stuff
> > (I think)
> > - I see Bart had a few comments about a few of your patches, I'll
> > wait
> > for you guys to hash that out.
>
> I'm amenable to that, but we do need an agreed patch set, as Roland
> says. I also hate to apply the pressure, but I suspect -rc7 was the
> last -rc, so I'm expecting the merge window to open on 2/12.
We can push it through James's tree if need be, but Bart's code is
pretty self-contained, and going through the SCSI tree will introduce
merge dependencies. It'd be much easier to push it all through the RDMA
tree, especially if we want to get this landed for 3.8.
I'd want Fujita Tomonori, Robert Jennings, and James to ack the changes
the SRP transport code, though they've been pending for a long time w/o
comment so perhaps silence is consent?
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 7:57 ` [PATCH 00/11] First pass at merging Bart's HA work Or Gerlitz
@ 2012-11-27 4:53 ` David Dillow
0 siblings, 0 replies; 42+ messages in thread
From: David Dillow @ 2012-11-27 4:53 UTC (permalink / raw)
To: Or Gerlitz; +Cc: linux-rdma, linux-scsi, bvanassche, roland
On Mon, 2012-11-26 at 09:57 +0200, Or Gerlitz wrote:
> On Mon, Nov 26, 2012 at 6:44 AM, David Dillow <dillowda@ornl.gov> wrote:
> > One may also pull this series from github:
> > git pull git://github.com/dillow/srp-initiator.git ha-merge-v1
>
> Hi Dave,
>
> The kernel maintainers file specifies the following tree
> git://git.kernel.org/pub/scm/linux/kernel/git/dad/srp-initiator.git
> for you -- I assume you've moved to github during the kernel.org
> sitedown period, could you go back to kernel.org? its nice to have the
> same look and feel e.g through git web for the IB, SRP, Networking,
> SCSI etc trees.
I'd love to, but I don't have enough PGP connectivity to the kernel.org
folks. I've been trying to meet up with other kernel guys during
conferences, but that's been slow going.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <1353989041.28917.24.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2012-11-27 6:42 ` Or Gerlitz
0 siblings, 0 replies; 42+ messages in thread
From: Or Gerlitz @ 2012-11-27 6:42 UTC (permalink / raw)
To: David Dillow, James Bottomley
Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-scsi, Bart Van Assche,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
On 27/11/2012 06:04, David Dillow wrote:
> We can push it through James's tree if need be, but Bart's code is
> pretty self-contained, and going through the SCSI tree will introduce
> merge dependencies. It'd be much easier to push it all through the
> RDMA tree
Yep, this makes sense to me even without taking into account the time
left for the merge window to open,
the patches have been around for long time and relate directly to the
SRP code in the IB subsystem, there's no point in introducing merge
dependencies where it can be avoided.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
` (8 preceding siblings ...)
2012-11-26 18:50 ` Roland Dreier
@ 2012-11-27 16:34 ` Bart Van Assche
[not found] ` <50B4EBA3.7070400-HInyCGIudOg@public.gmane.org>
9 siblings, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-11-27 16:34 UTC (permalink / raw)
To: David Dillow; +Cc: linux-rdma, linux-scsi, roland
On 11/26/12 05:44, David Dillow wrote:
> Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA
> work to upstream. It is not complete, as I have not yet added the
> transport layer error handling and related patches. It is also currently
> missing the patch to maintain a single connection for an I_T nexus.
>
> I swapped Ishai's code to recreate QP/CQs for each connection, as that
> adds recovery from fatal hardware errors, and reduces the code needed to
> avoid stale completions. Similarly, Vu's patch slightly reduces reconnect
> times.
>
> Target blocking/unblocking hasn't been added from Bart's patch that
> removes the SRP_TARGET_CONNECTING state; I think it is probably the right
> thing to do, but want to think it over a bit more.
>
> Similarly, I'm looking for a clean way to start the reconnection effort
> as soon as we get an error through the CQ -- we know that any pending
> commands are lost to us at that point, so we should be able to kick them
> back upstream quickly. This should allow multipath to reissue them on one
> of the remaining good paths.
>
> This series compiles, but is otherwise UNTESTED. I'll be working on that
> over the next few days, with an eye on getting as much of Bart's work
> into 3.8 as possible.
Thanks Dave for doing all this work. A reworked and retested patch
series that should address all comments that have been posted so far can
be found here: http://github.com/bvanassche/linux/srp-ha. I can repost
the entire patch series if you want. The changes compared to the
previous time this patch series was posted are:
* Integrated Ishai's and Vu's patches for recreating the QP and the CQs.
* Took advantage of SCSI core changes that make scsi_remove_host() wait
until device removal and error handling finished (will post these
shortly).
Bart.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50B4EBA3.7070400-HInyCGIudOg@public.gmane.org>
@ 2012-11-27 18:10 ` Joseph Glanville
0 siblings, 0 replies; 42+ messages in thread
From: Joseph Glanville @ 2012-11-27 18:10 UTC (permalink / raw)
To: Bart Van Assche
Cc: David Dillow, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, roland-BHEL68pLQRGGvPXPguhicg
On 28 November 2012 03:34, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote:
> On 11/26/12 05:44, David Dillow wrote:
>>
>> Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA
>> work to upstream. It is not complete, as I have not yet added the
>> transport layer error handling and related patches. It is also currently
>> missing the patch to maintain a single connection for an I_T nexus.
>>
>> I swapped Ishai's code to recreate QP/CQs for each connection, as that
>> adds recovery from fatal hardware errors, and reduces the code needed to
>> avoid stale completions. Similarly, Vu's patch slightly reduces reconnect
>> times.
>>
>> Target blocking/unblocking hasn't been added from Bart's patch that
>> removes the SRP_TARGET_CONNECTING state; I think it is probably the right
>> thing to do, but want to think it over a bit more.
>>
>> Similarly, I'm looking for a clean way to start the reconnection effort
>> as soon as we get an error through the CQ -- we know that any pending
>> commands are lost to us at that point, so we should be able to kick them
>> back upstream quickly. This should allow multipath to reissue them on one
>> of the remaining good paths.
>>
>> This series compiles, but is otherwise UNTESTED. I'll be working on that
>> over the next few days, with an eye on getting as much of Bart's work
>> into 3.8 as possible.
>
>
> Thanks Dave for doing all this work. A reworked and retested patch series
> that should address all comments that have been posted so far can be found
> here: http://github.com/bvanassche/linux/srp-ha. I can repost the entire
> patch series if you want. The changes compared to the previous time this
> patch series was posted are:
> * Integrated Ishai's and Vu's patches for recreating the QP and the CQs.
> * Took advantage of SCSI core changes that make scsi_remove_host() wait
> until device removal and error handling finished (will post these
> shortly).
>
> Bart.
This is grreat news, would be really good to see this merged for 3.8
Bart - I will test the new srp-ha branch in our environment.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Joseph.
--
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-27 4:04 ` David Dillow
[not found] ` <1353989041.28917.24.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2012-11-29 20:21 ` Roland Dreier
[not found] ` <CAL1RGDXpdWL_r7sWp=vvvXH4jxFgjDL+XcEGgKo-44=wrOBmtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 42+ messages in thread
From: Roland Dreier @ 2012-11-29 20:21 UTC (permalink / raw)
To: David Dillow
Cc: James Bottomley, linux-rdma@vger.kernel.org, linux-scsi,
Bart Van Assche, fujita.tomonori, rcj
On Mon, Nov 26, 2012 at 8:04 PM, David Dillow <dillowda@ornl.gov> wrote:
> We can push it through James's tree if need be, but Bart's code is
> pretty self-contained, and going through the SCSI tree will introduce
> merge dependencies. It'd be much easier to push it all through the RDMA
> tree, especially if we want to get this landed for 3.8.
OK, I guess all the srp_transport stuff looks quite simple and good to
me, so I'm OK merging it.
Is there some subset of patches that you and Bart agree are good,
which I can pick up now? I'd love to get at least some of the SRP
stuff into 3.8, and that window is opening pretty soon.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <CAL1RGDXpdWL_r7sWp=vvvXH4jxFgjDL+XcEGgKo-44=wrOBmtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-11-30 2:21 ` David Dillow
2012-12-05 18:23 ` Or Gerlitz
0 siblings, 1 reply; 42+ messages in thread
From: David Dillow @ 2012-11-30 2:21 UTC (permalink / raw)
To: Roland Dreier
Cc: James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
Bart Van Assche, fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
On Thu, 2012-11-29 at 12:21 -0800, Roland Dreier wrote:
> On Mon, Nov 26, 2012 at 8:04 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> > We can push it through James's tree if need be, but Bart's code is
> > pretty self-contained, and going through the SCSI tree will introduce
> > merge dependencies. It'd be much easier to push it all through the RDMA
> > tree, especially if we want to get this landed for 3.8.
>
> OK, I guess all the srp_transport stuff looks quite simple and good to
> me, so I'm OK merging it.
>
> Is there some subset of patches that you and Bart agree are good,
> which I can pick up now? I'd love to get at least some of the SRP
> stuff into 3.8, and that window is opening pretty soon.
Modulo a few style issues (braces around one line if branches, etc.) and
having three state variables vs one, I can live with everything up to
aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those
two are small things that can be fixed later and are not worth holding
things up any further.
I'll try to spend some time on the final four patches tomorrow
afternoon.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-11-30 2:21 ` David Dillow
@ 2012-12-05 18:23 ` Or Gerlitz
[not found] ` <CAJZOPZJBTRXftrW5NWEEHnf2QWsni0HMTAV_PKSgDtA7GO=wRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 42+ messages in thread
From: Or Gerlitz @ 2012-12-05 18:23 UTC (permalink / raw)
To: David Dillow, Bart Van Assche
Cc: Roland Dreier, James Bottomley, linux-rdma@vger.kernel.org,
linux-scsi, fujita.tomonori, rcj, Alex Turin
On Fri, Nov 30, 2012 at 4:21 AM, David Dillow <dillowda@ornl.gov> wrote:
[...]
> Modulo a few style issues (braces around one line if branches, etc.) and
> having three state variables vs one, I can live with everything up to
> aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those
> two are small things that can be fixed later and are not worth holding
> things up any further.
>
> I'll try to spend some time on the final four patches tomorrow afternoon.
Dave, Bart
My colleague Alex Turin <alextu@mellanox.com> tried today the bits as
they appear in Roland's kernel.org tree / for-next branch up to commit
fb57e1dbbd4 and here's some feedback
Basically, what he did was connecting to a target, next take down the
IB port on the initiator side, and issue some IOs (dd if=/dev/sdb
of=/dev/null count=1)
Our recollection of events from the logs (below) is the following
1. queued command get completion status 5
2. as part of error handling srp_reset_host() was called,
3. srp_reset_host() calls to srp_reconnect_target() which fails cause
port is down.
4. srp_reconnect_target() on failure calls to srp_queue_remove_work()
which sets
target->status to SRP_TARGET_REMOVED.
5.srp_reset_host() called second time. it calls to
srp_reconnect_target() but target->state == SRP_TARGET_REMOVED.
srp_reconnect_target() checks if target->state != SRP_TARGET_LIVE and
return -EAGAIN.
This probably means that even after enabling port it will still fail
to reconnect?
Or.
Dec 5 16:19:13 rsws42 kernel: scsi host7: ib_srp: failed send status 5
Dec 5 16:19:42 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:19:42 rsws42 kernel: scsi host7: SRP reset_device called
Dec 5 16:19:42 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called
Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Got failed path rec
status -110
Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Path record query failed
Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: reconnect failed
(-110), removing target port.
Dec 5 16:19:43 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
Dec 5 16:19:43 rsws42 kernel: sd 7:0:0:11: [sdb] Synchronizing SCSI cache
Dec 5 16:20:45 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:20:50 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:21:05 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:21:10 rsws42 kernel: scsi host7: SRP reset_device called
Dec 5 16:21:15 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called
Dec 5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
Dec 5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
repeating part:
Dec 5 16:22:17 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:22:22 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:22:37 rsws42 kernel: scsi host7: SRP abort called
Dec 5 16:22:42 rsws42 kernel: scsi host7: SRP reset_device called
Dec 5 16:22:47 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called
Dec 5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
Dec 5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <CAJZOPZJBTRXftrW5NWEEHnf2QWsni0HMTAV_PKSgDtA7GO=wRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-12-05 18:50 ` Bart Van Assche
[not found] ` <50BF9760.2080801-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-12-05 18:50 UTC (permalink / raw)
To: Or Gerlitz
Cc: David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On 12/05/12 19:23, Or Gerlitz wrote:
> On Fri, Nov 30, 2012 at 4:21 AM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> [...]
>> Modulo a few style issues (braces around one line if branches, etc.) and
>> having three state variables vs one, I can live with everything up to
>> aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those
>> two are small things that can be fixed later and are not worth holding
>> things up any further.
>>
>> I'll try to spend some time on the final four patches tomorrow afternoon.
>
> Dave, Bart
>
> My colleague Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> tried today the bits as
> they appear in Roland's kernel.org tree / for-next branch up to commit
> fb57e1dbbd4 and here's some feedback
>
> Basically, what he did was connecting to a target, next take down the
> IB port on the initiator side, and issue some IOs (dd if=/dev/sdb
> of=/dev/null count=1)
>
> Our recollection of events from the logs (below) is the following
>
> 1. queued command get completion status 5
>
> 2. as part of error handling srp_reset_host() was called,
>
> 3. srp_reset_host() calls to srp_reconnect_target() which fails cause
> port is down.
>
> 4. srp_reconnect_target() on failure calls to srp_queue_remove_work()
> which sets
> target->status to SRP_TARGET_REMOVED.
>
> 5.srp_reset_host() called second time. it calls to
> srp_reconnect_target() but target->state == SRP_TARGET_REMOVED.
> srp_reconnect_target() checks if target->state != SRP_TARGET_LIVE and
> return -EAGAIN.
>
> This probably means that even after enabling port it will still fail
> to reconnect?
Hello Or,
The only way to make I/O work reliably if a failure can occur at the
transport layer is to use multipathd on top of ib_srp. If a connection
fails for some reason, then the SRP SCSI host will be removed after the
SCSI error handler has finished with its error recovery strategy. And
once the transport layer is operational again and srp_daemon detects
that the initiator is no longer logged in srp_daemon will make ib_srp
log in again. multipathd will then cause I/O to continue over the new path.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50BF9760.2080801-HInyCGIudOg@public.gmane.org>
@ 2012-12-05 19:50 ` Bart Van Assche
2012-12-05 21:32 ` Or Gerlitz
1 sibling, 0 replies; 42+ messages in thread
From: Bart Van Assche @ 2012-12-05 19:50 UTC (permalink / raw)
Cc: Or Gerlitz, David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On 12/05/12 19:50, Bart Van Assche wrote:
> On 12/05/12 19:23, Or Gerlitz wrote:
>> On Fri, Nov 30, 2012 at 4:21 AM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
>> [...]
>>> Modulo a few style issues (braces around one line if branches, etc.) and
>>> having three state variables vs one, I can live with everything up to
>>> aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those
>>> two are small things that can be fixed later and are not worth holding
>>> things up any further.
>>>
>>> I'll try to spend some time on the final four patches tomorrow
>>> afternoon.
>>
>> Dave, Bart
>>
>> My colleague Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> tried today the bits as
>> they appear in Roland's kernel.org tree / for-next branch up to commit
>> fb57e1dbbd4 and here's some feedback
>>
>> Basically, what he did was connecting to a target, next take down the
>> IB port on the initiator side, and issue some IOs (dd if=/dev/sdb
>> of=/dev/null count=1)
>>
>> Our recollection of events from the logs (below) is the following
>>
>> 1. queued command get completion status 5
>>
>> 2. as part of error handling srp_reset_host() was called,
>>
>> 3. srp_reset_host() calls to srp_reconnect_target() which fails cause
>> port is down.
>>
>> 4. srp_reconnect_target() on failure calls to srp_queue_remove_work()
>> which sets
>> target->status to SRP_TARGET_REMOVED.
>>
>> 5.srp_reset_host() called second time. it calls to
>> srp_reconnect_target() but target->state == SRP_TARGET_REMOVED.
>> srp_reconnect_target() checks if target->state != SRP_TARGET_LIVE and
>> return -EAGAIN.
>>
>> This probably means that even after enabling port it will still fail
>> to reconnect?
>
> Hello Or,
>
> The only way to make I/O work reliably if a failure can occur at the
> transport layer is to use multipathd on top of ib_srp. If a connection
> fails for some reason, then the SRP SCSI host will be removed after the
> SCSI error handler has finished with its error recovery strategy. And
> once the transport layer is operational again and srp_daemon detects
> that the initiator is no longer logged in srp_daemon will make ib_srp
> log in again. multipathd will then cause I/O to continue over the new path.
(replying to my own e-mail)
Another possible approach would be to follow the FC model and to block
I/O when a port goes down and to unblock I/O once I/O is again possible.
Some time ago I had posted a patch that went somewhat in this direction
and in which ib_srp tried to reconnect to a target repeatedly after a
transport layer failure. That patch can be found here:
http://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg10158.html
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50BF9760.2080801-HInyCGIudOg@public.gmane.org>
2012-12-05 19:50 ` Bart Van Assche
@ 2012-12-05 21:32 ` Or Gerlitz
2012-12-06 14:10 ` Bart Van Assche
1 sibling, 1 reply; 42+ messages in thread
From: Or Gerlitz @ 2012-12-05 21:32 UTC (permalink / raw)
To: Bart Van Assche
Cc: David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote:
[...]
> The only way to make I/O work reliably if a failure can occur at the
> transport layer is to use multipathd on top of ib_srp. If a connection fails
> for some reason, then the SRP SCSI host will be removed after the SCSI error
> handler has finished with its error recovery strategy. And once the
> transport layer is operational again and srp_daemon detects that the
> initiator is no longer logged in srp_daemon will make ib_srp log in again.
> multipathd will then cause I/O to continue over the new path.
Claim basically understood and agreed however, does this also hold
when the link is back again, that is can't SRP login via this single
path also when there's no multipath on top?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
2012-12-05 21:32 ` Or Gerlitz
@ 2012-12-06 14:10 ` Bart Van Assche
[not found] ` <50C0A76C.20500-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-12-06 14:10 UTC (permalink / raw)
To: Or Gerlitz
Cc: David Dillow, Roland Dreier, James Bottomley,
linux-rdma@vger.kernel.org, linux-scsi, fujita.tomonori, rcj,
Alex Turin
On 12/05/12 22:32, Or Gerlitz wrote:
> On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche <bvanassche@acm.org> wrote:
> [...]
>> The only way to make I/O work reliably if a failure can occur at the
>> transport layer is to use multipathd on top of ib_srp. If a connection fails
>> for some reason, then the SRP SCSI host will be removed after the SCSI error
>> handler has finished with its error recovery strategy. And once the
>> transport layer is operational again and srp_daemon detects that the
>> initiator is no longer logged in srp_daemon will make ib_srp log in again.
>> multipathd will then cause I/O to continue over the new path.
>
> Claim basically understood and agreed however, does this also hold
> when the link is back again, that is can't SRP login via this single
> path also when there's no multipath on top?
As far as I can remember the behavior of ib_srp has always been to try
to reconnect once to the SRP target after the SCSI error handler kicked
in. Other SCSI LLDs, e.g. the iSCSI initiator, can be configured to keep
trying to reconnect after a transport layer failure. That has the
advantage that the SCSI host number remains the same after reconnecting
succeeded as before reconnecting started.
Bart.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50C0A76C.20500-HInyCGIudOg@public.gmane.org>
@ 2012-12-06 14:27 ` Or Gerlitz
[not found] ` <50C0AB42.8040402-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 42+ messages in thread
From: Or Gerlitz @ 2012-12-06 14:27 UTC (permalink / raw)
To: Bart Van Assche
Cc: David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On 06/12/2012 16:10, Bart Van Assche wrote:
> On 12/05/12 22:32, Or Gerlitz wrote:
>> On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> wrote:
>> [...]
>>> The only way to make I/O work reliably if a failure can occur at the
>>> transport layer is to use multipathd on top of ib_srp. If a
>>> connection fails
>>> for some reason, then the SRP SCSI host will be removed after the
>>> SCSI error
>>> handler has finished with its error recovery strategy. And once the
>>> transport layer is operational again and srp_daemon detects that the
>>> initiator is no longer logged in srp_daemon will make ib_srp log in
>>> again.
>>> multipathd will then cause I/O to continue over the new path.
>>
>> Claim basically understood and agreed however, does this also hold
>> when the link is back again, that is can't SRP login via this single
>> path also when there's no multipath on top?
>
> As far as I can remember the behavior of ib_srp has always been to try
> to reconnect once to the SRP target after the SCSI error handler
> kicked in. Other SCSI LLDs, e.g. the iSCSI initiator, can be
> configured to keep trying to reconnect after a transport layer
> failure. That has the advantage that the SCSI host number remains the
> same after reconnecting succeeded as before reconnecting started.
>
Bart,
The core problem here seems to be that scsi_remove_host simply never ends.
Observing all the tasks in the system (e.g using "echo t >
/proc/sysrq-trigger"), we've noted that
none of the SCSI EH are currently running, that is for all of them their
trace is the following
scsi_eh_0 S 0000000000000000 0 380 2 0x00000000
ffff88042c31be08 0000000000000046 ffff88042c31bfd8 0000000000014380
ffff88042c31a010 0000000000014380 0000000000014380 0000000000014380
ffff88042c31bfd8 0000000000014380 ffff88042f5be5c0 ffff88042bb48c40
Call Trace:
[<ffffffff8139b2c0>] ? scsi_unjam_host+0x1f0/0x1f0
[<ffffffff8155c599>] schedule+0x29/0x70
[<ffffffff8139b335>] scsi_error_handler+0x75/0x1c0
[<ffffffff8139b2c0>] ? scsi_unjam_host+0x1f0/0x1f0
[<ffffffff8107cc2e>] kthread+0xee/0x100
[<ffffffff8107cb40>] ? __init_kthread_worker+0x70/0x70
[<ffffffff8156676c>] ret_from_fork+0x7c/0xb0
[<ffffffff8107cb40>] ? __init_kthread_worker+0x70/0x70
However, the flow starting in srp_remove_target hangs somewhere in the
block layer waiting for something to happen
worker/11:1 D 0000000000000000 0 163 2 0x00000000
ffff88082be6f738 0000000000000046 ffff88082be6ffd8 0000000000014380
ffff88082be6e010 0000000000014380 0000000000014380 0000000000014380
ffff88082be6ffd8 0000000000014380 ffff88042f5ba580 ffff88082be6c1c0
Call Trace:
[<ffffffff8155c599>] schedule+0x29/0x70
[<ffffffff8155a60f>] schedule_timeout+0x14f/0x240
[<ffffffff810674f0>] ? lock_timer_base+0x70/0x70
[<ffffffff8155c43b>] wait_for_common+0x11b/0x170
[<ffffffff81091ab0>] ? try_to_wake_up+0x300/0x300
[<ffffffff8155c543>] wait_for_completion_timeout+0x13/0x20
[<ffffffff8125ecc3>] blk_execute_rq+0x133/0x1c0
[<ffffffff81257830>] ? get_request+0x210/0x3d0
[<ffffffff8139dfb8>] scsi_execute+0xe8/0x180
[<ffffffff8139e1f7>] scsi_execute_req+0xa7/0x110
[<ffffffffa0086498>] sd_sync_cache+0xd8/0x130 [sd_mod]
[<ffffffff8137180e>] ? __dev_printk+0x3e/0x90
[<ffffffff81371b45>] ? dev_printk+0x45/0x50
[<ffffffffa0086700>] sd_shutdown+0xd0/0x150 [sd_mod]
[<ffffffffa008691c>] sd_remove+0x7c/0xc0 [sd_mod]
[<ffffffff81375dec>] __device_release_driver+0x7c/0xe0
[<ffffffff81375f5f>] device_release_driver+0x2f/0x50
[<ffffffff81374e46>] bus_remove_device+0x126/0x190
[<ffffffff81372bbb>] device_del+0x14b/0x250
[<ffffffff813a2878>] __scsi_remove_device+0x1b8/0x1d0
[<ffffffff8139eba6>] scsi_forget_host+0xf6/0x110
[<ffffffff81396448>] scsi_remove_host+0x108/0x1e0
[<ffffffffa0536c38>] srp_remove_target+0xb8/0x150 [ib_srp]
[<ffffffffa0536d34>] srp_remove_work+0x64/0xa0 [ib_srp]
[<ffffffff81074ce2>] process_one_work+0x1c2/0x4a0
[<ffffffff81074c70>] ? process_one_work+0x150/0x4a0
[<ffffffffa0536cd0>] ? srp_remove_target+0x150/0x150 [ib_srp]
[<ffffffff8107746e>] worker_thread+0x12e/0x370
[<ffffffff81077340>] ? manage_workers+0x180/0x180
[<ffffffff8107cc2e>] kthread+0xee/0x100
[<ffffffff8107cb40>] ? __init_kthread_worker+0x70/0x70
[<ffffffff8156676c>] ret_from_fork+0x7c/0xb0
[<ffffffff8107cb40>] ? __init_kthread_worker+0x70/0x70
looking on the current locks in the system, we see that this kworker task
holds four locks, but none of them seems to be mutually held by another
task,
Showing all locks held in the system:
4 locks held by kworker/11:1/163:
#0: (events_long){.+.+.+}, at: [<ffffffff81074c70>]
process_one_work+0x150/0x4a0
#1: ((&target->remove_work)){+.+.+.}, at: [<ffffffff81074c70>]
process_one_work+0x150/0x4a0
#2: (&shost->scan_mutex){+.+.+.}, at: [<ffffffff81396374>]
scsi_remove_host+0x34/0x1e0
#3: (&__lockdep_no_validate__){......}, at: [<ffffffff81375f57>]
device_release_driver+0x27/0x50
1 lock held by bash/6298:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by mingetty/6319:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by mingetty/6321:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by mingetty/6323:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by mingetty/6325:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by mingetty/6327:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by mingetty/6329:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
1 lock held by agetty/6337:
#0: (&tty->atomic_read_lock){+.+...}, at: [<ffffffff81339a9e>]
n_tty_read+0x58e/0x960
2 locks held by bash/6479:
#0: (sysrq_key_table_lock){......}, at: [<ffffffff81340f52>]
__handle_sysrq+0x32/0x190
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b7b04>]
debug_show_all_locks+0x44/0x1e0
Alex and Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50C0AB42.8040402-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-12-06 15:04 ` Bart Van Assche
[not found] ` <50C0B407.4010706-HInyCGIudOg@public.gmane.org>
2012-12-07 8:19 ` Or Gerlitz
1 sibling, 1 reply; 42+ messages in thread
From: Bart Van Assche @ 2012-12-06 15:04 UTC (permalink / raw)
To: Or Gerlitz
Cc: David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On 12/06/12 15:27, Or Gerlitz wrote:
> The core problem here seems to be that scsi_remove_host simply never ends.
Hello Or,
The later patches in the srp-ha patch series avoided such behavior by
checking whether the connection between SRP initiator and target is
unique, and by removing duplicate SCSI hosts for which the transport
layer failed. Unfortunately these patches are still under review.
Unless someone can come up with a better solution I will post a patch
one of the next days that makes ib_srp again fail all commands after
host removal started. That will avoid spending a long time doing error
recovery.
Also, you might have noticed that Hannes Reinecke reported a few days
ago that the SCSI error handler may need a lot of time for other
transport types - this behavior is not SRP specific.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50C0B407.4010706-HInyCGIudOg@public.gmane.org>
@ 2012-12-06 15:46 ` Or Gerlitz
2012-12-06 15:55 ` Alex Turin
1 sibling, 0 replies; 42+ messages in thread
From: Or Gerlitz @ 2012-12-06 15:46 UTC (permalink / raw)
To: Bart Van Assche
Cc: David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On 06/12/2012 17:04, Bart Van Assche wrote:
> On 12/06/12 15:27, Or Gerlitz wrote:
>> The core problem here seems to be that scsi_remove_host simply never
>> ends.
>
> Hello Or,
>
> The later patches in the srp-ha patch series avoided such behavior by
> checking whether the connection between SRP initiator and target is
> unique, and by removing duplicate SCSI hosts for which the transport
> layer failed. Unfortunately these patches are still under review.
> Unless someone can come up with a better solution I will post a patch
> one of the next days that makes ib_srp again fail all commands after
> host removal started. That will avoid spending a long time doing error
> recovery.
>
> Also, you might have noticed that Hannes Reinecke reported a few days
> ago that the SCSI error handler may need a lot of time for other
> transport types - this behavior is not SRP specific.
I'm not sure what to you exactly refer by duplicated SCSI hosts in this
context or why we have them. Again, at the time we've took the stack
traces snapshot from the system none of the SCSI EH threads was active,
so I'm not sure either your comment about spending long time in the
error recovery flow, as the flow we've run into seems to simply wait
forever.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50C0B407.4010706-HInyCGIudOg@public.gmane.org>
2012-12-06 15:46 ` Or Gerlitz
@ 2012-12-06 15:55 ` Alex Turin
[not found] ` <50C0BFE0.909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
1 sibling, 1 reply; 42+ messages in thread
From: Alex Turin @ 2012-12-06 15:55 UTC (permalink / raw)
To: Bart Van Assche
Cc: Or Gerlitz, David Dillow, Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
On 12/6/2012 5:04 PM, Bart Van Assche wrote:
> On 12/06/12 15:27, Or Gerlitz wrote:
>> The core problem here seems to be that scsi_remove_host simply never
>> ends.
>
> Hello Or,
>
> The later patches in the srp-ha patch series avoided such behavior by
> checking whether the connection between SRP initiator and target is
> unique, and by removing duplicate SCSI hosts for which the transport
> layer failed. Unfortunately these patches are still under review.
> Unless someone can come up with a better solution I will post a patch
> one of the next days that makes ib_srp again fail all commands after
> host removal started. That will avoid spending a long time doing error
> recovery.
>
> Also, you might have noticed that Hannes Reinecke reported a few days
> ago that the SCSI error handler may need a lot of time for other
> transport types - this behavior is not SRP specific.
>
> Bart.
>
Hello Bart,
In our case we don't have duplicate hosts or targets. We are working
with a single SCSI disk.
To make scsi_remove_host hang we simply disabling a IB port and run "dd
if=/dev/sdb of=/dev/null count=1".
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50C0AB42.8040402-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-12-06 15:04 ` Bart Van Assche
@ 2012-12-07 8:19 ` Or Gerlitz
1 sibling, 0 replies; 42+ messages in thread
From: Or Gerlitz @ 2012-12-07 8:19 UTC (permalink / raw)
To: Bart Van Assche, David Dillow
Cc: Roland Dreier, James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Alex Turin
On Thu, Dec 6, 2012 at 4:27 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
[...]
> looking on the current locks in the system, we see that this kworker task
> holds four locks, but none of them seems to be mutually held by another task,
That was ofcourse a wrong assertion, as a lock can't be mutually held
by two tasks...
I wanted to say that from the general view of which locks are being
held there's no obvious deadlock here, since none of the other locks
holders relates to the block/scsi layers...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH 00/11] First pass at merging Bart's HA work
[not found] ` <50C0BFE0.909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-12-07 21:47 ` Vu Pham
0 siblings, 0 replies; 42+ messages in thread
From: Vu Pham @ 2012-12-07 21:47 UTC (permalink / raw)
To: Alex Turin
Cc: Bart Van Assche, Or Gerlitz, David Dillow, Roland Dreier,
James Bottomley,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi,
fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org,
rcj-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
Alex Turin wrote:
> On 12/6/2012 5:04 PM, Bart Van Assche wrote:
>
>> On 12/06/12 15:27, Or Gerlitz wrote:
>>
>>> The core problem here seems to be that scsi_remove_host simply never
>>> ends.
>>>
>> Hello Or,
>>
>> The later patches in the srp-ha patch series avoided such behavior by
>> checking whether the connection between SRP initiator and target is
>> unique, and by removing duplicate SCSI hosts for which the transport
>> layer failed. Unfortunately these patches are still under review.
>> Unless someone can come up with a better solution I will post a patch
>> one of the next days that makes ib_srp again fail all commands after
>> host removal started. That will avoid spending a long time doing error
>> recovery.
>>
>> Also, you might have noticed that Hannes Reinecke reported a few days
>> ago that the SCSI error handler may need a lot of time for other
>> transport types - this behavior is not SRP specific.
>>
>> Bart.
>>
>>
> Hello Bart,
>
> In our case we don't have duplicate hosts or targets. We are working
> with a single SCSI disk.
> To make scsi_remove_host hang we simply disabling a IB port and run "dd
> if=/dev/sdb of=/dev/null count=1".
>
>
Hello Bart,
I applied your latest patch [PATCH for-next] IB/srp: Make SCSI error
handling finish
and test
Let me capture what I'm seeing:
Host has two paths (scsi_host 7 & 8) to target thru two physical ports 1 & 2
[root@rsws42 ~]# multipath -l
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| `- 7:0:0:11 sdb 8:16 active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
`- 8:0:0:11 sdc 8:32 active undef running
Cable pull by disable port 1, I/Os fail-over fine, the problem is the
cleaning of scsi_host 7 of fail path.
IB RC failure, scsi error recovery kick in.
srp _reconnect_target() failed, srp_remove_target() run to remove
scsi_host 7; however, I think it get stuck at device_del(dev) inside
__scsi_remove_device(dev)
Error recovery continuously happen again and again on scsi host 7 for
9-10 minutes.
scsi_host 7 cannot be cleaned up, its sysfs entry is still there
(/sys/class/scsi_host/host7), its state is SHOST_CANCEL.
I brought port 1 back online, scsi_host 7 cannot reconnect to target
because its state in SRP_TARGET_REMOVED.
scci_host 7 sysfs entry does not contain target login info (ioc_guid,
id_ext, dgid...).
I think srp_daemon can reconnect to target by creating new path with new
scsi hosst; however, I cannot check because I currently don't have a
working srp_daemon.
I need to manually reconnect to target with echo command
Bottom line, I/Os can fail-over/failback; however, old scsi hosts cannot
be removed (sysfs entry is still there) with state SHOST_CANCEL, error
recovery keep happening on old scsi hosts for 10-20 minutes.
thanks,
-vu
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2012-12-07 21:47 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-26 4:44 [PATCH 00/11] First pass at merging Bart's HA work David Dillow
2012-11-26 4:44 ` [PATCH 01/11] IB/srp: enlarge block layer timeout David Dillow
2012-11-26 4:44 ` [PATCH 02/11] IB/srp: simplify state tracking David Dillow
2012-11-26 9:46 ` Bart Van Assche
[not found] ` <50B33A91.3060103-HInyCGIudOg@public.gmane.org>
2012-11-27 3:56 ` David Dillow
2012-11-26 4:44 ` [PATCH 05/11] IB/srp: destroy and recreate QP and CQs on each connection David Dillow
[not found] ` <8fa9a268ec4dc587970161efe94968f3263aad3b.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 18:57 ` Bart Van Assche
2012-11-26 4:44 ` [PATCH 06/11] IB/srp: send disconnect request without waiting for CM timewait exit David Dillow
2012-11-26 4:44 ` [PATCH 07/11] IB/srp: Document sysfs attributes David Dillow
[not found] ` <cover.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 4:44 ` [PATCH 03/11] IB/srp: don't send anything on a bad QP David Dillow
2012-11-26 9:17 ` Bart Van Assche
[not found] ` <50B333AF.6040509-HInyCGIudOg@public.gmane.org>
2012-11-27 3:31 ` David Dillow
2012-11-26 4:44 ` [PATCH 04/11] IB/srp: keep processing commands during host removal David Dillow
[not found] ` <8715294a23dded5879b3a327c470d9b6a39ddbc4.1353903448.git.dillowda-1Heg1YXhbW8@public.gmane.org>
2012-11-26 9:43 ` Bart Van Assche
2012-11-27 3:16 ` David Dillow
2012-11-26 4:44 ` [PATCH 08/11] srp_transport: Fix attribute registration David Dillow
2012-11-26 4:44 ` [PATCH 09/11] srp_transport: Simplify attribute initialization code David Dillow
2012-11-26 5:02 ` David Dillow
2012-11-26 4:44 ` [PATCH 11/11] IB/srp: Allow SRP disconnect through sysfs David Dillow
2012-11-26 4:44 ` [PATCH 10/11] srp_transport: Document sysfs attributes David Dillow
2012-11-26 7:57 ` [PATCH 00/11] First pass at merging Bart's HA work Or Gerlitz
2012-11-27 4:53 ` David Dillow
2012-11-26 18:50 ` Roland Dreier
2012-11-26 19:15 ` James Bottomley
2012-11-26 19:22 ` Roland Dreier
2012-11-27 4:04 ` David Dillow
[not found] ` <1353989041.28917.24.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-11-27 6:42 ` Or Gerlitz
2012-11-29 20:21 ` Roland Dreier
[not found] ` <CAL1RGDXpdWL_r7sWp=vvvXH4jxFgjDL+XcEGgKo-44=wrOBmtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-30 2:21 ` David Dillow
2012-12-05 18:23 ` Or Gerlitz
[not found] ` <CAJZOPZJBTRXftrW5NWEEHnf2QWsni0HMTAV_PKSgDtA7GO=wRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-12-05 18:50 ` Bart Van Assche
[not found] ` <50BF9760.2080801-HInyCGIudOg@public.gmane.org>
2012-12-05 19:50 ` Bart Van Assche
2012-12-05 21:32 ` Or Gerlitz
2012-12-06 14:10 ` Bart Van Assche
[not found] ` <50C0A76C.20500-HInyCGIudOg@public.gmane.org>
2012-12-06 14:27 ` Or Gerlitz
[not found] ` <50C0AB42.8040402-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-12-06 15:04 ` Bart Van Assche
[not found] ` <50C0B407.4010706-HInyCGIudOg@public.gmane.org>
2012-12-06 15:46 ` Or Gerlitz
2012-12-06 15:55 ` Alex Turin
[not found] ` <50C0BFE0.909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-12-07 21:47 ` Vu Pham
2012-12-07 8:19 ` Or Gerlitz
2012-11-27 16:34 ` Bart Van Assche
[not found] ` <50B4EBA3.7070400-HInyCGIudOg@public.gmane.org>
2012-11-27 18:10 ` Joseph Glanville
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).