* [PATCH 6/6] SRP handles send/recv errors and connection closed event
@ 2009-11-09 21:35 Vu Pham
0 siblings, 0 replies; only message in thread
From: Vu Pham @ 2009-11-09 21:35 UTC (permalink / raw)
To: Linux RDMA list; +Cc: Roland Dreier
[-- Attachment #1: Type: text/plain, Size: 796 bytes --]
Setting up timer for SRP_CONN_ERR_TIMEOUT seconds to propagate I/O errors
and clean up connection resources. When we receive cqe with error status
or connection closed callback, the target already left the fabric for
awhile (30 seconds < n < ...); therefore, we just set a default value
for SRP_CONN_ERR_TIMEOUT at 1 second.
The best solution is registering for trap when target joining/leaving the
the fabric and setting up timer at target->device_loss_timeout seconds to
propagate I/O errors and clean up connection resources; however,
this solution requires a lot of changes in srp_daemon to register for
traps and pass these trap events down to srp driver, and srp driver
need to create sys entry points to receive them.
Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
[-- Attachment #2: srp_6_handling_send_recv_conn_errors.patch --]
[-- Type: text/plain, Size: 2212 bytes --]
drivers/infiniband/ulp/srp/ib_srp.c | 18 +++++++++++++++++-
drivers/infiniband/ulp/srp/ib_srp.h | 1 +
2 files changed, 18 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index f62ef8f..04f4ece 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -950,11 +950,19 @@ static void srp_completion(struct ib_cq *cq, void *target_ptr)
ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
while (ib_poll_cq(cq, 1, &wc) > 0) {
if (wc.status) {
+ unsigned long flags;
+
shost_printk(KERN_ERR, target->scsi_host,
PFX "failed %s status %d\n",
wc.wr_id & SRP_OP_RECV ? "receive" : "send",
wc.status);
- target->qp_in_error = 1;
+ spin_lock_irqsave(target->scsi_host->host_lock, flags);
+ if (!target->qp_in_error &&
+ target->state == SRP_TARGET_LIVE) {
+ target->qp_in_error = 1;
+ srp_qp_err_add_timer(target, SRP_CONN_ERR_TIMEOUT);
+ }
+ spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
break;
}
@@ -1258,6 +1266,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
int attr_mask = 0;
int comp = 0;
int opcode = 0;
+ unsigned long flags;
switch (event->event) {
case IB_CM_REQ_ERROR:
@@ -1344,6 +1353,13 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
case IB_CM_TIMEWAIT_EXIT:
shost_printk(KERN_ERR, target->scsi_host,
PFX "connection closed\n");
+ spin_lock_irqsave(target->scsi_host->host_lock, flags);
+ if (!target->qp_in_error &&
+ target->state == SRP_TARGET_LIVE) {
+ target->qp_in_error = 1;
+ srp_qp_err_add_timer(target, SRP_CONN_ERR_TIMEOUT);
+ }
+ spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
target->status = 0;
break;
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 74d1f09..131f7a8 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -49,6 +49,7 @@
enum {
SRP_PATH_REC_TIMEOUT_MS = 1000,
SRP_ABORT_TIMEOUT_MS = 5000,
+ SRP_CONN_ERR_TIMEOUT = 1,
SRP_PORT_REDIRECT = 1,
SRP_DLID_REDIRECT = 2,
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2009-11-09 21:35 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-09 21:35 [PATCH 6/6] SRP handles send/recv errors and connection closed event Vu Pham
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.