public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 6/6]  SRP handles send/recv errors and connection closed event
@ 2009-11-09 21:35 Vu Pham
  0 siblings, 0 replies; only message in thread
From: Vu Pham @ 2009-11-09 21:35 UTC (permalink / raw)
  To: Linux RDMA list; +Cc: Roland Dreier

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

Setting up timer for SRP_CONN_ERR_TIMEOUT seconds to propagate I/O errors
and clean up connection resources. When we receive cqe with error status
or connection closed callback, the target already left the fabric for
awhile (30 seconds < n < ...); therefore, we just set a default value
for SRP_CONN_ERR_TIMEOUT at 1 second.

The best solution is registering for trap when target joining/leaving the
the fabric and setting up timer at target->device_loss_timeout seconds to
propagate I/O errors and clean up connection resources; however,
this solution requires a lot of changes in srp_daemon to register for
traps and pass these trap events down to srp driver, and srp driver
need to create sys entry points to receive them.

Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>


[-- Attachment #2: srp_6_handling_send_recv_conn_errors.patch --]
[-- Type: text/plain, Size: 2212 bytes --]

 drivers/infiniband/ulp/srp/ib_srp.c |   18 +++++++++++++++++-
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index f62ef8f..04f4ece 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -950,11 +950,19 @@ static void srp_completion(struct ib_cq *cq, void *target_ptr)
 	ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
 	while (ib_poll_cq(cq, 1, &wc) > 0) {
 		if (wc.status) {
+			unsigned long flags;
+
 			shost_printk(KERN_ERR, target->scsi_host,
 				     PFX "failed %s status %d\n",
 				     wc.wr_id & SRP_OP_RECV ? "receive" : "send",
 				     wc.status);
-			target->qp_in_error = 1;
+			spin_lock_irqsave(target->scsi_host->host_lock, flags);
+			if (!target->qp_in_error &&
+			    target->state == SRP_TARGET_LIVE) {
+				target->qp_in_error = 1;
+				srp_qp_err_add_timer(target, SRP_CONN_ERR_TIMEOUT);
+			}
+			spin_unlock_irqrestore(target->scsi_host->host_lock, flags);		
 			break;
 		}
 
@@ -1258,6 +1266,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 	int attr_mask = 0;
 	int comp = 0;
 	int opcode = 0;
+	unsigned long flags;
 
 	switch (event->event) {
 	case IB_CM_REQ_ERROR:
@@ -1344,6 +1353,13 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 	case IB_CM_TIMEWAIT_EXIT:
 		shost_printk(KERN_ERR, target->scsi_host,
 			     PFX "connection closed\n");
+		spin_lock_irqsave(target->scsi_host->host_lock, flags);
+		if (!target->qp_in_error &&
+		    target->state == SRP_TARGET_LIVE) {
+			target->qp_in_error = 1;
+			srp_qp_err_add_timer(target, SRP_CONN_ERR_TIMEOUT);
+		}
+		spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
 		target->status = 0;
 		break;
 
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 74d1f09..131f7a8 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -49,6 +49,7 @@
 enum {
 	SRP_PATH_REC_TIMEOUT_MS	= 1000,
 	SRP_ABORT_TIMEOUT_MS	= 5000,
+	SRP_CONN_ERR_TIMEOUT	= 1,
 
 	SRP_PORT_REDIRECT	= 1,
 	SRP_DLID_REDIRECT	= 2,

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2009-11-09 21:35 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-09 21:35 [PATCH 6/6] SRP handles send/recv errors and connection closed event Vu Pham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox