All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
To: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: [PATCH v2 2/2] IB/srp: Avoid endless SCSI error handling loop
Date: Fri, 14 Dec 2012 16:38:15 +0100	[thread overview]
Message-ID: <50CB47E7.2060308@acm.org> (raw)
In-Reply-To: <50CB46A4.4050300-HInyCGIudOg@public.gmane.org>

If a SCSI command times out it is passed to the SCSI error
handler. The SCSI error handler will try to abort the command
that timed out. If aborting failed a device reset will be
attempted. If the device reset fails too a host reset will
be attempted. If the host reset also fails the whole procedure
will be repeated.

Since srp_abort() and srp_reset_device() fail for a QP in the
error state and since srp_reset_host() fails after host removal
has started an endless loop will be triggered.

Hence modify the SCSI error handling functions in ib_srp as
follows:
- Abort SCSI commands properly even if the QP is in the error
  state.
- Make srp_reset_host() reset SCSI requests even if host
  removal has already started or if reconnecting fails.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Reported-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 94f76b9..42b6286 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -700,23 +700,24 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	struct Scsi_Host *shost = target->scsi_host;
 	int i, ret;
 
-	if (target->state != SRP_TARGET_LIVE)
-		return -EAGAIN;
-
 	scsi_target_block(&shost->shost_gendev);
 
 	srp_disconnect_target(target);
 	/*
-	 * Now get a new local CM ID so that we avoid confusing the
-	 * target in case things are really fouled up.
+	 * Now get a new local CM ID so that we avoid confusing the target in
+	 * case things are really fouled up. Doing so also ensures that all CM
+	 * callbacks will have finished before a new QP is allocated.
 	 */
 	ret = srp_new_cm_id(target);
-	if (ret)
-		goto unblock;
-
-	ret = srp_create_target_ib(target);
-	if (ret)
-		goto unblock;
+	/*
+	 * Whether or not creating a new CM ID succeeded, create a new
+	 * QP. This guarantees that all completion callback function
+	 * invocations have finished before request resetting starts.
+	 */
+	if (ret == 0)
+		ret = srp_create_target_ib(target);
+	else
+		srp_create_target_ib(target);
 
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
@@ -728,9 +729,9 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	for (i = 0; i < SRP_SQ_SIZE; ++i)
 		list_add(&target->tx_ring[i]->list, &target->free_tx);
 
-	ret = srp_connect_target(target);
+	if (ret == 0)
+		ret = srp_connect_target(target);
 
-unblock:
 	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
 			    SDEV_TRANSPORT_OFFLINE);
 
@@ -1736,7 +1737,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
-	if (!req || target->qp_in_error || !srp_claim_req(target, req, scmnd))
+	if (!req || !srp_claim_req(target, req, scmnd))
 		return FAILED;
 	srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			  SRP_TSK_ABORT_TASK);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-12-14 15:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-14 15:32 [PATCH v2] IB/SRP patches for kernel 3.8 Bart Van Assche
     [not found] ` <50CB46A4.4050300-HInyCGIudOg@public.gmane.org>
2012-12-14 15:34   ` [PATCH v2 1/2] IB/srp: Track connection state properly Bart Van Assche
     [not found]     ` <50CB4713.4080909-HInyCGIudOg@public.gmane.org>
2012-12-14 15:48       ` David Dillow
2012-12-14 15:38   ` Bart Van Assche [this message]
     [not found]     ` <50CB47E7.2060308-HInyCGIudOg@public.gmane.org>
2012-12-14 15:55       ` [PATCH v2 2/2] IB/srp: Avoid endless SCSI error handling loop David Dillow
     [not found]         ` <1355500552.18309.11.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2012-12-14 16:12           ` Bart Van Assche
     [not found]             ` <50CB4FEB.3080104-HInyCGIudOg@public.gmane.org>
2012-12-14 16:19               ` David Dillow
     [not found]                 ` <1355501996.18309.16.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2012-12-14 16:30                   ` Bart Van Assche
     [not found]                     ` <50CB5432.8040204-HInyCGIudOg@public.gmane.org>
2012-12-14 18:14                       ` Vu Pham
2012-12-19  4:09               ` David Dillow
     [not found]                 ` <1355890164.23969.0.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-12-19 14:15                   ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50CB47E7.2060308@acm.org \
    --to=bvanassche-hinycgiudog@public.gmane.org \
    --cc=alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org \
    --cc=vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.