linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] IB/SRP patches for kernel 3.8
@ 2012-12-19 14:20 Bart Van Assche
       [not found] ` <50D1CD49.5080607-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-19 14:20 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow,
	Or Gerlitz, Vu Pham, Alex Turin

This patch series avoids that SCSI error handling triggers an endless 
loop and also restores reporting of QP errors in the kernel log.

Changes between v3 and v2:
- As proposed by Dave, added a patch that prevents sending of a task
   management function over a closed connection.

Changes between v2 and v1:
- Track connection state properly.
- Make srp_reset_host() reset requests even if reconnecting fails

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found] ` <50D1CD49.5080607-HInyCGIudOg@public.gmane.org>
@ 2012-12-19 14:21   ` Bart Van Assche
       [not found]     ` <50D1CD7D.6030905-HInyCGIudOg@public.gmane.org>
  2012-12-19 14:22   ` [PATCH v3 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-19 14:21 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow,
	Or Gerlitz, Vu Pham, Alex Turin, Roland Dreier

The connection state must be initialized before srp_connect_target()
is invoked. Drop the assignment in srp_add_target() since it occurs
after srp_connect_target() and since scsi_host_alloc()
zero-initializes the Scsi_Host structure anyway.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index d5088ce..94f76b9 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1972,7 +1972,6 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
 	spin_unlock(&host->target_lock);
 
 	target->state = SRP_TARGET_LIVE;
-	target->connected = false;
 
 	scsi_scan_target(&target->scsi_host->shost_gendev,
 			 0, target->scsi_id, SCAN_WILD_CARD, 0);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/3] IB/srp: Avoid sending a task management function needlessly
       [not found] ` <50D1CD49.5080607-HInyCGIudOg@public.gmane.org>
  2012-12-19 14:21   ` [PATCH v3 1/3] IB/srp: Track connection state properly Bart Van Assche
@ 2012-12-19 14:22   ` Bart Van Assche
       [not found]     ` <50D1CDB4.9070906-HInyCGIudOg@public.gmane.org>
  2012-12-19 14:24   ` [PATCH v3 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
  2012-12-20 12:38   ` [PATCH v3 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
  3 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-19 14:22 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow,
	Or Gerlitz, Vu Pham, Alex Turin, Roland Dreier

Do not send a task management function if sending will fail anyway
because either there is no RDMA/RC connection or the QP is in the
error state.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 94f76b9..2633258 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1695,6 +1695,9 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 	struct srp_iu *iu;
 	struct srp_tsk_mgmt *tsk_mgmt;
 
+	if (!target->connected || target->qp_in_error)
+		return -1;
+
 	init_completion(&target->tsk_mgmt_done);
 
 	spin_lock_irq(&target->lock);
@@ -1754,8 +1757,6 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n");
 
-	if (target->qp_in_error)
-		return FAILED;
 	if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd->device->lun,
 			      SRP_TSK_LUN_RESET))
 		return FAILED;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 3/3] IB/srp: Avoid endless SCSI error handling loop
       [not found] ` <50D1CD49.5080607-HInyCGIudOg@public.gmane.org>
  2012-12-19 14:21   ` [PATCH v3 1/3] IB/srp: Track connection state properly Bart Van Assche
  2012-12-19 14:22   ` [PATCH v3 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
@ 2012-12-19 14:24   ` Bart Van Assche
       [not found]     ` <50D1CE0B.2080902-HInyCGIudOg@public.gmane.org>
  2012-12-20 12:38   ` [PATCH v3 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
  3 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-19 14:24 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow,
	Or Gerlitz, Vu Pham, Alex Turin, Roland Dreier

If a SCSI command times out it is passed to the SCSI error
handler. The SCSI error handler will try to abort the commands
that timed out. If aborting failed a device reset will be
attempted. If the device reset also fails a host reset will
be attempted. If the host reset also fails the whole procedure
will be repeated.

srp_abort() and srp_reset_device() fail for a QP in the error
state. srp_reset_host() fails after host removal has started.
Hence if the SCSI error handler gets invoked after host removal
has started and with the QP in the error state an endless loop
will be triggered.

Modify the SCSI error handling functions in ib_srp as follows:
- Abort SCSI commands properly even if the QP is in the error
  state.
- Make srp_reset_host() reset SCSI requests even after host
  removal has already started or if reconnecting fails.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2633258..8a7eb9f 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -700,23 +700,24 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	struct Scsi_Host *shost = target->scsi_host;
 	int i, ret;
 
-	if (target->state != SRP_TARGET_LIVE)
-		return -EAGAIN;
-
 	scsi_target_block(&shost->shost_gendev);
 
 	srp_disconnect_target(target);
 	/*
-	 * Now get a new local CM ID so that we avoid confusing the
-	 * target in case things are really fouled up.
+	 * Now get a new local CM ID so that we avoid confusing the target in
+	 * case things are really fouled up. Doing so also ensures that all CM
+	 * callbacks will have finished before a new QP is allocated.
 	 */
 	ret = srp_new_cm_id(target);
-	if (ret)
-		goto unblock;
-
-	ret = srp_create_target_ib(target);
-	if (ret)
-		goto unblock;
+	/*
+	 * Whether or not creating a new CM ID succeeded, create a new
+	 * QP. This guarantees that all completion callback function
+	 * invocations have finished before request resetting starts.
+	 */
+	if (ret == 0)
+		ret = srp_create_target_ib(target);
+	else
+		srp_create_target_ib(target);
 
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
@@ -728,9 +729,9 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	for (i = 0; i < SRP_SQ_SIZE; ++i)
 		list_add(&target->tx_ring[i]->list, &target->free_tx);
 
-	ret = srp_connect_target(target);
+	if (ret == 0)
+		ret = srp_connect_target(target);
 
-unblock:
 	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
 			    SDEV_TRANSPORT_OFFLINE);
 
@@ -1739,7 +1740,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
-	if (!req || target->qp_in_error || !srp_claim_req(target, req, scmnd))
+	if (!req || !srp_claim_req(target, req, scmnd))
 		return FAILED;
 	srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			  SRP_TSK_ABORT_TASK);
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]     ` <50D1CD7D.6030905-HInyCGIudOg@public.gmane.org>
@ 2012-12-19 18:04       ` David Dillow
       [not found]         ` <1355940274.23687.2.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: David Dillow @ 2012-12-19 18:04 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz,
	Vu Pham, Alex Turin, Roland Dreier

On Wed, 2012-12-19 at 15:21 +0100, Bart Van Assche wrote:
> The connection state must be initialized before srp_connect_target()
> is invoked. Drop the assignment in srp_add_target() since it occurs
> after srp_connect_target() and since scsi_host_alloc()
> zero-initializes the Scsi_Host structure anyway.

The commit message is still wrong. The issue is that
srp_connect_target() has marked us as active, and this patch removes an
assignment that reverses that.

This shouldn't go upstream without the fixed commit message.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/3] IB/srp: Avoid sending a task management function needlessly
       [not found]     ` <50D1CDB4.9070906-HInyCGIudOg@public.gmane.org>
@ 2012-12-19 18:05       ` David Dillow
  0 siblings, 0 replies; 17+ messages in thread
From: David Dillow @ 2012-12-19 18:05 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz,
	Vu Pham, Alex Turin, Roland Dreier

On Wed, 2012-12-19 at 15:22 +0100, Bart Van Assche wrote:
> Do not send a task management function if sending will fail anyway
> because either there is no RDMA/RC connection or the QP is in the
> error state.
> 
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/3] IB/srp: Avoid endless SCSI error handling loop
       [not found]     ` <50D1CE0B.2080902-HInyCGIudOg@public.gmane.org>
@ 2012-12-19 18:07       ` David Dillow
  0 siblings, 0 replies; 17+ messages in thread
From: David Dillow @ 2012-12-19 18:07 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz,
	Vu Pham, Alex Turin, Roland Dreier

On Wed, 2012-12-19 at 15:24 +0100, Bart Van Assche wrote:
> If a SCSI command times out it is passed to the SCSI error
> handler. The SCSI error handler will try to abort the commands
> that timed out. If aborting failed a device reset will be
> attempted. If the device reset also fails a host reset will
> be attempted. If the host reset also fails the whole procedure
> will be repeated.
> 
> srp_abort() and srp_reset_device() fail for a QP in the error
> state. srp_reset_host() fails after host removal has started.
> Hence if the SCSI error handler gets invoked after host removal
> has started and with the QP in the error state an endless loop
> will be triggered.
> 
> Modify the SCSI error handling functions in ib_srp as follows:
> - Abort SCSI commands properly even if the QP is in the error
>   state.
> - Make srp_reset_host() reset SCSI requests even after host
>   removal has already started or if reconnecting fails.
> 
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>

It's not the cleanest way to handle this, but it is the simplest and
we're short on time.

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]         ` <1355940274.23687.2.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2012-12-20  8:13           ` Bart Van Assche
       [not found]             ` <50D2C8C4.3030308-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-20  8:13 UTC (permalink / raw)
  To: David Dillow; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 12/19/12 19:04, David Dillow wrote:
> On Wed, 2012-12-19 at 15:21 +0100, Bart Van Assche wrote:
>> The connection state must be initialized before srp_connect_target()
>> is invoked. Drop the assignment in srp_add_target() since it occurs
>> after srp_connect_target() and since scsi_host_alloc()
>> zero-initializes the Scsi_Host structure anyway.
>
> The commit message is still wrong. The issue is that
> srp_connect_target() has marked us as active, and this patch removes an
> assignment that reverses that.
>
> This shouldn't go upstream without the fixed commit message.

Hi Dave,

How about replacing the commit message by the following text:

"Remove an assignment that incorrectly overwrites the connection state 
update by srp_connect_target()."

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 0/3] IB/SRP patches for kernel 3.8
       [not found] ` <50D1CD49.5080607-HInyCGIudOg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2012-12-19 14:24   ` [PATCH v3 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
@ 2012-12-20 12:38   ` Or Gerlitz
       [not found]     ` <50D306DE.7080505-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  3 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2012-12-20 12:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow,
	Vu Pham, Alex Turin

On 19/12/2012 16:20, Bart Van Assche wrote:
> This patch series avoids that SCSI error handling triggers an endless 
> loop and also restores reporting of QP errors in the kernel log.
>
> Changes between v3 and v2:
> - As proposed by Dave, added a patch that prevents sending of a task
>   management function over a closed connection.
>
> Changes between v2 and v1:
> - Track connection state properly.
> - Make srp_reset_host() reset requests even if reconnecting fails
>

Bart,

I tried the patches now, took these three commits from 
git://github.com/bvanassche/linux.git
branch 3.8-ib-srp-fixes which I assume are the ones you posted here and 
applied them on Roland's for-next branch

commit 62862c0f93d47853b8d321fe5dcdd5d789e92d08
IB/srp: Avoid endless SCSI error handling loop

commit d73006faf751df4fc2fff7514f6fc6a74c41fd35
IB/srp: Avoid sending a task management function needlessly

commit befde6a2ceca6b894fb183ff96d175d6c380546b
IB/srp: Track connection state properly


Basically, I connected to an SRP target, later took down the initiator 
IB port, and then attempted to issue some IO over the SRP lun (no 
multipathing). What happens is that things are seemed to be detected 
properly, reconnection goes in a loop, but the SCSI host  (host6) isn't 
deleted, I can't unload the module since there's non-zero ref count. 
Only when I bring back
the port, I see

scsi host6: ib_srp: reconnect succeeded
sd 6:0:0:1: [sdc] Synchronizing SCSI cache
... etc on all the luns of that host

and only then the SCSI host is finally removed, the module ref count 
goes to zero and I canunload the module. Maybe another patch is needed 
here? I think few days ago you had a patch on your tree named "Save and 
restore host_scribble during error handling", is it possible we need 
this here for happy removal of the scsi host?

Or.

Dec 20 14:06:13 vsa33 kernel: scsi6 : SRP.T10:0002C9030010B014
Dec 20 14:06:13 vsa33 kernel: scsi 6:0:0:0: Direct-Access     SUN      COMSTAR          1.0  PQ: 0 ANSI: 5
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:0: Attached scsi generic sg1 type 0
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:0: [sdb] 409600 512-byte logical blocks: (209 MB/200 MiB)
Dec 20 14:06:13 vsa33 kernel: scsi 6:0:0:1: Direct-Access     SUN      COMSTAR          1.0  PQ: 0 ANSI: 5
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:1: Attached scsi generic sg2 type 0
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:1: [sdc] 409600 512-byte logical blocks: (209 MB/200 MiB)
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:1: [sdc] Write Protect is off
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:1: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 20 14:06:13 vsa33 kernel: scsi 6:0:0:2: Direct-Access     SUN      COMSTAR          1.0  PQ: 0 ANSI: 5
Dec 20 14:06:13 vsa33 kernel: sdc: unknown partition table
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:2: Attached scsi generic sg3 type 0
Dec 20 14:06:13 vsa33 kernel: scsi 6:0:0:3: Direct-Access     SUN      COMSTAR          1.0  PQ: 0 ANSI: 5
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:3: Attached scsi generic sg4 type 0
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:2: [sdd] 409600 512-byte logical blocks: (209 MB/200 MiB)
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:3: [sde] 104857600 512-byte logical blocks: (53.6 GB/50.0 GiB)
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:0: [sdb] Write Protect is off
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:3: [sde] Write Protect is off
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:3: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:1: [sdc] Attached SCSI disk
Dec 20 14:06:13 vsa33 kernel: sdb: unknown partition table
Dec 20 14:06:13 vsa33 kernel: sde: unknown partition table
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:2: [sdd] Write Protect is off
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:0: [sdb] Attached SCSI disk
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:3: [sde] Attached SCSI disk
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:2: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 20 14:06:13 vsa33 kernel: sdd: unknown partition table
Dec 20 14:06:13 vsa33 kernel: sd 6:0:0:2: [sdd] Attached SCSI disk
Dec 20 14:08:02 vsa33 kernel: scsi host6: ib_srp: failed send status 5
Dec 20 14:08:29 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:08:39 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:08:39 vsa33 kernel: scsi host6: SRP reset_device called
Dec 20 14:08:39 vsa33 kernel: scsi host6: ib_srp: SRP reset_host called
Dec 20 14:08:40 vsa33 kernel: scsi host6: ib_srp: Got failed path rec status -110
Dec 20 14:08:40 vsa33 kernel: scsi host6: ib_srp: Path record query failed
Dec 20 14:08:40 vsa33 kernel: scsi host6: ib_srp: reconnect failed (-110), removing target port.
Dec 20 14:08:40 vsa33 kernel: sd 6:0:0:1: Device offlined - not ready after error recovery
Dec 20 14:08:40 vsa33 kernel: sd 6:0:0:1: [sdc] Unhandled error code
Dec 20 14:08:40 vsa33 kernel: sd 6:0:0:1: [sdc]
Dec 20 14:08:40 vsa33 kernel: sd 6:0:0:1: [sdc] CDB:
Dec 20 14:08:40 vsa33 kernel: end_request: I/O error, dev sdc, sector 0
Dec 20 14:08:40 vsa33 kernel: Buffer I/O error on device sdc, logical block 0
Dec 20 14:08:40 vsa33 kernel: Buffer I/O error on device sdc, logical block 1
Dec 20 14:08:40 vsa33 kernel: sd 6:0:0:1: rejecting I/O to offline device
Dec 20 14:08:40 vsa33 kernel: Buffer I/O error on device sdc, logical block 2
Dec 20 14:08:40 vsa33 kernel: Buffer I/O error on device sdc, logical block 3
Dec 20 14:08:40 vsa33 kernel: sd 6:0:0:0: [sdb] Synchronizing SCSI cache
Dec 20 14:09:41 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:09:51 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:09:51 vsa33 kernel: scsi host6: SRP reset_device called
Dec 20 14:09:51 vsa33 kernel: scsi host6: ib_srp: SRP reset_host called
Dec 20 14:09:52 vsa33 kernel: scsi host6: ib_srp: Got failed path rec status -110
Dec 20 14:09:52 vsa33 kernel: scsi host6: ib_srp: Path record query failed
Dec 20 14:09:52 vsa33 kernel: scsi host6: ib_srp: reconnect failed (-110), removing target port.
Dec 20 14:09:52 vsa33 kernel: sd 6:0:0:0: Device offlined - not ready after error recovery
Dec 20 14:10:53 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:11:03 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:11:03 vsa33 kernel: scsi host6: SRP reset_device called
Dec 20 14:11:03 vsa33 kernel: scsi host6: ib_srp: SRP reset_host called
Dec 20 14:11:04 vsa33 kernel: scsi host6: ib_srp: Got failed path rec status -110
Dec 20 14:11:04 vsa33 kernel: scsi host6: ib_srp: Path record query failed
Dec 20 14:11:04 vsa33 kernel: scsi host6: ib_srp: reconnect failed (-110), removing target port.
Dec 20 14:11:04 vsa33 kernel: sd 6:0:0:0: Device offlined - not ready after error recovery
Dec 20 14:12:05 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:12:15 vsa33 kernel: scsi host6: SRP abort called
Dec 20 14:12:15 vsa33 kernel: scsi host6: SRP reset_device called
Dec 20 14:12:15 vsa33 kernel: scsi host6: ib_srp: SRP reset_host called
Dec 20 14:12:15 vsa33 kernel: scsi host6: ib_srp: reconnect succeeded
Dec 20 14:12:25 vsa33 kernel: sd 6:0:0:1: [sdc] Synchronizing SCSI cache
Dec 20 14:12:25 vsa33 kernel: sd 6:0:0:2: [sdd] Synchronizing SCSI cache
Dec 20 14:12:25 vsa33 kernel: sd 6:0:0:3: [sde] Synchronizing SCSI cache



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 0/3] IB/SRP patches for kernel 3.8
       [not found]     ` <50D306DE.7080505-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-12-20 13:19       ` Bart Van Assche
       [not found]         ` <50D3105C.10005-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-20 13:19 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow

On 12/20/12 13:38, Or Gerlitz wrote:
> I think few days ago you had a patch on your tree named "Save and
> restore host_scribble during error handling", is it possible we need
> this here for happy removal of the scsi host?

No. Host removal works fine even without that patch. That's because 
srp_abort() has been modified such that it finishes a request whether or 
not sending a task management function to the target succeeded. As you 
can see in scsi_eh_abort_cmnds() if the eh_abort_handler callback 
function returns SUCCESS then scsi_eh_test_devices() will be passed an 
empty work_q list. Hence the SCSI error handler function 
scsi_eh_prep_cmnd() does not get invoked and hence does not have a 
chance of overwriting the host_scribble field.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 0/3] IB/SRP patches for kernel 3.8
       [not found]         ` <50D3105C.10005-HInyCGIudOg@public.gmane.org>
@ 2012-12-20 14:56           ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2012-12-20 14:56 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow

On 20/12/2012 15:19, Bart Van Assche wrote:
> On 12/20/12 13:38, Or Gerlitz wrote:
>> I think few days ago you had a patch on your tree named "Save and
>> restore host_scribble during error handling", is it possible we need
>> this here for happy removal of the scsi host?
>
> No. Host removal works fine even without that patch. 

not on my setup, how do you suggest to proceed here?

Or.

> That's because srp_abort() has been modified such that it finishes a 
> request whether or not sending a task management function to the 
> target succeeded. As you can see in scsi_eh_abort_cmnds() if the 
> eh_abort_handler callback function returns SUCCESS then 
> scsi_eh_test_devices() will be passed an empty work_q list. Hence the 
> SCSI error handler function scsi_eh_prep_cmnd() does not get invoked 
> and hence does not have a chance of overwriting the host_scribble field.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]             ` <50D2C8C4.3030308-HInyCGIudOg@public.gmane.org>
@ 2012-12-20 15:10               ` David Dillow
       [not found]                 ` <1356016241.2507.1.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: David Dillow @ 2012-12-20 15:10 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Thu, 2012-12-20 at 09:13 +0100, Bart Van Assche wrote:
> On 12/19/12 19:04, David Dillow wrote:
> > On Wed, 2012-12-19 at 15:21 +0100, Bart Van Assche wrote:
> >> The connection state must be initialized before srp_connect_target()
> >> is invoked. Drop the assignment in srp_add_target() since it occurs
> >> after srp_connect_target() and since scsi_host_alloc()
> >> zero-initializes the Scsi_Host structure anyway.
> >
> > The commit message is still wrong. The issue is that
> > srp_connect_target() has marked us as active, and this patch removes an
> > assignment that reverses that.
> >
> > This shouldn't go upstream without the fixed commit message.
> 
> Hi Dave,
> 
> How about replacing the commit message by the following text:
> 
> "Remove an assignment that incorrectly overwrites the connection state 
> update by srp_connect_target()."

Works for me, ship it!

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]                 ` <1356016241.2507.1.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2012-12-20 15:24                   ` Bart Van Assche
       [not found]                     ` <50D32DC7.9000007-HInyCGIudOg@public.gmane.org>
  2012-12-20 15:27                   ` [PATCH v3 " Or Gerlitz
  1 sibling, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2012-12-20 15:24 UTC (permalink / raw)
  To: David Dillow; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 12/20/12 16:10, David Dillow wrote:
> On Thu, 2012-12-20 at 09:13 +0100, Bart Van Assche wrote:
>> On 12/19/12 19:04, David Dillow wrote:
>>> On Wed, 2012-12-19 at 15:21 +0100, Bart Van Assche wrote:
>>>> The connection state must be initialized before srp_connect_target()
>>>> is invoked. Drop the assignment in srp_add_target() since it occurs
>>>> after srp_connect_target() and since scsi_host_alloc()
>>>> zero-initializes the Scsi_Host structure anyway.
>>>
>>> The commit message is still wrong. The issue is that
>>> srp_connect_target() has marked us as active, and this patch removes an
>>> assignment that reverses that.
>>>
>>> This shouldn't go upstream without the fixed commit message.
>>
>> Hi Dave,
>>
>> How about replacing the commit message by the following text:
>>
>> "Remove an assignment that incorrectly overwrites the connection state
>> update by srp_connect_target()."
>
> Works for me, ship it!

Thanks ! I will post a fourth version of this patch with the above 
commit message to make it easier for Roland.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 1/3] IB/srp: Track connection state properly
       [not found]                     ` <50D32DC7.9000007-HInyCGIudOg@public.gmane.org>
@ 2012-12-20 15:26                       ` Bart Van Assche
  0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2012-12-20 15:26 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: David Dillow, Roland Dreier

Remove an assignment that incorrectly overwrites the connection
state update by srp_connect_target().

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index d5088ce..94f76b9 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1972,7 +1972,6 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
 	spin_unlock(&host->target_lock);
 
 	target->state = SRP_TARGET_LIVE;
-	target->connected = false;
 
 	scsi_scan_target(&target->scsi_host->shost_gendev,
 			 0, target->scsi_id, SCAN_WILD_CARD, 0);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]                 ` <1356016241.2507.1.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  2012-12-20 15:24                   ` Bart Van Assche
@ 2012-12-20 15:27                   ` Or Gerlitz
       [not found]                     ` <50D32E7A.4070704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2012-12-20 15:27 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 20/12/2012 17:10, David Dillow wrote:
> Works for me, ship it!

Dave, did you gave the patches a try?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]                     ` <50D32E7A.4070704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-12-20 16:27                       ` David Dillow
       [not found]                         ` <1356020823.8322.0.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: David Dillow @ 2012-12-20 16:27 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Bart Van Assche,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Thu, 2012-12-20 at 17:27 +0200, Or Gerlitz wrote:
> On 20/12/2012 17:10, David Dillow wrote:
> > Works for me, ship it!
> 
> Dave, did you gave the patches a try?

This one looks to be an obvious fix, but I have not tested the others; I
have been relying on Bart, Alex, and yourself for the validation.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/3] IB/srp: Track connection state properly
       [not found]                         ` <1356020823.8322.0.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2012-12-20 20:06                           ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2012-12-20 20:06 UTC (permalink / raw)
  To: David Dillow
  Cc: Or Gerlitz, Bart Van Assche,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Thu, Dec 20, 2012 at 6:27 PM, David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org> wrote:
> On Thu, 2012-12-20 at 17:27 +0200, Or Gerlitz wrote:

>> Dave, did you gave the patches a try?

> This one looks to be an obvious fix, but I have not tested the others; I
> have been relying on Bart, Alex, and yourself for the validation.

Taking into account that we've been going back and forth w.r.t that
validation -- that is initially things weren't working on our setups,
later with picking some patches from Bart's tree, that host removal
went OK, and now with the patches posted here, it doesn't go fine on
my setup  -- all in all, sounds like a 2nd opinion which is strengthen
by a test is needed here. You only need to boot with Roland's for-next
+ Bart's patches, connect to a target, take down the IB link and issue
dd. No multipath scheme is needed for the very basic validation, can
you?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-12-20 20:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-19 14:20 [PATCH v3 0/3] IB/SRP patches for kernel 3.8 Bart Van Assche
     [not found] ` <50D1CD49.5080607-HInyCGIudOg@public.gmane.org>
2012-12-19 14:21   ` [PATCH v3 1/3] IB/srp: Track connection state properly Bart Van Assche
     [not found]     ` <50D1CD7D.6030905-HInyCGIudOg@public.gmane.org>
2012-12-19 18:04       ` David Dillow
     [not found]         ` <1355940274.23687.2.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2012-12-20  8:13           ` Bart Van Assche
     [not found]             ` <50D2C8C4.3030308-HInyCGIudOg@public.gmane.org>
2012-12-20 15:10               ` David Dillow
     [not found]                 ` <1356016241.2507.1.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-12-20 15:24                   ` Bart Van Assche
     [not found]                     ` <50D32DC7.9000007-HInyCGIudOg@public.gmane.org>
2012-12-20 15:26                       ` [PATCH v4 " Bart Van Assche
2012-12-20 15:27                   ` [PATCH v3 " Or Gerlitz
     [not found]                     ` <50D32E7A.4070704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-12-20 16:27                       ` David Dillow
     [not found]                         ` <1356020823.8322.0.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2012-12-20 20:06                           ` Or Gerlitz
2012-12-19 14:22   ` [PATCH v3 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
     [not found]     ` <50D1CDB4.9070906-HInyCGIudOg@public.gmane.org>
2012-12-19 18:05       ` David Dillow
2012-12-19 14:24   ` [PATCH v3 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
     [not found]     ` <50D1CE0B.2080902-HInyCGIudOg@public.gmane.org>
2012-12-19 18:07       ` David Dillow
2012-12-20 12:38   ` [PATCH v3 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
     [not found]     ` <50D306DE.7080505-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-12-20 13:19       ` Bart Van Assche
     [not found]         ` <50D3105C.10005-HInyCGIudOg@public.gmane.org>
2012-12-20 14:56           ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).