* [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
@ 2013-02-01 15:18 Bart Van Assche
[not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-01 15:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow,
Or Gerlitz, Vu Pham, Alex Turin
This patch series avoids that SCSI error handling triggers an endless
loop and also restores reporting of QP errors in the kernel log.
Changes between v3 and v2:
- As proposed by Dave, added a patch that prevents sending of a task
management function over a closed connection.
Changes between v2 and v1:
- Track connection state properly.
- Make srp_reset_host() reset requests even if reconnecting fails
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 23+ messages in thread[parent not found: <510BDCAA.204-HInyCGIudOg@public.gmane.org>]
* [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org> @ 2013-02-01 15:18 ` Bart Van Assche 2013-02-01 15:19 ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche ` (2 subsequent siblings) 3 siblings, 0 replies; 23+ messages in thread From: Bart Van Assche @ 2013-02-01 15:18 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow, Or Gerlitz, Vu Pham, Alex Turin Remove an assignment that incorrectly overwrites the connection state update by srp_connect_target(). Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> --- drivers/infiniband/ulp/srp/ib_srp.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index d5088ce..94f76b9 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1972,7 +1972,6 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target) spin_unlock(&host->target_lock); target->state = SRP_TARGET_LIVE; - target->connected = false; scsi_scan_target(&target->scsi_host->shost_gendev, 0, target->scsi_id, SCAN_WILD_CARD, 0); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org> 2013-02-01 15:18 ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche @ 2013-02-01 15:19 ` Bart Van Assche 2013-02-01 15:21 ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche 2013-02-04 21:11 ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz 3 siblings, 0 replies; 23+ messages in thread From: Bart Van Assche @ 2013-02-01 15:19 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow, Or Gerlitz, Vu Pham, Alex Turin, Roland Dreier Do not send a task management function if sending will fail anyway because either there is no RDMA/RC connection or the QP is in the error state. Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> --- drivers/infiniband/ulp/srp/ib_srp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 94f76b9..2633258 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1695,6 +1695,9 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target, struct srp_iu *iu; struct srp_tsk_mgmt *tsk_mgmt; + if (!target->connected || target->qp_in_error) + return -1; + init_completion(&target->tsk_mgmt_done); spin_lock_irq(&target->lock); @@ -1754,8 +1757,6 @@ static int srp_reset_device(struct scsi_cmnd *scmnd) shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n"); - if (target->qp_in_error) - return FAILED; if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd->device->lun, SRP_TSK_LUN_RESET)) return FAILED; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org> 2013-02-01 15:18 ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche 2013-02-01 15:19 ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche @ 2013-02-01 15:21 ` Bart Van Assche 2013-02-04 21:11 ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz 3 siblings, 0 replies; 23+ messages in thread From: Bart Van Assche @ 2013-02-01 15:21 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Dillow, Or Gerlitz, Vu Pham, Alex Turin, Roland Dreier If a SCSI command times out it is passed to the SCSI error handler. The SCSI error handler will try to abort the commands that timed out. If aborting failed a device reset will be attempted. If the device reset also fails a host reset will be attempted. If the host reset also fails the whole procedure will be repeated. srp_abort() and srp_reset_device() fail for a QP in the error state. srp_reset_host() fails after host removal has started. Hence if the SCSI error handler gets invoked after host removal has started and with the QP in the error state an endless loop will be triggered. Modify the SCSI error handling functions in ib_srp as follows: - Abort SCSI commands properly even if the QP is in the error state. - Make srp_reset_host() reset SCSI requests even after host removal has already started or if reconnecting fails. Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Cc: Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/ulp/srp/ib_srp.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 2633258..8a7eb9f 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -700,23 +700,24 @@ static int srp_reconnect_target(struct srp_target_port *target) struct Scsi_Host *shost = target->scsi_host; int i, ret; - if (target->state != SRP_TARGET_LIVE) - return -EAGAIN; - scsi_target_block(&shost->shost_gendev); srp_disconnect_target(target); /* - * Now get a new local CM ID so that we avoid confusing the - * target in case things are really fouled up. + * Now get a new local CM ID so that we avoid confusing the target in + * case things are really fouled up. Doing so also ensures that all CM + * callbacks will have finished before a new QP is allocated. */ ret = srp_new_cm_id(target); - if (ret) - goto unblock; - - ret = srp_create_target_ib(target); - if (ret) - goto unblock; + /* + * Whether or not creating a new CM ID succeeded, create a new + * QP. This guarantees that all completion callback function + * invocations have finished before request resetting starts. + */ + if (ret == 0) + ret = srp_create_target_ib(target); + else + srp_create_target_ib(target); for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) { struct srp_request *req = &target->req_ring[i]; @@ -728,9 +729,9 @@ static int srp_reconnect_target(struct srp_target_port *target) for (i = 0; i < SRP_SQ_SIZE; ++i) list_add(&target->tx_ring[i]->list, &target->free_tx); - ret = srp_connect_target(target); + if (ret == 0) + ret = srp_connect_target(target); -unblock: scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING : SDEV_TRANSPORT_OFFLINE); @@ -1739,7 +1740,7 @@ static int srp_abort(struct scsi_cmnd *scmnd) shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n"); - if (!req || target->qp_in_error || !srp_claim_req(target, req, scmnd)) + if (!req || !srp_claim_req(target, req, scmnd)) return FAILED; srp_send_tsk_mgmt(target, req->index, scmnd->device->lun, SRP_TSK_ABORT_TASK); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org> ` (2 preceding siblings ...) 2013-02-01 15:21 ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche @ 2013-02-04 21:11 ` Or Gerlitz [not found] ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 3 siblings, 1 reply; 23+ messages in thread From: Or Gerlitz @ 2013-02-04 21:11 UTC (permalink / raw) To: Bart Van Assche, David Dillow, Roland Dreier Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Vu Pham, Oren Duer On Fri, Feb 1, 2013 at 5:18 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote: > This patch series avoids that SCSI error handling triggers an endless loop > and also restores reporting of QP errors in the kernel log. Bart, You wrote "resend" in the subject line, anything new, in these patches OR in other patches merged through the SCSI tree for 3.8 vs. what you had posted earlier on Dec 20th, 2012 (http://marc.info/?t=135592692800001&r=1&w=2)?! as I wrote there, it didn't work for me. Dave, Roland, as I wrote here http://marc.info/?l=linux-rdma&m=135603401830703&w=2 I think what we need now is 2nd opinion, and I asked if Dave can give the patches a try, no more but no less... we need to know if it works Or. > Changes between v3 and v2: > - As proposed by Dave, added a patch that prevents sending of a task > management function over a closed connection. > > Changes between v2 and v1: > - Track connection state properly. > - Make srp_reset_host() reset requests even if reconnecting fails -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-02-05 16:25 ` Bart Van Assche [not found] ` <5111327F.6050402-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Bart Van Assche @ 2013-02-05 16:25 UTC (permalink / raw) To: Or Gerlitz Cc: David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Vu Pham, Oren Duer On 02/04/13 22:11, Or Gerlitz wrote: > On Fri, Feb 1, 2013 at 5:18 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote: >> This patch series avoids that SCSI error handling triggers an endless loop >> and also restores reporting of QP errors in the kernel log. > > Bart, > > You wrote "resend" in the subject line, anything new, in these patches > OR in other patches merged through the SCSI tree for 3.8 vs. what you > had posted earlier on Dec 20th, 2012. Hello Or, I have just reposted my SCSI device removal patch series. There are several changes in v8 of that patch series compared to v7, including: * Saving and restoring the host scribble field in the SCSI error handler. * A robustness improvement for when the SCSI timeout is below the maximum LLD response processing time. Version eight of the SCSI device removal patch series can be found here: http://github.com/bvanassche/linux/tree/device-removal-fixes. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5111327F.6050402-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <5111327F.6050402-HInyCGIudOg@public.gmane.org> @ 2013-02-05 20:54 ` Or Gerlitz [not found] ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Or Gerlitz @ 2013-02-05 20:54 UTC (permalink / raw) To: Bart Van Assche Cc: David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Vu Pham, Oren Duer On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote: > On 02/04/13 22:11, Or Gerlitz wrote: >> You wrote "resend" in the subject line, anything new, in these patches >> OR in other patches merged through the SCSI tree for 3.8 vs. what you >> had posted earlier on Dec 20th, 2012. > I have just reposted my SCSI device removal patch series. There are Bart, I'd like to sharpen the point: could you please clarify if the series posted to linux-rdma stands for itself in the sense that SRP HA scheme X (please state it) now works/better when the patches applied on top of the latest 3.8-rc cut? OR for X to do better/work, one needs this series AND the one you posted to linux-scsi. Or. > several changes in v8 of that patch series compared to v7, including: > * Saving and restoring the host scribble field in the SCSI error handler. > * A robustness improvement for when the SCSI timeout is below the > maximum LLD response processing time. > > Version eight of the SCSI device removal patch series can be found here: > http://github.com/bvanassche/linux/tree/device-removal-fixes. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-02-06 7:22 ` Bart Van Assche [not found] ` <5112049B.8030406-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Bart Van Assche @ 2013-02-06 7:22 UTC (permalink / raw) To: Or Gerlitz Cc: David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Vu Pham, Oren Duer On 02/05/13 21:54, Or Gerlitz wrote: > On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote: >> On 02/04/13 22:11, Or Gerlitz wrote: > Bart, I'd like to sharpen the point: could you please clarify if the > series posted to linux-rdma stands for itself in the sense that SRP HA > scheme X (please state it) now works/better when the patches applied > on top of the latest 3.8-rc cut? OR for X to do better/work, one needs > this series AND the one you posted to linux-scsi. Hello Or, A huge number of patches have been taken upstream between 3.8-rc1 and 3.8-rc6. I have retested these three patches with 3.8-rc6 and would appreciate if you would also repeat your tests. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5112049B.8030406-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <5112049B.8030406-HInyCGIudOg@public.gmane.org> @ 2013-02-06 7:44 ` Or Gerlitz [not found] ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2013-02-06 21:42 ` Vu Pham 1 sibling, 1 reply; 23+ messages in thread From: Or Gerlitz @ 2013-02-06 7:44 UTC (permalink / raw) To: Bart Van Assche Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Vu Pham, Oren Duer, Sagi Grimberg On 06/02/2013 09:22, Bart Van Assche wrote: > > A huge number of patches have been taken upstream between 3.8-rc1 and > 3.8-rc6. I have retested these three patches with 3.8-rc6 and would > appreciate if you would also repeat your tests. not really... this is what I see on Linus tree for the relevant directories, anywhere else I need to look linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/ drivers/block/ block/drivers/infiniband/ulp/srp bdb0ae6 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 83e6818 efi: Make 'efi_enabled' a function to query EFI facilities 2263647 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux 8d85fce Drivers: block: remove __dev* attributes. 6f03979 Drivers: scsi: remove __dev* attributes. f4953fe virtio-blk: Don't free ida when disk is in use -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-06 7:59 ` Bart Van Assche [not found] ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Bart Van Assche @ 2013-02-06 7:59 UTC (permalink / raw) To: Or Gerlitz Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Vu Pham, Oren Duer, Sagi Grimberg On 02/06/13 08:44, Or Gerlitz wrote: > On 06/02/2013 09:22, Bart Van Assche wrote: >> >> A huge number of patches have been taken upstream between 3.8-rc1 and >> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would >> appreciate if you would also repeat your tests. > > not really... this is what I see on Linus tree for the relevant > directories, anywhere else I need to look > > linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/ > drivers/block/ block/drivers/infiniband/ulp/srp > bdb0ae6 Merge branch 'x86-urgent-for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > 83e6818 efi: Make 'efi_enabled' a function to query EFI facilities > 2263647 Merge tag 'fixes-for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux > 8d85fce Drivers: block: remove __dev* attributes. > 6f03979 Drivers: scsi: remove __dev* attributes. > f4953fe virtio-blk: Don't free ida when disk is in use Hello Or, Nobody outside Mellanox has ever been able to reproduce the behavior reported by you. Something in your tests might have been specific to the Mellanox environment. Have you perhaps been running your tests with a firmware version that is not available to the general public ? I would appreciate it if you could check your test environment and repeat your tests. Thanks for your understanding, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <51120D4F.2070102-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org> @ 2013-02-06 8:25 ` Or Gerlitz 0 siblings, 0 replies; 23+ messages in thread From: Or Gerlitz @ 2013-02-06 8:25 UTC (permalink / raw) To: Bart Van Assche Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Vu Pham, Oren Duer, Sagi Grimberg On 06/02/2013 09:59, Bart Van Assche wrote: > On 02/06/13 08:44, Or Gerlitz wrote: >> On 06/02/2013 09:22, Bart Van Assche wrote: >>> >>> A huge number of patches have been taken upstream between 3.8-rc1 and >>> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would >>> appreciate if you would also repeat your tests. >> >> not really... this is what I see on Linus tree for the relevant >> directories, anywhere else I need to look >> >> linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/ >> drivers/block/ block/drivers/infiniband/ulp/srp >> bdb0ae6 Merge branch 'x86-urgent-for-linus' of >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip >> 83e6818 efi: Make 'efi_enabled' a function to query EFI facilities >> 2263647 Merge tag 'fixes-for-linus' of >> git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux >> 8d85fce Drivers: block: remove __dev* attributes. >> 6f03979 Drivers: scsi: remove __dev* attributes. >> f4953fe virtio-blk: Don't free ida when disk is in use > > Nobody outside Mellanox has ever been able to reproduce the behavior > reported by you. I have asked for 2nd opinion so we can get a quorum either way. > Something in your tests might have been specific to the Mellanox > environment. Have you perhaps been running your tests with a firmware > version that is not available to the general public ? NO > I would appreciate it if you could check your test environment and > repeat your tests. We will repeat the tests, indeed. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <5112049B.8030406-HInyCGIudOg@public.gmane.org> 2013-02-06 7:44 ` Or Gerlitz @ 2013-02-06 21:42 ` Vu Pham [not found] ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 1 sibling, 1 reply; 23+ messages in thread From: Vu Pham @ 2013-02-06 21:42 UTC (permalink / raw) To: Bart Van Assche Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Oren Duer, Sagi Grimberg [-- Attachment #1: Type: text/plain, Size: 3961 bytes --] Bart Van Assche wrote: > On 02/05/13 21:54, Or Gerlitz wrote: >> On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> >> wrote: >>> On 02/04/13 22:11, Or Gerlitz wrote: >> Bart, I'd like to sharpen the point: could you please clarify if the >> series posted to linux-rdma stands for itself in the sense that SRP HA >> scheme X (please state it) now works/better when the patches applied >> on top of the latest 3.8-rc cut? OR for X to do better/work, one needs >> this series AND the one you posted to linux-scsi. > > Hello Or, > > A huge number of patches have been taken upstream between 3.8-rc1 and > 3.8-rc6. I have retested these three patches with 3.8-rc6 and would > appreciate if you would also repeat your tests. > > Thanks, > > Bart. Hello Bart, I tested your 3.8 v3 patchset. I did the following: - clone & checkout Roland's ib tree for-next branch - applied Bart's 3.8 v3 patchset - applied "save & restore host_scribble during error handling" patch - http://www.mail-archive.com/linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg17809.html I have two paths to target thru port 1 & 2 (scsi_host host9 & host10) - run I/Os - disable port 1 @ 19:11:30 - error recovery for host9 kick in @ 19:12:04 - multipath remove the path, I/Os fail-over @ 19:12:51 - error recovery was still going on with host9 (sysfs entry for host9 still intact) - enable port 1 @19:15:00 - host9 reconnect to target thru error recovery, multipathd module re-instate the path in kernel; and then host9 is REMOVED, usermode "multipath -l" did not show re-instate path thru host9 Feb 6 19:15:04 vsa30 kernel: scsi host9: SRP abort called Feb 6 19:15:05 vsa30 multipathd: overflow in attribute '/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host9/target9:0:0/9:0:0:2/state' Feb 6 19:15:14 vsa30 kernel: scsi host9: SRP abort called Feb 6 19:15:14 vsa30 kernel: scsi host9: SRP reset_device called Feb 6 19:15:14 vsa30 kernel: scsi host9: ib_srp: SRP reset_host called Feb 6 19:15:14 vsa30 kernel: scsi host9: ib_srp: reconnect succeeded Feb 6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: sdd - tur checker reports path is up Feb 6 19:15:26 vsa30 multipathd: 8:48: reinstated Feb 6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: remaining active paths: 2 Feb 6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: sdc - tur checker reports path is up Feb 6 19:15:26 vsa30 multipathd: 8:32: reinstated Feb 6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: remaining active paths: 2 Feb 6 19:15:26 vsa30 multipathd: sdc: remove path (uevent) Feb 6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:80 1] Feb 6 19:15:26 vsa30 multipathd: sdc: path removed from map 3600144f0665c4400000050a522180002 Feb 6 19:15:26 vsa30 kernel: sd 9:0:0:1: [sdc] Synchronizing SCSI cache Feb 6 19:15:26 vsa30 multipathd: sdd: remove path (uevent) Feb 6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:96 1] Feb 6 19:15:26 vsa30 multipathd: sdd: path removed from map 3600144f0665c4400000050a522180003 Feb 6 19:15:26 vsa30 kernel: sd 9:0:0:2: [sdd] Synchronizing SCSI cache - disable port 2 @19:22:50 - error recovery kicked in on host10 @ 19:23:40 - I/Os failed with NO path to target @ 19:24:27 - without enabling port 2, error recovery was still going on host10 still 19:57:52 and stop. - host10 was still in sysfs /sys/class/scsi_host/host10 & taking reference on ib_srp module - enable port 2 - nothing happened. Conclusion: 1. disable the port/path long enough >35 minutes, we have dangling scsi host. 2. enable the port within 30 minute, scsi host re-establish connection, path re-instate and then scsi_host was removed (no entry in sysfs) I attached a log here to show what happened above. thanks, -vu [-- Attachment #2: messages.bz2 --] [-- Type: application/octet-stream, Size: 10661 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-07 9:05 ` Bart Van Assche [not found] ` <51136E74.9090209-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Bart Van Assche @ 2013-02-07 9:05 UTC (permalink / raw) To: Vu Pham Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Oren Duer, Sagi Grimberg On 02/06/13 22:42, Vu Pham wrote: > Conclusion: > 1. disable the port/path long enough >35 minutes, we have dangling scsi > host. > 2. enable the port within 30 minute, scsi host re-establish connection, > path re-instate and then scsi_host was removed (no entry in sysfs) > > I attached a log here to show what happened above. Hello Vu, I found the following in the attached logs: [ ... ] Feb 6 19:24:25 vsa30 kernel: scsi host10: ib_srp: SRP reset_host called [ ... ] Feb 6 19:25:28 vsa30 kernel: scsi host10: SRP abort called [ ... ] It is easy to see in patch 3/3 that srp_reset_host() invokes srp_reconnect_target() unconditionally and that that last function kills all outstanding requests via srp_reset_req(). So to me the above output means that the attached logs were generated by a kernel missing at least patch 3/3. This means that the above conclusions are invalid. I think it is also worth mentioning here that I asked Mellanox two months ago via private e-mail to provide me access to a setup on which this issue can be reproduced and on which I can recompile the kernel myself. However, such access was never provided. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <51136E74.9090209-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <51136E74.9090209-HInyCGIudOg@public.gmane.org> @ 2013-02-07 9:41 ` Or Gerlitz [not found] ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2013-02-07 18:20 ` Vu Pham 1 sibling, 1 reply; 23+ messages in thread From: Or Gerlitz @ 2013-02-07 9:41 UTC (permalink / raw) To: Bart Van Assche Cc: Vu Pham, Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Oren Duer, Sagi Grimberg On 07/02/2013 11:05, Bart Van Assche wrote: > On 02/06/13 22:42, Vu Pham wrote: >> Conclusion: >> 1. disable the port/path long enough >35 minutes, we have dangling scsi >> host. >> 2. enable the port within 30 minute, scsi host re-establish connection, >> path re-instate and then scsi_host was removed (no entry in sysfs) >> >> I attached a log here to show what happened above. > > Hello Vu, > > I found the following in the attached logs: > > [ ... ] > Feb 6 19:24:25 vsa30 kernel: scsi host10: ib_srp: SRP reset_host called > [ ... ] > Feb 6 19:25:28 vsa30 kernel: scsi host10: SRP abort called > [ ... ] > > It is easy to see in patch 3/3 that srp_reset_host() invokes > srp_reconnect_target() unconditionally and that that last function > kills all outstanding requests via srp_reset_req(). So to me the above > output means that the attached logs were generated by a kernel missing > at least patch 3/3. This means that the above conclusions are invalid. Hi Bart, Please calm down, we truly think your work is great step in the right direction. We will double check the environment and the logs provided to you. On the half side of the glass, I think Vu also saw things that work better with these patches (BTW - if the fourth patch that Vu used "save & restore host_scribble during error handling" is also needed, maybe you add it to this series, so they are reviewed/accepted together). We will set with you online session in the coming 2-3 working days to build the kernel and conduct the test together. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-07 10:15 ` Bart Van Assche 0 siblings, 0 replies; 23+ messages in thread From: Bart Van Assche @ 2013-02-07 10:15 UTC (permalink / raw) To: Or Gerlitz Cc: Vu Pham, Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Oren Duer, Sagi Grimberg On 02/07/13 10:41, Or Gerlitz wrote: > (BTW - if the fourth patch that Vu used "save > & restore host_scribble during error handling" is also needed, maybe you > add it to this series, so they are reviewed/accepted together). Hello Or, The three patches I posted guarantee timely host removal even without the host_scribble patch. Even if srp_abort() would not have aborted a request because the host_scribble field had been overwritten, srp_reset_host() will be invoked later on and will finish such a request properly. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 [not found] ` <51136E74.9090209-HInyCGIudOg@public.gmane.org> 2013-02-07 9:41 ` Or Gerlitz @ 2013-02-07 18:20 ` Vu Pham [not found] ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 1 sibling, 1 reply; 23+ messages in thread From: Vu Pham @ 2013-02-07 18:20 UTC (permalink / raw) To: Bart Van Assche Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Oren Duer, Sagi Grimberg Bart Van Assche wrote: > On 02/06/13 22:42, Vu Pham wrote: >> Conclusion: >> 1. disable the port/path long enough >35 minutes, we have dangling scsi >> host. >> 2. enable the port within 30 minute, scsi host re-establish connection, >> path re-instate and then scsi_host was removed (no entry in sysfs) >> >> I attached a log here to show what happened above. > > Hello Vu, > > I found the following in the attached logs: > > [ ... ] > Feb 6 19:24:25 vsa30 kernel: scsi host10: ib_srp: SRP reset_host called > [ ... ] > Feb 6 19:25:28 vsa30 kernel: scsi host10: SRP abort called > [ ... ] > > It is easy to see in patch 3/3 that srp_reset_host() invokes > srp_reconnect_target() unconditionally and that that last function > kills all outstanding requests via srp_reset_req(). So to me the above > output means that the attached logs were generated by a kernel missing > at least patch 3/3. This means that the above conclusions are invalid. > > I think it is also worth mentioning here that I asked Mellanox two > months ago via private e-mail to provide me access to a setup on which > this issue can be reproduced and on which I can recompile the kernel > myself. However, such access was never provided. > > Bart. Hello Bart, srp_reconnect_target() kill all outstanding requests, fail to reconnect (port offline), queued to remove scsi_host --> srp_reset_host() return FAILED. While scsi host has not been removed, multipath periodically still send TUR commands to check liveness of this path. Current srp_queuecommand() still process these TUR commands; therefore, the next SRP aborts are for those tur checker commands You can see these tur checker errors from multipath in the log. Sagi from Mellanox will work with you on webex. thanks, -vu -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-15 9:39 ` Bart Van Assche [not found] ` <511E024E.70002-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Bart Van Assche @ 2013-02-15 9:39 UTC (permalink / raw) To: Vu Pham Cc: Or Gerlitz, David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Sagi Grimberg If an SRP target is no longer reachable and srp_reset_host() fails to reconnect then ib_srp will invoke scsi_remove_host(). That function will invoke __scsi_remove_device() for each LUN. And that last function will change the device state from SDEV_TRANSPORT_OFFLINE into SDEV_CANCEL. Certain user space software, e.g. older versions of multipathd, continue queueing I/O to SCSI devices that are in the SDEV_CANCEL state. If these I/O requests are submitted as SG_IO that means that the REQ_PREEMPT flag will be set and hence that these requests will be passed to srp_queuecommand(). These requests will time out. If new requests are queued fast enough from user space these active requests will prevent __scsi_remove_device() to finish. Avoid this by failing I/O requests in the SDEV_CANCEL state if the transport is offline. Introduce a new variable to keep track of the transport state instead of failing requests if (!target->connected || target->qp_in_error) such that the SCSI error handler has a chance to retry commands after a transport layer failure occurred. Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org> Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/ulp/srp/ib_srp.c | 7 +++++++ drivers/infiniband/ulp/srp/ib_srp.h | 1 + 2 files changed, 8 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 8a7eb9f..b34752d 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target) scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING : SDEV_TRANSPORT_OFFLINE); + target->transport_offline = ret != 0; if (ret) goto err; @@ -1353,6 +1354,12 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) unsigned long flags; int len; + if (unlikely(target->transport_offline)) { + scmnd->result = DID_NO_CONNECT << 16; + scmnd->scsi_done(scmnd); + return 0; + } + spin_lock_irqsave(&target->lock, flags); iu = __srp_get_tx_iu(target, SRP_IU_CMD); if (!iu) diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h index de2d0b3..66fbedd 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.h +++ b/drivers/infiniband/ulp/srp/ib_srp.h @@ -140,6 +140,7 @@ struct srp_target_port { unsigned int cmd_sg_cnt; unsigned int indirect_size; bool allow_ext_sg; + bool transport_offline; /* Everything above this point is used in the hot path of * command processing. Try to keep them packed into cachelines. -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 23+ messages in thread
[parent not found: <511E024E.70002-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <511E024E.70002-HInyCGIudOg@public.gmane.org> @ 2013-02-18 4:06 ` David Dillow [not found] ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: David Dillow @ 2013-02-18 4:06 UTC (permalink / raw) To: Bart Van Assche Cc: Vu Pham, Or Gerlitz, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Sagi Grimberg On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote: > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > index 8a7eb9f..b34752d 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target) > > scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING : > SDEV_TRANSPORT_OFFLINE); > + target->transport_offline = ret != 0; Minor nit, that line is hard to read; I keep thinking it needs parens around the conditional... Perhaps target->transport_offline = !!ret; or target->transport_offline = ret; gcc should do the right conversion since we're assigning to a bool. Or, Vu, does this solve the issue you've seen? I may have time to test later this week, but not before. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>]
* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org> @ 2013-02-18 8:11 ` Sagi Grimberg [not found] ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2013-02-21 16:10 ` Bart Van Assche 1 sibling, 1 reply; 23+ messages in thread From: Sagi Grimberg @ 2013-02-18 8:11 UTC (permalink / raw) To: David Dillow Cc: Bart Van Assche, Vu Pham, Or Gerlitz, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz On 2/18/2013 6:06 AM, David Dillow wrote: > On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote: >> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c >> index 8a7eb9f..b34752d 100644 >> --- a/drivers/infiniband/ulp/srp/ib_srp.c >> +++ b/drivers/infiniband/ulp/srp/ib_srp.c >> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target) >> >> scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING : >> SDEV_TRANSPORT_OFFLINE); >> + target->transport_offline = ret != 0; > Minor nit, that line is hard to read; I keep thinking it needs parens > around the conditional... > > Perhaps > target->transport_offline = !!ret; > or > target->transport_offline = ret; > > gcc should do the right conversion since we're assigning to a bool. > > > Or, Vu, does this solve the issue you've seen? I may have time to test > later this week, but not before. > Hey David, This indeed solve scsi_host removal issues. Vu is on vacation, I'll perform some more failover tests... -Sagi -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-24 8:09 ` Bart Van Assche [not found] ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Bart Van Assche @ 2013-02-24 8:09 UTC (permalink / raw) To: Sagi Grimberg Cc: David Dillow, Vu Pham, Or Gerlitz, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz On 02/18/13 09:11, Sagi Grimberg wrote: > On 2/18/2013 6:06 AM, David Dillow wrote: >> On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote: >>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c >>> b/drivers/infiniband/ulp/srp/ib_srp.c >>> index 8a7eb9f..b34752d 100644 >>> --- a/drivers/infiniband/ulp/srp/ib_srp.c >>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c >>> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct >>> srp_target_port *target) >>> scsi_target_unblock(&shost->shost_gendev, ret == 0 ? >>> SDEV_RUNNING : >>> SDEV_TRANSPORT_OFFLINE); >>> + target->transport_offline = ret != 0; >> Minor nit, that line is hard to read; I keep thinking it needs parens >> around the conditional... >> >> Perhaps >> target->transport_offline = !!ret; >> or >> target->transport_offline = ret; >> >> gcc should do the right conversion since we're assigning to a bool. >> >> >> Or, Vu, does this solve the issue you've seen? I may have time to test >> later this week, but not before. > > This indeed solve scsi_host removal issues. > Vu is on vacation, I'll perform some more failover tests... Hello Sagi, Since no further feedback was posted on the list I assume that means that all tests passed ? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5129CAB6.5030506-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org> @ 2013-02-24 8:59 ` Sagi Grimberg [not found] ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 23+ messages in thread From: Sagi Grimberg @ 2013-02-24 8:59 UTC (permalink / raw) To: Bart Van Assche Cc: David Dillow, Vu Pham, Or Gerlitz, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz On 2/24/2013 10:09 AM, Bart Van Assche wrote: > On 02/18/13 09:11, Sagi Grimberg wrote: >> On 2/18/2013 6:06 AM, David Dillow wrote: >>> On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote: >>>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c >>>> b/drivers/infiniband/ulp/srp/ib_srp.c >>>> index 8a7eb9f..b34752d 100644 >>>> --- a/drivers/infiniband/ulp/srp/ib_srp.c >>>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c >>>> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct >>>> srp_target_port *target) >>>> scsi_target_unblock(&shost->shost_gendev, ret == 0 ? >>>> SDEV_RUNNING : >>>> SDEV_TRANSPORT_OFFLINE); >>>> + target->transport_offline = ret != 0; >>> Minor nit, that line is hard to read; I keep thinking it needs parens >>> around the conditional... >>> >>> Perhaps >>> target->transport_offline = !!ret; >>> or >>> target->transport_offline = ret; >>> >>> gcc should do the right conversion since we're assigning to a bool. >>> >>> >>> Or, Vu, does this solve the issue you've seen? I may have time to test >>> later this week, but not before. >> >> This indeed solve scsi_host removal issues. >> Vu is on vacation, I'll perform some more failover tests... > > Hello Sagi, > > Since no further feedback was posted on the list I assume that means > that all tests passed ? > > Bart. > Hey Bart, Sorry for the delay I was just about to reply... From my end, the related patchset seems solve the scsi_host removal issue and prevents the SCSI error handling loop. Generally our tests passed, I still have some issue with long-term failover test but I'm not sure its SRP (perhaps might origin in IB layer). So ack from me... -Sagi -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-24 14:42 ` Or Gerlitz 0 siblings, 0 replies; 23+ messages in thread From: Or Gerlitz @ 2013-02-24 14:42 UTC (permalink / raw) To: Sagi Grimberg, David Dillow, Roland Dreier Cc: Bart Van Assche, Vu Pham, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz On Sun, Feb 24, 2013 at 10:59 AM, Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > On 2/24/2013 10:09 AM, Bart Van Assche wrote: >> On 02/18/13 09:11, Sagi Grimberg wrote: >>> On 2/18/2013 6:06 AM, David Dillow wrote: >>>> Or, Vu, does this solve the issue you've seen? I may have time to test >>>> later this week, but not before. >>> This indeed solve scsi_host removal issues. >>> Vu is on vacation, I'll perform some more failover tests... >> Hello Sagi, Since no further feedback >> was posted on the list I assume that means that all tests passed ? > Hey Bart, Sorry for the delay I was just about to reply... > From my end, the related patchset seems solve the scsi_host removal issue > and prevents the SCSI error handling loop. > Generally our tests passed, I still have some issue with long-term > failover test but I'm not sure its SRP (perhaps might origin in IB layer). > So ack from me... OK Dave/Roland - sounds like we did managed to get real progress here... lets get these fix in for 3.9 and 3.8-stable too, Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline [not found] ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org> 2013-02-18 8:11 ` Sagi Grimberg @ 2013-02-21 16:10 ` Bart Van Assche 1 sibling, 0 replies; 23+ messages in thread From: Bart Van Assche @ 2013-02-21 16:10 UTC (permalink / raw) To: David Dillow Cc: Vu Pham, Or Gerlitz, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Or Gerlitz, Sagi Grimberg On 02/18/13 05:06, David Dillow wrote: > On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote: >> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c >> index 8a7eb9f..b34752d 100644 >> --- a/drivers/infiniband/ulp/srp/ib_srp.c >> +++ b/drivers/infiniband/ulp/srp/ib_srp.c >> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target) >> >> scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING : >> SDEV_TRANSPORT_OFFLINE); >> + target->transport_offline = ret != 0; > > Minor nit, that line is hard to read; I keep thinking it needs parens > around the conditional... > > Perhaps > target->transport_offline = !!ret; > or > target->transport_offline = ret; > > gcc should do the right conversion since we're assigning to a bool. Personally I prefer changing the assignment into the first alternative since it's more explicit than the second alternative - it doesn't require the person who's reading the source code that the transport_offline variable has been declared as "bool". Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2013-02-24 14:42 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-01 15:18 [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Bart Van Assche
[not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
2013-02-01 15:18 ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche
2013-02-01 15:19 ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
2013-02-01 15:21 ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
2013-02-04 21:11 ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
[not found] ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-05 16:25 ` Bart Van Assche
[not found] ` <5111327F.6050402-HInyCGIudOg@public.gmane.org>
2013-02-05 20:54 ` Or Gerlitz
[not found] ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-06 7:22 ` Bart Van Assche
[not found] ` <5112049B.8030406-HInyCGIudOg@public.gmane.org>
2013-02-06 7:44 ` Or Gerlitz
[not found] ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-06 7:59 ` Bart Van Assche
[not found] ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org>
2013-02-06 8:25 ` Or Gerlitz
2013-02-06 21:42 ` Vu Pham
[not found] ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-07 9:05 ` Bart Van Assche
[not found] ` <51136E74.9090209-HInyCGIudOg@public.gmane.org>
2013-02-07 9:41 ` Or Gerlitz
[not found] ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-07 10:15 ` Bart Van Assche
2013-02-07 18:20 ` Vu Pham
[not found] ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-15 9:39 ` [PATCH] IB/srp: Fail I/O requests if the transport is offline Bart Van Assche
[not found] ` <511E024E.70002-HInyCGIudOg@public.gmane.org>
2013-02-18 4:06 ` David Dillow
[not found] ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-02-18 8:11 ` Sagi Grimberg
[not found] ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-24 8:09 ` Bart Van Assche
[not found] ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org>
2013-02-24 8:59 ` Sagi Grimberg
[not found] ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-24 14:42 ` Or Gerlitz
2013-02-21 16:10 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).