From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCH v2 04/15] IB/srp: Fail I/O fast if target offline Date: Mon, 01 Jul 2013 13:38:33 +0200 Message-ID: <51D16A39.4050709@acm.org> References: <51CD856A.3010102@acm.org> <51CD8676.6080205@acm.org> <51D14AF1.4000803@profitbricks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51D14AF1.4000803-EIkl63zCoXaH+58JC4qpiA@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Sebastian Riemer Cc: Roland Dreier , David Dillow , Vu Pham , linux-rdma List-Id: linux-rdma@vger.kernel.org On 07/01/13 11:25, Sebastian Riemer wrote: > On 28.06.2013 14:49, Bart Van Assche wrote: >> If reconnecting failed we know that no command completion will >> be received anymore. Hence let the SCSI error handler fail such >> commands immediately. >> >> Signed-off-by: Bart Van Assche >> Cc: Roland Dreier >> Cc: David Dillow >> Cc: Sebastian Riemer >> Cc: Vu Pham >> --- >> drivers/infiniband/ulp/srp/ib_srp.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c >> index 8c95262..5c91521 100644 >> --- a/drivers/infiniband/ulp/srp/ib_srp.c >> +++ b/drivers/infiniband/ulp/srp/ib_srp.c >> @@ -1755,6 +1755,8 @@ static int srp_abort(struct scsi_cmnd *scmnd) >> if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun, >> SRP_TSK_ABORT_TASK) == 0) >> ret = SUCCESS; >> + else if (target->transport_offline) >> + ret = FAST_IO_FAIL; >> else >> ret = FAILED; >> srp_free_req(target, req, scmnd, 0); > > I'm also missing the concept for srp_reset_device(). There is a very > common case that the SCSI error handling and the transport layer error > handling run in parallel: Congestion. Can you explain this comment further, and also how this comment relates to patch 04/15 ? > In congestion some LUNs are blocked while others can still transmit. A > little bit later the QP timeout triggers in the middle of the SCSI error > handling in srp_abort(), srp_reset_device() or less likely in > srp_reset_host(). I am aware this can result in concurrent srp_reconnect_rport() calls. However, such concurrent calls are serialized via rport->mutex. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html