From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Christie Subject: Re: [PATCH] fc-transport: Close state transition-window during rport deletion. Date: Wed, 29 Apr 2009 13:12:39 -0500 Message-ID: <49F89897.5070007@cs.wisc.edu> References: <20090428140152.GA16413@plap4-2.qlogic.org> <49F71006.4060703@cs.wisc.edu> <20090428143746.GE16413@plap4-2.qlogic.org> <49F71A52.5090802@cs.wisc.edu> <20090428160110.GI16413@plap4-2.qlogic.org> <49F8914C.7050803@cs.wisc.edu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------000609060103040004020904" Return-path: Received: from sabe.cs.wisc.edu ([128.105.6.20]:46376 "EHLO sabe.cs.wisc.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752550AbZD2SMw (ORCPT ); Wed, 29 Apr 2009 14:12:52 -0400 In-Reply-To: <49F8914C.7050803@cs.wisc.edu> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Andrew Vasquez Cc: Linux SCSI Mailing List , James Smart This is a multi-part message in MIME format. --------------000609060103040004020904 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Mike Christie wrote: > Andrew Vasquez wrote: >> On Tue, 28 Apr 2009, Mike Christie wrote: >> >>> Andrew Vasquez wrote: >>>> On Tue, 28 Apr 2009, Mike Christie wrote: >>>> >>>>> Andrew Vasquez wrote: >>>>>> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED, >>>>>> but, prior to making the upcall to 'block' the scsi-target >>>>>> associated with an rport, queued commands can recycle and >>>>>> ultimately run out of retries causing failures to propagate to >>>>>> upper-level drivers. Close this transition-window by returning >>>>>> the non-'retries' modifying DID_IMM_RETRY status for submitted >>>>>> I/Os. >>>>>> >>>>>> Issue seen during continuous LIP-injection. >>>>>> >>>>>> Signed-off-by: Andrew Vasquez >>>>>> >>>>>> --- >>>>>> >>>>>> diff --git a/include/scsi/scsi_transport_fc.h >>>>>> b/include/scsi/scsi_transport_fc.h >>>>>> index c9184f7..d189e0e 100644 >>>>>> --- a/include/scsi/scsi_transport_fc.h >>>>>> +++ b/include/scsi/scsi_transport_fc.h >>>>>> @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport) >>>>>> case FC_PORTSTATE_BLOCKED: >>>>>> if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) >>>>>> result = DID_TRANSPORT_FAILFAST << 16; >>>>>> + else if (rport->flags & FC_RPORT_DEVLOSS_PENDING) >>>>>> + result = DID_IMM_RETRY << 16; >>>>>> else >>>>>> result = DID_TRANSPORT_DISRUPTED << 16; >>>>> I think you can just remove this DID_TRANSPORT_DISRUPTED. The >>>>> deletion, role change or re-addition code will do the right thing >>>>> with the IO when it finishes the transition for this case. >>>> Just to be clear here, you're proposing this as an alternate? >>>> >>>> -- av >>>> >>>> diff --git a/include/scsi/scsi_transport_fc.h >>>> b/include/scsi/scsi_transport_fc.h >>>> index c9184f7..a53a0fd 100644 >>>> --- a/include/scsi/scsi_transport_fc.h >>>> +++ b/include/scsi/scsi_transport_fc.h >>>> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport) >>>> if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) >>>> result = DID_TRANSPORT_FAILFAST << 16; >>>> else >>>> - result = DID_TRANSPORT_DISRUPTED << 16; >>>> + result = DID_IMM_RETRY << 16; >>> Yeah, I think that is what we want. We originally had only >>> DID_IMM_RETRY. When I added DID_TRANSPORT_DISRUPTED, it initially had >>> infinite retries like DID_IMM_RETRY so the behavior was not changed. >>> When I fixed DID_TRANSPORT_DISRUPTED to follow the cmd >>> retries/allowed, I should have changed this code back to use >>> DID_IMM_RETRY. >> >> Ok, here's a final one with an updated commit message. >> >> --- >> >> fc-transport: Close state transition-window during rport deletion. >> >> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED, >> but, prior to making the upcall to 'block' the scsi-target >> associated with an rport, queued commands can recycle and >> ultimately run out of retries causing failures to propagate to >> upper-level drivers. Close this transition-window by returning >> the non-'retries' modifying DID_IMM_RETRY status for submitted >> I/Os. >> >> Issue seen during continuous LIP-injection. >> >> Mike Christie (michaelc@cs.wisc.edu) also notes that this is a >> partial revert of f46e307da925a7b71a0018c0510cdc6e588b87fc >> ([SCSI] fc class: Add support for new transport errors), as >> follow-on transport changes now have DID_TRANSPORT_* statuses >> follow a command's retries/allowed values. >> >> Signed-off-by: Andrew Vasquez >> >> -- >> >> diff --git a/include/scsi/scsi_transport_fc.h >> b/include/scsi/scsi_transport_fc.h >> index c9184f7..a53a0fd 100644 >> --- a/include/scsi/scsi_transport_fc.h >> +++ b/include/scsi/scsi_transport_fc.h >> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport) >> if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) >> result = DID_TRANSPORT_FAILFAST << 16; >> else >> - result = DID_TRANSPORT_DISRUPTED << 16; >> + result = DID_IMM_RETRY << 16; >> break; >> default: >> result = DID_NO_CONNECT << 16; >> > > Hey, I did the attached patch to convert the port online devloss case > and iscsi since it will have the same problem. > I forgot to add my singed off on the patch. Here is a updated one. --------------000609060103040004020904 Content-Type: text/plain; name="use-did-imm-retry-w-signed.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="use-did-imm-retry-w-signed.patch" >>From Andrew Vasquez: > fc-transport: Close state transition-window during rport deletion. > > After an rport's state has transitioned to FC_PORTSTATE_BLOCKED, > but, prior to making the upcall to 'block' the scsi-target > associated with an rport, queued commands can recycle and > ultimately run out of retries causing failures to propagate to > upper-level drivers. Close this transition-window by returning > the non-'retries' modifying DID_IMM_RETRY status for submitted > I/Os. The same can happen for iscsi when transitioning from logged in to failed and blocking the sdevs. This patch converts iscsi and fc's transitions back to use DID_IMM_RETRY instead of DID_TRANSPORT_DISRUPTED which has a limited number of retries that we do not want to use for handling this race. Signed-off-by: Andrew Vasquez [Addition of iscsi and fc port online devloss case conversion by Mike Christie] Signed-off-by: Mike Christie diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 0b117c5..04d9da6 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -357,7 +357,7 @@ int iscsi_session_chkready(struct iscsi_cls_session *session) err = 0; break; case ISCSI_SESSION_FAILED: - err = DID_TRANSPORT_DISRUPTED << 16; + err = DID_IMM_RETRY << 16; break; case ISCSI_SESSION_FREE: err = DID_TRANSPORT_FAILFAST << 16; diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h index c9184f7..68a8d87 100644 --- a/include/scsi/scsi_transport_fc.h +++ b/include/scsi/scsi_transport_fc.h @@ -680,7 +680,7 @@ fc_remote_port_chkready(struct fc_rport *rport) if (rport->roles & FC_PORT_ROLE_FCP_TARGET) result = 0; else if (rport->flags & FC_RPORT_DEVLOSS_PENDING) - result = DID_TRANSPORT_DISRUPTED << 16; + result = DID_IMM_RETRY << 16; else result = DID_NO_CONNECT << 16; break; @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport) if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) result = DID_TRANSPORT_FAILFAST << 16; else - result = DID_TRANSPORT_DISRUPTED << 16; + result = DID_IMM_RETRY << 16; break; default: result = DID_NO_CONNECT << 16; --------------000609060103040004020904--