linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fc-transport: Close state transition-window during rport deletion.
@ 2009-04-28 14:01 Andrew Vasquez
  2009-04-28 14:17 ` Mike Christie
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Vasquez @ 2009-04-28 14:01 UTC (permalink / raw)
  To: Linux SCSI Mailing List; +Cc: James Smart

After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
but, prior to making the upcall to 'block' the scsi-target
associated with an rport, queued commands can recycle and
ultimately run out of retries causing failures to propagate to
upper-level drivers.  Close this transition-window by returning
the non-'retries' modifying DID_IMM_RETRY status for submitted
I/Os.

Issue seen during continuous LIP-injection.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>

---

diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index c9184f7..d189e0e 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
 	case FC_PORTSTATE_BLOCKED:
 		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
 			result = DID_TRANSPORT_FAILFAST << 16;
+		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
+			result = DID_IMM_RETRY << 16;
 		else
 			result = DID_TRANSPORT_DISRUPTED << 16;
 		break;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] fc-transport: Close state transition-window during rport deletion.
  2009-04-28 14:01 [PATCH] fc-transport: Close state transition-window during rport deletion Andrew Vasquez
@ 2009-04-28 14:17 ` Mike Christie
  2009-04-28 14:37   ` Andrew Vasquez
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Christie @ 2009-04-28 14:17 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: Linux SCSI Mailing List, James Smart

Andrew Vasquez wrote:
> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> but, prior to making the upcall to 'block' the scsi-target
> associated with an rport, queued commands can recycle and
> ultimately run out of retries causing failures to propagate to
> upper-level drivers.  Close this transition-window by returning
> the non-'retries' modifying DID_IMM_RETRY status for submitted
> I/Os.
> 
> Issue seen during continuous LIP-injection.
> 
> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
> 
> ---
> 
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index c9184f7..d189e0e 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
>  	case FC_PORTSTATE_BLOCKED:
>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>  			result = DID_TRANSPORT_FAILFAST << 16;
> +		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> +			result = DID_IMM_RETRY << 16;
>  		else
>  			result = DID_TRANSPORT_DISRUPTED << 16;

I think you can just remove this DID_TRANSPORT_DISRUPTED. The deletion, 
role change or re-addition code will do the right thing with the IO when 
it finishes the transition for this case.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fc-transport: Close state transition-window during rport deletion.
  2009-04-28 14:17 ` Mike Christie
@ 2009-04-28 14:37   ` Andrew Vasquez
  2009-04-28 15:01     ` Mike Christie
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Vasquez @ 2009-04-28 14:37 UTC (permalink / raw)
  To: Mike Christie; +Cc: Linux SCSI Mailing List, James Smart

On Tue, 28 Apr 2009, Mike Christie wrote:

> Andrew Vasquez wrote:
> > After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> > but, prior to making the upcall to 'block' the scsi-target
> > associated with an rport, queued commands can recycle and
> > ultimately run out of retries causing failures to propagate to
> > upper-level drivers.  Close this transition-window by returning
> > the non-'retries' modifying DID_IMM_RETRY status for submitted
> > I/Os.
> > 
> > Issue seen during continuous LIP-injection.
> > 
> > Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
> > 
> > ---
> > 
> > diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> > index c9184f7..d189e0e 100644
> > --- a/include/scsi/scsi_transport_fc.h
> > +++ b/include/scsi/scsi_transport_fc.h
> > @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
> >  	case FC_PORTSTATE_BLOCKED:
> >  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
> >  			result = DID_TRANSPORT_FAILFAST << 16;
> > +		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> > +			result = DID_IMM_RETRY << 16;
> >  		else
> >  			result = DID_TRANSPORT_DISRUPTED << 16;
> 
> I think you can just remove this DID_TRANSPORT_DISRUPTED. The deletion, 
> role change or re-addition code will do the right thing with the IO when 
> it finishes the transition for this case.

Just to be clear here, you're proposing this as an alternate?

-- av

diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index c9184f7..a53a0fd 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
 			result = DID_TRANSPORT_FAILFAST << 16;
 		else
-			result = DID_TRANSPORT_DISRUPTED << 16;
+			result = DID_IMM_RETRY << 16;
 		break;
 	default:
 		result = DID_NO_CONNECT << 16;


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] fc-transport: Close state transition-window during rport deletion.
  2009-04-28 14:37   ` Andrew Vasquez
@ 2009-04-28 15:01     ` Mike Christie
  2009-04-28 16:01       ` Andrew Vasquez
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Christie @ 2009-04-28 15:01 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: Linux SCSI Mailing List, James Smart

Andrew Vasquez wrote:
> On Tue, 28 Apr 2009, Mike Christie wrote:
> 
>> Andrew Vasquez wrote:
>>> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
>>> but, prior to making the upcall to 'block' the scsi-target
>>> associated with an rport, queued commands can recycle and
>>> ultimately run out of retries causing failures to propagate to
>>> upper-level drivers.  Close this transition-window by returning
>>> the non-'retries' modifying DID_IMM_RETRY status for submitted
>>> I/Os.
>>>
>>> Issue seen during continuous LIP-injection.
>>>
>>> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
>>>
>>> ---
>>>
>>> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
>>> index c9184f7..d189e0e 100644
>>> --- a/include/scsi/scsi_transport_fc.h
>>> +++ b/include/scsi/scsi_transport_fc.h
>>> @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
>>>  	case FC_PORTSTATE_BLOCKED:
>>>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>>  			result = DID_TRANSPORT_FAILFAST << 16;
>>> +		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
>>> +			result = DID_IMM_RETRY << 16;
>>>  		else
>>>  			result = DID_TRANSPORT_DISRUPTED << 16;
>> I think you can just remove this DID_TRANSPORT_DISRUPTED. The deletion, 
>> role change or re-addition code will do the right thing with the IO when 
>> it finishes the transition for this case.
> 
> Just to be clear here, you're proposing this as an alternate?
> 
> -- av
> 
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index c9184f7..a53a0fd 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>  			result = DID_TRANSPORT_FAILFAST << 16;
>  		else
> -			result = DID_TRANSPORT_DISRUPTED << 16;
> +			result = DID_IMM_RETRY << 16;

Yeah, I think that is what we want. We originally had only 
DID_IMM_RETRY. When I added DID_TRANSPORT_DISRUPTED, it initially had 
infinite retries like DID_IMM_RETRY so the behavior was not changed. 
When I fixed DID_TRANSPORT_DISRUPTED to follow the cmd retries/allowed, 
I should have changed this code back to use DID_IMM_RETRY.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fc-transport: Close state transition-window during rport deletion.
  2009-04-28 15:01     ` Mike Christie
@ 2009-04-28 16:01       ` Andrew Vasquez
  2009-04-29 17:41         ` Mike Christie
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Vasquez @ 2009-04-28 16:01 UTC (permalink / raw)
  To: Linux SCSI Mailing List; +Cc: Mike Christie, James Smart

On Tue, 28 Apr 2009, Mike Christie wrote:

> Andrew Vasquez wrote:
> > On Tue, 28 Apr 2009, Mike Christie wrote:
> > 
> >> Andrew Vasquez wrote:
> >>> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> >>> but, prior to making the upcall to 'block' the scsi-target
> >>> associated with an rport, queued commands can recycle and
> >>> ultimately run out of retries causing failures to propagate to
> >>> upper-level drivers.  Close this transition-window by returning
> >>> the non-'retries' modifying DID_IMM_RETRY status for submitted
> >>> I/Os.
> >>>
> >>> Issue seen during continuous LIP-injection.
> >>>
> >>> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
> >>>
> >>> ---
> >>>
> >>> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> >>> index c9184f7..d189e0e 100644
> >>> --- a/include/scsi/scsi_transport_fc.h
> >>> +++ b/include/scsi/scsi_transport_fc.h
> >>> @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
> >>>  	case FC_PORTSTATE_BLOCKED:
> >>>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
> >>>  			result = DID_TRANSPORT_FAILFAST << 16;
> >>> +		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
> >>> +			result = DID_IMM_RETRY << 16;
> >>>  		else
> >>>  			result = DID_TRANSPORT_DISRUPTED << 16;
> >> I think you can just remove this DID_TRANSPORT_DISRUPTED. The deletion, 
> >> role change or re-addition code will do the right thing with the IO when 
> >> it finishes the transition for this case.
> > 
> > Just to be clear here, you're proposing this as an alternate?
> > 
> > -- av
> > 
> > diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> > index c9184f7..a53a0fd 100644
> > --- a/include/scsi/scsi_transport_fc.h
> > +++ b/include/scsi/scsi_transport_fc.h
> > @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
> >  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
> >  			result = DID_TRANSPORT_FAILFAST << 16;
> >  		else
> > -			result = DID_TRANSPORT_DISRUPTED << 16;
> > +			result = DID_IMM_RETRY << 16;
> 
> Yeah, I think that is what we want. We originally had only 
> DID_IMM_RETRY. When I added DID_TRANSPORT_DISRUPTED, it initially had 
> infinite retries like DID_IMM_RETRY so the behavior was not changed. 
> When I fixed DID_TRANSPORT_DISRUPTED to follow the cmd retries/allowed, 
> I should have changed this code back to use DID_IMM_RETRY.

Ok, here's a final one with an updated commit message.

---

fc-transport: Close state transition-window during rport deletion.

After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
but, prior to making the upcall to 'block' the scsi-target
associated with an rport, queued commands can recycle and
ultimately run out of retries causing failures to propagate to
upper-level drivers.  Close this transition-window by returning
the non-'retries' modifying DID_IMM_RETRY status for submitted
I/Os.

Issue seen during continuous LIP-injection.

Mike Christie (michaelc@cs.wisc.edu) also notes that this is a
partial revert of f46e307da925a7b71a0018c0510cdc6e588b87fc
([SCSI] fc class: Add support for new transport errors), as
follow-on transport changes now have DID_TRANSPORT_* statuses
follow a command's retries/allowed values.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>

--

diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index c9184f7..a53a0fd 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
 			result = DID_TRANSPORT_FAILFAST << 16;
 		else
-			result = DID_TRANSPORT_DISRUPTED << 16;
+			result = DID_IMM_RETRY << 16;
 		break;
 	default:
 		result = DID_NO_CONNECT << 16;


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] fc-transport: Close state transition-window during rport deletion.
  2009-04-28 16:01       ` Andrew Vasquez
@ 2009-04-29 17:41         ` Mike Christie
  2009-04-29 18:12           ` Mike Christie
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Christie @ 2009-04-29 17:41 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: Linux SCSI Mailing List, James Smart

[-- Attachment #1: Type: text/plain, Size: 4042 bytes --]

Andrew Vasquez wrote:
> On Tue, 28 Apr 2009, Mike Christie wrote:
> 
>> Andrew Vasquez wrote:
>>> On Tue, 28 Apr 2009, Mike Christie wrote:
>>>
>>>> Andrew Vasquez wrote:
>>>>> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
>>>>> but, prior to making the upcall to 'block' the scsi-target
>>>>> associated with an rport, queued commands can recycle and
>>>>> ultimately run out of retries causing failures to propagate to
>>>>> upper-level drivers.  Close this transition-window by returning
>>>>> the non-'retries' modifying DID_IMM_RETRY status for submitted
>>>>> I/Os.
>>>>>
>>>>> Issue seen during continuous LIP-injection.
>>>>>
>>>>> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
>>>>>
>>>>> ---
>>>>>
>>>>> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
>>>>> index c9184f7..d189e0e 100644
>>>>> --- a/include/scsi/scsi_transport_fc.h
>>>>> +++ b/include/scsi/scsi_transport_fc.h
>>>>> @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
>>>>>  	case FC_PORTSTATE_BLOCKED:
>>>>>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>>>>  			result = DID_TRANSPORT_FAILFAST << 16;
>>>>> +		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
>>>>> +			result = DID_IMM_RETRY << 16;
>>>>>  		else
>>>>>  			result = DID_TRANSPORT_DISRUPTED << 16;
>>>> I think you can just remove this DID_TRANSPORT_DISRUPTED. The deletion, 
>>>> role change or re-addition code will do the right thing with the IO when 
>>>> it finishes the transition for this case.
>>> Just to be clear here, you're proposing this as an alternate?
>>>
>>> -- av
>>>
>>> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
>>> index c9184f7..a53a0fd 100644
>>> --- a/include/scsi/scsi_transport_fc.h
>>> +++ b/include/scsi/scsi_transport_fc.h
>>> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>>>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>>  			result = DID_TRANSPORT_FAILFAST << 16;
>>>  		else
>>> -			result = DID_TRANSPORT_DISRUPTED << 16;
>>> +			result = DID_IMM_RETRY << 16;
>> Yeah, I think that is what we want. We originally had only 
>> DID_IMM_RETRY. When I added DID_TRANSPORT_DISRUPTED, it initially had 
>> infinite retries like DID_IMM_RETRY so the behavior was not changed. 
>> When I fixed DID_TRANSPORT_DISRUPTED to follow the cmd retries/allowed, 
>> I should have changed this code back to use DID_IMM_RETRY.
> 
> Ok, here's a final one with an updated commit message.
> 
> ---
> 
> fc-transport: Close state transition-window during rport deletion.
> 
> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> but, prior to making the upcall to 'block' the scsi-target
> associated with an rport, queued commands can recycle and
> ultimately run out of retries causing failures to propagate to
> upper-level drivers.  Close this transition-window by returning
> the non-'retries' modifying DID_IMM_RETRY status for submitted
> I/Os.
> 
> Issue seen during continuous LIP-injection.
> 
> Mike Christie (michaelc@cs.wisc.edu) also notes that this is a
> partial revert of f46e307da925a7b71a0018c0510cdc6e588b87fc
> ([SCSI] fc class: Add support for new transport errors), as
> follow-on transport changes now have DID_TRANSPORT_* statuses
> follow a command's retries/allowed values.
> 
> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
> 
> --
> 
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index c9184f7..a53a0fd 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>  		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>  			result = DID_TRANSPORT_FAILFAST << 16;
>  		else
> -			result = DID_TRANSPORT_DISRUPTED << 16;
> +			result = DID_IMM_RETRY << 16;
>  		break;
>  	default:
>  		result = DID_NO_CONNECT << 16;
> 

Hey, I did the attached patch to convert the port online devloss case 
and iscsi since it will have the same problem.

[-- Attachment #2: use-did-imm-retry.patch --]
[-- Type: text/plain, Size: 2210 bytes --]

>From Andrew Vasquez:

> fc-transport: Close state transition-window during rport deletion.
> 
> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> but, prior to making the upcall to 'block' the scsi-target
> associated with an rport, queued commands can recycle and
> ultimately run out of retries causing failures to propagate to
> upper-level drivers.  Close this transition-window by returning
> the non-'retries' modifying DID_IMM_RETRY status for submitted
> I/Os.

The same can happen for iscsi when transitioning from logged in
to failed and blocking the sdevs.

This patch converts iscsi and fc's transitions back to use DID_IMM_RETRY
instead of DID_TRANSPORT_DISRUPTED which has a limited number of retries
that we do not want to use for handling this race.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Addition of iscsi and fc port online devloss case conversion by Mike Christie


diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 0b117c5..04d9da6 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -357,7 +357,7 @@ int iscsi_session_chkready(struct iscsi_cls_session *session)
 		err = 0;
 		break;
 	case ISCSI_SESSION_FAILED:
-		err = DID_TRANSPORT_DISRUPTED << 16;
+		err = DID_IMM_RETRY << 16;
 		break;
 	case ISCSI_SESSION_FREE:
 		err = DID_TRANSPORT_FAILFAST << 16;
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index c9184f7..68a8d87 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -680,7 +680,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
-			result = DID_TRANSPORT_DISRUPTED << 16;
+			result = DID_IMM_RETRY << 16;
 		else
 			result = DID_NO_CONNECT << 16;
 		break;
@@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
 			result = DID_TRANSPORT_FAILFAST << 16;
 		else
-			result = DID_TRANSPORT_DISRUPTED << 16;
+			result = DID_IMM_RETRY << 16;
 		break;
 	default:
 		result = DID_NO_CONNECT << 16;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] fc-transport: Close state transition-window during rport deletion.
  2009-04-29 17:41         ` Mike Christie
@ 2009-04-29 18:12           ` Mike Christie
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Christie @ 2009-04-29 18:12 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: Linux SCSI Mailing List, James Smart

[-- Attachment #1: Type: text/plain, Size: 4391 bytes --]

Mike Christie wrote:
> Andrew Vasquez wrote:
>> On Tue, 28 Apr 2009, Mike Christie wrote:
>>
>>> Andrew Vasquez wrote:
>>>> On Tue, 28 Apr 2009, Mike Christie wrote:
>>>>
>>>>> Andrew Vasquez wrote:
>>>>>> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
>>>>>> but, prior to making the upcall to 'block' the scsi-target
>>>>>> associated with an rport, queued commands can recycle and
>>>>>> ultimately run out of retries causing failures to propagate to
>>>>>> upper-level drivers.  Close this transition-window by returning
>>>>>> the non-'retries' modifying DID_IMM_RETRY status for submitted
>>>>>> I/Os.
>>>>>>
>>>>>> Issue seen during continuous LIP-injection.
>>>>>>
>>>>>> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> diff --git a/include/scsi/scsi_transport_fc.h 
>>>>>> b/include/scsi/scsi_transport_fc.h
>>>>>> index c9184f7..d189e0e 100644
>>>>>> --- a/include/scsi/scsi_transport_fc.h
>>>>>> +++ b/include/scsi/scsi_transport_fc.h
>>>>>> @@ -687,6 +687,8 @@ fc_remote_port_chkready(struct fc_rport *rport)
>>>>>>      case FC_PORTSTATE_BLOCKED:
>>>>>>          if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>>>>>              result = DID_TRANSPORT_FAILFAST << 16;
>>>>>> +        else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
>>>>>> +            result = DID_IMM_RETRY << 16;
>>>>>>          else
>>>>>>              result = DID_TRANSPORT_DISRUPTED << 16;
>>>>> I think you can just remove this DID_TRANSPORT_DISRUPTED. The 
>>>>> deletion, role change or re-addition code will do the right thing 
>>>>> with the IO when it finishes the transition for this case.
>>>> Just to be clear here, you're proposing this as an alternate?
>>>>
>>>> -- av
>>>>
>>>> diff --git a/include/scsi/scsi_transport_fc.h 
>>>> b/include/scsi/scsi_transport_fc.h
>>>> index c9184f7..a53a0fd 100644
>>>> --- a/include/scsi/scsi_transport_fc.h
>>>> +++ b/include/scsi/scsi_transport_fc.h
>>>> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>>>>          if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>>>              result = DID_TRANSPORT_FAILFAST << 16;
>>>>          else
>>>> -            result = DID_TRANSPORT_DISRUPTED << 16;
>>>> +            result = DID_IMM_RETRY << 16;
>>> Yeah, I think that is what we want. We originally had only 
>>> DID_IMM_RETRY. When I added DID_TRANSPORT_DISRUPTED, it initially had 
>>> infinite retries like DID_IMM_RETRY so the behavior was not changed. 
>>> When I fixed DID_TRANSPORT_DISRUPTED to follow the cmd 
>>> retries/allowed, I should have changed this code back to use 
>>> DID_IMM_RETRY.
>>
>> Ok, here's a final one with an updated commit message.
>>
>> ---
>>
>> fc-transport: Close state transition-window during rport deletion.
>>
>> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
>> but, prior to making the upcall to 'block' the scsi-target
>> associated with an rport, queued commands can recycle and
>> ultimately run out of retries causing failures to propagate to
>> upper-level drivers.  Close this transition-window by returning
>> the non-'retries' modifying DID_IMM_RETRY status for submitted
>> I/Os.
>>
>> Issue seen during continuous LIP-injection.
>>
>> Mike Christie (michaelc@cs.wisc.edu) also notes that this is a
>> partial revert of f46e307da925a7b71a0018c0510cdc6e588b87fc
>> ([SCSI] fc class: Add support for new transport errors), as
>> follow-on transport changes now have DID_TRANSPORT_* statuses
>> follow a command's retries/allowed values.
>>
>> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
>>
>> -- 
>>
>> diff --git a/include/scsi/scsi_transport_fc.h 
>> b/include/scsi/scsi_transport_fc.h
>> index c9184f7..a53a0fd 100644
>> --- a/include/scsi/scsi_transport_fc.h
>> +++ b/include/scsi/scsi_transport_fc.h
>> @@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
>>          if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>              result = DID_TRANSPORT_FAILFAST << 16;
>>          else
>> -            result = DID_TRANSPORT_DISRUPTED << 16;
>> +            result = DID_IMM_RETRY << 16;
>>          break;
>>      default:
>>          result = DID_NO_CONNECT << 16;
>>
> 
> Hey, I did the attached patch to convert the port online devloss case 
> and iscsi since it will have the same problem.
> 

I forgot to add my singed off on the patch. Here is a updated one.

[-- Attachment #2: use-did-imm-retry-w-signed.patch --]
[-- Type: text/plain, Size: 2264 bytes --]

>From Andrew Vasquez:

> fc-transport: Close state transition-window during rport deletion.
> 
> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> but, prior to making the upcall to 'block' the scsi-target
> associated with an rport, queued commands can recycle and
> ultimately run out of retries causing failures to propagate to
> upper-level drivers.  Close this transition-window by returning
> the non-'retries' modifying DID_IMM_RETRY status for submitted
> I/Os.

The same can happen for iscsi when transitioning from logged in
to failed and blocking the sdevs.

This patch converts iscsi and fc's transitions back to use DID_IMM_RETRY
instead of DID_TRANSPORT_DISRUPTED which has a limited number of retries
that we do not want to use for handling this race.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
[Addition of iscsi and fc port online devloss case conversion by Mike Christie]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>


diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 0b117c5..04d9da6 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -357,7 +357,7 @@ int iscsi_session_chkready(struct iscsi_cls_session *session)
 		err = 0;
 		break;
 	case ISCSI_SESSION_FAILED:
-		err = DID_TRANSPORT_DISRUPTED << 16;
+		err = DID_IMM_RETRY << 16;
 		break;
 	case ISCSI_SESSION_FREE:
 		err = DID_TRANSPORT_FAILFAST << 16;
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index c9184f7..68a8d87 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -680,7 +680,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->roles & FC_PORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
-			result = DID_TRANSPORT_DISRUPTED << 16;
+			result = DID_IMM_RETRY << 16;
 		else
 			result = DID_NO_CONNECT << 16;
 		break;
@@ -688,7 +688,7 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
 			result = DID_TRANSPORT_FAILFAST << 16;
 		else
-			result = DID_TRANSPORT_DISRUPTED << 16;
+			result = DID_IMM_RETRY << 16;
 		break;
 	default:
 		result = DID_NO_CONNECT << 16;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-29 18:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-28 14:01 [PATCH] fc-transport: Close state transition-window during rport deletion Andrew Vasquez
2009-04-28 14:17 ` Mike Christie
2009-04-28 14:37   ` Andrew Vasquez
2009-04-28 15:01     ` Mike Christie
2009-04-28 16:01       ` Andrew Vasquez
2009-04-29 17:41         ` Mike Christie
2009-04-29 18:12           ` Mike Christie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).