linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
@ 2009-04-27 17:09 Takahiro Yasui
  2009-04-27 18:03 ` Matthew Wilcox
  2009-04-27 20:08 ` James Bottomley
  0 siblings, 2 replies; 8+ messages in thread
From: Takahiro Yasui @ 2009-04-27 17:09 UTC (permalink / raw)
  To: linux-scsi; +Cc: mchristi, mbarrow

Hi,

This is a patch to fix SCSI timeout error recovery issue which
I posted about one week ago.

  [RFC] SCSI timeout error recovery issue
  http://marc.info/?l=linux-scsi&m=124042136915970&w=2

scsi timeout on two or more devices may cause extremely long execution
time for user applications because SDEV_OFFLINE state is changed to
SDEV_RUNNING state during scsi error recovery procedures triggered by
a bus reset or a host reset of LLD, and scsi timeout can happens on
the same devices many times.

This happens because scsi_internal_device_unblock() changes device's state
to SDEV_RUNNING even if a device in other states than SDEV_BLOCK, but
scsi_internal_device_block() is a pair with scsi_internal_device_unblock()
and only devices in SDEV_BLOCK state are needed to be changed. This patch
adds a check of the current device state so that state change and queue
activation is not executed for devices in SDEV_BLOCK state.

This patch is based on the idea that devices in SDEV_OFFLINE state need to
stay in SDEV_OFFLINE because SDEV_OFFLINE state is done by:

 - All recovery procedures, bus reset and host reset, has already failed
   for devices in SDEV_OFFLINE state.
   (There is no (or rare?) chance to be recovered.)

 - A user has changed the device's state to SDEV_OFFLINE intentionally
   by an interface such as sysfs.
   (A user needs to enable devices in SDEV_OFFLINE state before using
    them again.)

I appreciate any comments and suggestions on this patch.

Regards,
---
Takahiro Yasui
Hitachi Computer Products (America), Inc.


Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
---
 drivers/scsi/scsi_lib.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Index: linux-2.6.29/drivers/scsi/scsi_lib.c
===================================================================
--- linux-2.6.29.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6.29/drivers/scsi/scsi_lib.c
@@ -2633,9 +2633,12 @@ scsi_internal_device_unblock(struct scsi
 	unsigned long flags;
 	
 	/* 
-	 * Try to transition the scsi device to SDEV_RUNNING
-	 * and goose the device queue if successful.  
+	 * Try to transition the scsi device to SDEV_RUNNING if it is
+	 * SDEV_BLOCK and goose the device queue if successful.
 	 */
+	if (sdev->sdev_state != SDEV_BLOCK)
+		return 0;
+
 	err = scsi_device_set_state(sdev, SDEV_RUNNING);
 	if (err) {
 		err = scsi_device_set_state(sdev, SDEV_CREATED);





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-27 17:09 [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock Takahiro Yasui
@ 2009-04-27 18:03 ` Matthew Wilcox
  2009-04-27 19:43   ` Takahiro Yasui
  2009-04-27 20:08 ` James Bottomley
  1 sibling, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2009-04-27 18:03 UTC (permalink / raw)
  To: Takahiro Yasui; +Cc: linux-scsi, mchristi, mbarrow

On Mon, Apr 27, 2009 at 01:09:57PM -0400, Takahiro Yasui wrote:
> @@ -2633,9 +2633,12 @@ scsi_internal_device_unblock(struct scsi
>  	unsigned long flags;
>  	
>  	/* 
> -	 * Try to transition the scsi device to SDEV_RUNNING
> -	 * and goose the device queue if successful.  
> +	 * Try to transition the scsi device to SDEV_RUNNING if it is
> +	 * SDEV_BLOCK and goose the device queue if successful.

I think the code looks good, but the edit to the comment dilutes its
flavour somewhat.  How about just moving the comment down below the
check, and then you don't need to edit the comment at all?

>  	 */
> +	if (sdev->sdev_state != SDEV_BLOCK)
> +		return 0;
> +
>  	err = scsi_device_set_state(sdev, SDEV_RUNNING);
>  	if (err) {
>  		err = scsi_device_set_state(sdev, SDEV_CREATED);
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-27 18:03 ` Matthew Wilcox
@ 2009-04-27 19:43   ` Takahiro Yasui
  0 siblings, 0 replies; 8+ messages in thread
From: Takahiro Yasui @ 2009-04-27 19:43 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-scsi, mchristi, mbarrow

Matthew Wilcox wrote:
> On Mon, Apr 27, 2009 at 01:09:57PM -0400, Takahiro Yasui wrote:
>> @@ -2633,9 +2633,12 @@ scsi_internal_device_unblock(struct scsi
>>  	unsigned long flags;
>>  	
>>  	/* 
>> -	 * Try to transition the scsi device to SDEV_RUNNING
>> -	 * and goose the device queue if successful.  
>> +	 * Try to transition the scsi device to SDEV_RUNNING if it is
>> +	 * SDEV_BLOCK and goose the device queue if successful.
> 
> I think the code looks good, but the edit to the comment dilutes its
> flavour somewhat.  How about just moving the comment down below the
> check, and then you don't need to edit the comment at all?

Thank you for the suggestion. I updated the patch according to your
comment.

Regards,
---
Takahiro Yasui
Hitachi Computer Products (America), Inc.


Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
---
 drivers/scsi/scsi_lib.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-2.6.29/drivers/scsi/scsi_lib.c
===================================================================
--- linux-2.6.29.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6.29/drivers/scsi/scsi_lib.c
@@ -2631,7 +2631,10 @@ scsi_internal_device_unblock(struct scsi
 	struct request_queue *q = sdev->request_queue; 
 	int err;
 	unsigned long flags;
-	
+
+	if (sdev->sdev_state != SDEV_BLOCK)
+		return 0;
+
 	/* 
 	 * Try to transition the scsi device to SDEV_RUNNING
 	 * and goose the device queue if successful.  

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-27 17:09 [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock Takahiro Yasui
  2009-04-27 18:03 ` Matthew Wilcox
@ 2009-04-27 20:08 ` James Bottomley
  2009-04-27 23:52   ` Takahiro Yasui
  1 sibling, 1 reply; 8+ messages in thread
From: James Bottomley @ 2009-04-27 20:08 UTC (permalink / raw)
  To: Takahiro Yasui; +Cc: linux-scsi, mchristi, mbarrow

On Mon, 2009-04-27 at 13:09 -0400, Takahiro Yasui wrote:
> Hi,
> 
> This is a patch to fix SCSI timeout error recovery issue which
> I posted about one week ago.
> 
>   [RFC] SCSI timeout error recovery issue
>   http://marc.info/?l=linux-scsi&m=124042136915970&w=2
> 
> scsi timeout on two or more devices may cause extremely long execution
> time for user applications because SDEV_OFFLINE state is changed to
> SDEV_RUNNING state during scsi error recovery procedures triggered by
> a bus reset or a host reset of LLD, and scsi timeout can happens on
> the same devices many times.
> 
> This happens because scsi_internal_device_unblock() changes device's state
> to SDEV_RUNNING even if a device in other states than SDEV_BLOCK, but
> scsi_internal_device_block() is a pair with scsi_internal_device_unblock()
> and only devices in SDEV_BLOCK state are needed to be changed. This patch
> adds a check of the current device state so that state change and queue
> activation is not executed for devices in SDEV_BLOCK state.
> 
> This patch is based on the idea that devices in SDEV_OFFLINE state need to
> stay in SDEV_OFFLINE because SDEV_OFFLINE state is done by:
> 
>  - All recovery procedures, bus reset and host reset, has already failed
>    for devices in SDEV_OFFLINE state.
>    (There is no (or rare?) chance to be recovered.)
> 
>  - A user has changed the device's state to SDEV_OFFLINE intentionally
>    by an interface such as sysfs.
>    (A user needs to enable devices in SDEV_OFFLINE state before using
>     them again.)
> 
> I appreciate any comments and suggestions on this patch.

The analysis seems good, and the idea of putting state guards to make
sure we only transition from a correct state is fine.

The actual state transitions only check that going from in state to out
is permitted by the state model ... rather than actually being correct
for what the function is trying to do.

> Regards,
> ---
> Takahiro Yasui
> Hitachi Computer Products (America), Inc.
> 
> 
> Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
> ---
>  drivers/scsi/scsi_lib.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.29/drivers/scsi/scsi_lib.c
> ===================================================================
> --- linux-2.6.29.orig/drivers/scsi/scsi_lib.c
> +++ linux-2.6.29/drivers/scsi/scsi_lib.c
> @@ -2633,9 +2633,12 @@ scsi_internal_device_unblock(struct scsi
>  	unsigned long flags;
>  	
>  	/* 
> -	 * Try to transition the scsi device to SDEV_RUNNING
> -	 * and goose the device queue if successful.  
> +	 * Try to transition the scsi device to SDEV_RUNNING if it is
> +	 * SDEV_BLOCK and goose the device queue if successful.
>  	 */
> +	if (sdev->sdev_state != SDEV_BLOCK)

This isn't quite correct.  There are two blocked states in the model
currently:  SDEV_BLOCK and SDEV_CREATED_BLOCK ... you'd need to check
for both of them

> +		return 0;

Traditionally the return for an attempted invalid state transition is
-EINVAL.

I suppose for lower down, if you check the state, now we know we go 

SDEV_CREATED_BLOCK -> SDEV_CREATED

or

SDEV_BLOCK -> SDEV_RUNNING

so there's no need for the dual scsi_device_set_state.

James



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-27 20:08 ` James Bottomley
@ 2009-04-27 23:52   ` Takahiro Yasui
  2009-04-28  2:31     ` Matthew Wilcox
  0 siblings, 1 reply; 8+ messages in thread
From: Takahiro Yasui @ 2009-04-27 23:52 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, mchristi, mbarrow

James Bottomley wrote:
>> +	if (sdev->sdev_state != SDEV_BLOCK)
> 
> This isn't quite correct.  There are two blocked states in the model
> currently:  SDEV_BLOCK and SDEV_CREATED_BLOCK ... you'd need to check
> for both of them
> 
>> +		return 0;
> 
> Traditionally the return for an attempted invalid state transition is
> -EINVAL.
> 
> I suppose for lower down, if you check the state, now we know we go 
> 
> SDEV_CREATED_BLOCK -> SDEV_CREATED
> 
> or
> 
> SDEV_BLOCK -> SDEV_RUNNING
> 
> so there's no need for the dual scsi_device_set_state.

Thank you for the comments. If I understand your comments correctly, 
the state transition is better to be defined in scsi_device_set_state(),
and both transitions, "SDEV_CREATED_BLOCK -> SDEV_CREATED" and
"SDEV_BLOCK -> SDEV_RUNNING" need to be covered. I updated the patch
so that the transition of "SDEV_OFFLINE -> SDEV_RUNNING" is prohibited.

A note in this change is all transitions of "SDEV_OFFLINE -> SDEV_RUNNING"
is prohibited, and the following state change for devices in SDEV_OFFLINE
state fails.

  # echo running > /sys/block/sdX/device/state

Therefore, users need to delete the device once and then those devices
need to be scanned as follows.

  # echo yes > /sys/block/sdX/device/delete
  # echo "- - -" > /sys/class/scsi_host/hostX/scan

I appreciate your reviews on it.

Regards,
---
Takahiro Yasui
Hitachi Computer Products (America), Inc.


Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
---
 drivers/scsi/scsi_lib.c |    1 -
 1 file changed, 1 deletion(-)

Index: linux-2.6.29/drivers/scsi/scsi_lib.c
===================================================================
--- linux-2.6.29.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6.29/drivers/scsi/scsi_lib.c
@@ -2258,7 +2258,6 @@ scsi_device_set_state(struct scsi_device
 	case SDEV_RUNNING:
 		switch (oldstate) {
 		case SDEV_CREATED:
-		case SDEV_OFFLINE:
 		case SDEV_QUIESCE:
 		case SDEV_BLOCK:
 			break;




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-27 23:52   ` Takahiro Yasui
@ 2009-04-28  2:31     ` Matthew Wilcox
  2009-04-28  2:51       ` Takahiro Yasui
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2009-04-28  2:31 UTC (permalink / raw)
  To: Takahiro Yasui; +Cc: James Bottomley, linux-scsi, mchristi, mbarrow

On Mon, Apr 27, 2009 at 07:52:57PM -0400, Takahiro Yasui wrote:
> James Bottomley wrote:
> >> +	if (sdev->sdev_state != SDEV_BLOCK)
> > 
> > This isn't quite correct.  There are two blocked states in the model
> > currently:  SDEV_BLOCK and SDEV_CREATED_BLOCK ... you'd need to check
> > for both of them
> > 
> >> +		return 0;
> > 
> > Traditionally the return for an attempted invalid state transition is
> > -EINVAL.
> > 
> > I suppose for lower down, if you check the state, now we know we go 
> > 
> > SDEV_CREATED_BLOCK -> SDEV_CREATED
> > 
> > or
> > 
> > SDEV_BLOCK -> SDEV_RUNNING
> > 
> > so there's no need for the dual scsi_device_set_state.
> 
> Thank you for the comments. If I understand your comments correctly, 
> the state transition is better to be defined in scsi_device_set_state(),
> and both transitions, "SDEV_CREATED_BLOCK -> SDEV_CREATED" and
> "SDEV_BLOCK -> SDEV_RUNNING" need to be covered. I updated the patch
> so that the transition of "SDEV_OFFLINE -> SDEV_RUNNING" is prohibited.

I don't think that's what he meant.  I think he meant something more like:

scsi_internal_device_unblock(struct scsi_device *sdev)
{
	struct request_queue *q = sdev->request_queue;
-	int err;
	unsigned long flags;

	/*
	 * Try to transition the scsi device to SDEV_RUNNING
	 * and goose the device queue if successful.
	 */

+	if (sdev->sdev_state == SDEV_CREATED_BLOCK)
+		sdev->sdev_state = SDEV_CREATED;
+	else if (sdev->sdev_state == SDEV_BLOCK)
+		sdev->sdev_state = SDEV_RUNNING;
+	else
+		return -EINVAL;

-	err = scsi_device_set_state(sdev, SDEV_RUNNING);
-	if (err) {
-		err = scsi_device_set_state(sdev, SDEV_CREATED);
-
-		if (err)
-			return err;
-	}

	spin_lock_irqsave(q->queue_lock, flags);
	blk_start_queue(q);
	spin_unlock_irqrestore(q->queue_lock, flags);

	return 0;
}

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-28  2:31     ` Matthew Wilcox
@ 2009-04-28  2:51       ` Takahiro Yasui
  2009-04-28  3:27         ` Takahiro Yasui
  0 siblings, 1 reply; 8+ messages in thread
From: Takahiro Yasui @ 2009-04-28  2:51 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: James Bottomley, linux-scsi, mchristi, mbarrow

Matthew Wilcox wrote:
> On Mon, Apr 27, 2009 at 07:52:57PM -0400, Takahiro Yasui wrote:
>> James Bottomley wrote:
>>>> +	if (sdev->sdev_state != SDEV_BLOCK)
>>> This isn't quite correct.  There are two blocked states in the model
>>> currently:  SDEV_BLOCK and SDEV_CREATED_BLOCK ... you'd need to check
>>> for both of them
>>>
>>>> +		return 0;
>>> Traditionally the return for an attempted invalid state transition is
>>> -EINVAL.
>>>
>>> I suppose for lower down, if you check the state, now we know we go 
>>>
>>> SDEV_CREATED_BLOCK -> SDEV_CREATED
>>>
>>> or
>>>
>>> SDEV_BLOCK -> SDEV_RUNNING
>>>
>>> so there's no need for the dual scsi_device_set_state.
>> Thank you for the comments. If I understand your comments correctly, 
>> the state transition is better to be defined in scsi_device_set_state(),
>> and both transitions, "SDEV_CREATED_BLOCK -> SDEV_CREATED" and
>> "SDEV_BLOCK -> SDEV_RUNNING" need to be covered. I updated the patch
>> so that the transition of "SDEV_OFFLINE -> SDEV_RUNNING" is prohibited.
> 
> I don't think that's what he meant.  I think he meant something more like:

Thanks a lot. I'll build a patch and send it.

Regards,
---
Takahiro Yasui
Hitachi Computer Products (America), Inc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock
  2009-04-28  2:51       ` Takahiro Yasui
@ 2009-04-28  3:27         ` Takahiro Yasui
  0 siblings, 0 replies; 8+ messages in thread
From: Takahiro Yasui @ 2009-04-28  3:27 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: James Bottomley, linux-scsi, mchristi, mbarrow

Takahiro Yasui wrote:
> Matthew Wilcox wrote:
>> On Mon, Apr 27, 2009 at 07:52:57PM -0400, Takahiro Yasui wrote:
>>> James Bottomley wrote:
>>>>> +	if (sdev->sdev_state != SDEV_BLOCK)
>>>> This isn't quite correct.  There are two blocked states in the model
>>>> currently:  SDEV_BLOCK and SDEV_CREATED_BLOCK ... you'd need to check
>>>> for both of them
>>>>
>>>>> +		return 0;
>>>> Traditionally the return for an attempted invalid state transition is
>>>> -EINVAL.
>>>>
>>>> I suppose for lower down, if you check the state, now we know we go 
>>>>
>>>> SDEV_CREATED_BLOCK -> SDEV_CREATED
>>>>
>>>> or
>>>>
>>>> SDEV_BLOCK -> SDEV_RUNNING
>>>>
>>>> so there's no need for the dual scsi_device_set_state.
>>> Thank you for the comments. If I understand your comments correctly, 
>>> the state transition is better to be defined in scsi_device_set_state(),
>>> and both transitions, "SDEV_CREATED_BLOCK -> SDEV_CREATED" and
>>> "SDEV_BLOCK -> SDEV_RUNNING" need to be covered. I updated the patch
>>> so that the transition of "SDEV_OFFLINE -> SDEV_RUNNING" is prohibited.
>> I don't think that's what he meant.  I think he meant something more like:

I have one question about the order of comparisons. The original code
tries to change to SDEV_BLOCK first and retries to SDEV_CREATED.
If they are transformed just that order, I think that comparisons will
be as follows.

	if (sdev->sdev_state == SDEV_BLOCK)
		sdev->sdev_state = SDEV_RUNNING;
	else if (sdev->sdev_state == SDEV_CREATED_BLOCK)
		sdev->sdev_state = SDEV_CREATED;
	else
		return -EINVAL;

Is this wrong order?

Regards,
---
Takahiro Yasui
Hitachi Computer Products (America), Inc.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-04-28  3:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-27 17:09 [RFC][PATCH] limit state change to SDEV_BLOCK devices in scsi_internal_device_unblock Takahiro Yasui
2009-04-27 18:03 ` Matthew Wilcox
2009-04-27 19:43   ` Takahiro Yasui
2009-04-27 20:08 ` James Bottomley
2009-04-27 23:52   ` Takahiro Yasui
2009-04-28  2:31     ` Matthew Wilcox
2009-04-28  2:51       ` Takahiro Yasui
2009-04-28  3:27         ` Takahiro Yasui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).