linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REPOST][PATCH] update max sdev block limit
@ 2006-05-11 14:42 James Smart
  2006-05-16 15:09 ` Michael Reed
  0 siblings, 1 reply; 5+ messages in thread
From: James Smart @ 2006-05-11 14:42 UTC (permalink / raw)
  To: linux-scsi

Updated patch to address comments from Andreas Herrman, who noted that
the initialization, with the HZ, was inconsistent with its use in the
FC transport.
--
This patch ups the maximum limit for how long an sdev is allowed to
be blocked. Originally, the value was 60 seconds. However, we are aware
of array failover and switch reboot times that can be as high
as 90 seconds. We're proposing to change the max to 120 seconds.

-- james s


Signed-off-by: James Smart <James.Smart@emulex.com>

diff -upNr a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
--- a/drivers/scsi/scsi_priv.h	2006-05-10 11:36:25.000000000 -0400
+++ b/drivers/scsi/scsi_priv.h	2006-05-11 10:37:57.000000000 -0400
@@ -127,7 +127,7 @@ extern struct bus_type scsi_bus_type;
  * classes.
  */
 
-#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	(HZ*60)
+#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	120	/* units in seconds */
 extern int scsi_internal_device_block(struct scsi_device *sdev);
 extern int scsi_internal_device_unblock(struct scsi_device *sdev);
 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPOST][PATCH] update max sdev block limit
  2006-05-11 14:42 [REPOST][PATCH] update max sdev block limit James Smart
@ 2006-05-16 15:09 ` Michael Reed
  2006-05-16 16:05   ` James Smart
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Reed @ 2006-05-16 15:09 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi

I disagree with this patch.

My personal opinion is that the previous value of 6000 (assume HZ==100),
or around 1 hour 40 minutes, was probably too long.  But, 120 seconds is too
short.  I would suggest a MAX value of maybe 10 minutes, or 600 seconds.
It appears that introducing an upper bound which is now more than an order
of magnitude smaller than the previous value could have some impact at
customer sites.

There are raid devices which require around 200+ seconds to crash, dump,
reboot, and return on line.  (Yes, I've timed it!)

Mike


James Smart wrote:
> Updated patch to address comments from Andreas Herrman, who noted that
> the initialization, with the HZ, was inconsistent with its use in the
> FC transport.
> --
> This patch ups the maximum limit for how long an sdev is allowed to
> be blocked. Originally, the value was 60 seconds. However, we are aware
> of array failover and switch reboot times that can be as high
> as 90 seconds. We're proposing to change the max to 120 seconds.
> 
> -- james s
> 
> 
> Signed-off-by: James Smart <James.Smart@emulex.com>
> 
> diff -upNr a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
> --- a/drivers/scsi/scsi_priv.h	2006-05-10 11:36:25.000000000 -0400
> +++ b/drivers/scsi/scsi_priv.h	2006-05-11 10:37:57.000000000 -0400
> @@ -127,7 +127,7 @@ extern struct bus_type scsi_bus_type;
>   * classes.
>   */
>  
> -#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	(HZ*60)
> +#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	120	/* units in seconds */
>  extern int scsi_internal_device_block(struct scsi_device *sdev);
>  extern int scsi_internal_device_unblock(struct scsi_device *sdev);
>  
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPOST][PATCH] update max sdev block limit
  2006-05-16 15:09 ` Michael Reed
@ 2006-05-16 16:05   ` James Smart
  2006-05-16 16:34     ` Patrick Mansfield
  0 siblings, 1 reply; 5+ messages in thread
From: James Smart @ 2006-05-16 16:05 UTC (permalink / raw)
  To: Michael Reed; +Cc: linux-scsi

I don't mind making it bigger, especially as this is just a max, not the
default value. I tried to keep it low, as I believe even 2 mins is a long
time from the system's perspective. 10 minutes is forever (and remember
the scan deadlock that we just worked through).

-- james


Michael Reed wrote:
> I disagree with this patch.
> 
> My personal opinion is that the previous value of 6000 (assume HZ==100),
> or around 1 hour 40 minutes, was probably too long.  But, 120 seconds is too
> short.  I would suggest a MAX value of maybe 10 minutes, or 600 seconds.
> It appears that introducing an upper bound which is now more than an order
> of magnitude smaller than the previous value could have some impact at
> customer sites.
> 
> There are raid devices which require around 200+ seconds to crash, dump,
> reboot, and return on line.  (Yes, I've timed it!)
> 
> Mike
> 
> 
> James Smart wrote:
>> Updated patch to address comments from Andreas Herrman, who noted that
>> the initialization, with the HZ, was inconsistent with its use in the
>> FC transport.
>> --
>> This patch ups the maximum limit for how long an sdev is allowed to
>> be blocked. Originally, the value was 60 seconds. However, we are aware
>> of array failover and switch reboot times that can be as high
>> as 90 seconds. We're proposing to change the max to 120 seconds.
>>
>> -- james s
>>
>>
>> Signed-off-by: James Smart <James.Smart@emulex.com>
>>
>> diff -upNr a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
>> --- a/drivers/scsi/scsi_priv.h	2006-05-10 11:36:25.000000000 -0400
>> +++ b/drivers/scsi/scsi_priv.h	2006-05-11 10:37:57.000000000 -0400
>> @@ -127,7 +127,7 @@ extern struct bus_type scsi_bus_type;
>>   * classes.
>>   */
>>  
>> -#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	(HZ*60)
>> +#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT	120	/* units in seconds */
>>  extern int scsi_internal_device_block(struct scsi_device *sdev);
>>  extern int scsi_internal_device_unblock(struct scsi_device *sdev);
>>  
>>
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPOST][PATCH] update max sdev block limit
  2006-05-16 16:05   ` James Smart
@ 2006-05-16 16:34     ` Patrick Mansfield
  2006-05-16 18:14       ` James Smart
  0 siblings, 1 reply; 5+ messages in thread
From: Patrick Mansfield @ 2006-05-16 16:34 UTC (permalink / raw)
  To: James Smart; +Cc: Michael Reed, linux-scsi

On Tue, May 16, 2006 at 12:05:19PM -0400, James Smart wrote:
> I don't mind making it bigger, especially as this is just a max, not the
> default value. I tried to keep it low, as I believe even 2 mins is a long
> time from the system's perspective. 10 minutes is forever (and remember
> the scan deadlock that we just worked through).

Yes, so add default and max settings instead of using the max as the default.

And I still don't see how the scsi timeout can (reliably) make it through
these block/unblocks. EH_RESET_TIMER doesn't freeze the scsi timeout like
you really need, just restarts it. 

For example, with default sd timeout of 30, you could be one second into a
command, block for 28 seconds, unblock, and then still timeout.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPOST][PATCH] update max sdev block limit
  2006-05-16 16:34     ` Patrick Mansfield
@ 2006-05-16 18:14       ` James Smart
  0 siblings, 0 replies; 5+ messages in thread
From: James Smart @ 2006-05-16 18:14 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: Michael Reed, linux-scsi



Patrick Mansfield wrote:
> On Tue, May 16, 2006 at 12:05:19PM -0400, James Smart wrote:
>> I don't mind making it bigger, especially as this is just a max, not the
>> default value. I tried to keep it low, as I believe even 2 mins is a long
>> time from the system's perspective. 10 minutes is forever (and remember
>> the scan deadlock that we just worked through).
> 
> Yes, so add default and max settings instead of using the max as the default.

Agreed - doing so.

> And I still don't see how the scsi timeout can (reliably) make it through
> these block/unblocks. EH_RESET_TIMER doesn't freeze the scsi timeout like
> you really need, just restarts it. 
> 
> For example, with default sd timeout of 30, you could be one second into a
> command, block for 28 seconds, unblock, and then still timeout.

True.  However, the point was not necessarily to allow the command to
succeed. Note: any target disappearance for any real amount of time (like 28s)
is likely going to be a condition that required a new login and killed the
i/o anyway.

The rescheduling of the timeout was to avoid the ramifications of the timeout
fails, which it would do, as there's no target to send the abort request to.
What was happening was the abort was failing, the device reset was failing,
and it escalated up to bus resets and adapter resets - followed by a Test Unit
Ready being sent, which of course was to a non-existent target, which failed
and took the device offline. Which then required manual interaction to restart
io.

-- james

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-05-16 18:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-11 14:42 [REPOST][PATCH] update max sdev block limit James Smart
2006-05-16 15:09 ` Michael Reed
2006-05-16 16:05   ` James Smart
2006-05-16 16:34     ` Patrick Mansfield
2006-05-16 18:14       ` James Smart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).