* [REPOST][PATCH] update max sdev block limit @ 2006-05-11 14:42 James Smart 2006-05-16 15:09 ` Michael Reed 0 siblings, 1 reply; 5+ messages in thread From: James Smart @ 2006-05-11 14:42 UTC (permalink / raw) To: linux-scsi Updated patch to address comments from Andreas Herrman, who noted that the initialization, with the HZ, was inconsistent with its use in the FC transport. -- This patch ups the maximum limit for how long an sdev is allowed to be blocked. Originally, the value was 60 seconds. However, we are aware of array failover and switch reboot times that can be as high as 90 seconds. We're proposing to change the max to 120 seconds. -- james s Signed-off-by: James Smart <James.Smart@emulex.com> diff -upNr a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h --- a/drivers/scsi/scsi_priv.h 2006-05-10 11:36:25.000000000 -0400 +++ b/drivers/scsi/scsi_priv.h 2006-05-11 10:37:57.000000000 -0400 @@ -127,7 +127,7 @@ extern struct bus_type scsi_bus_type; * classes. */ -#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT (HZ*60) +#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT 120 /* units in seconds */ extern int scsi_internal_device_block(struct scsi_device *sdev); extern int scsi_internal_device_unblock(struct scsi_device *sdev); ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPOST][PATCH] update max sdev block limit 2006-05-11 14:42 [REPOST][PATCH] update max sdev block limit James Smart @ 2006-05-16 15:09 ` Michael Reed 2006-05-16 16:05 ` James Smart 0 siblings, 1 reply; 5+ messages in thread From: Michael Reed @ 2006-05-16 15:09 UTC (permalink / raw) To: James.Smart; +Cc: linux-scsi I disagree with this patch. My personal opinion is that the previous value of 6000 (assume HZ==100), or around 1 hour 40 minutes, was probably too long. But, 120 seconds is too short. I would suggest a MAX value of maybe 10 minutes, or 600 seconds. It appears that introducing an upper bound which is now more than an order of magnitude smaller than the previous value could have some impact at customer sites. There are raid devices which require around 200+ seconds to crash, dump, reboot, and return on line. (Yes, I've timed it!) Mike James Smart wrote: > Updated patch to address comments from Andreas Herrman, who noted that > the initialization, with the HZ, was inconsistent with its use in the > FC transport. > -- > This patch ups the maximum limit for how long an sdev is allowed to > be blocked. Originally, the value was 60 seconds. However, we are aware > of array failover and switch reboot times that can be as high > as 90 seconds. We're proposing to change the max to 120 seconds. > > -- james s > > > Signed-off-by: James Smart <James.Smart@emulex.com> > > diff -upNr a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h > --- a/drivers/scsi/scsi_priv.h 2006-05-10 11:36:25.000000000 -0400 > +++ b/drivers/scsi/scsi_priv.h 2006-05-11 10:37:57.000000000 -0400 > @@ -127,7 +127,7 @@ extern struct bus_type scsi_bus_type; > * classes. > */ > > -#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT (HZ*60) > +#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT 120 /* units in seconds */ > extern int scsi_internal_device_block(struct scsi_device *sdev); > extern int scsi_internal_device_unblock(struct scsi_device *sdev); > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPOST][PATCH] update max sdev block limit 2006-05-16 15:09 ` Michael Reed @ 2006-05-16 16:05 ` James Smart 2006-05-16 16:34 ` Patrick Mansfield 0 siblings, 1 reply; 5+ messages in thread From: James Smart @ 2006-05-16 16:05 UTC (permalink / raw) To: Michael Reed; +Cc: linux-scsi I don't mind making it bigger, especially as this is just a max, not the default value. I tried to keep it low, as I believe even 2 mins is a long time from the system's perspective. 10 minutes is forever (and remember the scan deadlock that we just worked through). -- james Michael Reed wrote: > I disagree with this patch. > > My personal opinion is that the previous value of 6000 (assume HZ==100), > or around 1 hour 40 minutes, was probably too long. But, 120 seconds is too > short. I would suggest a MAX value of maybe 10 minutes, or 600 seconds. > It appears that introducing an upper bound which is now more than an order > of magnitude smaller than the previous value could have some impact at > customer sites. > > There are raid devices which require around 200+ seconds to crash, dump, > reboot, and return on line. (Yes, I've timed it!) > > Mike > > > James Smart wrote: >> Updated patch to address comments from Andreas Herrman, who noted that >> the initialization, with the HZ, was inconsistent with its use in the >> FC transport. >> -- >> This patch ups the maximum limit for how long an sdev is allowed to >> be blocked. Originally, the value was 60 seconds. However, we are aware >> of array failover and switch reboot times that can be as high >> as 90 seconds. We're proposing to change the max to 120 seconds. >> >> -- james s >> >> >> Signed-off-by: James Smart <James.Smart@emulex.com> >> >> diff -upNr a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h >> --- a/drivers/scsi/scsi_priv.h 2006-05-10 11:36:25.000000000 -0400 >> +++ b/drivers/scsi/scsi_priv.h 2006-05-11 10:37:57.000000000 -0400 >> @@ -127,7 +127,7 @@ extern struct bus_type scsi_bus_type; >> * classes. >> */ >> >> -#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT (HZ*60) >> +#define SCSI_DEVICE_BLOCK_MAX_TIMEOUT 120 /* units in seconds */ >> extern int scsi_internal_device_block(struct scsi_device *sdev); >> extern int scsi_internal_device_unblock(struct scsi_device *sdev); >> >> >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPOST][PATCH] update max sdev block limit 2006-05-16 16:05 ` James Smart @ 2006-05-16 16:34 ` Patrick Mansfield 2006-05-16 18:14 ` James Smart 0 siblings, 1 reply; 5+ messages in thread From: Patrick Mansfield @ 2006-05-16 16:34 UTC (permalink / raw) To: James Smart; +Cc: Michael Reed, linux-scsi On Tue, May 16, 2006 at 12:05:19PM -0400, James Smart wrote: > I don't mind making it bigger, especially as this is just a max, not the > default value. I tried to keep it low, as I believe even 2 mins is a long > time from the system's perspective. 10 minutes is forever (and remember > the scan deadlock that we just worked through). Yes, so add default and max settings instead of using the max as the default. And I still don't see how the scsi timeout can (reliably) make it through these block/unblocks. EH_RESET_TIMER doesn't freeze the scsi timeout like you really need, just restarts it. For example, with default sd timeout of 30, you could be one second into a command, block for 28 seconds, unblock, and then still timeout. -- Patrick Mansfield ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPOST][PATCH] update max sdev block limit 2006-05-16 16:34 ` Patrick Mansfield @ 2006-05-16 18:14 ` James Smart 0 siblings, 0 replies; 5+ messages in thread From: James Smart @ 2006-05-16 18:14 UTC (permalink / raw) To: Patrick Mansfield; +Cc: Michael Reed, linux-scsi Patrick Mansfield wrote: > On Tue, May 16, 2006 at 12:05:19PM -0400, James Smart wrote: >> I don't mind making it bigger, especially as this is just a max, not the >> default value. I tried to keep it low, as I believe even 2 mins is a long >> time from the system's perspective. 10 minutes is forever (and remember >> the scan deadlock that we just worked through). > > Yes, so add default and max settings instead of using the max as the default. Agreed - doing so. > And I still don't see how the scsi timeout can (reliably) make it through > these block/unblocks. EH_RESET_TIMER doesn't freeze the scsi timeout like > you really need, just restarts it. > > For example, with default sd timeout of 30, you could be one second into a > command, block for 28 seconds, unblock, and then still timeout. True. However, the point was not necessarily to allow the command to succeed. Note: any target disappearance for any real amount of time (like 28s) is likely going to be a condition that required a new login and killed the i/o anyway. The rescheduling of the timeout was to avoid the ramifications of the timeout fails, which it would do, as there's no target to send the abort request to. What was happening was the abort was failing, the device reset was failing, and it escalated up to bus resets and adapter resets - followed by a Test Unit Ready being sent, which of course was to a non-existent target, which failed and took the device offline. Which then required manual interaction to restart io. -- james ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-05-16 18:11 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-05-11 14:42 [REPOST][PATCH] update max sdev block limit James Smart 2006-05-16 15:09 ` Michael Reed 2006-05-16 16:05 ` James Smart 2006-05-16 16:34 ` Patrick Mansfield 2006-05-16 18:14 ` James Smart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox