RE: Transport affected timeouts...

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* RE: Transport affected timeouts...
@ 2004-04-22 18:54 Smart, James
  2004-04-22 19:02 ` James Bottomley
  2004-04-22 19:09 ` Brian King
  0 siblings, 2 replies; 15+ messages in thread
From: Smart, James @ 2004-04-22 18:54 UTC (permalink / raw)
  To: 'Brian King'; +Cc: 'James Bottomley', Linux SCSI Reflector

Brian,

To be honest, it's probably both.  The folks that performed the
trouble-shooting in the past blamed much of the problem on the latency, and
used link timer values to resolve it. However, since the qual was
predominantly raid arrays, I'd bet that it was heavily influenced by the
target as you indicate. (note: the resulting timeout based on r_a_tov value
is very close to just doubling the timeout). Note: I was rather surprised to
see the timeout value of sd to be 30 seconds. I know when I was in Tru64, we
had 60 seconds as a minimum.

One question though - how does the LLD really know what the timeout should
be ?  It doesn't identify a target as a raid device does it ? or what raid
level it's using ?

-- James S
 

> -----Original Message-----
> From: Brian King [mailto:brking@us.ibm.com]
> Sent: Thursday, April 22, 2004 2:15 PM
> To: Smart, James
> Cc: 'James Bottomley'; Linux SCSI Reflector
> Subject: Re: Transport affected timeouts...
> 
> 
> We are really trying to solve two different problems here. 
> The problem I am
> trying to solve with the patch I submitted is that the 
> existing r/w timeouts
> are too short for RAID array devices. Since a single read or 
> write may end
> up resulting in multiple ops in RAID 5 arrays this timeout 
> becomes far too
> short. In this scenario the LLD has the best knowledge as to 
> what this timeout
> value needs to be. A delta value here really does not make sense.
> 
> The problem you are trying to solve is more related to the 
> latencies you may
> experience due to the transport. I'm not sure my patch is the 
> best way to fix
> your problem. Updating the st driver to use this rw_timeout 
> does not sound like
> a good solution as the LLD really has no idea what the total 
> timeout for a
> read or write should be for a tape.
> 
> Option 2 works best for the ipr driver, option 3 works best 
> for you. Since we
> are really solving two different problems, how about using 
> both options?
> 
> -Brian
> 
> 
> > Potential options:
> > 1) Change the base driver timeout.  (base drivers defined 
> to be sd, st, etc)
> > 
> > I dislike this mainly because it fails (a) and (c). Also 
> concerned about
> > abilities to tune all base drivers.
> > 
> > 2) Allow the scsi-host to provide a timeout value that can 
> override the base
> > driver.
> > The IBM proposed patch does this. I dislike the patch as : 
> scsi host has no
> > input as to what the base driver timeout is; there are 
> multiple base driver
> > timeouts (sd, st, etc); thus apriori knowledge is required 
> to determine a
> > maximum. Also, application of timeout change is inconsistent.
> > 
> > 3) Allow the scsi-host to provide a transport-specific 
> increment that can be
> > added to the base driver timeout.
> > Just a refinement of (2) to hopefully remove my dislikes. 
> Still has faults
> > as the exact relationship of topology/configuration/devices 
> to needed
> > timeout is not exact/determinable.
> 
> 
> -- 
> Brian King
> eServer Storage I/O
> IBM Linux Technology Center
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
  2004-04-22 18:54 Transport affected timeouts Smart, James
@ 2004-04-22 19:02 ` James Bottomley
  2004-04-22 19:09 ` Brian King
  1 sibling, 0 replies; 15+ messages in thread
From: James Bottomley @ 2004-04-22 19:02 UTC (permalink / raw)
  To: Smart, James; +Cc: 'Brian King', Linux SCSI Reflector

On Thu, 2004-04-22 at 14:54, Smart, James wrote:
> To be honest, it's probably both.  The folks that performed the
> trouble-shooting in the past blamed much of the problem on the latency, and
> used link timer values to resolve it. However, since the qual was
> predominantly raid arrays, I'd bet that it was heavily influenced by the
> target as you indicate. (note: the resulting timeout based on r_a_tov value
> is very close to just doubling the timeout). Note: I was rather surprised to
> see the timeout value of sd to be 30 seconds. I know when I was in Tru64, we
> had 60 seconds as a minimum.
> 
> One question though - how does the LLD really know what the timeout should
> be ?  It doesn't identify a target as a raid device does it ? or what raid
> level it's using ?

You don't, really.  If the default value were larger (say 60s) would we
even be having this discussion?

I know the way solaris does this is to have a global variable that
allows you to raise the timeout.  If we simply exposed Brian's proposed
parameter in sysfs, so you could change it from user space, would that
be sufficient?

I'd really like to keep the default as small as possible ... too may
people have eccentric setups which lose commands.  The longer the
timeout is, the longer we take to notice and correct the situation.

James

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Transport affected timeouts...
  2004-04-22 18:54 Transport affected timeouts Smart, James
  2004-04-22 19:02 ` James Bottomley
@ 2004-04-22 19:09 ` Brian King
  1 sibling, 0 replies; 15+ messages in thread
From: Brian King @ 2004-04-22 19:09 UTC (permalink / raw)
  To: Smart, James; +Cc: 'James Bottomley', Linux SCSI Reflector

Smart, James wrote:
> One question though - how does the LLD really know what the timeout should
> be ?  It doesn't identify a target as a raid device does it ? or what raid
> level it's using ?

Yes, the slave_configure routine knows that the device is a RAID array and
sets the rw_timeout value appropriately.


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
@ 2004-04-22 21:36 Smart, James
  2004-04-22 21:45 ` Brian King
  0 siblings, 1 reply; 15+ messages in thread
From: Smart, James @ 2004-04-22 21:36 UTC (permalink / raw)
  To: 'James Bottomley'; +Cc: 'Brian King', Linux SCSI Reflector



> I know the way solaris does this is to have a global variable that
> allows you to raise the timeout.  If we simply exposed 
> Brian's proposed
> parameter in sysfs, so you could change it from user space, would that
> be sufficient?


Yes.

-- James S

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Transport affected timeouts...
  2004-04-22 21:36 Smart, James
@ 2004-04-22 21:45 ` Brian King
  2004-05-03 15:49   ` Brian King
  0 siblings, 1 reply; 15+ messages in thread
From: Brian King @ 2004-04-22 21:45 UTC (permalink / raw)
  To: Smart, James; +Cc: 'James Bottomley', Linux SCSI Reflector

Smart, James wrote:
> 
>>I know the way solaris does this is to have a global variable that
>>allows you to raise the timeout.  If we simply exposed 
>>Brian's proposed
>>parameter in sysfs, so you could change it from user space, would that
>>be sufficient?
> 
> 
> 
> Yes.

I'll respin the patch.


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Transport affected timeouts...
  2004-04-22 21:45 ` Brian King
@ 2004-05-03 15:49   ` Brian King
  0 siblings, 0 replies; 15+ messages in thread
From: Brian King @ 2004-05-03 15:49 UTC (permalink / raw)
  To: Brian King
  Cc: Smart, James, 'James Bottomley', Linux SCSI Reflector,
	Kai Makisara

[-- Attachment #1: Type: text/plain, Size: 201 bytes --]

The following patch makes st use the timeout field in the scsi_device struct.
It requires the scsi_timeout_mod patch I just submitted.



-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center

[-- Attachment #2: st_timeout_mod.patch --]
[-- Type: text/plain, Size: 7146 bytes --]


This patch changes st to use the timeout field in the scsi_device struct
for the short timeout. This patch depends on scsi_timeout_mod patch.


---


diff -puN drivers/scsi/st.h~st_timeout_mod drivers/scsi/st.h
--- linux-2.6.6-rc3/drivers/scsi/st.h~st_timeout_mod	Fri Apr 30 09:56:04 2004
+++ linux-2.6.6-rc3-bjking1/drivers/scsi/st.h	Fri Apr 30 09:56:15 2004
@@ -100,7 +100,6 @@ typedef struct {
 	unsigned char c_algo;			/* compression algorithm */
 	unsigned char pos_unknown;			/* after reset position unknown */
 	int tape_type;
-	int timeout;		/* timeout for normal commands */
 	int long_timeout;	/* timeout for commands known to take long time */
 
 	unsigned long max_pfn;	/* the maximum page number reachable by the HBA */
diff -puN drivers/scsi/st.c~st_timeout_mod drivers/scsi/st.c
--- linux-2.6.6-rc3/drivers/scsi/st.c~st_timeout_mod	Fri Apr 30 09:56:19 2004
+++ linux-2.6.6-rc3-bjking1/drivers/scsi/st.c	Fri Apr 30 10:15:03 2004
@@ -486,7 +486,7 @@ static int cross_eof(Scsi_Tape * STp, in
 		   tape_name(STp), forward ? "forward" : "backward"));
 
 	SRpnt = st_do_scsi(NULL, STp, cmd, 0, SCSI_DATA_NONE,
-			   STp->timeout, MAX_RETRIES, TRUE);
+			   STp->device->timeout, MAX_RETRIES, TRUE);
 	if (!SRpnt)
 		return (STp->buffer)->syscall_result;
 
@@ -544,7 +544,7 @@ static int flush_write_buffer(Scsi_Tape 
 		cmd[4] = blks;
 
 		SRpnt = st_do_scsi(NULL, STp, cmd, transfer, SCSI_DATA_WRITE,
-				   STp->timeout, MAX_WRITE_RETRIES, TRUE);
+				   STp->device->timeout, MAX_WRITE_RETRIES, TRUE);
 		if (!SRpnt)
 			return (STp->buffer)->syscall_result;
 
@@ -867,7 +867,7 @@ static int check_tape(Scsi_Tape *STp, st
 		memset((void *) &cmd[0], 0, MAX_COMMAND_SIZE);
 		cmd[0] = READ_BLOCK_LIMITS;
 
-		SRpnt = st_do_scsi(SRpnt, STp, cmd, 6, SCSI_DATA_READ, STp->timeout,
+		SRpnt = st_do_scsi(SRpnt, STp, cmd, 6, SCSI_DATA_READ, STp->device->timeout,
 				   MAX_READY_RETRIES, TRUE);
 		if (!SRpnt) {
 			retval = (STp->buffer)->syscall_result;
@@ -894,7 +894,7 @@ static int check_tape(Scsi_Tape *STp, st
 	cmd[0] = MODE_SENSE;
 	cmd[4] = 12;
 
-	SRpnt = st_do_scsi(SRpnt, STp, cmd, 12, SCSI_DATA_READ, STp->timeout,
+	SRpnt = st_do_scsi(SRpnt, STp, cmd, 12, SCSI_DATA_READ, STp->device->timeout,
 			   MAX_READY_RETRIES, TRUE);
 	if (!SRpnt) {
 		retval = (STp->buffer)->syscall_result;
@@ -1115,7 +1115,7 @@ static int st_flush(struct file *filp)
 		cmd[4] = 1 + STp->two_fm;
 
 		SRpnt = st_do_scsi(NULL, STp, cmd, 0, SCSI_DATA_NONE,
-				   STp->timeout, MAX_WRITE_RETRIES, TRUE);
+				   STp->device->timeout, MAX_WRITE_RETRIES, TRUE);
 		if (!SRpnt) {
 			result = (STp->buffer)->syscall_result;
 			goto out;
@@ -1506,7 +1506,7 @@ static ssize_t
 		cmd[4] = blks;
 
 		SRpnt = st_do_scsi(SRpnt, STp, cmd, transfer, SCSI_DATA_WRITE,
-				   STp->timeout, MAX_WRITE_RETRIES, !async_write);
+				   STp->device->timeout, MAX_WRITE_RETRIES, !async_write);
 		if (!SRpnt) {
 			retval = STbp->syscall_result;
 			goto out;
@@ -1676,7 +1676,7 @@ static long read_tape(Scsi_Tape *STp, lo
 
 	SRpnt = *aSRpnt;
 	SRpnt = st_do_scsi(SRpnt, STp, cmd, bytes, SCSI_DATA_READ,
-			   STp->timeout, MAX_RETRIES, TRUE);
+			   STp->device->timeout, MAX_RETRIES, TRUE);
 	release_buffering(STp);
 	*aSRpnt = SRpnt;
 	if (!SRpnt)
@@ -2075,7 +2075,7 @@ static int st_set_options(Scsi_Tape *STp
 			printk(KERN_INFO "%s: Long timeout set to %d seconds.\n", name,
 			       (value & ~MT_ST_SET_LONG_TIMEOUT));
 		} else {
-			STp->timeout = value * HZ;
+			STp->device->timeout = value * HZ;
 			printk(KERN_INFO "%s: Normal timeout set to %d seconds.\n",
                                name, value);
 		}
@@ -2183,7 +2183,7 @@ static int read_mode_page(Scsi_Tape *STp
 	cmd[4] = 255;
 
 	SRpnt = st_do_scsi(SRpnt, STp, cmd, cmd[4], SCSI_DATA_READ,
-			   STp->timeout, 0, TRUE);
+			   STp->device->timeout, 0, TRUE);
 	if (SRpnt == NULL)
 		return (STp->buffer)->syscall_result;
 
@@ -2214,7 +2214,7 @@ static int write_mode_page(Scsi_Tape *ST
 	(STp->buffer)->b_data[pgo + MP_OFF_PAGE_NBR] &= MP_MSK_PAGE_NBR;
 
 	SRpnt = st_do_scsi(SRpnt, STp, cmd, cmd[4], SCSI_DATA_WRITE,
-			   (slow ? STp->long_timeout : STp->timeout), 0, TRUE);
+			   (slow ? STp->long_timeout : STp->device->timeout), 0, TRUE);
 	if (SRpnt == NULL)
 		return (STp->buffer)->syscall_result;
 
@@ -2326,7 +2326,7 @@ static int do_load_unload(Scsi_Tape *STp
 	}
 	if (STp->immediate) {
 		cmd[1] = 1;	/* Don't wait for completion */
-		timeout = STp->timeout;
+		timeout = STp->device->timeout;
 	}
 	else
 		timeout = STp->long_timeout;
@@ -2506,7 +2506,7 @@ static int st_int_ioctl(Scsi_Tape *STp, 
 		cmd[2] = (arg >> 16);
 		cmd[3] = (arg >> 8);
 		cmd[4] = arg;
-		timeout = STp->timeout;
+		timeout = STp->device->timeout;
                 DEBC(
                      if (cmd_in == MTWEOF)
                                printk(ST_DEB_MSG "%s: Writing %d filemarks.\n", name,
@@ -2524,7 +2524,7 @@ static int st_int_ioctl(Scsi_Tape *STp, 
 		cmd[0] = REZERO_UNIT;
 		if (STp->immediate) {
 			cmd[1] = 1;	/* Don't wait for completion */
-			timeout = STp->timeout;
+			timeout = STp->device->timeout;
 		}
                 DEBC(printk(ST_DEB_MSG "%s: Rewinding tape.\n", name));
 		fileno = blkno = at_sm = 0;
@@ -2537,7 +2537,7 @@ static int st_int_ioctl(Scsi_Tape *STp, 
 		cmd[0] = START_STOP;
 		if (STp->immediate) {
 			cmd[1] = 1;	/* Don't wait for completion */
-			timeout = STp->timeout;
+			timeout = STp->device->timeout;
 		}
 		cmd[4] = 3;
                 DEBC(printk(ST_DEB_MSG "%s: Retensioning tape.\n", name));
@@ -2570,7 +2570,7 @@ static int st_int_ioctl(Scsi_Tape *STp, 
 		cmd[1] = (arg ? 1 : 0);	/* Long erase with non-zero argument */
 		if (STp->immediate) {
 			cmd[1] |= 2;	/* Don't wait for completion */
-			timeout = STp->timeout;
+			timeout = STp->device->timeout;
 		}
 		else
 			timeout = STp->long_timeout * 8;
@@ -2622,7 +2622,7 @@ static int st_int_ioctl(Scsi_Tape *STp, 
 		(STp->buffer)->b_data[9] = (ltmp >> 16);
 		(STp->buffer)->b_data[10] = (ltmp >> 8);
 		(STp->buffer)->b_data[11] = ltmp;
-		timeout = STp->timeout;
+		timeout = STp->device->timeout;
                 DEBC(
 			if (cmd_in == MTSETBLK || cmd_in == SET_DENS_AND_BLK)
 				printk(ST_DEB_MSG
@@ -2803,7 +2803,7 @@ static int get_location(Scsi_Tape *STp, 
 		if (!logical && !STp->scsi2_logical)
 			scmd[1] = 1;
 	}
-	SRpnt = st_do_scsi(NULL, STp, scmd, 20, SCSI_DATA_READ, STp->timeout,
+	SRpnt = st_do_scsi(NULL, STp, scmd, 20, SCSI_DATA_READ, STp->device->timeout,
 			   MAX_READY_RETRIES, TRUE);
 	if (!SRpnt)
 		return (STp->buffer)->syscall_result;
@@ -2905,7 +2905,7 @@ static int set_location(Scsi_Tape *STp, 
 	}
 	if (STp->immediate) {
 		scmd[1] |= 1;		/* Don't wait for completion */
-		timeout = STp->timeout;
+		timeout = STp->device->timeout;
 	}
 
 	SRpnt = st_do_scsi(NULL, STp, scmd, 0, SCSI_DATA_NONE,
@@ -3844,7 +3844,7 @@ static int st_probe(struct device *dev)
 	tpnt->partition = 0;
 	tpnt->new_partition = 0;
 	tpnt->nbr_partitions = 0;
-	tpnt->timeout = ST_TIMEOUT;
+	tpnt->device->timeout = ST_TIMEOUT;
 	tpnt->long_timeout = ST_LONG_TIMEOUT;
 	tpnt->try_dio = try_direct_io && !SDp->host->unchecked_isa_dma;
 

_

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
@ 2004-04-22 16:28 Smart, James
  2004-04-22 18:14 ` Brian King
  0 siblings, 1 reply; 15+ messages in thread
From: Smart, James @ 2004-04-22 16:28 UTC (permalink / raw)
  To: 'James Bottomley'; +Cc: Linux SCSI Reflector

I noted the current behavior - as it is unacceptable, and I'd like a
solution so that we can get rid of it.

The problem we're trying to solve is : there are topologies (long links,
small bb credits, large multi-lun devices thus lots of queued i/o) where the
default timeout values of the base drivers (sd, st, etc) are too short.

The optimal solution:
a) adjusts the timeout only on the scsi devices affected
b) timeout value determined dynamically by best entity (usually the scsi
host) at appropriate times (topology changes).
c) timeout adjustment does not require administrator input, apriori
knowledge, or kernel/driver rebuilds
d) addresses all commands from any source. 

Potential options:
1) Change the base driver timeout.  (base drivers defined to be sd, st, etc)

I dislike this mainly because it fails (a) and (c). Also concerned about
abilities to tune all base drivers.

2) Allow the scsi-host to provide a timeout value that can override the base
driver.
The IBM proposed patch does this. I dislike the patch as : scsi host has no
input as to what the base driver timeout is; there are multiple base driver
timeouts (sd, st, etc); thus apriori knowledge is required to determine a
maximum. Also, application of timeout change is inconsistent.

3) Allow the scsi-host to provide a transport-specific increment that can be
added to the base driver timeout.
Just a refinement of (2) to hopefully remove my dislikes. Still has faults
as the exact relationship of topology/configuration/devices to needed
timeout is not exact/determinable.

If there's not enough consensus to do (3) - then I vote for moving ahead
with the IBM patch (2), and updating the st driver as well.

-- James S

note: we are using the scsi_host_self_blocked interface to bridge reconfig
events, but that's a different topic.

> -----Original Message-----
> From: James Bottomley [mailto:James.Bottomley@SteelEye.com]
> Sent: Wednesday, April 21, 2004 3:21 PM
> To: Smart, James
> Cc: Linux SCSI Reflector
> Subject: RE: Transport affected timeouts...
> 
> 
> On Wed, 2004-04-21 at 12:53, Smart, James wrote:
> > Where do we go from here ?   
> > 
> > What we are doing in our driver is the following:
> > - Cancel the mid-layer timeout
> > - Set timeout to (cmd->timeout_per_command/HZ) + hba_offset
> > - Start timer based on new timeout value
> 
> Well, this is unacceptable.  Only the mid layer should be mucking with
> mid-layer timers.
> 
> > Where hba_offset is: (2 * R_A_TOV) + administrative 
> increment (default 0)
> > Where R_A_TOV is the fabric-reported timeout. R_A_TOV is at 
> least a round
> > trip time, plus 2 times max delivery delay time within the 
> fabric. (default
> > 10 seconds).  This value can change based on fabric 
> reconfiguration or
> > plugging the adapter into a differnet fabric.
> 
> I'm still not clear on what you're trying to achieve.
> 
> the scsi_host_self_blocked interface was created with 
> reconfig events in
> mind...it still won't stop in-progress timers, but I've been 
> considering
> adding that feature for things like FC lip events.
> 
> James
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Transport affected timeouts...
  2004-04-22 16:28 Smart, James
@ 2004-04-22 18:14 ` Brian King
  0 siblings, 0 replies; 15+ messages in thread
From: Brian King @ 2004-04-22 18:14 UTC (permalink / raw)
  To: Smart, James; +Cc: 'James Bottomley', Linux SCSI Reflector

We are really trying to solve two different problems here. The problem I am
trying to solve with the patch I submitted is that the existing r/w timeouts
are too short for RAID array devices. Since a single read or write may end
up resulting in multiple ops in RAID 5 arrays this timeout becomes far too
short. In this scenario the LLD has the best knowledge as to what this timeout
value needs to be. A delta value here really does not make sense.

The problem you are trying to solve is more related to the latencies you may
experience due to the transport. I'm not sure my patch is the best way to fix
your problem. Updating the st driver to use this rw_timeout does not sound like
a good solution as the LLD really has no idea what the total timeout for a
read or write should be for a tape.

Option 2 works best for the ipr driver, option 3 works best for you. Since we
are really solving two different problems, how about using both options?

-Brian

> Potential options:
> 1) Change the base driver timeout.  (base drivers defined to be sd, st, etc)
> 
> I dislike this mainly because it fails (a) and (c). Also concerned about
> abilities to tune all base drivers.
> 
> 2) Allow the scsi-host to provide a timeout value that can override the base
> driver.
> The IBM proposed patch does this. I dislike the patch as : scsi host has no
> input as to what the base driver timeout is; there are multiple base driver
> timeouts (sd, st, etc); thus apriori knowledge is required to determine a
> maximum. Also, application of timeout change is inconsistent.
> 
> 3) Allow the scsi-host to provide a transport-specific increment that can be
> added to the base driver timeout.
> Just a refinement of (2) to hopefully remove my dislikes. Still has faults
> as the exact relationship of topology/configuration/devices to needed
> timeout is not exact/determinable.

-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
@ 2004-04-21 16:53 Smart, James
  2004-04-21 19:20 ` James Bottomley
  0 siblings, 1 reply; 15+ messages in thread
From: Smart, James @ 2004-04-21 16:53 UTC (permalink / raw)
  To: Smart, James, 'James Bottomley'; +Cc: Linux SCSI Reflector

James B and others...

Where do we go from here ?   

What we are doing in our driver is the following:
- Cancel the mid-layer timeout
- Set timeout to (cmd->timeout_per_command/HZ) + hba_offset
- Start timer based on new timeout value

Where hba_offset is: (2 * R_A_TOV) + administrative increment (default 0)
Where R_A_TOV is the fabric-reported timeout. R_A_TOV is at least a round
trip time, plus 2 times max delivery delay time within the fabric. (default
10 seconds).  This value can change based on fabric reconfiguration or
plugging the adapter into a differnet fabric.

-- James S
  

> -----Original Message-----
> From: Smart, James 
> Sent: Friday, April 16, 2004 4:13 PM
> To: 'James Bottomley'
> Cc: Linux SCSI Reflector
> Subject: RE: Transport affected timeouts...
> 
> 
> 
> > -----Original Message-----
> > From: James Bottomley [mailto:James.Bottomley@SteelEye.com]
> > Sent: Friday, April 16, 2004 3:47 PM
> > To: Smart, James
> > Cc: Linux SCSI Reflector
> > Subject: RE: Transport affected timeouts...
> > 
> > 
> > On Fri, 2004-04-16 at 14:39, Smart, James wrote:
> > > I had looked at the patch.  I don't think it works as well as the
> > > incrementer. But it would be a start. We would need the st 
> > driver, and scsi
> > > generic to use it as well. It doesn't address the timeout 
> > changing post
> > > slave_configure. The other thing that bothers me is that it 
> > uses an explicit
> > > value. As per the thread, it would have been better to 
> know what the
> > > original default was and just increment/double it - and 
> > there is the issue
> > > of different device types needing different defaults.
> > 
> > I don't think it's a good idea to alter *every* timeout, merely the
> > usual ones (hence, really only read and write in the patch).
> 
> In general, we're reflecting larger/longer topologies that 
> just induce more
> latency in performing an i/o and which can have an aggregate 
> effect overall
> to i/o queued in/for the target. Didn't matter whether it was 
> an usual i/o
> or not.
> 
> > 
> > Why do you need it to be variable post slave_configure?
> 
> Hmm... the gist of the argument is that the adapter could be 
> replugged to
> another fabric that has larger or lower timeouts needed. But - I would
> assume a midlayer rescan would have to occur on such a 
> change. New devices
> are covered as slave_configure would be called for them. But, is
> slave_configure called again on existing devices that change 
> "personality"
> as they are now a different physical device?
> 
> -- james
>  
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
  2004-04-21 16:53 Smart, James
@ 2004-04-21 19:20 ` James Bottomley
  0 siblings, 0 replies; 15+ messages in thread
From: James Bottomley @ 2004-04-21 19:20 UTC (permalink / raw)
  To: Smart, James; +Cc: Linux SCSI Reflector

On Wed, 2004-04-21 at 12:53, Smart, James wrote:
> Where do we go from here ?   
> 
> What we are doing in our driver is the following:
> - Cancel the mid-layer timeout
> - Set timeout to (cmd->timeout_per_command/HZ) + hba_offset
> - Start timer based on new timeout value

Well, this is unacceptable.  Only the mid layer should be mucking with
mid-layer timers.

> Where hba_offset is: (2 * R_A_TOV) + administrative increment (default 0)
> Where R_A_TOV is the fabric-reported timeout. R_A_TOV is at least a round
> trip time, plus 2 times max delivery delay time within the fabric. (default
> 10 seconds).  This value can change based on fabric reconfiguration or
> plugging the adapter into a differnet fabric.

I'm still not clear on what you're trying to achieve.

the scsi_host_self_blocked interface was created with reconfig events in
mind...it still won't stop in-progress timers, but I've been considering
adding that feature for things like FC lip events.

James



^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
@ 2004-04-16 20:13 Smart, James
  0 siblings, 0 replies; 15+ messages in thread
From: Smart, James @ 2004-04-16 20:13 UTC (permalink / raw)
  To: 'James Bottomley'; +Cc: Linux SCSI Reflector

> -----Original Message-----
> From: James Bottomley [mailto:James.Bottomley@SteelEye.com]
> Sent: Friday, April 16, 2004 3:47 PM
> To: Smart, James
> Cc: Linux SCSI Reflector
> Subject: RE: Transport affected timeouts...
> 
> 
> On Fri, 2004-04-16 at 14:39, Smart, James wrote:
> > I had looked at the patch.  I don't think it works as well as the
> > incrementer. But it would be a start. We would need the st 
> driver, and scsi
> > generic to use it as well. It doesn't address the timeout 
> changing post
> > slave_configure. The other thing that bothers me is that it 
> uses an explicit
> > value. As per the thread, it would have been better to know what the
> > original default was and just increment/double it - and 
> there is the issue
> > of different device types needing different defaults.
> 
> I don't think it's a good idea to alter *every* timeout, merely the
> usual ones (hence, really only read and write in the patch).

In general, we're reflecting larger/longer topologies that just induce more
latency in performing an i/o and which can have an aggregate effect overall
to i/o queued in/for the target. Didn't matter whether it was an usual i/o
or not.

> 
> Why do you need it to be variable post slave_configure?

Hmm... the gist of the argument is that the adapter could be replugged to
another fabric that has larger or lower timeouts needed. But - I would
assume a midlayer rescan would have to occur on such a change. New devices
are covered as slave_configure would be called for them. But, is
slave_configure called again on existing devices that change "personality"
as they are now a different physical device?

-- james

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
@ 2004-04-16 19:39 Smart, James
  2004-04-16 19:46 ` James Bottomley
  0 siblings, 1 reply; 15+ messages in thread
From: Smart, James @ 2004-04-16 19:39 UTC (permalink / raw)
  To: 'James Bottomley'; +Cc: Linux SCSI Reflector

I had looked at the patch.  I don't think it works as well as the
incrementer. But it would be a start. We would need the st driver, and scsi
generic to use it as well. It doesn't address the timeout changing post
slave_configure. The other thing that bothers me is that it uses an explicit
value. As per the thread, it would have been better to know what the
original default was and just increment/double it - and there is the issue
of different device types needing different defaults.

-- james


> -----Original Message-----
> From: James Bottomley [mailto:James.Bottomley@SteelEye.com]
> Sent: Friday, April 16, 2004 3:24 PM
> To: Smart, James
> Cc: Linux SCSI Reflector
> Subject: Re: Transport affected timeouts...
> 
> 
> On Fri, 2004-04-16 at 10:40, Smart, James wrote:
> > One issue that we're wrestling with in our driver is 
> timeout values. In the
> > past, we've encountered large configurations where the 
> timeouts from the
> > midlayer are insufficient. In general - we'd like the scsi 
> host to be able
> > to add a transport/topology increment time to the base 
> timeout values.  The
> > methodology would have to be dynamic as it may change as 
> link connectivity
> > changes.
> > 
> > Obviously, an hba driver mucking with the timeout values 
> handed to it is
> > frowned upon. Is there a recommendation on how we should 
> handle this ?
> 
> There's no currently agreed upon framework, but would
> 
> http://www-124.ibm.com/storageio/ipr/patch-2.6.5-sd_timeout_mod
> 
> Do for what you want?
> 
> James
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Transport affected timeouts...
  2004-04-16 19:39 Smart, James
@ 2004-04-16 19:46 ` James Bottomley
  0 siblings, 0 replies; 15+ messages in thread
From: James Bottomley @ 2004-04-16 19:46 UTC (permalink / raw)
  To: Smart, James; +Cc: Linux SCSI Reflector

On Fri, 2004-04-16 at 14:39, Smart, James wrote:
> I had looked at the patch.  I don't think it works as well as the
> incrementer. But it would be a start. We would need the st driver, and scsi
> generic to use it as well. It doesn't address the timeout changing post
> slave_configure. The other thing that bothers me is that it uses an explicit
> value. As per the thread, it would have been better to know what the
> original default was and just increment/double it - and there is the issue
> of different device types needing different defaults.

I don't think it's a good idea to alter *every* timeout, merely the
usual ones (hence, really only read and write in the patch).

Why do you need it to be variable post slave_configure?

James



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Transport affected timeouts...
@ 2004-04-16 15:40 Smart, James
  2004-04-16 19:24 ` James Bottomley
  0 siblings, 1 reply; 15+ messages in thread
From: Smart, James @ 2004-04-16 15:40 UTC (permalink / raw)
  To: Linux SCSI Reflector

All,

One issue that we're wrestling with in our driver is timeout values. In the
past, we've encountered large configurations where the timeouts from the
midlayer are insufficient. In general - we'd like the scsi host to be able
to add a transport/topology increment time to the base timeout values.  The
methodology would have to be dynamic as it may change as link connectivity
changes.

Obviously, an hba driver mucking with the timeout values handed to it is
frowned upon. Is there a recommendation on how we should handle this ?

-- james s

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Transport affected timeouts...
  2004-04-16 15:40 Smart, James
@ 2004-04-16 19:24 ` James Bottomley
  0 siblings, 0 replies; 15+ messages in thread
From: James Bottomley @ 2004-04-16 19:24 UTC (permalink / raw)
  To: Smart, James; +Cc: Linux SCSI Reflector

On Fri, 2004-04-16 at 10:40, Smart, James wrote:
> One issue that we're wrestling with in our driver is timeout values. In the
> past, we've encountered large configurations where the timeouts from the
> midlayer are insufficient. In general - we'd like the scsi host to be able
> to add a transport/topology increment time to the base timeout values.  The
> methodology would have to be dynamic as it may change as link connectivity
> changes.
> 
> Obviously, an hba driver mucking with the timeout values handed to it is
> frowned upon. Is there a recommendation on how we should handle this ?

There's no currently agreed upon framework, but would

http://www-124.ibm.com/storageio/ipr/patch-2.6.5-sd_timeout_mod

Do for what you want?

James



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-05-03 15:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-22 18:54 Transport affected timeouts Smart, James
2004-04-22 19:02 ` James Bottomley
2004-04-22 19:09 ` Brian King
  -- strict thread matches above, loose matches on Subject: below --
2004-04-22 21:36 Smart, James
2004-04-22 21:45 ` Brian King
2004-05-03 15:49   ` Brian King
2004-04-22 16:28 Smart, James
2004-04-22 18:14 ` Brian King
2004-04-21 16:53 Smart, James
2004-04-21 19:20 ` James Bottomley
2004-04-16 20:13 Smart, James
2004-04-16 19:39 Smart, James
2004-04-16 19:46 ` James Bottomley
2004-04-16 15:40 Smart, James
2004-04-16 19:24 ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox