[PATCH] Flexible timout intfrastructure take II

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] Flexible timout intfrastructure take II
@ 2004-06-16 21:37 James Bottomley
  2004-06-16 22:15 ` Luben Tuikov
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: James Bottomley @ 2004-06-16 21:37 UTC (permalink / raw)
  To: SCSI Mailing List

[This is basically the same patch posted on the flexible timeout
infrastructure thread, but with all the comments/doc stuff done as well]

The object of this infrastructure is to give HBAs early warning that
error handling is about to happen and also provide them with the
opportunity to do something about it.

It introduces the extra template callback:

eh_timed_out()

which scsi_times_out() will call if it is populated to notify the LLD
that an outstanding command took a timeout.

There are three possible returns:

EH_HANDLED:	I've fixed the problem, please complete the command for me
(as soon as the timer fires, scsi_done will do nothing, so the timer
itself will call a special version of scsi_done that doesn't check the
timer).

EH_NOT_HANDLED:	Invoke error recovery as normal

EH_RESET_TIMER:	The command will complete, reset the timer to its
original value and start it ticking again.

James

===== Documentation/scsi/scsi_mid_low_api.txt 1.16 vs edited =====
--- 1.16/Documentation/scsi/scsi_mid_low_api.txt	2004-02-01 04:45:23 -06:00
+++ edited/Documentation/scsi/scsi_mid_low_api.txt	2004-06-16 14:53:28 -05:00
@@ -827,6 +827,7 @@
 Summary:
    bios_param - fetch head, sector, cylinder info for a disk
    detect - detects HBAs this driver wants to control
+   eh_timed_out - notify the host that a command timer expired
    eh_abort_handler - abort given command
    eh_bus_reset_handler - issue SCSI bus reset
    eh_device_reset_handler - issue SCSI device reset
@@ -892,6 +893,32 @@
  *                       not invoked in "hotplug initialization mode")
  **/
     int detect(struct scsi_host_template * shtp)
+
+
+/**
+ *      eh_timed_out - The timer for the command has just fired
+ *      @scp: identifies command timing out
+ *
+ *      Returns:
+ *
+ *	EH_HANDLED:		I fixed the error, please complete the command
+ *	EH_RESET_TIMER:		I need more time, reset the timer and
+ *				begin counting again
+ *	EH_NOT_HANDLED		Begin normal error recovery
+
+ *
+ *      Locks: None held
+ *
+ *      Calling context: interrupt
+ *
+ *	Notes: This is to give the LLD an opportunity to do local recovery.
+ *	This recovery is limited to determining if the outstanding command
+ *	will ever complete.  You may not abort and restart the command from
+ *	this callback.
+ *
+ *      Optionally defined in: LLD
+ **/
+     int eh_timed_out(struct scsi_cmnd * scp)
 

 /**
===== drivers/scsi/scsi.c 1.143 vs edited =====
--- 1.143/drivers/scsi/scsi.c	2004-04-28 11:32:09 -05:00
+++ edited/drivers/scsi/scsi.c	2004-06-16 10:47:05 -05:00
@@ -689,8 +689,6 @@
  */
 void scsi_done(struct scsi_cmnd *cmd)
 {
-	unsigned long flags;
-
 	/*
 	 * We don't have to worry about this one timing out any more.
 	 * If we are unable to remove the timer, then the command
@@ -701,6 +699,14 @@
 	 */
 	if (!scsi_delete_timer(cmd))
 		return;
+	__scsi_done(cmd);
+}
+
+/* Private entry to scsi_done() to complete a command when the timer
+ * isn't running --- used by scsi_times_out */
+void __scsi_done(struct scsi_cmnd *cmd)
+{
+	unsigned long flags;
 
 	/*
 	 * Set the serial numbers back to zero
===== drivers/scsi/scsi_error.c 1.77 vs edited =====
--- 1.77/drivers/scsi/scsi_error.c	2004-06-06 06:19:15 -05:00
+++ edited/drivers/scsi/scsi_error.c	2004-06-16 10:53:02 -05:00
@@ -162,6 +162,24 @@
 void scsi_times_out(struct scsi_cmnd *scmd)
 {
 	scsi_log_completion(scmd, TIMEOUT_ERROR);
+
+	if (scmd->device->host->hostt->eh_timed_out)
+		switch (scmd->device->host->hostt->eh_timed_out(scmd)) {
+		case EH_HANDLED:
+			__scsi_done(scmd);
+			return;
+		case EH_RESET_TIMER:
+			/* This allows a single retry even of a command
+			 * with allowed == 0 */
+			if (scmd->retries++ > scmd->allowed)
+				break;
+			scsi_add_timer(scmd, scmd->timeout_per_command,
+				       scsi_times_out);
+			return;
+		case EH_NOT_HANDLED:
+			break;
+		}
+
 	if (unlikely(!scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))) {
 		panic("Error handler thread not present at %p %p %s %d",
 		      scmd, scmd->device->host, __FILE__, __LINE__);
===== drivers/scsi/scsi_priv.h 1.32 vs edited =====
--- 1.32/drivers/scsi/scsi_priv.h	2004-03-10 22:20:08 -06:00
+++ edited/drivers/scsi/scsi_priv.h	2004-06-16 10:45:44 -05:00
@@ -82,6 +82,7 @@
 extern void scsi_init_cmd_from_req(struct scsi_cmnd *cmd,
 		struct scsi_request *sreq);
 extern void __scsi_release_request(struct scsi_request *sreq);
+extern void __scsi_done(struct scsi_cmnd *cmd);
 #ifdef CONFIG_SCSI_LOGGING
 void scsi_log_send(struct scsi_cmnd *cmd);
 void scsi_log_completion(struct scsi_cmnd *cmd, int disposition);
===== include/scsi/scsi_host.h 1.17 vs edited =====
--- 1.17/include/scsi/scsi_host.h	2004-06-04 11:51:31 -05:00
+++ edited/include/scsi/scsi_host.h	2004-06-16 14:36:04 -05:00
@@ -30,6 +30,12 @@
 #define DISABLE_CLUSTERING 0
 #define ENABLE_CLUSTERING 1
 
+enum scsi_eh_timer_return {
+	EH_NOT_HANDLED,
+	EH_HANDLED,
+	EH_RESET_TIMER,
+};
+
 
 struct scsi_host_template {
 	struct module *module;
@@ -124,6 +130,20 @@
 	int (* eh_device_reset_handler)(struct scsi_cmnd *);
 	int (* eh_bus_reset_handler)(struct scsi_cmnd *);
 	int (* eh_host_reset_handler)(struct scsi_cmnd *);
+
+	/*
+	 * This is an optional routine to notify the host that the scsi
+	 * timer just fired.  The returns tell the timer routine what to
+	 * do about this:
+	 *
+	 * EH_HANDLED:		I fixed the error, please complete the command
+	 * EH_RESET_TIMER:	I need more time, reset the timer and
+	 *			begin counting again
+	 * EH_NOT_HANDLED	Begin normal error recovery
+	 *
+	 * Status: OPTIONAL
+	 */
+	enum scsi_eh_timer_return (* eh_timed_out)(struct scsi_cmnd *);
 
 	/*
 	 * Old EH handlers, no longer used. Make them warn the user of old


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Flexible timout intfrastructure take II
  2004-06-16 21:37 [PATCH] Flexible timout intfrastructure take II James Bottomley
@ 2004-06-16 22:15 ` Luben Tuikov
  2004-06-17 21:52 ` Mike Anderson
  2004-06-21 17:24 ` Justin T. Gibbs
  2 siblings, 0 replies; 6+ messages in thread
From: Luben Tuikov @ 2004-06-16 22:15 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List

James Bottomley wrote:
> [This is basically the same patch posted on the flexible timeout
> infrastructure thread, but with all the comments/doc stuff done as well]
> 
> The object of this infrastructure is to give HBAs early warning that
> error handling is about to happen and also provide them with the
> opportunity to do something about it.
> 
> It introduces the extra template callback:
> 
> eh_timed_out()
> 
> which scsi_times_out() will call if it is populated to notify the LLD
> that an outstanding command took a timeout.
> 
> There are three possible returns:
> 
> EH_HANDLED:     I've fixed the problem, please complete the command for me
> (as soon as the timer fires, scsi_done will do nothing, so the timer
> itself will call a special version of scsi_done that doesn't check the
> timer).

Maybe this:

EH_HANDLED: The command has completed. The driver has filled
in the status and service response values in the scsi command
structure.  The command is ready to be given ownership back
to SCSI Core. The driver has just NOT called scsi_done(). SCSI
Core will do that for the driver.

The thing in parenthesis is confusing since it implies that a
timer is running at that point, while none is running. We're
here because it fired already.  LLDD need not know the internal
workings of SCSI Core (scsi_done() vs. __scsi_done() mess).
 
> EH_NOT_HANDLED: Invoke error recovery as normal
> 
> EH_RESET_TIMER: The command will complete, reset the timer to its
> original value and start it ticking again.
> 
> James
> 
> ===== Documentation/scsi/scsi_mid_low_api.txt 1.16 vs edited =====
> --- 1.16/Documentation/scsi/scsi_mid_low_api.txt        2004-02-01 
> 04:45:23 -06:00
> +++ edited/Documentation/scsi/scsi_mid_low_api.txt      2004-06-16 
> 14:53:28 -05:00
> @@ -827,6 +827,7 @@
>  Summary:
>     bios_param - fetch head, sector, cylinder info for a disk
>     detect - detects HBAs this driver wants to control
> +   eh_timed_out - notify the host that a command timer expired
>     eh_abort_handler - abort given command
>     eh_bus_reset_handler - issue SCSI bus reset
>     eh_device_reset_handler - issue SCSI device reset
> @@ -892,6 +893,32 @@
>   *                       not invoked in "hotplug initialization mode")
>   **/
>      int detect(struct scsi_host_template * shtp)
> +
> +
> +/**
> + *      eh_timed_out - The timer for the command has just fired
> + *      @scp: identifies command timing out
> + *
> + *      Returns:
> + *
> + *     EH_HANDLED:             I fixed the error, please complete the 
> command
> + *     EH_RESET_TIMER:         I need more time, reset the timer and
> + *                             begin counting again
> + *     EH_NOT_HANDLED          Begin normal error recovery
> +
> + *
> + *      Locks: None held
> + *
> + *      Calling context: interrupt
> + *
> + *     Notes: This is to give the LLD an opportunity to do local recovery.
> + *     This recovery is limited to determining if the outstanding command
> + *     will ever complete.  You may not abort and restart the command from
> + *     this callback.
> + *
> + *      Optionally defined in: LLD
> + **/
> +     int eh_timed_out(struct scsi_cmnd * scp)
>  
> 
>  /**
> ===== drivers/scsi/scsi.c 1.143 vs edited =====
> --- 1.143/drivers/scsi/scsi.c   2004-04-28 11:32:09 -05:00
> +++ edited/drivers/scsi/scsi.c  2004-06-16 10:47:05 -05:00
> @@ -689,8 +689,6 @@
>   */
>  void scsi_done(struct scsi_cmnd *cmd)
>  {
> -       unsigned long flags;
> -
>         /*
>          * We don't have to worry about this one timing out any more.
>          * If we are unable to remove the timer, then the command
> @@ -701,6 +699,14 @@
>          */
>         if (!scsi_delete_timer(cmd))
>                 return;
> +       __scsi_done(cmd);
> +}
> +
> +/* Private entry to scsi_done() to complete a command when the timer
> + * isn't running --- used by scsi_times_out */
> +void __scsi_done(struct scsi_cmnd *cmd)
> +{
> +       unsigned long flags;
>  
>         /*
>          * Set the serial numbers back to zero
> ===== drivers/scsi/scsi_error.c 1.77 vs edited =====
> --- 1.77/drivers/scsi/scsi_error.c      2004-06-06 06:19:15 -05:00
> +++ edited/drivers/scsi/scsi_error.c    2004-06-16 10:53:02 -05:00
> @@ -162,6 +162,24 @@
>  void scsi_times_out(struct scsi_cmnd *scmd)
>  {
>         scsi_log_completion(scmd, TIMEOUT_ERROR);
> +
> +       if (scmd->device->host->hostt->eh_timed_out)
> +               switch (scmd->device->host->hostt->eh_timed_out(scmd)) {
> +               case EH_HANDLED:
> +                       __scsi_done(scmd);
> +                       return;
> +               case EH_RESET_TIMER:
> +                       /* This allows a single retry even of a command
> +                        * with allowed == 0 */
> +                       if (scmd->retries++ > scmd->allowed)
> +                               break;
> +                       scsi_add_timer(scmd, scmd->timeout_per_command,
> +                                      scsi_times_out);
> +                       return;
> +               case EH_NOT_HANDLED:
> +                       break;
> +               }
> +
>         if (unlikely(!scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))) {
>                 panic("Error handler thread not present at %p %p %s %d",
>                       scmd, scmd->device->host, __FILE__, __LINE__);
> ===== drivers/scsi/scsi_priv.h 1.32 vs edited =====
> --- 1.32/drivers/scsi/scsi_priv.h       2004-03-10 22:20:08 -06:00
> +++ edited/drivers/scsi/scsi_priv.h     2004-06-16 10:45:44 -05:00
> @@ -82,6 +82,7 @@
>  extern void scsi_init_cmd_from_req(struct scsi_cmnd *cmd,
>                 struct scsi_request *sreq);
>  extern void __scsi_release_request(struct scsi_request *sreq);
> +extern void __scsi_done(struct scsi_cmnd *cmd);
>  #ifdef CONFIG_SCSI_LOGGING
>  void scsi_log_send(struct scsi_cmnd *cmd);
>  void scsi_log_completion(struct scsi_cmnd *cmd, int disposition);
> ===== include/scsi/scsi_host.h 1.17 vs edited =====
> --- 1.17/include/scsi/scsi_host.h       2004-06-04 11:51:31 -05:00
> +++ edited/include/scsi/scsi_host.h     2004-06-16 14:36:04 -05:00
> @@ -30,6 +30,12 @@
>  #define DISABLE_CLUSTERING 0
>  #define ENABLE_CLUSTERING 1
>  
> +enum scsi_eh_timer_return {
> +       EH_NOT_HANDLED,
> +       EH_HANDLED,
> +       EH_RESET_TIMER,
> +};
> +
>  
>  struct scsi_host_template {
>         struct module *module;
> @@ -124,6 +130,20 @@
>         int (* eh_device_reset_handler)(struct scsi_cmnd *);
>         int (* eh_bus_reset_handler)(struct scsi_cmnd *);
>         int (* eh_host_reset_handler)(struct scsi_cmnd *);
> +
> +       /*
> +        * This is an optional routine to notify the host that the scsi
> +        * timer just fired.  The returns tell the timer routine what to
> +        * do about this:
> +        *
> +        * EH_HANDLED:          I fixed the error, please complete the 
> command
> +        * EH_RESET_TIMER:      I need more time, reset the timer and
> +        *                      begin counting again
> +        * EH_NOT_HANDLED       Begin normal error recovery
> +        *
> +        * Status: OPTIONAL
> +        */
> +       enum scsi_eh_timer_return (* eh_timed_out)(struct scsi_cmnd *);
>  
>         /*
>          * Old EH handlers, no longer used. Make them warn the user of old
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Luben



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Flexible timout intfrastructure take II
  2004-06-16 21:37 [PATCH] Flexible timout intfrastructure take II James Bottomley
  2004-06-16 22:15 ` Luben Tuikov
@ 2004-06-17 21:52 ` Mike Anderson
  2004-06-21 17:24 ` Justin T. Gibbs
  2 siblings, 0 replies; 6+ messages in thread
From: Mike Anderson @ 2004-06-17 21:52 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List


Just an FYI. I applied the patch to scsi-misc-2.6. I patched scsi_debug
to add the new eh_timed_out interface and added support for the
scsi_debug opts modules param to control the return values so I could
test the three cases. 

I provide output below from this artificial test running a single dd and
selecting each return value. I also attached the messages output with
scsi_error logging selected.

-andmike
--
Michael Anderson
andmike@us.ibm.com

1.) Output of test running.
# ./test_eh_timed_out /dev/sdc
andmike: Starting eh_timed_out tests
andmike: Starting eh_timed_out EH_HANDLED test
dev.scsi.logging_level = 0x7
1+0 records in
1+0 records out
512 bytes transferred in 30.017646 seconds (17 bytes/sec)
dev.scsi.logging_level = 0

andmike: Starting eh_timed_out EH_NOT_HANDLED test
dev.scsi.logging_level = 0x7
1+0 records in
1+0 records out
512 bytes transferred in 30.241254 seconds (17 bytes/sec)
dev.scsi.logging_level = 0

andmike: Starting eh_timed_out EH_RESET_TIMER test
dev.scsi.logging_level = 0x7
1+0 records in
1+0 records out
512 bytes transferred in 60.017334 seconds (9 bytes/sec)
dev.scsi.logging_level = 0
andmike: Ending eh_timed_out tests


2.) /var/log/messages output
Jun 17 19:22:05 elm andmike: Starting eh_timed_out tests

#
# EH_HANDLED
#
Jun 17 19:22:05 elm andmike: Starting eh_timed_out EH_HANDLED test
Jun 17 19:22:05 elm kernel: scsi_block_when_processing_errors: rtn: 1
Jun 17 19:22:05 elm kernel: scsi_add_timer: scmd: c9ccce98, time: 30000, (c028cd50)
Jun 17 19:22:05 elm kernel: scsi_debug: scmd: c9ccce98 28 00 00 00 00 00 00 00 80 00 
Jun 17 19:22:35 elm kernel: scsi_debug_timed_out: scmd: c9ccce98


#
# EH_NOT_HANDLED
#
Jun 17 19:22:35 elm andmike: Starting eh_timed_out EH_NOT_HANDLED test
Jun 17 19:22:35 elm kernel: scsi_block_when_processing_errors: rtn: 1
Jun 17 19:22:35 elm kernel: scsi_add_timer: scmd: c9ccce98, time: 30000, (c028cd50)
Jun 17 19:22:35 elm kernel: scsi_debug: scmd: c9ccce98 28 00 00 00 00 00 00 00 80 00
Jun 17 19:23:05 elm kernel: scsi_debug_timed_out: scmd: c9ccce98
Jun 17 19:23:05 elm kernel: scsi_debug_timed_out: scmd: c9ccce98 NOT_HNDLD
Jun 17 19:23:05 elm kernel: Waking error handler thread
Jun 17 19:23:05 elm kernel: Error handler scsi_eh_2 waking up
Jun 17 19:23:05 elm kernel: scsi_eh_prt_fail_stats: 2:0:0:0 cmds failed: 0, cancel: 1
Jun 17 19:23:05 elm kernel: Total of 1 commands on 1 devices require eh work
Jun 17 19:23:05 elm kernel: scsi_eh_2: aborting cmd:0xc9ccce98
Jun 17 19:23:05 elm kernel: scsi_debug: abort
Jun 17 19:23:05 elm kernel: scsi_add_timer: scmd: c9ccce98, time: 10000, (c028cfe0)
Jun 17 19:23:05 elm kernel: scsi_debug: scmd: c9ccce98 00 00 00 00 00 00 
Jun 17 19:23:05 elm kernel: scsi_add_timer: scmd: cc59ee98, time: 30000, (c028cd50)
Jun 17 19:23:05 elm kernel: scsi_eh_done scmd: c9ccce98 result: 0
Jun 17 19:23:05 elm kernel: scsi_send_eh_cmnd: scmd: c9ccce98, rtn:2002
Jun 17 19:23:05 elm kernel: scsi_send_eh_cmnd: scsi_eh_completed_normally 2002
Jun 17 19:23:05 elm kernel: scsi_eh_tur: scmd c9ccce98 rtn 2002
Jun 17 19:23:05 elm kernel: scsi_eh_2: flush retry cmd: c9ccce98
Jun 17 19:23:05 elm kernel: scsi_delete_timer: scmd: c9ccce98, rtn: 0
Jun 17 19:23:05 elm kernel: scsi_add_timer: scmd: c9ccce98, time: 30000, (c028cd50)
Jun 17 19:23:05 elm kernel: scsi_debug: scmd: c9ccce98 28 00 00 00 00 00 00 00 80 00 
Jun 17 19:23:05 elm kernel: Error handler scsi_eh_2 sleeping
Jun 17 19:23:05 elm kernel: scsi_delete_timer: scmd: c9ccce98, rtn: 1


#
# EH_RESET_TIMER
#
Jun 17 19:23:05 elm andmike: Starting eh_timed_out EH_RESET_TIMER test
Jun 17 19:23:05 elm kernel: scsi_block_when_processing_errors: rtn: 1
Jun 17 19:23:05 elm kernel: scsi_add_timer: scmd: c9ccce98, time: 30000, (c028cd50)
Jun 17 19:23:05 elm kernel: scsi_debug: scmd: c9ccce98 28 00 00 00 00 00 00 00 80 00 
Jun 17 19:23:35 elm kernel: scsi_debug_timed_out: scmd: c9ccce98
Jun 17 19:23:35 elm kernel: scsi_debug_timed_out: scmd: c9ccce98 RST_TIME
Jun 17 19:23:35 elm kernel: scsi_add_timer: scmd: c9ccce98, time: 30000, (c028cd50)
Jun 17 19:24:05 elm kernel: scsi_delete_timer: scmd: c9ccce98, rtn: 1
Jun 17 19:24:05 elm andmike: Ending eh_timed_out tests


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Flexible timout intfrastructure take II
  2004-06-16 21:37 [PATCH] Flexible timout intfrastructure take II James Bottomley
  2004-06-16 22:15 ` Luben Tuikov
  2004-06-17 21:52 ` Mike Anderson
@ 2004-06-21 17:24 ` Justin T. Gibbs
  2004-06-21 18:11   ` James Bottomley
  2 siblings, 1 reply; 6+ messages in thread
From: Justin T. Gibbs @ 2004-06-21 17:24 UTC (permalink / raw)
  To: James Bottomley, SCSI Mailing List

> [This is basically the same patch posted on the flexible timeout
> infrastructure thread, but with all the comments/doc stuff done as well]
> 
> The object of this infrastructure is to give HBAs early warning that
> error handling is about to happen and also provide them with the
> opportunity to do something about it.

..

> There are three possible returns:
> 
> EH_HANDLED:	I've fixed the problem, please complete the command for me
> (as soon as the timer fires, scsi_done will do nothing, so the timer
> itself will call a special version of scsi_done that doesn't check the
> timer).
> 
> EH_NOT_HANDLED:	Invoke error recovery as normal
> 
> EH_RESET_TIMER:	The command will complete, reset the timer to its
> original value and start it ticking again.

You also need an:

EH_NOT_FOUND:		The HBA knows nothing about this command and
			has probably already completed it, but has no
			state to confirm this.

or do you propose that the drivers just blindly return EH_HANDLED?

Since scsi_done() has no return value, the LLD cannot know if the
attempted completion was ignored due to losing the race with the timer.
So it cannot keep state to ensure that it provides the correct response
(EH_HANDLED) in this case.

--
Justin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Flexible timout intfrastructure take II
  2004-06-21 17:24 ` Justin T. Gibbs
@ 2004-06-21 18:11   ` James Bottomley
  2004-06-21 19:08     ` Justin T. Gibbs
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-06-21 18:11 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: SCSI Mailing List

On Mon, 2004-06-21 at 12:24, Justin T. Gibbs wrote:
> You also need an:
> 
> EH_NOT_FOUND:		The HBA knows nothing about this command and
> 			has probably already completed it, but has no
> 			state to confirm this.
> 
> or do you propose that the drivers just blindly return EH_HANDLED?
> 
> Since scsi_done() has no return value, the LLD cannot know if the
> attempted completion was ignored due to losing the race with the timer.
> So it cannot keep state to ensure that it provides the correct response
> (EH_HANDLED) in this case.

How is this case different from EH_HANDLED?  If the command has been
completed after the timer fired then it still needs to be completed when
the driver is notified of the timeout.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Flexible timout intfrastructure take II
  2004-06-21 18:11   ` James Bottomley
@ 2004-06-21 19:08     ` Justin T. Gibbs
  0 siblings, 0 replies; 6+ messages in thread
From: Justin T. Gibbs @ 2004-06-21 19:08 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List

> On Mon, 2004-06-21 at 12:24, Justin T. Gibbs wrote:
>> You also need an:
>> 
>> EH_NOT_FOUND:		The HBA knows nothing about this command and
>> 			has probably already completed it, but has no
>> 			state to confirm this.
>> 
>> or do you propose that the drivers just blindly return EH_HANDLED?
>> 
>> Since scsi_done() has no return value, the LLD cannot know if the
>> attempted completion was ignored due to losing the race with the timer.
>> So it cannot keep state to ensure that it provides the correct response
>> (EH_HANDLED) in this case.
> 
> How is this case different from EH_HANDLED?  If the command has been
> completed after the timer fired then it still needs to be completed when
> the driver is notified of the timeout.

I was just seeking clarification of the sematics, since this particular
race was not explicitly covered.

It would seem more in line with your "do it in the mid-layer" approach to
fix this race condition there instead of in each driver.  Either way, the
documentation and/or code should address this case.

--
Justin


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-06-21 19:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-16 21:37 [PATCH] Flexible timout intfrastructure take II James Bottomley
2004-06-16 22:15 ` Luben Tuikov
2004-06-17 21:52 ` Mike Anderson
2004-06-21 17:24 ` Justin T. Gibbs
2004-06-21 18:11   ` James Bottomley
2004-06-21 19:08     ` Justin T. Gibbs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox