All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi_error device offline fix
@ 2002-10-21  7:37 Mike Anderson
  2002-10-21 16:39 ` Richard Gooch
  2002-10-21 16:50 ` Mike Anderson
  0 siblings, 2 replies; 11+ messages in thread
From: Mike Anderson @ 2002-10-21  7:37 UTC (permalink / raw)
  To: linux-kernel, linux-scsi

This patch corrects a problem in scsi error handling.

When a device is offlined indicated by a message like ...Device offlined
- not ready...

the command return status was not being updated with a failure status if
the IO was a timeout.

I tested the patch on system with ips, aic, and qlogic fc adapters, but
was unable to generate a satisfactory device offline test case.

I did test this fix on uml with scsi_debug and generated a device
offline condition with verified this fix was working correctly.

-andmike
--
Michael Anderson
andmike@us.ibm.com

 scsi_error.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)
------

===== drivers/scsi/scsi_error.c 1.18 vs edited =====
--- 1.18/drivers/scsi/scsi_error.c	Thu Oct 17 10:52:39 2002
+++ edited/drivers/scsi/scsi_error.c	Sat Oct 19 15:24:06 2002
@@ -1145,14 +1145,18 @@
 		if (!scsi_eh_eflags_chk(scmd, SCSI_EH_CMD_ERR))
 			continue;
 
-		printk(KERN_INFO "%s: Device offlined - not"
+		printk(KERN_INFO "scsi: Device offlined - not"
 				" ready or command retry failed"
 				" after error recovery: host"
 				" %d channel %d id %d lun %d\n",
-				__FUNCTION__, shost->host_no,
+				shost->host_no,
 				scmd->device->channel,
 				scmd->device->id,
 				scmd->device->lun);
+
+		if (scsi_eh_eflags_chk(scmd, SCSI_EH_CMD_TIMEOUT))
+			scmd->result |= (DRIVER_TIMEOUT << 24);
+
 		scmd->device->online = FALSE;
 		scsi_eh_finish_cmd(scmd, shost);
 	}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21  7:37 [PATCH] scsi_error device offline fix Mike Anderson
@ 2002-10-21 16:39 ` Richard Gooch
  2002-10-21 16:51   ` Mike Anderson
  2002-10-21 16:50 ` Mike Anderson
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Gooch @ 2002-10-21 16:39 UTC (permalink / raw)
  To: Mike Anderson; +Cc: linux-kernel, linux-scsi

Mike Anderson writes:
> This patch corrects a problem in scsi error handling.

For which kernel version?

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21  7:37 [PATCH] scsi_error device offline fix Mike Anderson
  2002-10-21 16:39 ` Richard Gooch
@ 2002-10-21 16:50 ` Mike Anderson
  1 sibling, 0 replies; 11+ messages in thread
From: Mike Anderson @ 2002-10-21 16:50 UTC (permalink / raw)
  To: linux-kernel, linux-scsi

Sorry for not stating clearly in the mail this patch is against 2.5.44

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21 16:39 ` Richard Gooch
@ 2002-10-21 16:51   ` Mike Anderson
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Anderson @ 2002-10-21 16:51 UTC (permalink / raw)
  To: Richard Gooch; +Cc: linux-kernel, linux-scsi

Sorry for not making it clear in the mail this patch is against 2.5.44

Richard Gooch [rgooch@ras.ucalgary.ca] wrote:
> Mike Anderson writes:
> > This patch corrects a problem in scsi error handling.
> 
> For which kernel version?
> 
> 				Regards,
> 
> 					Richard....
> Permanent: rgooch@atnf.csiro.au
> Current:   rgooch@ras.ucalgary.ca
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
@ 2002-10-21 17:39 andy barlak
  2002-10-21 19:33 ` Mike Anderson
  0 siblings, 1 reply; 11+ messages in thread
From: andy barlak @ 2002-10-21 17:39 UTC (permalink / raw)
  To: linux-kernel


This patch to scsi_error.c   make no improvement
in my BusLogic 958  difficulties.  Still get these messages
and timouts with the patch.

scsi_eh_offline_sdevs: Device offlined - not ready or command retry failed after
 error recovery: host 0 channel 0 id 1 lun 0
.
.
.

-- 

 Andy Barlak


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21 17:39 andy barlak
@ 2002-10-21 19:33 ` Mike Anderson
  2002-10-21 20:01   ` andy barlak
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Anderson @ 2002-10-21 19:33 UTC (permalink / raw)
  To: andy barlak; +Cc: linux-kernel

andy barlak [andyb@island.net] wrote:
> 
> This patch to scsi_error.c   make no improvement
> in my BusLogic 958  difficulties.  Still get these messages
> and timouts with the patch.
> 
> scsi_eh_offline_sdevs: Device offlined - not ready or command retry failed after
>  error recovery: host 0 channel 0 id 1 lun 0
> .
> .
> .
> 
> -- 
> 
>  Andy Barlak

Is the patch applied correctly?

I the patch the printk is changed to "scsi:" instead of
"scsi_eh_offline_sdevs:"

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21 19:33 ` Mike Anderson
@ 2002-10-21 20:01   ` andy barlak
  2002-10-21 22:52     ` Patrick Mansfield
  0 siblings, 1 reply; 11+ messages in thread
From: andy barlak @ 2002-10-21 20:01 UTC (permalink / raw)
  To: Mike Anderson; +Cc: linux-kernel


Sorry,  used the wrong dmesg file for the copy and paste of the error message.

yes the printk error message issued is:

scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 0 lun 0

over and over through all ids, existing or not.
Patch was successfully applied to 2.5.44.



On Mon, 21 Oct 2002, Mike Anderson wrote:

> andy barlak [andyb@island.net] wrote:
> >
> > This patch to scsi_error.c   make no improvement
> > in my BusLogic 958  difficulties.  Still get these messages
> > and timouts with the patch.
> >
> > scsi_eh_offline_sdevs: Device offlined - not ready or command retry failed after
> >  error recovery: host 0 channel 0 id 1 lun 0
> > .
> > .
> > .
> >
> > --
> >
> >  Andy Barlak
>
> Is the patch applied correctly?
>
> I the patch the printk is changed to "scsi:" instead of
> "scsi_eh_offline_sdevs:"
>
> -andmike
> --
> Michael Anderson
> andmike@us.ibm.com
>

-- 

 Andy Barlak



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21 20:01   ` andy barlak
@ 2002-10-21 22:52     ` Patrick Mansfield
  2002-10-22  0:58       ` andy barlak
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick Mansfield @ 2002-10-21 22:52 UTC (permalink / raw)
  To: andy barlak; +Cc: Mike Anderson, linux-kernel

On Mon, Oct 21, 2002 at 01:01:26PM -0700, andy barlak wrote:
> 
> Sorry,  used the wrong dmesg file for the copy and paste of the error message.
> 
> yes the printk error message issued is:
> 
> scsi: Device offlined - not ready or command retry failed after error recovery:
> host 0 channel 0 id 0 lun 0
> 
> over and over through all ids, existing or not.
> Patch was successfully applied to 2.5.44.

I thought it could be one of the INQUIRY related commands to get the
id/serial numbers, since in your (previous) dmesg output, the failure
occured after the print_inquiry() call on the same target before any
upper level attaches.

But now you are getting nothing at all, not even any of the print_inquiry()
output? 

Like you got just before the failures in your original message:

Vendor: CONNER    Model: CFP2107E  2.14GB  Rev: 1423
Type:   Direct-Access                      ANSI SCSI revision: 02
Vendor: SEAGATE   Model: SX423451W         Rev: 9E18
Type:   Direct-Access                      ANSI SCSI revision: 02

Can you turn on all scsi logging - with CONFIG_SCSI_LOGGING enabled,
on your boot command line add a "scsi_logging=1" and send
the output.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-21 22:52     ` Patrick Mansfield
@ 2002-10-22  0:58       ` andy barlak
  2002-10-22 15:38         ` Patrick Mansfield
  0 siblings, 1 reply; 11+ messages in thread
From: andy barlak @ 2002-10-22  0:58 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: Mike Anderson, linux-kernel

On Mon, 21 Oct 2002, Patrick Mansfield wrote:
> Can you turn on all scsi logging - with CONFIG_SCSI_LOGGING enabled,
> on your boot command line add a "scsi_logging=1" and send
> the output.
>
> -- Patrick Mansfield

Sure.  large dmesg buffer required.  This produced a 55k file that
I will pare down to what I consider informative.

SCSI subsystem driver Revision: 1.00
PCI: Assigned IRQ 10 for device 00:08.0
scsi: ***** BusLogic SCSI Driver Version 2.1.16 of 18 July 2002 *****
scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com>
Wake up parent
Error handler sleeping
scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter
scsi0:   Firmware Version: 5.06J, I/O Address: 0xE800, IRQ Channel: 10/Level
scsi0:   PCI Bus: 0, Device: 8, Address: 0xED001000, Host Adapter SCSI ID: 7
scsi0:   Parity Checking: Disabled, Extended Translation: Disabled
scsi0:   Synchronous Negotiation: FFFFSFF#FFFFFFFF, Wide Negotiation: YYYYNYY#YY
YYYYYY
scsi0:   Disconnect/Reconnect: Enabled, Tagged Queuing: Enabled
scsi0:   Scatter/Gather Limit: 128 of 8192 segments, Mailboxes: 211
scsi0:   Driver Queue Depth: 211, Host Adapter Queue Depth: 192
scsi0:   Tagged Queue Depth: Automatic, Untagged Queue Depth: 3
scsi0:   Error Recovery Strategy: Default, SCSI Bus Reset: Disabled
scsi0:   SCSI Bus Termination: High Enabled, SCAM: Disabled
scsi0: *** BusLogic BT-958 Initialized Successfully ***
scsi0 : BusLogic BT-958
scsi scan: INQUIRY to host 0 channel 0 id 0 lun 0
scsi_do_req (host = 0, channel = 0 target = 0, buffer =c03a9060, bufflen = 36, d
one = c01e425c, timeout = 6000, retries = 3)
command : 12  00  00  00  24  00
Activating command for device 0 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0, command = c134f064, buffe
r = c03a9060,
bufflen = 36, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
scsi_delete_timer: scmd: c134f000, rtn: 1
Command finished 1 0 0x0
Notifying upper driver of completion for device 0 0
Deactivating command for device 0 (active=0, failed=0)
scsi scan: 1st INQUIRY successful with code 0x0
  Vendor: CONNER    Model: CFP2107E  2.14GB  Rev: 1423
  Type:   Direct-Access                      ANSI SCSI revision: 02
scsi_do_req (host = 0, channel = 0 target = 0, buffer =c03a9160, bufflen = 255,
done = c01e425c, timeout = 6000, retries = 3)
command : 12  01  00  00  ff  00
Activating command for device 0 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0, command = c134f064, buffe
r = c03a9160,
bufflen = 255, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
scsi_delete_timer: scmd: c134f000, rtn: 1
Command finished 1 0 0x0
Notifying upper driver of completion for device 0 0
Deactivating command for device 0 (active=0, failed=0)
scsi_do_req (host = 0, channel = 0 target = 0, buffer =c03a9260, bufflen = 255,
done = c01e425c, timeout = 6000, retries = 3)
command : 12  01  80  00  ff  00
Activating command for device 0 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0, command = c134f064, buffe
r = c03a9260,
bufflen = 255, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
scsi_delete_timer: scmd: c134f000, rtn: 1
Command finished 1 0 0x0
Notifying upper driver of completion for device 0 0
Deactivating command for device 0 (active=0, failed=0)
scsi scan: host 0 channel 0 id 0 lun 0 name/id: 'SCONNER  CFP2107E  2.14GBEG95Z9
W '
scsi scan: Sequential scan of host 0 channel 0 id 0
scsi scan: INQUIRY to host 0 channel 0 id 1 lun 0
scsi_do_req (host = 0, channel = 0 target = 1, buffer =c03a9060, bufflen = 36, d
one = c01e425c, timeout = 6000, retries = 3)
command : 12  00  00  00  24  00
Activating command for device 1 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1, command = c134f064, buffe
r = c03a9060,
bufflen = 36, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
scsi_delete_timer: scmd: c134f000, rtn: 1
Command finished 1 0 0x0
Notifying upper driver of completion for device 1 0
Deactivating command for device 1 (active=0, failed=0)
scsi scan: 1st INQUIRY successful with code 0x0
scsi_do_req (host = 0, channel = 0 target = 1, buffer =c03a9060, bufflen = 144,
done = c01e425c, timeout = 6000, retries = 3)
command : 12  00  00  00  90  00
Activating command for device 1 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1, command = c134f064, buffe
r = c03a9060,
bufflen = 144, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
scsi_delete_timer: scmd: c134f000, rtn: 1
Command finished 1 0 0x0
Notifying upper driver of completion for device 1 0
Deactivating command for device 1 (active=0, failed=0)
scsi scan: 2nd INQUIRY successful with code 0x0
  Vendor: SEAGATE   Model: SX423451W         Rev: 9E18
  Type:   Direct-Access                      ANSI SCSI revision: 02
scsi_do_req (host = 0, channel = 0 target = 1, buffer =c03a9160, bufflen = 255,
done = c01e425c, timeout = 6000, retries = 3)
command : 12  01  00  00  ff  00
Activating command for device 1 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1, command = c134f064, buffe
r = c03a9160,
bufflen = 255, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:1:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work       <<<<<<<<<<<<<<<
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 1 lun 0
scsi_add_timer: scmd: c134f000, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f000, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 1 6000000
Deactivating command for device 1 (active=0, failed=0)
scsi_do_req (host = 0, channel = 0 target = 1, buffer =c03a9160, bufflen = 255,
done = c01e425c, timeout = 6000, retries = 3)
command : 12  01  80  00  ff  00
Activating command for device 1 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1, command = c134f064, buffe
r = c03a9160,
bufflen = 255, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:1:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work            <<<<<<<<<<<<<
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 1 lun 0
scsi_add_timer: scmd: c134f000, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f000, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 1 6000000
Deactivating command for device 1 (active=0, failed=0)
scsi scan: host 0 channel 0 id 1 lun 0 name/id: ''
scsi scan: Sequential scan of host 0 channel 0 id 1
scsi scan: INQUIRY to host 0 channel 0 id 2 lun 0
scsi_do_req (host = 0, channel = 0 target = 2, buffer =c03a9060, bufflen = 36, d
one = c01e425c, timeout = 6000, retries = 3)
command : 12  00  00  00  24  00
Activating command for device 2 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 2, command = c134f064, buffe
r = c03a9060,
bufflen = 36, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:2:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 2 lun 0
scsi_add_timer: scmd: c134f000, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f000, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 2 6000000
Deactivating command for device 2 (active=0, failed=0)
scsi scan: 1st INQUIRY failed with code 0x6000000
scsi scan: INQUIRY to host 0 channel 0 id 3 lun 0
scsi_do_req (host = 0, channel = 0 target = 3, buffer =c03a9060, bufflen = 36, d
one = c01e425c, timeout = 6000, retries = 3)
command : 12  00  00  00  24  00
Activating command for device 3 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 3, command = c134f064, buffe
r = c03a9060,
bufflen = 36, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:3:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 3 lun 0
scsi_add_timer: scmd: c134f000, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f000, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 3 6000000
Deactivating command for device 3 (active=0, failed=0)
scsi scan: 1st INQUIRY failed with code 0x6000000
scsi scan: INQUIRY to host 0 channel 0 id 4 lun 0
scsi_do_req (host = 0, channel = 0 target = 4, buffer =c03a9060, bufflen = 36, d
one = c01e425c, timeout = 6000, retries = 3)
command : 12  00  00  00  24  00
Activating command for device 4 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f000, time: 6000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 4, command = c134f064, buffe
r = c03a9060,
bufflen = 36, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:4:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 4 lun 0
scsi_add_timer: scmd: c134f000, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f000, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 4 6000000
Deactivating command for device 4 (active=0, failed=0)
scsi scan: 1st INQUIRY failed with code 0x6000000
.
.
.
.
Deactivating command for device 15 (active=0, failed=0)
scsi scan: 1st INQUIRY failed with code 0x6000000
st: Version 20021015, fixed bufsize 32768, wrt 30720, s/g segs 256
init_sd: sd driver entry point
sd_detect: type=0
sd_detect: type=0
sd_init: dev_noticed=2
sd_attach: scsi device: <0,0,0,0>
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
sd_attach: scsi device: <0,0,1,0>
Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0
sd_finish:
sd_init_onedisk: disk=sda
scsi_do_req (host = 0, channel = 0 target = 0, buffer =c0003000, bufflen = 0, do
ne = c01e425c, timeout = 30000, retries = 5)
command : 00  00  00  00  00  00
Activating command for device 0 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f400, time: 30000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0, command = c134f464, buffe
r = c0003000,
bufflen = 0, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:0:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 0 lun 0
scsi_add_timer: scmd: c134f400, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f400, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 0 6000000
Deactivating command for device 0 (active=0, failed=0)
scsi_do_req (host = 0, channel = 0 target = 0, buffer =c0003000, bufflen = 128,
done = c01e425c, timeout = 30000, retries = 5)
command : 1a  08  08  00  80  00
Activating command for device 0 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134fe00, time: 30000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0, command = c134fe64, buffe
r = c0003000,
bufflen = 128, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:0:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 0 lun 0
scsi_add_timer: scmd: c134fe00, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134fe00, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 0 6000000
Deactivating command for device 0 (active=0, failed=0)
scsi_do_req (host = 0, channel = 0 target = 0, buffer =c0003000, bufflen = 128,
done = c01e425c, timeout = 30000, retries = 5)
command : 1a  08  08  00  80  00
Activating command for device 0 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: cfdf4000, time: 30000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0, command = cfdf4064, buffe
r = c0003000,
bufflen = 128, done = c01e425c)
.
.
.
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 0 lun 0
scsi_add_timer: scmd: cfdf4800, time: 100, (c01e91dc)
scsi_delete_timer: scmd: cfdf4800, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 0 6000000
Deactivating command for device 0 (active=0, failed=0)
sda : READ CAPACITY failed.
sda : status=0, message=00, host=0, driver=06
sda : sense not available.
sd_open: disk=sda
scsi_block_when_processing_errors: rtn: 0
sd_init_onedisk: disk=sdb
scsi_do_req (host = 0, channel = 0 target = 1, buffer =c0003000, bufflen = 0, do
ne = c01e425c, timeout = 30000, retries = 5)
command : 00  00  00  00  00  00
Activating command for device 1 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: c134f600, time: 30000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1, command = c134f664,
buffer = c0003000,
bufflen = 0, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:1:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd
scsi_eh_bus_device_reset: Trying BDR
scsi_eh_bus_host_reset: Try Bus/Host RST
scsi_try_bus_reset: Snd Bus RST
scsi_try_host_reset: Snd Host RST
scsi: Device offlined - not ready or command retry failed after error recovery:
host 0 channel 0 id 1 lun 0
scsi_add_timer: scmd: c134f600, time: 100, (c01e91dc)
scsi_delete_timer: scmd: c134f600, rtn: 1
scsi_restart_operations: waking up host to restart
Error handler sleeping
scsi_decide_disposition: device offline - report as SUCCESS
Command finished 1 0 0x6000000
Notifying upper driver of completion for device 1 6000000
Deactivating command for device 1 (active=0, failed=0)
scsi_do_req (host = 0, channel = 0 target = 1, buffer =c0003000, bufflen = 128,
done = c01e425c, timeout = 30000, retries = 5)
command : 1a  08  08  00  80  00
Activating command for device 1 (1)
Leaving scsi_init_cmd_from_req()
scsi_add_timer: scmd: cfdf4c00, time: 30000, (c01e8e00)
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1, command = cfdf4c64, buffe
r = c0003000,
bufflen = 128, done = c01e425c)
queuecommand : routine at c01eda7c
leaving scsi_dispatch_cmnd()
Leaving scsi_do_req()
Waking error handler thread
Command timed out active=1 busy=1  failed=1
Error handler waking up
scsi_eh_prt_fail_stats: 0:0:1:0 cmds failed: 0, timedout: 1
Total of 1 commands on 1 devices require eh work
scsi_eh_get_sense: checking to see if we need to request sense
scsi_eh_abort_cmd: checking to see if we need to abort cmd


and so on.







> On Mon, Oct 21, 2002 at 01:01:26PM -0700, andy barlak wrote:
> >
> > Sorry,  used the wrong dmesg file for the copy and paste of the error message.
> >
> > yes the printk error message issued is:
> >
> > scsi: Device offlined - not ready or command retry failed after error recovery:
> > host 0 channel 0 id 0 lun 0
> >
> > over and over through all ids, existing or not.
> > Patch was successfully applied to 2.5.44.
>
> I thought it could be one of the INQUIRY related commands to get the
> id/serial numbers, since in your (previous) dmesg output, the failure
> occured after the print_inquiry() call on the same target before any
> upper level attaches.
>
> But now you are getting nothing at all, not even any of the print_inquiry()
> output?
>
> Like you got just before the failures in your original message:
>
> Vendor: CONNER    Model: CFP2107E  2.14GB  Rev: 1423
> Type:   Direct-Access                      ANSI SCSI revision: 02
> Vendor: SEAGATE   Model: SX423451W         Rev: 9E18
> Type:   Direct-Access                      ANSI SCSI revision: 02
>
> Can you turn on all scsi logging - with CONFIG_SCSI_LOGGING enabled,
> on your boot command line add a "scsi_logging=1" and send
> the output.
>
> -- Patrick Mansfield
>

-- 

 Andy Barlak



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-22  0:58       ` andy barlak
@ 2002-10-22 15:38         ` Patrick Mansfield
  2002-10-22 16:14           ` andy barlak
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick Mansfield @ 2002-10-22 15:38 UTC (permalink / raw)
  To: andy barlak; +Cc: Mike Anderson, linux-kernel

On Mon, Oct 21, 2002 at 05:58:04PM -0700, andy barlak wrote:
> On Mon, 21 Oct 2002, Patrick Mansfield wrote:
> > Can you turn on all scsi logging - with CONFIG_SCSI_LOGGING enabled,
> > on your boot command line add a "scsi_logging=1" and send
> > the output.
> >
> > -- Patrick Mansfield
> 
> Sure.  large dmesg buffer required.  This produced a 55k file that
> I will pare down to what I consider informative.

It looks like the INQUIRY page code 0 is timing out and appears to have
hung the bus, as all other commands sent to the bus then timeout.

It's surprising that that would hang the bus.

That driver really needs at least some basic reset handling.

Try removing the scsi_load_identifier call in scsi_scan.c and
see if you can boot. And/or get sg_utils and on your 2.4 system
send a INQUIRY page 0 to the device, and see if that hangs or
not, like:

[patman@elm3a50 sg_utils]$ sudo ./sg_inq  -e -o=0 /dev/sg1
EVPD INQUIRY, page code=0x00:
 Only hex output supported
 00     00 00 00 0c 00 03 80 81  c0 c1 c2 c3 c7 c8 d1 d2    ................    

FYI sg_utils is at:

http://www.torque.net/sg/index.html#Utilities:%20sg_utils%20and%20sg3_utils
http://www.torque.net/sg/p/sg3_utils-1.01.tgz

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] scsi_error device offline fix
  2002-10-22 15:38         ` Patrick Mansfield
@ 2002-10-22 16:14           ` andy barlak
  0 siblings, 0 replies; 11+ messages in thread
From: andy barlak @ 2002-10-22 16:14 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: Mike Anderson, linux-kernel


On Tue, 22 Oct 2002, Patrick Mansfield wrote:
> Try removing the scsi_load_identifier call in scsi_scan.c and
> see if you can boot. And/or get sg_utils and on your 2.4 system
> send a INQUIRY page 0 to the device, and see if that hangs or
> not, like:


On this 2.4.19 box with the Buslogic 958, that command hangs:
# ./sg_inq -e -o=0 /dev/sg1
EVPD INQUIRY, page code=0x00:

Dmesg reports a growing list of:
.
.
.
SCSI host 0 abort (pid 41290) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
scsi0: Resetting BusLogic BT-958 due to Target 1
scsi0: *** BusLogic BT-958 Initialized Successfully ***
SCSI host 0 abort (pid 41292) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
scsi0: Resetting BusLogic BT-958 due to Target 1
scsi0: *** BusLogic BT-958 Initialized Successfully ***
.
.
.


> On Mon, Oct 21, 2002 at 05:58:04PM -0700, andy barlak wrote:
> > On Mon, 21 Oct 2002, Patrick Mansfield wrote:
> > > Can you turn on all scsi logging - with CONFIG_SCSI_LOGGING enabled,
> > > on your boot command line add a "scsi_logging=1" and send
> > > the output.
> > >
> > > -- Patrick Mansfield
> >
> > Sure.  large dmesg buffer required.  This produced a 55k file that
> > I will pare down to what I consider informative.
>
> It looks like the INQUIRY page code 0 is timing out and appears to have
> hung the bus, as all other commands sent to the bus then timeout.
>
> It's surprising that that would hang the bus.
>
> That driver really needs at least some basic reset handling.
>
> Try removing the scsi_load_identifier call in scsi_scan.c and
> see if you can boot. And/or get sg_utils and on your 2.4 system
> send a INQUIRY page 0 to the device, and see if that hangs or
> not, like:
>
> [patman@elm3a50 sg_utils]$ sudo ./sg_inq  -e -o=0 /dev/sg1
> EVPD INQUIRY, page code=0x00:
>  Only hex output supported
>  00     00 00 00 0c 00 03 80 81  c0 c1 c2 c3 c7 c8 d1 d2    ................
>
> FYI sg_utils is at:
>
> http://www.torque.net/sg/index.html#Utilities:%20sg_utils%20and%20sg3_utils
> http://www.torque.net/sg/p/sg3_utils-1.01.tgz
>
> -- Patrick Mansfield
>

-- 

 Andy Barlak


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-10-22 16:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-21  7:37 [PATCH] scsi_error device offline fix Mike Anderson
2002-10-21 16:39 ` Richard Gooch
2002-10-21 16:51   ` Mike Anderson
2002-10-21 16:50 ` Mike Anderson
  -- strict thread matches above, loose matches on Subject: below --
2002-10-21 17:39 andy barlak
2002-10-21 19:33 ` Mike Anderson
2002-10-21 20:01   ` andy barlak
2002-10-21 22:52     ` Patrick Mansfield
2002-10-22  0:58       ` andy barlak
2002-10-22 15:38         ` Patrick Mansfield
2002-10-22 16:14           ` andy barlak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.