public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Strange SCSI behavior?
@ 2001-08-17 12:26 Jon Lapham
  0 siblings, 0 replies; 4+ messages in thread
From: Jon Lapham @ 2001-08-17 12:26 UTC (permalink / raw)
  To: linux-kernel

Hello-

I'm running a heavily used (~10-40 simultaneous users) NFS/smb/email 
server on which I recently installed a new SCSI HD (Atlas V 18GB).  The 
system is a PIII 450, 256MB RAM, 2940U2W SCSI controller, running kernel 
v2.4.8 (but I've also tried older kernels as well) using the new aic7xxx 
driver, the fs is ext2.

What I'm seeing is SCSI "sense key hardware Errors" on the new HD 
during tape backups (HPC1554 DDS-3 drive) scheduled at night (when the 
system is unused):

SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 8000002
Info fld=0x214a55b, Current sd08:11: sense key Hardware Error
  I/O error: dev 08:11, sector 34907416

Sounds like bad HD, right?  Well, I've seen bad SCSI disks before, and 
this seems different.  These messages *only* appear during tape backups, 
  but not during the day when the machine is under *heavy* I/O load to 
that HD.  It is *only* when the DAT tape gets involved that I see these 
messages.  I should also say that files that correspond to the affected 
sectors in the error messages are fine, they are not corrupted.

Suggestions?  What can I do to track this down?

TIA, Jon

[root@office /root]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
   Vendor: SEAGATE  Model: ST39175LW        Rev: 0001
   Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
   Vendor: QUANTUM  Model: ATLAS_V_18_WLS   Rev: 0230
   Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 03 Lun: 00
   Vendor: HP       Model: C1537A           Rev: L708
   Type:   Sequential-Access                ANSI SCSI revision: 02

[root@office /root]# cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 6.1.13
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
Channel A Target 0 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
	Goal: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
	Curr: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
	Channel A Target 0 Lun 0 Settings
		Commands Queued 760609
		Commands Active 0
		Command Openings 52
		Max Tagged Openings 253
		Device Queue Frozen Count 0
Channel A Target 1 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
	Goal: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
	Curr: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
	Channel A Target 1 Lun 0 Settings
		Commands Queued 790561
		Commands Active 0
		Command Openings 64
		Max Tagged Openings 253
		Device Queue Frozen Count 0
Channel A Target 2 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 3 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
	Goal: 10.000MB/s transfers (10.000MHz, offset 32)
	Curr: 10.000MB/s transfers (10.000MHz, offset 32)
	Channel A Target 3 Lun 0 Settings
		Commands Queued 3200334
		Commands Active 0
		Command Openings 1
		Max Tagged Openings 0
		Device Queue Frozen Count 0
Channel A Target 4 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 5 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 6 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 7 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)




^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Strange SCSI behavior?
@ 2001-08-17 12:46 Cress, Andrew R
  2001-08-17 13:25 ` Jon Lapham
  0 siblings, 1 reply; 4+ messages in thread
From: Cress, Andrew R @ 2001-08-17 12:46 UTC (permalink / raw)
  To: 'lapham@extracta.com.br', linux-kernel

Jon,

You really need to know what the additional sense data shows.
With DAT tapes often they have variable length block sizes and get errors
from some UNIX commands as a result.  Or, it may be something that could be
fixed with a firmware update to the DAT drive, or a driver fix.  It depends
on the details.  Is sd08:11 the DAT drive?

Make sure 
CONFIG_SCSI_LOGGING=y 
CONFIG_SCSI_DEBUG=m (or =y)
in your kernel, and issue
echo "scsi log error 3" > /proc/scsi/scsi
and rerun the tape backup to get more info.

Andy

-----Original Message-----
From: Jon Lapham [mailto:lapham@extracta.com.br]
Sent: Friday, August 17, 2001 8:27 AM
To: linux-kernel@vger.kernel.org
Subject: Strange SCSI behavior?


Hello-

I'm running a heavily used (~10-40 simultaneous users) NFS/smb/email 
server on which I recently installed a new SCSI HD (Atlas V 18GB).  The 
system is a PIII 450, 256MB RAM, 2940U2W SCSI controller, running kernel 
v2.4.8 (but I've also tried older kernels as well) using the new aic7xxx 
driver, the fs is ext2.

What I'm seeing is SCSI "sense key hardware Errors" on the new HD 
during tape backups (HPC1554 DDS-3 drive) scheduled at night (when the 
system is unused):

SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 8000002
Info fld=0x214a55b, Current sd08:11: sense key Hardware Error
  I/O error: dev 08:11, sector 34907416

Sounds like bad HD, right?  Well, I've seen bad SCSI disks before, and 
this seems different.  These messages *only* appear during tape backups, 
  but not during the day when the machine is under *heavy* I/O load to 
that HD.  It is *only* when the DAT tape gets involved that I see these 
messages.  I should also say that files that correspond to the affected 
sectors in the error messages are fine, they are not corrupted.

Suggestions?  What can I do to track this down?

TIA, Jon

[root@office /root]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
   Vendor: SEAGATE  Model: ST39175LW        Rev: 0001
   Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
   Vendor: QUANTUM  Model: ATLAS_V_18_WLS   Rev: 0230
   Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 03 Lun: 00
   Vendor: HP       Model: C1537A           Rev: L708
   Type:   Sequential-Access                ANSI SCSI revision: 02

[root@office /root]# cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 6.1.13
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
Channel A Target 0 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
	Goal: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
	Curr: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
	Channel A Target 0 Lun 0 Settings
		Commands Queued 760609
		Commands Active 0
		Command Openings 52
		Max Tagged Openings 253
		Device Queue Frozen Count 0
Channel A Target 1 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
	Goal: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
	Curr: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
	Channel A Target 1 Lun 0 Settings
		Commands Queued 790561
		Commands Active 0
		Command Openings 64
		Max Tagged Openings 253
		Device Queue Frozen Count 0
Channel A Target 2 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 3 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
	Goal: 10.000MB/s transfers (10.000MHz, offset 32)
	Curr: 10.000MB/s transfers (10.000MHz, offset 32)
	Channel A Target 3 Lun 0 Settings
		Commands Queued 3200334
		Commands Active 0
		Command Openings 1
		Max Tagged Openings 0
		Device Queue Frozen Count 0
Channel A Target 4 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 5 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 6 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 7 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Strange SCSI behavior?
@ 2001-08-17 13:03 Randal, Phil
  0 siblings, 0 replies; 4+ messages in thread
From: Randal, Phil @ 2001-08-17 13:03 UTC (permalink / raw)
  To: linux-kernel

The most common cause of Sense Key errors on DDS drives is
dirty heads.  Try running a cleaning tape through the drive
a couple of times.  No, once is not enough, from my experience.

Cheers,

Phil

---------------------------------------------
Phil Randal
Network Engineer
Herefordshire Council
Hereford, UK 

> -----Original Message-----
> From: Cress, Andrew R [mailto:andrew.r.cress@intel.com]
> Sent: 17 August 2001 13:46
> To: 'lapham@extracta.com.br'; linux-kernel@vger.kernel.org
> Subject: RE: Strange SCSI behavior?
> 
> 
> Jon,
> 
> You really need to know what the additional sense data shows.
> With DAT tapes often they have variable length block sizes 
> and get errors
> from some UNIX commands as a result.  Or, it may be something 
> that could be
> fixed with a firmware update to the DAT drive, or a driver 
> fix.  It depends
> on the details.  Is sd08:11 the DAT drive?
> 
> Make sure 
> CONFIG_SCSI_LOGGING=y 
> CONFIG_SCSI_DEBUG=m (or =y)
> in your kernel, and issue
> echo "scsi log error 3" > /proc/scsi/scsi
> and rerun the tape backup to get more info.
> 
> Andy
> 
> -----Original Message-----
> From: Jon Lapham [mailto:lapham@extracta.com.br]
> Sent: Friday, August 17, 2001 8:27 AM
> To: linux-kernel@vger.kernel.org
> Subject: Strange SCSI behavior?
> 
> 
> Hello-
> 
> I'm running a heavily used (~10-40 simultaneous users) NFS/smb/email 
> server on which I recently installed a new SCSI HD (Atlas V 
> 18GB).  The 
> system is a PIII 450, 256MB RAM, 2940U2W SCSI controller, 
> running kernel 
> v2.4.8 (but I've also tried older kernels as well) using the 
> new aic7xxx 
> driver, the fs is ext2.
> 
> What I'm seeing is SCSI "sense key hardware Errors" on the new HD 
> during tape backups (HPC1554 DDS-3 drive) scheduled at night 
> (when the 
> system is unused):
> 
> SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 8000002
> Info fld=0x214a55b, Current sd08:11: sense key Hardware Error
>   I/O error: dev 08:11, sector 34907416
> 
> Sounds like bad HD, right?  Well, I've seen bad SCSI disks 
> before, and 
> this seems different.  These messages *only* appear during 
> tape backups, 
>   but not during the day when the machine is under *heavy* 
> I/O load to 
> that HD.  It is *only* when the DAT tape gets involved that I 
> see these 
> messages.  I should also say that files that correspond to 
> the affected 
> sectors in the error messages are fine, they are not corrupted.
> 
> Suggestions?  What can I do to track this down?
> 
> TIA, Jon
> 
> [root@office /root]# cat /proc/scsi/scsi
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 00
>    Vendor: SEAGATE  Model: ST39175LW        Rev: 0001
>    Type:   Direct-Access                    ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 01 Lun: 00
>    Vendor: QUANTUM  Model: ATLAS_V_18_WLS   Rev: 0230
>    Type:   Direct-Access                    ANSI SCSI revision: 03
> Host: scsi0 Channel: 00 Id: 03 Lun: 00
>    Vendor: HP       Model: C1537A           Rev: L708
>    Type:   Sequential-Access                ANSI SCSI revision: 02
> 
> [root@office /root]# cat /proc/scsi/aic7xxx/0
> Adaptec AIC7xxx driver version: 6.1.13
> aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
> Channel A Target 0 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> 	Goal: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
> 	Curr: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
> 	Channel A Target 0 Lun 0 Settings
> 		Commands Queued 760609
> 		Commands Active 0
> 		Command Openings 52
> 		Max Tagged Openings 253
> 		Device Queue Frozen Count 0
> Channel A Target 1 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> 	Goal: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
> 	Curr: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
> 	Channel A Target 1 Lun 0 Settings
> 		Commands Queued 790561
> 		Commands Active 0
> 		Command Openings 64
> 		Max Tagged Openings 253
> 		Device Queue Frozen Count 0
> Channel A Target 2 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 3 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> 	Goal: 10.000MB/s transfers (10.000MHz, offset 32)
> 	Curr: 10.000MB/s transfers (10.000MHz, offset 32)
> 	Channel A Target 3 Lun 0 Settings
> 		Commands Queued 3200334
> 		Commands Active 0
> 		Command Openings 1
> 		Max Tagged Openings 0
> 		Device Queue Frozen Count 0
> Channel A Target 4 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 5 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 6 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 7 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 8 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 9 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 10 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 11 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 12 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 13 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 14 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 15 Negotiation Settings
> 	User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange SCSI behavior?
  2001-08-17 12:46 Strange SCSI behavior? Cress, Andrew R
@ 2001-08-17 13:25 ` Jon Lapham
  0 siblings, 0 replies; 4+ messages in thread
From: Jon Lapham @ 2001-08-17 13:25 UTC (permalink / raw)
  To: Cress, Andrew R; +Cc: linux-kernel

Andrew-

I don't know if this helps, but this is the message tar spits out when 
this problem occurs:

/bin/tar: home/pedro/COL&AMk-TA: Read error at byte 21958656, reading 
10240 byte: Input/output error

Cress, Andrew R wrote:
> Jon,
> 
> You really need to know what the additional sense data shows.
> With DAT tapes often they have variable length block sizes and get errors
> from some UNIX commands as a result.  Or, it may be something that could be
> fixed with a firmware update to the DAT drive, or a driver fix.  It depends
> on the details.  Is sd08:11 the DAT drive?

Hmmm... I have no idea!  I know that "host 0 channel 0 id 1 lun 0" 
refers to the new Atlas HD, and I know that I have another HD with id 0, 
and that the tape drive is id 3.  I do not know how to interpret 
"sd08:11", suggestions?

Maybe 'sd08' refers to the block device number?
[root@office sysadm]# cat /proc/devices
[snip]
Block devices:
   2 fd
   7 loop
   8 sd
  22 ide1
  65 sd
  66 sd

So, one of the three SCSI devices uses block 8... but which?  I don't know.

> 
> Make sure 
> CONFIG_SCSI_LOGGING=y 
> CONFIG_SCSI_DEBUG=m (or =y)
> in your kernel, and issue
> echo "scsi log error 3" > /proc/scsi/scsi
> and rerun the tape backup to get more info.
> 
> Andy
> 


Okay, good idea.  I will recompile setting those symbols, but I will not 
be able to do so until after the machine is idle (>5PM tonight).  BTW, 
these are my current SCSI symbol defs:

CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_SD_EXTRA_DEVS=40
CONFIG_CHR_DEV_ST=y
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
CONFIG_SCSI_DEBUG_QUEUES=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=253
CONFIG_AIC7XXX_RESET_DELAY_MS=15000



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2001-08-17 13:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-17 12:46 Strange SCSI behavior? Cress, Andrew R
2001-08-17 13:25 ` Jon Lapham
  -- strict thread matches above, loose matches on Subject: below --
2001-08-17 13:03 Randal, Phil
2001-08-17 12:26 Jon Lapham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox