* Strange SCSI behavior?
@ 2001-08-17 12:26 Jon Lapham
0 siblings, 0 replies; 4+ messages in thread
From: Jon Lapham @ 2001-08-17 12:26 UTC (permalink / raw)
To: linux-kernel
Hello-
I'm running a heavily used (~10-40 simultaneous users) NFS/smb/email
server on which I recently installed a new SCSI HD (Atlas V 18GB). The
system is a PIII 450, 256MB RAM, 2940U2W SCSI controller, running kernel
v2.4.8 (but I've also tried older kernels as well) using the new aic7xxx
driver, the fs is ext2.
What I'm seeing is SCSI "sense key hardware Errors" on the new HD
during tape backups (HPC1554 DDS-3 drive) scheduled at night (when the
system is unused):
SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 8000002
Info fld=0x214a55b, Current sd08:11: sense key Hardware Error
I/O error: dev 08:11, sector 34907416
Sounds like bad HD, right? Well, I've seen bad SCSI disks before, and
this seems different. These messages *only* appear during tape backups,
but not during the day when the machine is under *heavy* I/O load to
that HD. It is *only* when the DAT tape gets involved that I see these
messages. I should also say that files that correspond to the affected
sectors in the error messages are fine, they are not corrupted.
Suggestions? What can I do to track this down?
TIA, Jon
[root@office /root]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: SEAGATE Model: ST39175LW Rev: 0001
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: QUANTUM Model: ATLAS_V_18_WLS Rev: 0230
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 03 Lun: 00
Vendor: HP Model: C1537A Rev: L708
Type: Sequential-Access ANSI SCSI revision: 02
[root@office /root]# cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 6.1.13
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
Channel A Target 0 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Goal: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
Curr: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
Channel A Target 0 Lun 0 Settings
Commands Queued 760609
Commands Active 0
Command Openings 52
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 1 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Goal: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
Curr: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
Channel A Target 1 Lun 0 Settings
Commands Queued 790561
Commands Active 0
Command Openings 64
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 2 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 3 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Goal: 10.000MB/s transfers (10.000MHz, offset 32)
Curr: 10.000MB/s transfers (10.000MHz, offset 32)
Channel A Target 3 Lun 0 Settings
Commands Queued 3200334
Commands Active 0
Command Openings 1
Max Tagged Openings 0
Device Queue Frozen Count 0
Channel A Target 4 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 5 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 6 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 7 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Strange SCSI behavior?
@ 2001-08-17 12:46 Cress, Andrew R
2001-08-17 13:25 ` Jon Lapham
0 siblings, 1 reply; 4+ messages in thread
From: Cress, Andrew R @ 2001-08-17 12:46 UTC (permalink / raw)
To: 'lapham@extracta.com.br', linux-kernel
Jon,
You really need to know what the additional sense data shows.
With DAT tapes often they have variable length block sizes and get errors
from some UNIX commands as a result. Or, it may be something that could be
fixed with a firmware update to the DAT drive, or a driver fix. It depends
on the details. Is sd08:11 the DAT drive?
Make sure
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_DEBUG=m (or =y)
in your kernel, and issue
echo "scsi log error 3" > /proc/scsi/scsi
and rerun the tape backup to get more info.
Andy
-----Original Message-----
From: Jon Lapham [mailto:lapham@extracta.com.br]
Sent: Friday, August 17, 2001 8:27 AM
To: linux-kernel@vger.kernel.org
Subject: Strange SCSI behavior?
Hello-
I'm running a heavily used (~10-40 simultaneous users) NFS/smb/email
server on which I recently installed a new SCSI HD (Atlas V 18GB). The
system is a PIII 450, 256MB RAM, 2940U2W SCSI controller, running kernel
v2.4.8 (but I've also tried older kernels as well) using the new aic7xxx
driver, the fs is ext2.
What I'm seeing is SCSI "sense key hardware Errors" on the new HD
during tape backups (HPC1554 DDS-3 drive) scheduled at night (when the
system is unused):
SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 8000002
Info fld=0x214a55b, Current sd08:11: sense key Hardware Error
I/O error: dev 08:11, sector 34907416
Sounds like bad HD, right? Well, I've seen bad SCSI disks before, and
this seems different. These messages *only* appear during tape backups,
but not during the day when the machine is under *heavy* I/O load to
that HD. It is *only* when the DAT tape gets involved that I see these
messages. I should also say that files that correspond to the affected
sectors in the error messages are fine, they are not corrupted.
Suggestions? What can I do to track this down?
TIA, Jon
[root@office /root]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: SEAGATE Model: ST39175LW Rev: 0001
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: QUANTUM Model: ATLAS_V_18_WLS Rev: 0230
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 03 Lun: 00
Vendor: HP Model: C1537A Rev: L708
Type: Sequential-Access ANSI SCSI revision: 02
[root@office /root]# cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 6.1.13
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
Channel A Target 0 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Goal: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
Curr: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
Channel A Target 0 Lun 0 Settings
Commands Queued 760609
Commands Active 0
Command Openings 52
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 1 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Goal: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
Curr: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
Channel A Target 1 Lun 0 Settings
Commands Queued 790561
Commands Active 0
Command Openings 64
Max Tagged Openings 253
Device Queue Frozen Count 0
Channel A Target 2 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 3 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Goal: 10.000MB/s transfers (10.000MHz, offset 32)
Curr: 10.000MB/s transfers (10.000MHz, offset 32)
Channel A Target 3 Lun 0 Settings
Commands Queued 3200334
Commands Active 0
Command Openings 1
Max Tagged Openings 0
Device Queue Frozen Count 0
Channel A Target 4 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 5 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 6 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 7 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 8 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 9 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 10 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 11 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 12 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 13 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 14 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
Channel A Target 15 Negotiation Settings
User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Strange SCSI behavior?
2001-08-17 12:46 Cress, Andrew R
@ 2001-08-17 13:25 ` Jon Lapham
0 siblings, 0 replies; 4+ messages in thread
From: Jon Lapham @ 2001-08-17 13:25 UTC (permalink / raw)
To: Cress, Andrew R; +Cc: linux-kernel
Andrew-
I don't know if this helps, but this is the message tar spits out when
this problem occurs:
/bin/tar: home/pedro/COL&AMk-TA: Read error at byte 21958656, reading
10240 byte: Input/output error
Cress, Andrew R wrote:
> Jon,
>
> You really need to know what the additional sense data shows.
> With DAT tapes often they have variable length block sizes and get errors
> from some UNIX commands as a result. Or, it may be something that could be
> fixed with a firmware update to the DAT drive, or a driver fix. It depends
> on the details. Is sd08:11 the DAT drive?
Hmmm... I have no idea! I know that "host 0 channel 0 id 1 lun 0"
refers to the new Atlas HD, and I know that I have another HD with id 0,
and that the tape drive is id 3. I do not know how to interpret
"sd08:11", suggestions?
Maybe 'sd08' refers to the block device number?
[root@office sysadm]# cat /proc/devices
[snip]
Block devices:
2 fd
7 loop
8 sd
22 ide1
65 sd
66 sd
So, one of the three SCSI devices uses block 8... but which? I don't know.
>
> Make sure
> CONFIG_SCSI_LOGGING=y
> CONFIG_SCSI_DEBUG=m (or =y)
> in your kernel, and issue
> echo "scsi log error 3" > /proc/scsi/scsi
> and rerun the tape backup to get more info.
>
> Andy
>
Okay, good idea. I will recompile setting those symbols, but I will not
be able to do so until after the machine is idle (>5PM tonight). BTW,
these are my current SCSI symbol defs:
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_SD_EXTRA_DEVS=40
CONFIG_CHR_DEV_ST=y
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
CONFIG_SCSI_DEBUG_QUEUES=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=253
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Strange SCSI behavior?
@ 2001-08-17 13:03 Randal, Phil
0 siblings, 0 replies; 4+ messages in thread
From: Randal, Phil @ 2001-08-17 13:03 UTC (permalink / raw)
To: linux-kernel
The most common cause of Sense Key errors on DDS drives is
dirty heads. Try running a cleaning tape through the drive
a couple of times. No, once is not enough, from my experience.
Cheers,
Phil
---------------------------------------------
Phil Randal
Network Engineer
Herefordshire Council
Hereford, UK
> -----Original Message-----
> From: Cress, Andrew R [mailto:andrew.r.cress@intel.com]
> Sent: 17 August 2001 13:46
> To: 'lapham@extracta.com.br'; linux-kernel@vger.kernel.org
> Subject: RE: Strange SCSI behavior?
>
>
> Jon,
>
> You really need to know what the additional sense data shows.
> With DAT tapes often they have variable length block sizes
> and get errors
> from some UNIX commands as a result. Or, it may be something
> that could be
> fixed with a firmware update to the DAT drive, or a driver
> fix. It depends
> on the details. Is sd08:11 the DAT drive?
>
> Make sure
> CONFIG_SCSI_LOGGING=y
> CONFIG_SCSI_DEBUG=m (or =y)
> in your kernel, and issue
> echo "scsi log error 3" > /proc/scsi/scsi
> and rerun the tape backup to get more info.
>
> Andy
>
> -----Original Message-----
> From: Jon Lapham [mailto:lapham@extracta.com.br]
> Sent: Friday, August 17, 2001 8:27 AM
> To: linux-kernel@vger.kernel.org
> Subject: Strange SCSI behavior?
>
>
> Hello-
>
> I'm running a heavily used (~10-40 simultaneous users) NFS/smb/email
> server on which I recently installed a new SCSI HD (Atlas V
> 18GB). The
> system is a PIII 450, 256MB RAM, 2940U2W SCSI controller,
> running kernel
> v2.4.8 (but I've also tried older kernels as well) using the
> new aic7xxx
> driver, the fs is ext2.
>
> What I'm seeing is SCSI "sense key hardware Errors" on the new HD
> during tape backups (HPC1554 DDS-3 drive) scheduled at night
> (when the
> system is unused):
>
> SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 8000002
> Info fld=0x214a55b, Current sd08:11: sense key Hardware Error
> I/O error: dev 08:11, sector 34907416
>
> Sounds like bad HD, right? Well, I've seen bad SCSI disks
> before, and
> this seems different. These messages *only* appear during
> tape backups,
> but not during the day when the machine is under *heavy*
> I/O load to
> that HD. It is *only* when the DAT tape gets involved that I
> see these
> messages. I should also say that files that correspond to
> the affected
> sectors in the error messages are fine, they are not corrupted.
>
> Suggestions? What can I do to track this down?
>
> TIA, Jon
>
> [root@office /root]# cat /proc/scsi/scsi
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 00
> Vendor: SEAGATE Model: ST39175LW Rev: 0001
> Type: Direct-Access ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 01 Lun: 00
> Vendor: QUANTUM Model: ATLAS_V_18_WLS Rev: 0230
> Type: Direct-Access ANSI SCSI revision: 03
> Host: scsi0 Channel: 00 Id: 03 Lun: 00
> Vendor: HP Model: C1537A Rev: L708
> Type: Sequential-Access ANSI SCSI revision: 02
>
> [root@office /root]# cat /proc/scsi/aic7xxx/0
> Adaptec AIC7xxx driver version: 6.1.13
> aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
> Channel A Target 0 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Goal: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
> Curr: 40.000MB/s transfers (20.000MHz, offset 15, 16bit)
> Channel A Target 0 Lun 0 Settings
> Commands Queued 760609
> Commands Active 0
> Command Openings 52
> Max Tagged Openings 253
> Device Queue Frozen Count 0
> Channel A Target 1 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Goal: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
> Curr: 40.000MB/s transfers (20.000MHz, offset 63, 16bit)
> Channel A Target 1 Lun 0 Settings
> Commands Queued 790561
> Commands Active 0
> Command Openings 64
> Max Tagged Openings 253
> Device Queue Frozen Count 0
> Channel A Target 2 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 3 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Goal: 10.000MB/s transfers (10.000MHz, offset 32)
> Curr: 10.000MB/s transfers (10.000MHz, offset 32)
> Channel A Target 3 Lun 0 Settings
> Commands Queued 3200334
> Commands Active 0
> Command Openings 1
> Max Tagged Openings 0
> Device Queue Frozen Count 0
> Channel A Target 4 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 5 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 6 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 7 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 8 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 9 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 10 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 11 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 12 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 13 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 14 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
> Channel A Target 15 Negotiation Settings
> User: 80.000MB/s transfers (40.000MHz, offset 255, 16bit)
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2001-08-17 13:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-17 12:26 Strange SCSI behavior? Jon Lapham
-- strict thread matches above, loose matches on Subject: below --
2001-08-17 12:46 Cress, Andrew R
2001-08-17 13:25 ` Jon Lapham
2001-08-17 13:03 Randal, Phil
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox