* Scsi debug help requested. What's happening here?
@ 2005-09-26 16:59 JF
0 siblings, 0 replies; only message in thread
From: JF @ 2005-09-26 16:59 UTC (permalink / raw)
To: linux-scsi
Good day,
I have an old Dell workstation running RedHat Enterprise 3 Rel5 with an
oboard Adaptec 7899 and an Adaptec 29160. The 29160 connects externally to a
Promise UltraTrak100 TX8 (external SCSI-to-ATA RAID). Inside the UltraTrak
there are 8 Western Digital WD120JB drives; this gives me about 833GB of
RAID5 storage:
SCSI subsystem driver Revision: 1.00
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec 29160 Ultra160 SCSI adapter>
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7899 Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7899 Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
blk: queue f7fc8618, I/O limit 4095Mb (mask 0xffffffff)
(scsi1:A:0): 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
(scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 16, 16bit)
(scsi1:A:1): 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
Vendor: Promise Model: 8 Disk RAID5 Rev: 1.10
Type: Direct-Access ANSI SCSI revision: 03
blk: queue f7fc8418, I/O limit 4095Mb (mask 0xffffffff)
scsi0:A:0:0: Tagged Queuing enabled. Depth 32
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 1626952320 512-byte hdwr sectors (833000 MB)
Partition check:
sda: sda1
blk: queue f7fcb018, I/O limit 4095Mb (mask 0xffffffff)
Vendor: QUANTUM Model: ATLAS10K2-TY367L Rev: DA40
Type: Direct-Access ANSI SCSI revision: 03
blk: queue f7fcce18, I/O limit 4095Mb (mask 0xffffffff)
Vendor: FUJITSU Model: MAJ3364MP Rev: 5509
Type: Direct-Access ANSI SCSI revision: 03
blk: queue f7fd6018, I/O limit 4095Mb (mask 0xffffffff)
scsi1:A:0:0: Tagged Queuing enabled. Depth 32
scsi1:A:1:0: Tagged Queuing enabled. Depth 32
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
Attached scsi disk sdc at scsi1, channel 0, id 1, lun 0
SCSI device sdb: 71132959 512-byte hdwr sectors (36420 MB)
sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 sdb7 sdb8 sdb9 >
SCSI device sdc: 71132959 512-byte hdwr sectors (36420 MB)
sdc: sdc1
blk: queue f77c0c18, I/O limit 4095Mb (mask 0xffffffff)
Anyhooooo, twice in the past 2 weeks, the array has panicked (or caused the
kernel to panic) causing it to go offline. I haven't lost a byte of data;
rebooting both the array and the host clears everything up, but twice is now
a pattern for me. /var/log/messages has the following messages in it:
Sep 25 00:51:24 aztec kernel: scsi0:0:0:0: Attempting to queue an ABORT
message
Sep 25 00:51:24 aztec kernel: scsi0: At time of recovery, card was not
paused
Sep 25 00:51:24 aztec kernel: scsi0: Dumping Card State while idle, at
SEQADDR 0x9
Sep 25 00:51:24 aztec kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0]
SCSIBUSL[0x0]
Sep 25 00:51:24 aztec kernel: LASTPHASE[0x1] SCSISEQ[0x12] SBLKCTL[0xa]
SCSIRATE[0x0]
Sep 25 00:51:24 aztec kernel: 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x5]
Sep 25 00:51:24 aztec kernel: 1 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x16]
Sep 25 00:51:24 aztec kernel: 2 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x3]
Sep 25 00:51:24 aztec kernel: 3 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x1c]
Sep 25 00:51:24 aztec kernel: 4 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x12]
Sep 25 00:51:24 aztec kernel: 5 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x7]
Sep 25 00:51:24 aztec kernel: 6 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x17]
Sep 25 00:51:24 aztec kernel: 7 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x9]
Sep 25 00:51:24 aztec kernel: 8 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x14]
Sep 25 00:51:24 aztec kernel: 9 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x0]
Sep 25 00:51:24 aztec kernel: 10 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x1a]
Sep 25 00:51:24 aztec kernel: 11 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xf]
Sep 25 00:51:24 aztec kernel: 12 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x22]
Sep 25 00:51:24 aztec kernel: 13 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x8]
Sep 25 00:51:24 aztec kernel: 14 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x19]
Sep 25 00:51:24 aztec kernel: 15 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x18]
Sep 25 00:51:24 aztec kernel: 16 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x11]
Sep 25 00:51:24 aztec kernel: 17 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xc]
Sep 25 00:51:24 aztec kernel: 18 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x23]
Sep 25 00:51:24 aztec kernel: 19 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x1b]
Sep 25 00:51:24 aztec kernel: 20 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xe]
Sep 25 00:51:24 aztec kernel: 21 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x10]
Sep 25 00:51:24 aztec kernel: 22 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xb]
Sep 25 00:51:24 aztec kernel: 23 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xd]
Sep 25 00:51:24 aztec kernel: 24 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x1d]
Sep 25 00:51:24 aztec kernel: 25 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x13]
Sep 25 00:51:24 aztec kernel: 26 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x21]
Sep 25 00:51:24 aztec kernel: 27 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x2]
Sep 25 00:51:24 aztec kernel: 28 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x4]
Sep 25 00:51:24 aztec kernel: 29 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x1]
Sep 25 00:51:24 aztec kernel: 30 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xa]
Sep 25 00:51:24 aztec kernel: 31 SCB_CONTROL[0x64] SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x15]
Sep 25 00:51:25 aztec kernel: 4 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 25 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 7 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 23 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 1 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 8 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 22 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 12 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 2 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 13 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 0 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 15 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 28 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 5 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 14 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 35 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 11 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 16 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 33 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 9 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 24 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 3 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 21 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 34 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 19 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 26 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 29 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 10 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 18 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 27 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 17 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: 20 SCB_CONTROL[0x60] SCB_SCSIID[0x7]
SCB_LUN[0x0]
Sep 25 00:51:25 aztec kernel: (scsi0:A:0:0): Device is disconnected,
re-queuing SCB
Sep 25 00:51:25 aztec kernel: (scsi0:A:0:0): Abort Tag Message Sent
Sep 25 00:51:25 aztec kernel: (scsi0:A:0:0): SCB 23 - Abort Tag Completed.
Sep 25 00:51:34 aztec kernel: scsi0:0:0:0: Attempting to queue an ABORT
message
Sep 25 00:51:34 aztec kernel: scsi0: At time of recovery, card was not
paused
Sep 25 00:51:34 aztec kernel: scsi0: Dumping Card State while idle, at
SEQADDR 0x16b
Sep 25 00:51:34 aztec kernel: SCSIPHASE[0x0] SCSISIGI[0x14] ERROR[0x0]
SCSIBUSL[0x0]
Sep 25 00:51:24 aztec kernel: scsi0:0:0:0: Attempting to queue an ABORT
message
Sep 25 00:51:24 aztec kernel: scsi0: At time of recovery, card was not
paused
Sep 25 00:51:24 aztec kernel: scsi0: Dumping Card State while idle, at
SEQADDR 0x9
and so on and so forth.
Due to the fact I am not a scsi engineer, I have no idea what the problem
is. I replace the 29160 the first time, but it happened again. I suspect the
obvious, that it, that the UltraTrak is the problem, but I cannot tell if it
is a drive in the array or if the whole unit is dying. The UltraTrak's
limited display says the array is 'functional.'
Is anyone out there able to make heads or tails of these kernel messages?
I'd really appreciate it.
Thanks,
JF
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2005-09-26 17:24 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-26 16:59 Scsi debug help requested. What's happening here? JF
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).