* SATA Promise TX4 Crash
@ 2005-03-21 0:18 Neil Whelchel
2005-03-21 6:59 ` Brad Campbell
0 siblings, 1 reply; 4+ messages in thread
From: Neil Whelchel @ 2005-03-21 0:18 UTC (permalink / raw)
To: linux-kernel
Hello,
I have two Promise SATA TX4 cards connected to a total of 6 Maxtor 250 GB
drives (7Y250M0) configured into a RAID 5. All works well with small
disk load, but when a large number of requests are issued, it causes crash
similar to the attached, except that the errors before the crash are on a
different drive nearly every time. I have tried several different
motherboards with both Nvidia and Via chipsets, with Athlon and K6
CPUs, and the crash remains the same. I have also seen the same crash
with both a preemptable and a non-preemptable kernel, with kernel
versions 2.6.9, 2.6.10, 2.6.11, and 2.6.11.2 (this one).
Any suggestions, or is this a bug?
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x40 { UncorrectableError }
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x40 { UncorrectableError }
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x40 { UncorrectableError }
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x40 { UncorrectableError }
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x40 { UncorrectableError }
ata3: command timeout
Assertion failed! qc->flags &
ATA_QCFLAG_ACTIVE,drivers/scsi/libata-core.c,ata_qc_complete,line=2807
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: called with no error (51)!
------------[ cut here ]------------
kernel BUG at drivers/scsi/scsi.c:299!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in:
CPU: 0
EIP: 0060:[<c02a9ddb>] Not tainted VLI
EFLAGS: 00010046 (2.6.11.2)
EIP is at scsi_put_command+0xbb/0x100
eax: 00000001 ebx: c2f5e390 ecx: 00000001 edx: c2f5e390
esi: c2f5e380 edi: 00000246 ebp: c7c4beb4 esp: c7c4be9c
ds: 007b es: 007b ss: 0068
Process scsi_eh_2 (pid: 821, threadinfo=c7c4a000 task=c7c315b0)
Stack: 00000296 c7c30000 c7c26400 c7c23030 c2f5e380 00000246 c7c4bec4 c02ae9b3
c2f5e380 c44a1740 c7c4bee0 c02aeabc c2f5e380 c7c23030 c2f5e380 08000002
c44a1740 c7c4bf28 c02aedd2 c2f5e380 00000001 00000000 00000000 00000000
Call Trace:
[<c0102c12>] show_stack+0x72/0xa0
[<c0102d64>] show_registers+0x104/0x180
[<c0102f53>] die+0xd3/0x180
[<c0103330>] do_invalid_op+0x90/0xa0
[<c010282b>] error_code+0x2b/0x30
[<c02ae9b3>] scsi_next_command+0x13/0x20
[<c02aeabc>] scsi_end_request+0xbc/0xe0
[<c02aedd2>] scsi_io_completion+0x132/0x3c0
[<c02ba698>] sd_rw_intr+0xb8/0x2c0
[<c02b8420>] ata_scsi_qc_complete+0x20/0x40
[<c02b658c>] ata_qc_complete+0x2c/0xa0
[<c02b9473>] pdc_eng_timeout+0x93/0x120
[<c02b7ef4>] ata_scsi_error+0x14/0x40
[<c02add5b>] scsi_error_handler+0x5b/0xc0
[<c0100811>] kernel_thread_helper+0x5/0x14
Code: ec 8b 42 08 ff 30 e8 e5 cd e8 ff 59 5b 8b 45 f0 05 84 01 00 00 89 45
08 8d 65 f4 5b 5e 5f c9 e9 ac 41 fc ff e8 47 6c 13 00 eb ce <0f> 0b 2b 01
d6 e8 40 c0 e9 6c ff ff ff e8 33 6c 13 00 eb 8b 89
<6>note: scsi_eh_2[821] exited with preempt_count 1
-Neil Whelchel-
First Light Internet Services
760 366-0145
- We don't do Window$, that's what the janitor is for -
Bubble Memory, n.:
A derogatory term, usually referring to a person's
intelligence. See also "vacuum tube".
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: SATA Promise TX4 Crash 2005-03-21 0:18 SATA Promise TX4 Crash Neil Whelchel @ 2005-03-21 6:59 ` Brad Campbell 2005-03-21 9:42 ` Raphael Jacquot 0 siblings, 1 reply; 4+ messages in thread From: Brad Campbell @ 2005-03-21 6:59 UTC (permalink / raw) To: Neil Whelchel; +Cc: linux-kernel Neil Whelchel wrote: > Hello, > I have two Promise SATA TX4 cards connected to a total of 6 Maxtor 250 GB > drives (7Y250M0) configured into a RAID 5. All works well with small > disk load, but when a large number of requests are issued, it causes crash > similar to the attached, except that the errors before the crash are on a > EFLAGS: 00010046 (2.6.11.2) > EIP is at scsi_put_command+0xbb/0x100 Oooh Oooh Oooh, pick me Mr Kotter! I have seen this repeatedly, fought it and "apparently" beat it by upgrading my PSU. I could reliably reproduce it by running a raid resync and issuing SMART queries to the drives, but after a PSU upgrade it has gone away. I have tried hard to reproduce it recently but I just can't get it to crash anymore. I have a similar setup 4x SATA-TX4 cards and 15x 7Y250M0 drives. I'm thought it was actually a bug, but as I can't reproduce it anymore it's making it a bit hard to track down. Not much help, sorry. Brad -- "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." -- Douglas Adams ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SATA Promise TX4 Crash 2005-03-21 6:59 ` Brad Campbell @ 2005-03-21 9:42 ` Raphael Jacquot 0 siblings, 0 replies; 4+ messages in thread From: Raphael Jacquot @ 2005-03-21 9:42 UTC (permalink / raw) To: Brad Campbell; +Cc: Neil Whelchel, linux-kernel Brad Campbell wrote: > Neil Whelchel wrote: > >> Hello, >> I have two Promise SATA TX4 cards connected to a total of 6 Maxtor 250 GB >> drives (7Y250M0) configured into a RAID 5. All works well with small >> disk load, but when a large number of requests are issued, it causes >> crash >> similar to the attached, except that the errors before the crash are on a > > >> EFLAGS: 00010046 (2.6.11.2) >> EIP is at scsi_put_command+0xbb/0x100 > > > Oooh Oooh Oooh, pick me Mr Kotter! > I have seen this repeatedly, fought it and "apparently" beat it by > upgrading my PSU. > I could reliably reproduce it by running a raid resync and issuing SMART > queries > to the drives, but after a PSU upgrade it has gone away. > I have tried hard to reproduce it recently but I just can't get it to > crash anymore. > > I have a similar setup 4x SATA-TX4 cards and 15x 7Y250M0 drives. I'm > thought it was actually > a bug, but as I can't reproduce it anymore it's making it a bit hard to > track down. > > Not much help, sorry. > > Brad I have similar crashes with a (netbooted) epia and 4 250G Seagate 7200.8 PATA drives. removing the kernel preempt stuff & realtime scheduling and stuff alleviates the issue a bit but it occured again yesterday. a quirk in the epia forces me to reboot the box by power cycling it. ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <1111898649.185617.170980@g14g2000cwa.googlegroups.com>]
* Re: SATA Promise TX4 Crash [not found] <1111898649.185617.170980@g14g2000cwa.googlegroups.com> @ 2005-03-27 22:17 ` Neil Whelchel 0 siblings, 0 replies; 4+ messages in thread From: Neil Whelchel @ 2005-03-27 22:17 UTC (permalink / raw) To: quasiben; +Cc: linux-kernel On Sat, 26 Mar 2005, quasiben wrote: > Dear Neil Whelchel, > I have been having very similar problems. However, my setup is > somewhat different. I have a LVM logical volume that spans two disks > (one PATA and one SATA). Did you upgrade your PSU as one person > suggested ? If so, did it work ? > > --Benji > > Neil Whelchel wrote: > > Hello, > > I have two Promise SATA TX4 cards connected to a total of 6 Maxtor > 250 GB > > drives (7Y250M0) configured into a RAID 5. All works well with small > > disk load, but when a large number of requests are issued, it causes > crash > > similar to the attached, except that the errors before the crash are > on a > > different drive nearly every time. I have tried several different > > motherboards with both Nvidia and Via chipsets, with Athlon and K6 > > CPUs, and the crash remains the same. I have also seen the same crash > > with both a preemptable and a non-preemptable kernel, with kernel > > versions 2.6.9, 2.6.10, 2.6.11, and 2.6.11.2 (this one). > > Any suggestions, or is this a bug? > > > > > > ata3: status=0x51 { DriveReady SeekComplete Error } > > ata3: error=0x40 { UncorrectableError } > > ata3: status=0x51 { DriveReady SeekComplete Error } > > ata3: error=0x40 { UncorrectableError } > > ata3: status=0x51 { DriveReady SeekComplete Error } > > ata3: error=0x40 { UncorrectableError } > > ata3: status=0x51 { DriveReady SeekComplete Error } > > ata3: error=0x40 { UncorrectableError } > > ata3: status=0x51 { DriveReady SeekComplete Error } > > ata3: error=0x40 { UncorrectableError } > > ata3: command timeout > > Assertion failed! qc->flags & > > > ATA_QCFLAG_ACTIVE,drivers/scsi/libata-core.c,ata_qc_complete,line=2807 > > ata3: status=0x51 { DriveReady SeekComplete Error } > > ata3: called with no error (51)! > > ------------[ cut here ]------------ > > kernel BUG at drivers/scsi/scsi.c:299! > > invalid operand: 0000 [#1] > > PREEMPT > > Modules linked in: > > CPU: 0 > > EIP: 0060:[<c02a9ddb>] Not tainted VLI > > EFLAGS: 00010046 (2.6.11.2) > > EIP is at scsi_put_command+0xbb/0x100 > > eax: 00000001 ebx: c2f5e390 ecx: 00000001 edx: c2f5e390 > > esi: c2f5e380 edi: 00000246 ebp: c7c4beb4 esp: c7c4be9c > > ds: 007b es: 007b ss: 0068 > > Process scsi_eh_2 (pid: 821, threadinfo=c7c4a000 task=c7c315b0) > > Stack: 00000296 c7c30000 c7c26400 c7c23030 c2f5e380 00000246 c7c4bec4 > c02ae9b3 > > c2f5e380 c44a1740 c7c4bee0 c02aeabc c2f5e380 c7c23030 c2f5e380 > 08000002 > > c44a1740 c7c4bf28 c02aedd2 c2f5e380 00000001 00000000 00000000 > 00000000 > > Call Trace: > > [<c0102c12>] show_stack+0x72/0xa0 > > [<c0102d64>] show_registers+0x104/0x180 > > [<c0102f53>] die+0xd3/0x180 > > [<c0103330>] do_invalid_op+0x90/0xa0 > > [<c010282b>] error_code+0x2b/0x30 > > [<c02ae9b3>] scsi_next_command+0x13/0x20 > > [<c02aeabc>] scsi_end_request+0xbc/0xe0 > > [<c02aedd2>] scsi_io_completion+0x132/0x3c0 > > [<c02ba698>] sd_rw_intr+0xb8/0x2c0 > > [<c02b8420>] ata_scsi_qc_complete+0x20/0x40 > > [<c02b658c>] ata_qc_complete+0x2c/0xa0 > > [<c02b9473>] pdc_eng_timeout+0x93/0x120 > > [<c02b7ef4>] ata_scsi_error+0x14/0x40 > > [<c02add5b>] scsi_error_handler+0x5b/0xc0 > > [<c0100811>] kernel_thread_helper+0x5/0x14 > > Code: ec 8b 42 08 ff 30 e8 e5 cd e8 ff 59 5b 8b 45 f0 05 84 01 00 00 > 89 45 > > 08 8d 65 f4 5b 5e 5f c9 e9 ac 41 fc ff e8 47 6c 13 00 eb ce <0f> 0b > 2b 01 > > d6 e8 40 c0 e9 6c ff ff ff e8 33 6c 13 00 eb 8b 89 > > <6>note: scsi_eh_2[821] exited with preempt_count 1 > > > > > > -Neil Whelchel- > > First Light Internet Services > > 760 366-0145 > > - We don't do Window$, that's what the janitor is for - > > > > Bubble Memory, n.: > > A derogatory term, usually referring to a person's > > intelligence. See also "vacuum tube". > > > > - > > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > Hello, Yes, I did replace the PSU about 6 times. I had the same problem with 4 similar machines (all the same), and in one of them I tried two other different power supplies, so there were a total of three completely different supplies tested. All of them were 450 Watts, except for one 500 Watt that I tested.. While, my 'feelings' tell me that the PSU is the issue, I have been looking more to grounding and SATA cable than anything else. But there is one HUGE however here... If there is a communication failure, it should not cause a crash in the kernel, this should be fixed! -Neil Whelchel- First Light Internet Services 760 366-0145 - We don't do Window$, that's what the janitor is for - Bubble Memory, n.: A derogatory term, usually referring to a person's intelligence. See also "vacuum tube". ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-03-27 22:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-21 0:18 SATA Promise TX4 Crash Neil Whelchel
2005-03-21 6:59 ` Brad Campbell
2005-03-21 9:42 ` Raphael Jacquot
[not found] <1111898649.185617.170980@g14g2000cwa.googlegroups.com>
2005-03-27 22:17 ` Neil Whelchel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox