* Bug? ahci.c
@ 2007-01-19 12:21 James Ray
2007-02-27 15:20 ` Tejun Heo
0 siblings, 1 reply; 4+ messages in thread
From: James Ray @ 2007-01-19 12:21 UTC (permalink / raw)
To: jgarzik; +Cc: linux-ide
Jeff, All,
We have just started using some new machines with The Intel ESB2
chipset in them (PCI Id: 8086:2681) for the SATA Controller.
With any kernels above 2.6.15 (as far as I can tell from my testing at
least) we are getting problems with them.
This is the output I have managed to collect from the server (Its hard
to get much more since the system gets more and more unresponsive as
these errors progress):
<3>ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x80002 action 0x2 frozen
<3>ata2.00: (irq_stat 0x08000000, interface fatal error)
<3>ata2.00: tag 0 cmd 0x35 Emask 0x10 stat 0x50 err 0x0 (ATA bus error)
<6>ata2: soft resetting port
<3>ata2: softreset failed (1st FIS failed)
<4>ata2: softreset failed, retrying in 5 secs
<6>ata2: hard resetting port
<4>ata2: port is slow to respond, please be patient
I have found some other people with similar problems and this is their
output:
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: tag 0 cmd 0xb0 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata2: soft resetting port
ata2: softreset failed (port busy but CLO unavailable)
ata2: softreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: port is slow to respond, please be patient (Status 0x80)
ata2: port failed to respond (30 secs, Status 0x80)
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: configured for UDMA/133
ata2: EH complete
SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
This second bunch of output matches mine pretty much word for word
(except different size disks) as far as I can tell!
I am trying to get a detailed output from the machine from the console
server and will follow up with this and forward on my exact logs if I
can get some.
Interestingly I only get this behaviour when I am doing software RAID on
the machine (not via the hardware but rather the Linux MD support). I
don't know if this is useful information, I don't see the problem when
running non-mirrored disks or when just doing anything like the following:
dd if=/dev/urandom of=/dev/sd[a|b] bs=1024k.
Even to both disks at the same time.
Can you throw any light on this at all? Suggest any work around?
I'm happy to try and provide any other information I can.
--
James Ray. <j.ray@qmul.ac.uk>
Computing Services
Queen Mary, University of London
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug? ahci.c
2007-01-19 12:21 Bug? ahci.c James Ray
@ 2007-02-27 15:20 ` Tejun Heo
2007-02-27 15:56 ` James Ray
0 siblings, 1 reply; 4+ messages in thread
From: Tejun Heo @ 2007-02-27 15:20 UTC (permalink / raw)
To: James Ray; +Cc: jgarzik, linux-ide
James Ray wrote:
> Jeff, All,
> We have just started using some new machines with The Intel ESB2
> chipset in them (PCI Id: 8086:2681) for the SATA Controller.
> With any kernels above 2.6.15 (as far as I can tell from my testing at
> least) we are getting problems with them.
>
> This is the output I have managed to collect from the server (Its hard
> to get much more since the system gets more and more unresponsive as
> these errors progress):
> <3>ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x80002 action 0x2 frozen
> <3>ata2.00: (irq_stat 0x08000000, interface fatal error)
> <3>ata2.00: tag 0 cmd 0x35 Emask 0x10 stat 0x50 err 0x0 (ATA bus error)
> <6>ata2: soft resetting port
> <3>ata2: softreset failed (1st FIS failed)
> <4>ata2: softreset failed, retrying in 5 secs
> <6>ata2: hard resetting port
> <4>ata2: port is slow to respond, please be patient
Can you test 2.6.20.1 and report full dmesg? Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug? ahci.c
2007-02-27 15:20 ` Tejun Heo
@ 2007-02-27 15:56 ` James Ray
2007-02-27 16:05 ` Tejun Heo
0 siblings, 1 reply; 4+ messages in thread
From: James Ray @ 2007-02-27 15:56 UTC (permalink / raw)
To: Tejun Heo; +Cc: jgarzik, linux-ide
Tejun Heo wrote:
> James Ray wrote:
>> Jeff, All,
>> We have just started using some new machines with The Intel ESB2
>> chipset in them (PCI Id: 8086:2681) for the SATA Controller.
>> With any kernels above 2.6.15 (as far as I can tell from my testing at
>> least) we are getting problems with them.
>>
>> This is the output I have managed to collect from the server (Its hard
>> to get much more since the system gets more and more unresponsive as
>> these errors progress):
>> <3>ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x80002 action 0x2 frozen
>> <3>ata2.00: (irq_stat 0x08000000, interface fatal error)
>> <3>ata2.00: tag 0 cmd 0x35 Emask 0x10 stat 0x50 err 0x0 (ATA bus error)
>> <6>ata2: soft resetting port
>> <3>ata2: softreset failed (1st FIS failed)
>> <4>ata2: softreset failed, retrying in 5 secs
>> <6>ata2: hard resetting port
>> <4>ata2: port is slow to respond, please be patient
>
> Can you test 2.6.20.1 and report full dmesg? Thanks.
>
I managed to get a response from Jeff on the linux-kernel list which I
have yet to respond to (sorry!).
This problem was fixed by dodgy NCQ on the disk itself, S/N:
WMANM2828677 P/N: WD1600JS-60MHB1 (Western Digital 160Gb). If I disabled
the disk it worked fine, or replaced the disks with Some Seagates it
worked fine also!
I am running a 2.6.19 at the moment quite happily. Still want me to test
a 2.6.20.1? I might be able to get a bit of time later on this week.
--
James Ray. <j.ray@qmul.ac.uk>
Computing Services
Queen Mary, University of London
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug? ahci.c
2007-02-27 15:56 ` James Ray
@ 2007-02-27 16:05 ` Tejun Heo
0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2007-02-27 16:05 UTC (permalink / raw)
To: James Ray; +Cc: jgarzik, linux-ide
James Ray wrote:
> I managed to get a response from Jeff on the linux-kernel list which I
> have yet to respond to (sorry!).
>
> This problem was fixed by dodgy NCQ on the disk itself, S/N:
> WMANM2828677 P/N: WD1600JS-60MHB1 (Western Digital 160Gb). If I disabled
> the disk it worked fine, or replaced the disks with Some Seagates it
> worked fine also!
>
> I am running a 2.6.19 at the moment quite happily. Still want me to test
> a 2.6.20.1? I might be able to get a bit of time later on this week.
Nope, as long as you're happy. :-)
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-02-27 17:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-19 12:21 Bug? ahci.c James Ray
2007-02-27 15:20 ` Tejun Heo
2007-02-27 15:56 ` James Ray
2007-02-27 16:05 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).