* sata_mv: hard resetting port
@ 2007-11-14 9:43 Tomasz Chmielewski
2007-11-14 15:27 ` Mark Lord
0 siblings, 1 reply; 10+ messages in thread
From: Tomasz Chmielewski @ 2007-11-14 9:43 UTC (permalink / raw)
To: Linux IDE
This are the log entries I saw as I started e2fsck on a 5-disk Linux
RAID-5 array; all disks connected to a sata_mv controller:
Nov 13 18:57:33 superthecus kernel: ata6.00: exception Emask 0x1 SAct
0x0 SErr 0x100000 action 0x6 frozen
Nov 13 18:57:34 superthecus kernel: ata6.00: edma_err 0x00000084, EDMA
self-disable
Nov 13 18:57:34 superthecus kernel: ata6.00: cmd
25/00:00:bf:09:24/00:02:1c:00:00/e0 tag 0 cdb 0x0 data 262144 in
Nov 13 18:57:34 superthecus kernel: res
51/84:00:bf:09:24/84:02:1c:00:00/e0 Emask 0x10 (ATA bus error)
Nov 13 18:57:34 superthecus kernel: ata6: hard resetting port
Nov 13 18:57:34 superthecus kernel: ata6: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Nov 13 18:57:34 superthecus kernel: ata6.00: configured for UDMA/133
Nov 13 18:57:34 superthecus kernel: ata6: EH complete
Nov 13 18:57:34 superthecus kernel: sd 5:0:0:0: [sde] 781422768 512-byte
hardware sectors (400088 MB)
Nov 13 18:57:34 superthecus kernel: sd 5:0:0:0: [sde] Write Protect is off
Nov 13 18:57:34 superthecus kernel: sd 5:0:0:0: [sde] Mode Sense: 00 3a
00 00
Nov 13 18:57:34 superthecus kernel: sd 5:0:0:0: [sde] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Nov 13 18:57:37 superthecus kernel: ata6: exception Emask 0x10 SAct 0x0
SErr 0x180000 action 0x6 frozen
Nov 13 18:57:37 superthecus kernel: ata6: edma_err 0x00000020
Nov 13 18:57:37 superthecus kernel: ata6: hard resetting port
Nov 13 18:57:37 superthecus kernel: ata6: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Nov 13 18:57:37 superthecus kernel: ata6.00: configured for UDMA/133
Nov 13 18:57:37 superthecus kernel: ata6: EH pending after completion,
repeating EH (cnt=4)
Nov 13 18:57:37 superthecus kernel: ata6: exception Emask 0x10 SAct 0x0
SErr 0x4010000 action 0x7
Nov 13 18:57:37 superthecus kernel: ata6: edma_err 0x00000010, dev connect
Nov 13 18:57:37 superthecus kernel: ata6: hard resetting port
Nov 13 18:57:37 superthecus kernel: ata6: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Nov 13 18:57:38 superthecus kernel: ata6.00: configured for UDMA/133
Nov 13 18:57:38 superthecus kernel: ata6: EH complete
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] 781422768 512-byte
hardware sectors (400088 MB)
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] Write Protect is off
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] Mode Sense: 00 3a
00 00
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] 781422768 512-byte
hardware sectors (400088 MB)
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] Write Protect is off
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] Mode Sense: 00 3a
00 00
Nov 13 18:57:38 superthecus kernel: sd 5:0:0:0: [sde] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
That's the Marvell controller:
01:00.0 SCSI storage controller: Marvell Technology Group Ltd.
MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
smartctl didn't show any errors.
--
Tomasz Chmielewski
http://lists.wpkg.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: sata_mv: hard resetting port
2007-11-14 9:43 sata_mv: hard resetting port Tomasz Chmielewski
@ 2007-11-14 15:27 ` Mark Lord
2007-11-14 15:36 ` Tomasz Chmielewski
2007-11-15 10:26 ` Tomasz Chmielewski
0 siblings, 2 replies; 10+ messages in thread
From: Mark Lord @ 2007-11-14 15:27 UTC (permalink / raw)
To: Tomasz Chmielewski; +Cc: Linux IDE
Tomasz Chmielewski wrote:
> This are the log entries I saw as I started e2fsck on a 5-disk Linux
> RAID-5 array; all disks connected to a sata_mv controller:
>
> ata6.00: exception Emask 0x1 SAct 0x0 SErr 0x100000 action 0x6 frozen
> ata6.00: edma_err 0x00000084, EDMA self-disable
..
Translation:
"The port had a link CRC error, which caused it to drop out of host-queuing mode."
> ata6.00: cmd 25/00:00:bf:09:24/00:02:1c:00:00/e0 tag 0 cdb 0x0 data 262144 in
> res 51/84:00:bf:09:24/84:02:1c:00:00/e0 Emask 0x10 (ATA bus error)
> ata6: hard resetting port
> ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata6.00: configured for UDMA/133
> ata6: EH complete
> sd 5:0:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
> sd 5:0:0:0: [sde] Write Protect is off
> sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
> sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ata6: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen
> ata6: edma_err 0x00000020
> ata6: hard resetting port
> ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata6.00: configured for UDMA/133
> ata6: EH pending after completion, repeating EH (cnt=4)
> ata6: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0x7
> ata6: edma_err 0x00000010, dev connect
> ata6: hard resetting port
> ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata6.00: configured for UDMA/133
> ata6: EH complete
> sd 5:0:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
> sd 5:0:0:0: [sde] Write Protect is off
> sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
> sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> sd 5:0:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
> sd 5:0:0:0: [sde] Write Protect is off
> sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
> sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
..
Translation:
"libata reset the link, and everything appeared okay,
so it reissued the failed command and continued.
No data loss."
> That's the Marvell controller:
>
> 01:00.0 SCSI storage controller: Marvell Technology Group Ltd.
> MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: sata_mv: hard resetting port
2007-11-14 15:27 ` Mark Lord
@ 2007-11-14 15:36 ` Tomasz Chmielewski
2007-11-14 16:11 ` Mark Lord
2007-11-15 10:26 ` Tomasz Chmielewski
1 sibling, 1 reply; 10+ messages in thread
From: Tomasz Chmielewski @ 2007-11-14 15:36 UTC (permalink / raw)
To: Mark Lord; +Cc: Linux IDE
Mark Lord schrieb:
> Tomasz Chmielewski wrote:
>> This are the log entries I saw as I started e2fsck on a 5-disk Linux
>> RAID-5 array; all disks connected to a sata_mv controller:
>>
>> ata6.00: exception Emask 0x1 SAct 0x0 SErr 0x100000 action 0x6 frozen
>> ata6.00: edma_err 0x00000084, EDMA self-disable
> ..
>
> Translation:
> "The port had a link CRC error, which caused it to drop out of
> host-queuing mode."
Is there an online Kernelish-English dictionary anywhere? ;)
--
Tomasz Chmielewski
http://wpkg.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: sata_mv: hard resetting port
2007-11-14 15:36 ` Tomasz Chmielewski
@ 2007-11-14 16:11 ` Mark Lord
2007-11-15 3:36 ` Tejun Heo
0 siblings, 1 reply; 10+ messages in thread
From: Mark Lord @ 2007-11-14 16:11 UTC (permalink / raw)
To: Tomasz Chmielewski; +Cc: Linux IDE
Tomasz Chmielewski wrote:
> Mark Lord schrieb:
>> Tomasz Chmielewski wrote:
>>> This are the log entries I saw as I started e2fsck on a 5-disk Linux
>>> RAID-5 array; all disks connected to a sata_mv controller:
>>>
>>> ata6.00: exception Emask 0x1 SAct 0x0 SErr 0x100000 action 0x6 frozen
>>> ata6.00: edma_err 0x00000084, EDMA self-disable
>> ..
>>
>> Translation:
>> "The port had a link CRC error, which caused it to drop out of
>> host-queuing mode."
>
> Is there an online Kernelish-English dictionary anywhere? ;)
..
Unfortunately not.
At some point, we *really* need to convice {Tejun,Jeff} that libata
messages should be simpler, fewer, and more human readable by default,
with perhaps a sysfs flag to re-enable the {Tejun,Jeff}-speak versions.
Cheers
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: sata_mv: hard resetting port
2007-11-14 16:11 ` Mark Lord
@ 2007-11-15 3:36 ` Tejun Heo
2007-11-15 4:12 ` Mark Lord
2007-11-15 10:16 ` Tomasz Chmielewski
0 siblings, 2 replies; 10+ messages in thread
From: Tejun Heo @ 2007-11-15 3:36 UTC (permalink / raw)
To: Mark Lord; +Cc: Tomasz Chmielewski, Linux IDE
Mark Lord wrote:
> Tomasz Chmielewski wrote:
>> Is there an online Kernelish-English dictionary anywhere? ;)
> ..
>
> Unfortunately not.
>
> At some point, we *really* need to convice {Tejun,Jeff} that libata
> messages should be simpler, fewer, and more human readable by default,
> with perhaps a sysfs flag to re-enable the {Tejun,Jeff}-speak versions.
Ummm... Okay. That might have something to do with me not being a
native speaker. :-)
I agree that the current message looks too scary and detailed. Yeah,
essages for recovered errors can be shorter and sweeter. Those messages
are pretty helpful to developers tho. Always having duplicate SCSI
messages don't help either.
Thanks. I'm putting it on the todo list but I have to admit it is of
low priority ATM.
--
tejun
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: sata_mv: hard resetting port
2007-11-15 3:36 ` Tejun Heo
@ 2007-11-15 4:12 ` Mark Lord
2007-11-15 10:16 ` Tomasz Chmielewski
1 sibling, 0 replies; 10+ messages in thread
From: Mark Lord @ 2007-11-15 4:12 UTC (permalink / raw)
To: Tejun Heo; +Cc: Tomasz Chmielewski, Linux IDE
Tejun Heo wrote:
> Mark Lord wrote:
>> Tomasz Chmielewski wrote:
>>> Is there an online Kernelish-English dictionary anywhere? ;)
>> ..
>>
>> Unfortunately not.
>>
>> At some point, we *really* need to convice {Tejun,Jeff} that libata
>> messages should be simpler, fewer, and more human readable by default,
>> with perhaps a sysfs flag to re-enable the {Tejun,Jeff}-speak versions.
>
> Ummm... Okay. That might have something to do with me not being a
> native speaker. :-)
>
> I agree that the current message looks too scary and detailed. Yeah,
> essages for recovered errors can be shorter and sweeter. Those messages
> are pretty helpful to developers tho. Always having duplicate SCSI
> messages don't help either.
>
> Thanks. I'm putting it on the todo list but I have to admit it is of
> low priority ATM.
..
Nothing bad intended by that.. I can read the current messages just fine :)
though there do seem to be a *lot* of them.
But that is good for libata, in it's current level of immaturity and rapid
development. In a release or three, once things settle down a bit,
we'll likely feel more pressure to make things understandable by others.
Cheers
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: sata_mv: hard resetting port
2007-11-15 3:36 ` Tejun Heo
2007-11-15 4:12 ` Mark Lord
@ 2007-11-15 10:16 ` Tomasz Chmielewski
1 sibling, 0 replies; 10+ messages in thread
From: Tomasz Chmielewski @ 2007-11-15 10:16 UTC (permalink / raw)
To: Tejun Heo; +Cc: Mark Lord, Linux IDE
Tejun Heo schrieb:
> Mark Lord wrote:
>> Tomasz Chmielewski wrote:
>>> Is there an online Kernelish-English dictionary anywhere? ;)
>> ..
>>
>> Unfortunately not.
>>
>> At some point, we *really* need to convice {Tejun,Jeff} that libata
>> messages should be simpler, fewer, and more human readable by default,
>> with perhaps a sysfs flag to re-enable the {Tejun,Jeff}-speak versions.
>
> Ummm... Okay. That might have something to do with me not being a
> native speaker. :-)
>
> I agree that the current message looks too scary and detailed. Yeah,
> essages for recovered errors can be shorter and sweeter. Those messages
> are pretty helpful to developers tho.
Note that I only sent this kernel message to this list because I
couldn't understand it.
So, I wasn't really helpful to you or Mark, but I actually took your
time - to read and reply to this posting. And time is a scarce resource.
My 3 cents.
--
Tomasz Chmielewski
http://wpkg.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: sata_mv: hard resetting port
2007-11-14 15:27 ` Mark Lord
2007-11-14 15:36 ` Tomasz Chmielewski
@ 2007-11-15 10:26 ` Tomasz Chmielewski
2007-11-15 14:31 ` Mark Lord
1 sibling, 1 reply; 10+ messages in thread
From: Tomasz Chmielewski @ 2007-11-15 10:26 UTC (permalink / raw)
To: Mark Lord; +Cc: Linux IDE
Mark Lord schrieb:
(...)
>> ata6: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen
>> ata6: edma_err 0x00000020
>> ata6: hard resetting port
>> ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>> ata6.00: configured for UDMA/133
>> ata6: EH pending after completion, repeating EH (cnt=4)
>> ata6: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0x7
>> ata6: edma_err 0x00000010, dev connect
>> ata6: hard resetting port
> ..
>
> Translation:
> "libata reset the link, and everything appeared okay,
> so it reissued the failed command and continued.
> No data loss."
And today kernel (2.6.23.1) in the same machine have spoken to
not-mere-mortals again:
ata6: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x6 frozen
ata6: edma_err 0x00000020
ata6: hard resetting port
ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata6.00: configured for UDMA/133
ata6: EH complete
sd 5:0:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sde] Write Protect is off
sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
As I understand it now, your previous translation would fit here (or
not, as SErr differs?):
> Translation:
> "libata reset the link, and everything appeared okay,
> so it reissued the failed command and continued.
> No data loss."
But why was the port reseted? There was no CRC error as before, was there?
What worries me is that it always happens for the same drive.
--
Tomasz Chmielewski
http://wpkg.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: sata_mv: hard resetting port
2007-11-15 10:26 ` Tomasz Chmielewski
@ 2007-11-15 14:31 ` Mark Lord
2007-11-15 19:22 ` Mark Lord
0 siblings, 1 reply; 10+ messages in thread
From: Mark Lord @ 2007-11-15 14:31 UTC (permalink / raw)
To: Tomasz Chmielewski; +Cc: Linux IDE
Tomasz Chmielewski wrote:
>
> And today kernel (2.6.23.1) in the same machine have spoken to
> not-mere-mortals again:
>
> ata6: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x6 frozen
> ata6: edma_err 0x00000020
..
Here, the messages fail us. The edma_err value says that there
should be a non-zero value in the SErr value. Except the messages
show zero there, meaning the registers were probably read in the
wrong sequence (some bits clear automatically on reads).
In any event, the messages that follow don't say anything about
I/O failing in any way, so again this is nothing to be concerned about
unless it happens frequently.
At this point, I would unplug/replug all of the SATA cables,
to ensure they have good connections.
> ata6: hard resetting port
> ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata6.00: configured for UDMA/133
> ata6: EH complete
> sd 5:0:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
> sd 5:0:0:0: [sde] Write Protect is off
> sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
> sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>
>
> As I understand it now, your previous translation would fit here (or
> not, as SErr differs?):
>
> > Translation:
> > "libata reset the link, and everything appeared okay,
> > so it reissued the failed command and continued.
> > No data loss."
>
> But why was the port reseted? There was no CRC error as before, was there?
..
I think the error-handling code is a bit heavy handed,
in that the port reset was not actually needed in the prior case either.
But this way it is simple, consistent, and does work.
It just prints too many messages.
> What worries me is that it always happens for the same drive.
..
Twiddle with the cabling for that drive, and it will probably behave.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: sata_mv: hard resetting port
2007-11-15 14:31 ` Mark Lord
@ 2007-11-15 19:22 ` Mark Lord
0 siblings, 0 replies; 10+ messages in thread
From: Mark Lord @ 2007-11-15 19:22 UTC (permalink / raw)
To: Tomasz Chmielewski; +Cc: Linux IDE, Eric D. Mudama
Eric D. Mudama wrote:
> On Nov 15, 2007 7:31 AM, Mark Lord <liml@rtr.ca> wrote:
>> Here, the messages fail us. The edma_err value says that there
>> should be a non-zero value in the SErr value. Except the messages
>> show zero there, meaning the registers were probably read in the
>> wrong sequence (some bits clear automatically on reads).
>
> Isn't that the reset signature?
..
No, it's just funny status. The chipset (Marvell) claims it saw
a SATA error. But the SATA error register is all-zeros, meaning "no error".
I think somebody forgot to save the latter before clearing it,
or somebody forgot to clear the former from an earlier error.
Either way, a nuisance, but no harm done here.
> Maybe due to insufficient power the drive decided to reboot itself.
>
> (What bits clear automatically on read, other than IRQ?)
..
Mmm.. none in this case (my mistake).
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-11-15 19:22 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-14 9:43 sata_mv: hard resetting port Tomasz Chmielewski
2007-11-14 15:27 ` Mark Lord
2007-11-14 15:36 ` Tomasz Chmielewski
2007-11-14 16:11 ` Mark Lord
2007-11-15 3:36 ` Tejun Heo
2007-11-15 4:12 ` Mark Lord
2007-11-15 10:16 ` Tomasz Chmielewski
2007-11-15 10:26 ` Tomasz Chmielewski
2007-11-15 14:31 ` Mark Lord
2007-11-15 19:22 ` Mark Lord
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).