* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
[not found] <g8p6po$nm$1@ger.gmane.org>
@ 2008-08-23 22:38 ` Jeff Garzik
2008-08-23 23:27 ` Chr
2008-08-24 0:27 ` Sergey Spiridonov
0 siblings, 2 replies; 12+ messages in thread
From: Jeff Garzik @ 2008-08-23 22:38 UTC (permalink / raw)
To: Sergey Spiridonov; +Cc: linux-kernel, Linux IDE mailing list
Sergey Spiridonov wrote:
> Hi
>
> I got kernel errors [1] and [2] followed by SATA reset on heavy load on
> the hard drive connected to the GA-MA790FX-DS5 onboard controller
> Jmicron 20360/20363 (JMB363) (here is lspci [3]). Hard drive connected
> to the another onboard (south bridge from AMD SB600) controller works
> without problem.
>
> I got two 1TB Seagate hard disks, ST31000340AS and ST31000340NS. I
> connected one to Jmicron JMB363, another to SB600. After some testing
> with several instances of bonnie++ I got kernel errors [1] and [2].
> After this I exchanged hard disks connections. The one which was
> connected to JMB363 I connected to SB600 and vs versa. Errors, timeouts
> and hard drive resetting happened always on the hard drive which is
> connected to the JMB363 (in log file it is sdb). There are no errors if
> both drives are connected to the SB600.
>
> Here [4] is complete (before i get errors) dmesg output after system is
> booted.
>
> I already replaced (took from working PC) power supply, memory, video
> card and dvd drive. I get same problems also with this devices. So
> problem must be motherboard, software or CPU. CPU seems to work O.K.
>
> It looks like the problem is motherboard or ahci ata driver. Does
> somebody have any clue about it? Is chip JMB363 broken or linux driver
> is broken?
>
> [1] http://hurd.homeunix.org/~sena/GA-MA790FX-DS5/dmesg-sata-errors.txt
> [2] http://hurd.homeunix.org/~sena/GA-MA790FX-DS5/dmesg-sata-errors2.txt
> [3] http://hurd.homeunix.org/~sena/GA-MA790FX-DS5/lspci.txt
> [4] http://hurd.homeunix.org/~sena/GA-MA790FX-DS5/dmesg-after-boot.txt
See http://ata.wiki.kernel.org/index.php/Libata_error_messages for an
introduction.
In general, tons of ATA bus errors and SError register bits means that
problems are coming from the ATA bus, a.k.a. the SATA cable and its
related connections.
So... suspect bad cables, bad port connectors, cable interference,
motherboard-caused interference or grounding problems, power supply
problems.
Jeff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-23 22:38 ` GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny Jeff Garzik
@ 2008-08-23 23:27 ` Chr
2008-08-24 0:27 ` Sergey Spiridonov
1 sibling, 0 replies; 12+ messages in thread
From: Chr @ 2008-08-23 23:27 UTC (permalink / raw)
To: Sergey Spiridonov; +Cc: Jeff Garzik, linux-kernel, Linux IDE mailing list
On Sunday 24 August 2008 00:38:36 Jeff Garzik wrote:
> See http://ata.wiki.kernel.org/index.php/Libata_error_messages for an
> introduction.
>
> In general, tons of ATA bus errors and SError register bits means that
> problems are coming from the ATA bus, a.k.a. the SATA cable and its
> related connections.
>
> So... suspect bad cables, bad port connectors, cable interference,
> motherboard-caused interference or grounding problems, power supply
> problems.
>
hmm, or something totally odd...
what happens if you do: (after you made a backup!)
"dd if=/dev/sdX(where X is your affected hdd?) of=/dev/null bs=1"
Note:
The important bit is the small bs (blocksize) number.
You can throw in a O_DIRECT flag to disable the caches, or
if you have some "empty" partition space, you can "dd" into
it with a small blocksize too)
my seagate & even a samsung hd103uj doesn't like that and will spew
out the same sort problems you have just posted... (but they work fine,
if I don't do nasty dd things!)
and unfortunatly my md(raid1) seems to do lots of "small" reads & writes
when it starts to check/resync the whole 1TB array :-/.
Regards,
Chr
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-23 22:38 ` GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny Jeff Garzik
2008-08-23 23:27 ` Chr
@ 2008-08-24 0:27 ` Sergey Spiridonov
2008-08-24 4:39 ` Jeff Garzik
2008-08-24 16:44 ` xerces8
1 sibling, 2 replies; 12+ messages in thread
From: Sergey Spiridonov @ 2008-08-24 0:27 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-ide
Hi
Jeff Garzik wrote:
> So... suspect bad cables, bad port connectors, cable interference,
> motherboard-caused interference or grounding problems, power supply
> problems.
I did exchange power supply and I did exchange hard drives. The same
hard drive with the same cable works with SB600 and produces errors with
JMB363. So looks like it is not cable or hard drive problem. May be the
problem is JMB363 port connector on the motherboard. How can I check it?
--
Best regards, Sergey Spiridonov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-24 0:27 ` Sergey Spiridonov
@ 2008-08-24 4:39 ` Jeff Garzik
2008-08-24 23:31 ` Sergey Spiridonov
2008-08-24 16:44 ` xerces8
1 sibling, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2008-08-24 4:39 UTC (permalink / raw)
To: Sergey Spiridonov; +Cc: linux-kernel, linux-ide
Sergey Spiridonov wrote:
> Hi
>
> Jeff Garzik wrote:
>
>> So... suspect bad cables, bad port connectors, cable interference,
>> motherboard-caused interference or grounding problems, power supply
>> problems.
>
> I did exchange power supply and I did exchange hard drives. The same
> hard drive with the same cable works with SB600 and produces errors with
> JMB363. So looks like it is not cable or hard drive problem. May be the
> problem is JMB363 port connector on the motherboard. How can I check it?
Try new motherboard of same brand and model :/
In general, tons of ATA bus errors and SError complaints indicate some
sort of problem at the physical layer/level. Its always possible that
software is to blame, but bug report patterns so far tend to point to
hardware.
Jeff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-24 0:27 ` Sergey Spiridonov
2008-08-24 4:39 ` Jeff Garzik
@ 2008-08-24 16:44 ` xerces8
2008-08-30 11:00 ` Tejun Heo
1 sibling, 1 reply; 12+ messages in thread
From: xerces8 @ 2008-08-24 16:44 UTC (permalink / raw)
To: Sergey Spiridonov, linux-ide; +Cc: linux-kernel
Sergey Spiridonov wrote:
> Jeff Garzik wrote:
>
> > So... suspect bad cables, bad port connectors, cable interference,
> > motherboard-caused interference or grounding problems, power supply
> > problems.
>
> I did exchange power supply and I did exchange hard drives. The same
> hard drive with the same cable works with SB600 and produces errors with
> JMB363. So looks like it is not cable or hard drive problem. May be the
> problem is JMB363 port connector on the motherboard. How can I check it?
Hi!
I have a JMB363 myself and it has its share of problems.
I would say it is buggy hardware. (why would they otherwise
release a new windows driver every week ? if not to workaround
bugs in HW ;-)
My WD MyBook Studio Edition 500 GB external eSATA drive does not
work on the JMB363 correctly no matter what I try. Both
under linux and windows. I think the best was 30 minutes of
(apparent) error free operation under windows.
If interested, I can supply logs, data etc.
(I have a bunch of drives to try).
Regards,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-24 4:39 ` Jeff Garzik
@ 2008-08-24 23:31 ` Sergey Spiridonov
2008-08-25 14:52 ` xerces8
2008-09-17 12:07 ` Sergey Spiridonov
0 siblings, 2 replies; 12+ messages in thread
From: Sergey Spiridonov @ 2008-08-24 23:31 UTC (permalink / raw)
To: linux-ide; +Cc: linux-kernel
Hi
Jeff Garzik wrote:
> Try new motherboard of same brand and model :/
Well, it is probably right time to try to get replacement motherboard
from store.
> In general, tons of ATA bus errors and SError complaints indicate some
> sort of problem at the physical layer/level. Its always possible that
> software is to blame, but bug report patterns so far tend to point to
> hardware.
I think I got indirect confirmation of possible motherboard defect.
There are 2 sata ports managed by JMB363: GSATAII-1 and GSATAII-2. I
found out that kernel errors appear only if I use GSATAII-1. If I
connect my drive to the GSATAII-2, there are no errors anymore.
Also, Seagate testing utility Seatools for DOS 2.07PG (bootable iso) is
hanging at startup even without connected sata drive, which also
indicates some motherboard problem (or again, utility problem). Seagate
support does not tell anything except something like "it should work"...
Thanks everybody for support.
--
Best regards, Sergey Spiridonov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-24 23:31 ` Sergey Spiridonov
@ 2008-08-25 14:52 ` xerces8
2008-09-17 12:07 ` Sergey Spiridonov
1 sibling, 0 replies; 12+ messages in thread
From: xerces8 @ 2008-08-25 14:52 UTC (permalink / raw)
To: Sergey Spiridonov, linux-ide; +Cc: linux-kernel
Sergey Spiridonov wrote:
> Seagate
> support does not tell anything except something like "it should work"...
I just got this* link from WD support, accompanied with the text:
"This sounds like an incompatibility perhaps with the ESATA controller.
Please see the link below for tested ESATA controllers."
* - http://www.wdc.com/en/products/resources/esataupgrade.asp
Regards,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-24 16:44 ` xerces8
@ 2008-08-30 11:00 ` Tejun Heo
2008-08-30 18:13 ` xerces8
0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2008-08-30 11:00 UTC (permalink / raw)
To: xerces8; +Cc: Sergey Spiridonov, linux-ide, linux-kernel
xerces8 wrote:
> I have a JMB363 myself and it has its share of problems.
> I would say it is buggy hardware. (why would they otherwise
> release a new windows driver every week ? if not to workaround
> bugs in HW ;-)
Well, FWIW, JMB ahci's are one of my favorites and usually very well
behaved.
> My WD MyBook Studio Edition 500 GB external eSATA drive does not
> work on the JMB363 correctly no matter what I try. Both
> under linux and windows. I think the best was 30 minutes of
> (apparent) error free operation under windows.
This one is being discussed both with JMB and WD. It seems the bridge
chip used in the WD external drives is somehow incompatible with the
JMB ahci's. Don't know whose fault it is or how it can be worked
around yet. The issue is being tracked in the following bugzilla.
http://bugzilla.kernel.org/show_bug.cgi?id=9913
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-30 11:00 ` Tejun Heo
@ 2008-08-30 18:13 ` xerces8
2008-08-31 9:33 ` Tejun Heo
0 siblings, 1 reply; 12+ messages in thread
From: xerces8 @ 2008-08-30 18:13 UTC (permalink / raw)
To: Tejun Heo; +Cc: Sergey Spiridonov, linux-ide, linux-kernel
Tejun Heo wrote:
> xerces8 wrote:
> > My WD MyBook Studio Edition 500 GB external eSATA drive does not
> > work on the JMB363 correctly no matter what I try. Both
> > under linux and windows. I think the best was 30 minutes of
> > (apparent) error free operation under windows.
>
> This one is being discussed both with JMB and WD. It seems the bridge
> chip used in the WD external drives is somehow incompatible with the
> JMB ahci's. Don't know whose fault it is or how it can be worked
> around yet. The issue is being tracked in the following bugzilla.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=9913
I know, I'm David Balažic (the last commenter on bug, besides you) ;-)
Regards,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-30 18:13 ` xerces8
@ 2008-08-31 9:33 ` Tejun Heo
0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2008-08-31 9:33 UTC (permalink / raw)
To: xerces8; +Cc: Sergey Spiridonov, linux-ide, linux-kernel
xerces8 wrote:
> Tejun Heo wrote:
>
>> xerces8 wrote:
>>> My WD MyBook Studio Edition 500 GB external eSATA drive does not
>>> work on the JMB363 correctly no matter what I try. Both
>>> under linux and windows. I think the best was 30 minutes of
>>> (apparent) error free operation under windows.
>> This one is being discussed both with JMB and WD. It seems the bridge
>> chip used in the WD external drives is somehow incompatible with the
>> JMB ahci's. Don't know whose fault it is or how it can be worked
>> around yet. The issue is being tracked in the following bugzilla.
>>
>> http://bugzilla.kernel.org/show_bug.cgi?id=9913
>
> I know, I'm David Balažic (the last commenter on bug, besides you) ;-)
Somehow I've been confusing people a lot lately. I asked my AMD contact
a few times about sata_nv problems somehow thinking AMD acquired NVidia
instead of ATI. :-)
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-08-24 23:31 ` Sergey Spiridonov
2008-08-25 14:52 ` xerces8
@ 2008-09-17 12:07 ` Sergey Spiridonov
2008-09-17 15:31 ` Krzysztof Halasa
1 sibling, 1 reply; 12+ messages in thread
From: Sergey Spiridonov @ 2008-09-17 12:07 UTC (permalink / raw)
To: linux-ide; +Cc: linux-kernel
Hi
Sergey Spiridonov wrote:
> Well, it is probably right time to try to get replacement motherboard
> from store.
I got a replacement motherboard. And I got absolutely same errors and
problems.
:( :( :(
Will try to live with this...
--
Best regards, Sergey Spiridonov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny
2008-09-17 12:07 ` Sergey Spiridonov
@ 2008-09-17 15:31 ` Krzysztof Halasa
0 siblings, 0 replies; 12+ messages in thread
From: Krzysztof Halasa @ 2008-09-17 15:31 UTC (permalink / raw)
To: Sergey Spiridonov; +Cc: linux-kernel, linux-ide
Sergey Spiridonov <sena@hurd.homeunix.org> writes:
> I got a replacement motherboard. And I got absolutely same errors and
> problems.
FWIW I have an MSI (P45 Neo2) mobo with JMB363 (only PATA connected)
and its second AHCI port (ata8, not connected) gives me the following.
The 2.6.26.2 kernel doesn't see any problem:
Linux version 2.6.26.2 (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) SMP
ata7: SATA max UDMA/133 abar m8192@0xfeafe000 port 0xfeafe100 irq 16
ata8: SATA max UDMA/133 abar m8192@0xfeafe000 port 0xfeafe180 irq 16
ata7: SATA link down (SStatus 0 SControl 300)
ata8: SATA link down (SStatus 0 SControl 300)
ata9: PATA max UDMA/100 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
ata10: PATA max UDMA/100 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17
ata9.00: ATAPI: _NEC DVD_RW ND-4550A, 1.09, max UDMA/33
ata9.00: configured for UDMA/33
Now with 2.6.26.3:
ata7: SATA max UDMA/133 abar m8192@0xfeafe000 port 0xfeafe100 irq 16
ata8: SATA max UDMA/133 abar m8192@0xfeafe000 port 0xfeafe180 irq 16
ata7: SATA link down (SStatus 0 SControl 300)
ata8: SATA link down (SStatus 0 SControl 300)
ata9: PATA max UDMA/100 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
ata10: PATA max UDMA/100 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17
ata9.00: ATAPI: _NEC DVD_RW ND-4550A, 1.09, max UDMA/33
ata9.00: configured for UDMA/33
ata8: exception Emask 0x10 SAct 0x0 SErr 0x4000000 action 0xe frozen
ata8: irq_stat 0x00000040, connection status changed
ata8: SError: { DevExch }
ata8: hard resetting link
ata8: SATA link down (SStatus 0 SControl 300)
ata8: EH complete
ata8: exception Emask 0x10 SAct 0x0 SErr 0x4000000 action 0xe frozen
ata8: irq_stat 0x00000040, connection status changed
ata8: SError: { DevExch }
ata8: hard resetting link
ata8: SATA link down (SStatus 0 SControl 300)
ata8: EH complete
ata8: exception Emask 0x10 SAct 0x0 SErr 0x4000000 action 0xe frozen
ata8: irq_stat 0x00000040, connection status changed
ata8: SError: { DevExch }
ata8: hard resetting link
...
.config differences which could somehow be relevant(?):
-CONFIG_GART_IOMMU=y
-CONFIG_K8_NB=y
-CONFIG_AGP_AMD64=y
(I'm using SWIOTLB in both cases, it's P45/Core2 duo based machine).
Can't see anything between 2.6.26.2 and 2.6.26.3 which could cause
that. Perhaps I should bisect it anyway?
For now I have just disabled these AHCI devices, didn't know that it
was ok with 2.6.26.2 (just noticed it in the logs). Will look at it.
--
Krzysztof Halasa
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-09-17 15:31 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <g8p6po$nm$1@ger.gmane.org>
2008-08-23 22:38 ` GA-MA790FX-DS5 SATA ahci NCQ erros on Jmicron 20360/20363 (JMB363) kernel 2.6.25-2 Debian/Lenny Jeff Garzik
2008-08-23 23:27 ` Chr
2008-08-24 0:27 ` Sergey Spiridonov
2008-08-24 4:39 ` Jeff Garzik
2008-08-24 23:31 ` Sergey Spiridonov
2008-08-25 14:52 ` xerces8
2008-09-17 12:07 ` Sergey Spiridonov
2008-09-17 15:31 ` Krzysztof Halasa
2008-08-24 16:44 ` xerces8
2008-08-30 11:00 ` Tejun Heo
2008-08-30 18:13 ` xerces8
2008-08-31 9:33 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).