* AMD Au1100 problems (USB & Ethernet)
@ 2005-08-05 19:39 Sylvain Munaut
2005-08-05 19:47 ` Pete Popov
0 siblings, 1 reply; 6+ messages in thread
From: Sylvain Munaut @ 2005-08-05 19:39 UTC (permalink / raw)
To: linux-mips
Hello,
I've been trying to adapt linux ( the HEAD CVS version ) to a
custom board based around a Au1100. To be more precise, the board
use a "cpu module" (CSB650 from Cogent
http://www.cogcomp.com/csb_csb650.htm ) that's placed on a custom PCB.
I've compiled and booted a kernel sucessfully, I see the message on the
serial console. It's in Big Endian mode since the boot loaded on the
card is big endian only and I could manage to get it to switch to little
endian ...
Now, let's go on with the problems :
* About USB. First time I tried, it just hung but I quicly found out
that it was because I didn't route the 48Mhz clock to USB module. After
that, I had to slightly adapt the ohci bus glue to enable the OHCI big
endian mode. After that, when a USB stick is inserted, it gets detected,
I can mount it and read small files. But when I try to read bigger files
( just 1 or 2 MB ), I get stuff like :
[4294743.146000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2
[4294743.618000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2
[4294743.891000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2
[4294744.151000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2
[4294744.328000] au1xxx-ohci au1xxx-ohci.0: bad entry 4b
[4294744.346000] au1xxx-ohci au1xxx-ohci.0: bad entry ac450000
[4294744.363000] au1xxx-ohci au1xxx-ohci.0: bad entry 8f820014
[4294744.381000] au1xxx-ohci au1xxx-ohci.0: bad entry 38210001
[4294744.495000] hub 1-0:1.0: port 1 disabled by hub (EMI?),
re-enabling...
[4294744.515000] usb 1-1: USB disconnect, address 2
[4294745.532000] au1xxx-ohci au1xxx-ohci.0: IRQ INTR_SF lossage
[4294745.532000] usb 1-1: sg_complete, unlink --> -19
[4294745.532000] usb 1-1: sg_complete, unlink --> -19
Which means absolutly nothing to me ;( Has anyone got a clue ?
I can't say for sure it's not hardware but the cpu module is used by
others and on the base board, it's just a couple of differential pair
with 90ohm differential impedance, nothing more ...
* About ethernet : It works, I have a network access. However I have
two kind of errors. On the RX side, I get quite a lot of "rx miss"
errors (when au1x00_eth debug is on). About 5% of packets are dropped.
That's not _too_ much of a problem as log as it doesn't increase. But
what can that be due too ?
A more annoying problem is that I get a lot of :
[ 506.397000] NETDEV WATCHDOG: eth0: transmit timed out
[ 506.412000] eth0: au1000_tx_timeout: dev=8048b400
theses are quite comment when I transmitt a lot
and they completly ruin the transmission
(_real_ slow !).
Heres is some stats from ifconfig :
RX packets:50496 errors:76 dropped:76 overruns:0 frame:0
TX packets:49573 errors:47 dropped:0 overruns:0 carrier:74
Any insight / suggestion is appreciated, I'm getting desperate ;)
Sylvain
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD Au1100 problems (USB & Ethernet)
2005-08-05 19:39 AMD Au1100 problems (USB & Ethernet) Sylvain Munaut
@ 2005-08-05 19:47 ` Pete Popov
[not found] ` <6.2.0.14.2.20050805155414.044a0f00@mail.cogcomp.com>
0 siblings, 1 reply; 6+ messages in thread
From: Pete Popov @ 2005-08-05 19:47 UTC (permalink / raw)
To: Sylvain Munaut; +Cc: 'linux-mips@linux-mips.org'
I'm guessing the ethernet problems are hardware/board related. Take a
look at the PHY and make sure it's working OK.
Pete
On Fri, 2005-08-05 at 21:39 +0200, Sylvain Munaut wrote:
> Hello,
>
> I've been trying to adapt linux ( the HEAD CVS version ) to a
> custom board based around a Au1100. To be more precise, the board
> use a "cpu module" (CSB650 from Cogent
> http://www.cogcomp.com/csb_csb650.htm ) that's placed on a custom PCB.
>
>
> I've compiled and booted a kernel sucessfully, I see the message on the
> serial console. It's in Big Endian mode since the boot loaded on the
> card is big endian only and I could manage to get it to switch to little
> endian ...
>
> Now, let's go on with the problems :
>
> * About USB. First time I tried, it just hung but I quicly found out
> that it was because I didn't route the 48Mhz clock to USB module. After
> that, I had to slightly adapt the ohci bus glue to enable the OHCI big
> endian mode. After that, when a USB stick is inserted, it gets detected,
> I can mount it and read small files. But when I try to read bigger files
> ( just 1 or 2 MB ), I get stuff like :
>
> [4294743.146000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
>
> [4294743.618000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
>
> [4294743.891000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
>
> [4294744.151000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
>
> [4294744.328000] au1xxx-ohci au1xxx-ohci.0: bad entry 4b
>
>
> [4294744.346000] au1xxx-ohci au1xxx-ohci.0: bad entry ac450000
>
>
> [4294744.363000] au1xxx-ohci au1xxx-ohci.0: bad entry 8f820014
>
>
> [4294744.381000] au1xxx-ohci au1xxx-ohci.0: bad entry 38210001
>
>
> [4294744.495000] hub 1-0:1.0: port 1 disabled by hub (EMI?),
> re-enabling...
>
> [4294744.515000] usb 1-1: USB disconnect, address 2
>
>
> [4294745.532000] au1xxx-ohci au1xxx-ohci.0: IRQ INTR_SF lossage
>
>
> [4294745.532000] usb 1-1: sg_complete, unlink --> -19
>
>
> [4294745.532000] usb 1-1: sg_complete, unlink --> -19
>
>
>
>
> Which means absolutly nothing to me ;( Has anyone got a clue ?
> I can't say for sure it's not hardware but the cpu module is used by
> others and on the base board, it's just a couple of differential pair
> with 90ohm differential impedance, nothing more ...
>
>
> * About ethernet : It works, I have a network access. However I have
> two kind of errors. On the RX side, I get quite a lot of "rx miss"
> errors (when au1x00_eth debug is on). About 5% of packets are dropped.
> That's not _too_ much of a problem as log as it doesn't increase. But
> what can that be due too ?
>
> A more annoying problem is that I get a lot of :
> [ 506.397000] NETDEV WATCHDOG: eth0: transmit timed out
>
>
> [ 506.412000] eth0: au1000_tx_timeout: dev=8048b400
>
> theses are quite comment when I transmitt a lot
> and they completly ruin the transmission
> (_real_ slow !).
>
> Heres is some stats from ifconfig :
>
> RX packets:50496 errors:76 dropped:76 overruns:0 frame:0
>
>
> TX packets:49573 errors:47 dropped:0 overruns:0 carrier:74
>
>
>
>
>
> Any insight / suggestion is appreciated, I'm getting desperate ;)
>
>
> Sylvain
>
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD Au1100 problems (USB & Ethernet)
[not found] ` <6.2.0.14.2.20050805155414.044a0f00@mail.cogcomp.com>
@ 2005-08-05 20:10 ` Pete Popov
2005-08-05 21:05 ` Sylvain Munaut
0 siblings, 1 reply; 6+ messages in thread
From: Pete Popov @ 2005-08-05 20:10 UTC (permalink / raw)
To: Michael Kelly; +Cc: Sylvain Munaut, 'linux-mips@linux-mips.org'
On Fri, 2005-08-05 at 15:58 -0400, Michael Kelly wrote:
> The error count is less than .15%, not 5%. This does not seem excessive.
> So, the question is what are these errors exactly. We have done internal
> testing, but there is no way to test with every cable and switch/hub
> combination.
Of course. I'm sure the CPU module itself is fine. I took a look at the
picture and it looks like the PHY is external so I'm guessing it's on
their custom PCB.
> If you could determine the actual errors (such as CRC, collision, etc) then we
> can try to determine where the errors are coming from. It may very well be
> HW, but it is a bit too early to make such a broad statement without more
> information.
Well, could be just a cable issue, hub, etc, but I'll put that in the HW
bucket as well :)
Pete
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD Au1100 problems (USB & Ethernet)
2005-08-05 20:10 ` Pete Popov
@ 2005-08-05 21:05 ` Sylvain Munaut
2005-08-05 21:14 ` Pete Popov
0 siblings, 1 reply; 6+ messages in thread
From: Sylvain Munaut @ 2005-08-05 21:05 UTC (permalink / raw)
To: ppopov; +Cc: Michael Kelly, 'linux-mips@linux-mips.org'
Pete Popov wrote:
> On Fri, 2005-08-05 at 15:58 -0400, Michael Kelly wrote:
>
>>The error count is less than .15%, not 5%. This does not seem excessive.
>>So, the question is what are these errors exactly. We have done internal
>>testing, but there is no way to test with every cable and switch/hub
>>combination.
Yes, on that particular count because I mainly testing TX on that
particular boot (so the RX are mainly small acks). But when testing
heavy receive with big packets, it can climbs up.
> Of course. I'm sure the CPU module itself is fine. I took a look at the
> picture and it looks like the PHY is external so I'm guessing it's on
> their custom PCB.
>
The PHY is on the CPU module itself, it's a BCM5221.
>>If you could determine the actual errors (such as CRC, collision, etc) then we
>>can try to determine where the errors are coming from. It may very well be
>>HW, but it is a bit too early to make such a broad statement without more
>>information.
>
>
> Well, could be just a cable issue, hub, etc, but I'll put that in the HW
> bucket as well :)
The RX errors are reported as "rx miss" (RX_MISSED_FRAME set) which is
described as "Internal FIFO overrun". Maybe those are just OK and it's
just that it can't wistand full 100Mbps (the module is connected on a
10/100/1000 switch and the server is gigabit).
The TX errors are time-out, how can I find more details about that ?
Sylvain
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD Au1100 problems (USB & Ethernet)
2005-08-05 21:05 ` Sylvain Munaut
@ 2005-08-05 21:14 ` Pete Popov
2005-08-06 10:42 ` Sylvain Munaut
0 siblings, 1 reply; 6+ messages in thread
From: Pete Popov @ 2005-08-05 21:14 UTC (permalink / raw)
To: Sylvain Munaut; +Cc: Michael Kelly, 'linux-mips@linux-mips.org'
> The PHY is on the CPU module itself, it's a BCM5221.
I see.
> >>If you could determine the actual errors (such as CRC, collision, etc) then we
> >>can try to determine where the errors are coming from. It may very well be
> >>HW, but it is a bit too early to make such a broad statement without more
> >>information.
> >
> >
> > Well, could be just a cable issue, hub, etc, but I'll put that in the HW
> > bucket as well :)
>
> The RX errors are reported as "rx miss" (RX_MISSED_FRAME set) which is
> described as "Internal FIFO overrun". Maybe those are just OK and it's
> just that it can't wistand full 100Mbps (the module is connected on a
> 10/100/1000 switch and the server is gigabit).
No, I don't think that's normal.
> The TX errors are time-out, how can I find more details about that ?
If possible, eliminate the gig switch by replacing it with a small
10/100 switch. If the problems go away, then that's a big clue.
Take a look at what the bcm phy is auto-negotiating and make sure it
matches what the switch thinks it has negotiated. Although, the tx
timeouts should have nothing to do with mismatched auto negotiation...
but I see there are a bunch of "carrier" errors.
You of course tried a different cable, just in case?
Pete
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD Au1100 problems (USB & Ethernet)
2005-08-05 21:14 ` Pete Popov
@ 2005-08-06 10:42 ` Sylvain Munaut
0 siblings, 0 replies; 6+ messages in thread
From: Sylvain Munaut @ 2005-08-06 10:42 UTC (permalink / raw)
To: linux-mips
Pete Popov wrote:
>>The RX errors are reported as "rx miss" (RX_MISSED_FRAME set) which is
>>described as "Internal FIFO overrun". Maybe those are just OK and it's
>>just that it can't wistand full 100Mbps (the module is connected on a
>>10/100/1000 switch and the server is gigabit).
>
> No, I don't think that's normal.
Maybe it has something to do with initialisation that I don't do
properly. The bootloader is uMon, not YaMon so maybe something is
execpted to be setup that I don't know of.
>>The TX errors are time-out, how can I find more details about that ?
>
>
> If possible, eliminate the gig switch by replacing it with a small
> 10/100 switch. If the problems go away, then that's a big clue.
I don't habe a 10/100 switch but I tried on a 10/100 Hub and the results
are quite the same. I just have a few "rx runt" error more that are due
to the hub.
> Take a look at what the bcm phy is auto-negotiating and make sure it
> matches what the switch thinks it has negotiated. Although, the tx
> timeouts should have nothing to do with mismatched auto negotiation...
> but I see there are a bunch of "carrier" errors.
Phy reports 100Mbps half duplex with the hub and 100Mbps full duplex
with the switch, which looks correct.
btw, It seems that after a timeout error, the au1000_timer isn't
restored correctly ( I put a printk in it and before the errors, it
prints every sec, and never after ).
> You of course tried a different cable, just in case?
Sure, with 3 differents cables in fact.
Sylvain
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-08-06 10:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-05 19:39 AMD Au1100 problems (USB & Ethernet) Sylvain Munaut
2005-08-05 19:47 ` Pete Popov
[not found] ` <6.2.0.14.2.20050805155414.044a0f00@mail.cogcomp.com>
2005-08-05 20:10 ` Pete Popov
2005-08-05 21:05 ` Sylvain Munaut
2005-08-05 21:14 ` Pete Popov
2005-08-06 10:42 ` Sylvain Munaut
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox