Linux MIPS Architecture development
 help / color / mirror / Atom feed
* AMD Au1100 problems (USB & Ethernet)
@ 2005-08-05 19:39 Sylvain Munaut
  2005-08-05 19:47 ` Pete Popov
  0 siblings, 1 reply; 6+ messages in thread
From: Sylvain Munaut @ 2005-08-05 19:39 UTC (permalink / raw)
  To: linux-mips

Hello,

I've been trying to adapt linux ( the HEAD CVS version ) to a
custom board based around a Au1100. To be more precise, the board
use a "cpu module" (CSB650 from Cogent
http://www.cogcomp.com/csb_csb650.htm ) that's placed on a custom PCB.


I've compiled and booted a kernel sucessfully, I see the message on the
serial console. It's in Big Endian mode since the boot loaded on the
card is big endian only and I could manage to get it to switch to little
endian ...

Now, let's go on with the problems :

 * About USB. First time I tried, it just hung but I quicly found out
that it was because I didn't route the 48Mhz clock to USB module. After
that, I had to slightly adapt the ohci bus glue to enable the OHCI big
endian mode. After that, when a USB stick is inserted, it gets detected,
I can mount it and read small files. But when I try to read bigger files
( just 1 or 2 MB ), I get stuff like :

[4294743.146000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2

[4294743.618000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2

[4294743.891000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2

[4294744.151000] usb 1-1: reset full speed USB device using au1xxx-ohci
and address 2

[4294744.328000] au1xxx-ohci au1xxx-ohci.0: bad entry       4b


[4294744.346000] au1xxx-ohci au1xxx-ohci.0: bad entry ac450000


[4294744.363000] au1xxx-ohci au1xxx-ohci.0: bad entry 8f820014


[4294744.381000] au1xxx-ohci au1xxx-ohci.0: bad entry 38210001


[4294744.495000] hub 1-0:1.0: port 1 disabled by hub (EMI?),
re-enabling...

[4294744.515000] usb 1-1: USB disconnect, address 2


[4294745.532000] au1xxx-ohci au1xxx-ohci.0: IRQ INTR_SF lossage


[4294745.532000] usb 1-1: sg_complete, unlink --> -19


[4294745.532000] usb 1-1: sg_complete, unlink --> -19




Which means absolutly nothing to me ;( Has anyone got a clue ?
I can't say for sure it's not hardware but the cpu module is used by
others and on the base board, it's just a couple of differential pair
with 90ohm differential impedance, nothing more ...


 * About ethernet : It works, I have a network access. However I have
two kind of errors. On the RX side, I get quite a lot of "rx miss"
errors (when au1x00_eth debug is on). About 5% of packets are dropped.
That's not _too_ much of a problem as log as it doesn't increase. But
what can that be due too ?

A more annoying problem is that I get a lot of :
[  506.397000] NETDEV WATCHDOG: eth0: transmit timed out


[  506.412000] eth0: au1000_tx_timeout: dev=8048b400

theses are quite comment when  I transmitt a lot
                         and they completly ruin the transmission
(_real_ slow !).

Heres is some stats from ifconfig :

          RX packets:50496 errors:76 dropped:76 overruns:0 frame:0


          TX packets:49573 errors:47 dropped:0 overruns:0 carrier:74





Any insight / suggestion is appreciated, I'm getting desperate ;)


	Sylvain

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Au1100 problems (USB & Ethernet)
  2005-08-05 19:39 AMD Au1100 problems (USB & Ethernet) Sylvain Munaut
@ 2005-08-05 19:47 ` Pete Popov
       [not found]   ` <6.2.0.14.2.20050805155414.044a0f00@mail.cogcomp.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Pete Popov @ 2005-08-05 19:47 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: 'linux-mips@linux-mips.org'


I'm guessing the ethernet problems are hardware/board related. Take a
look at the PHY and make sure it's working OK.

Pete


On Fri, 2005-08-05 at 21:39 +0200, Sylvain Munaut wrote:
> Hello,
> 
> I've been trying to adapt linux ( the HEAD CVS version ) to a
> custom board based around a Au1100. To be more precise, the board
> use a "cpu module" (CSB650 from Cogent
> http://www.cogcomp.com/csb_csb650.htm ) that's placed on a custom PCB.
> 
> 
> I've compiled and booted a kernel sucessfully, I see the message on the
> serial console. It's in Big Endian mode since the boot loaded on the
> card is big endian only and I could manage to get it to switch to little
> endian ...
> 
> Now, let's go on with the problems :
> 
>  * About USB. First time I tried, it just hung but I quicly found out
> that it was because I didn't route the 48Mhz clock to USB module. After
> that, I had to slightly adapt the ohci bus glue to enable the OHCI big
> endian mode. After that, when a USB stick is inserted, it gets detected,
> I can mount it and read small files. But when I try to read bigger files
> ( just 1 or 2 MB ), I get stuff like :
> 
> [4294743.146000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
> 
> [4294743.618000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
> 
> [4294743.891000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
> 
> [4294744.151000] usb 1-1: reset full speed USB device using au1xxx-ohci
> and address 2
> 
> [4294744.328000] au1xxx-ohci au1xxx-ohci.0: bad entry       4b
> 
> 
> [4294744.346000] au1xxx-ohci au1xxx-ohci.0: bad entry ac450000
> 
> 
> [4294744.363000] au1xxx-ohci au1xxx-ohci.0: bad entry 8f820014
> 
> 
> [4294744.381000] au1xxx-ohci au1xxx-ohci.0: bad entry 38210001
> 
> 
> [4294744.495000] hub 1-0:1.0: port 1 disabled by hub (EMI?),
> re-enabling...
> 
> [4294744.515000] usb 1-1: USB disconnect, address 2
> 
> 
> [4294745.532000] au1xxx-ohci au1xxx-ohci.0: IRQ INTR_SF lossage
> 
> 
> [4294745.532000] usb 1-1: sg_complete, unlink --> -19
> 
> 
> [4294745.532000] usb 1-1: sg_complete, unlink --> -19
> 
> 
> 
> 
> Which means absolutly nothing to me ;( Has anyone got a clue ?
> I can't say for sure it's not hardware but the cpu module is used by
> others and on the base board, it's just a couple of differential pair
> with 90ohm differential impedance, nothing more ...
> 
> 
>  * About ethernet : It works, I have a network access. However I have
> two kind of errors. On the RX side, I get quite a lot of "rx miss"
> errors (when au1x00_eth debug is on). About 5% of packets are dropped.
> That's not _too_ much of a problem as log as it doesn't increase. But
> what can that be due too ?
> 
> A more annoying problem is that I get a lot of :
> [  506.397000] NETDEV WATCHDOG: eth0: transmit timed out
> 
> 
> [  506.412000] eth0: au1000_tx_timeout: dev=8048b400
> 
> theses are quite comment when  I transmitt a lot
>                          and they completly ruin the transmission
> (_real_ slow !).
> 
> Heres is some stats from ifconfig :
> 
>           RX packets:50496 errors:76 dropped:76 overruns:0 frame:0
> 
> 
>           TX packets:49573 errors:47 dropped:0 overruns:0 carrier:74
> 
> 
> 
> 
> 
> Any insight / suggestion is appreciated, I'm getting desperate ;)
> 
> 
> 	Sylvain
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Au1100 problems (USB & Ethernet)
       [not found]   ` <6.2.0.14.2.20050805155414.044a0f00@mail.cogcomp.com>
@ 2005-08-05 20:10     ` Pete Popov
  2005-08-05 21:05       ` Sylvain Munaut
  0 siblings, 1 reply; 6+ messages in thread
From: Pete Popov @ 2005-08-05 20:10 UTC (permalink / raw)
  To: Michael Kelly; +Cc: Sylvain Munaut, 'linux-mips@linux-mips.org'

On Fri, 2005-08-05 at 15:58 -0400, Michael Kelly wrote:
> The error count is less than .15%, not 5%.  This does not seem excessive.
> So, the question is what are these errors exactly.  We have done internal
> testing, but there is no way to test with every cable and switch/hub 
> combination.

Of course. I'm sure the CPU module itself is fine. I took a look at the
picture and it looks like the PHY is external so I'm guessing it's on
their custom PCB.

> If you could determine the actual errors (such as CRC, collision, etc) then we
> can try to determine where the errors are coming from.  It may very well be
> HW, but it is a bit too early to make such a broad statement without more
> information.

Well, could be just a cable issue, hub, etc, but I'll put that in the HW
bucket as well :)

Pete

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Au1100 problems (USB & Ethernet)
  2005-08-05 20:10     ` Pete Popov
@ 2005-08-05 21:05       ` Sylvain Munaut
  2005-08-05 21:14         ` Pete Popov
  0 siblings, 1 reply; 6+ messages in thread
From: Sylvain Munaut @ 2005-08-05 21:05 UTC (permalink / raw)
  To: ppopov; +Cc: Michael Kelly, 'linux-mips@linux-mips.org'

Pete Popov wrote:
> On Fri, 2005-08-05 at 15:58 -0400, Michael Kelly wrote:
> 
>>The error count is less than .15%, not 5%.  This does not seem excessive.
>>So, the question is what are these errors exactly.  We have done internal
>>testing, but there is no way to test with every cable and switch/hub 
>>combination.

Yes, on that particular count because I mainly testing TX on that
particular boot (so the RX are mainly small acks). But when testing
heavy receive with big packets, it can climbs up.

> Of course. I'm sure the CPU module itself is fine. I took a look at the
> picture and it looks like the PHY is external so I'm guessing it's on
> their custom PCB.
> 

The PHY is on the CPU module itself, it's a BCM5221.


>>If you could determine the actual errors (such as CRC, collision, etc) then we
>>can try to determine where the errors are coming from.  It may very well be
>>HW, but it is a bit too early to make such a broad statement without more
>>information.
> 
> 
> Well, could be just a cable issue, hub, etc, but I'll put that in the HW
> bucket as well :)

The RX errors are reported as "rx miss" (RX_MISSED_FRAME set) which is
described as "Internal FIFO overrun". Maybe those are just OK and it's
just that it can't wistand full 100Mbps (the module is connected on a
10/100/1000 switch and the server is gigabit).

The TX errors are time-out, how can I find more details about that ?


	Sylvain

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Au1100 problems (USB & Ethernet)
  2005-08-05 21:05       ` Sylvain Munaut
@ 2005-08-05 21:14         ` Pete Popov
  2005-08-06 10:42           ` Sylvain Munaut
  0 siblings, 1 reply; 6+ messages in thread
From: Pete Popov @ 2005-08-05 21:14 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Michael Kelly, 'linux-mips@linux-mips.org'


> The PHY is on the CPU module itself, it's a BCM5221.

I see.

> >>If you could determine the actual errors (such as CRC, collision, etc) then we
> >>can try to determine where the errors are coming from.  It may very well be
> >>HW, but it is a bit too early to make such a broad statement without more
> >>information.
> > 
> > 
> > Well, could be just a cable issue, hub, etc, but I'll put that in the HW
> > bucket as well :)
> 
> The RX errors are reported as "rx miss" (RX_MISSED_FRAME set) which is
> described as "Internal FIFO overrun". Maybe those are just OK and it's
> just that it can't wistand full 100Mbps (the module is connected on a
> 10/100/1000 switch and the server is gigabit).

No, I don't think that's normal.

> The TX errors are time-out, how can I find more details about that ?

If possible, eliminate the gig switch by replacing it with a small
10/100 switch. If the problems go away, then that's a big clue.

Take a look at what the bcm phy is auto-negotiating and make sure it
matches what the switch thinks it has negotiated. Although, the tx
timeouts should have nothing to do with mismatched auto negotiation...
but I see there are a bunch of "carrier" errors.

You of course tried a different cable, just in case?

Pete

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Au1100 problems (USB & Ethernet)
  2005-08-05 21:14         ` Pete Popov
@ 2005-08-06 10:42           ` Sylvain Munaut
  0 siblings, 0 replies; 6+ messages in thread
From: Sylvain Munaut @ 2005-08-06 10:42 UTC (permalink / raw)
  To: linux-mips

Pete Popov wrote:

>>The RX errors are reported as "rx miss" (RX_MISSED_FRAME set) which is
>>described as "Internal FIFO overrun". Maybe those are just OK and it's
>>just that it can't wistand full 100Mbps (the module is connected on a
>>10/100/1000 switch and the server is gigabit).
> 
> No, I don't think that's normal.

Maybe it has something to do with initialisation that I don't do
properly. The bootloader is uMon, not YaMon so maybe something is
execpted to be setup that I don't know of.


>>The TX errors are time-out, how can I find more details about that ?
> 
> 
> If possible, eliminate the gig switch by replacing it with a small
> 10/100 switch. If the problems go away, then that's a big clue.

I don't habe a 10/100 switch but I tried on a 10/100 Hub and the results
are quite the same. I just have a few "rx runt" error more that are due
to the hub.

> Take a look at what the bcm phy is auto-negotiating and make sure it
> matches what the switch thinks it has negotiated. Although, the tx
> timeouts should have nothing to do with mismatched auto negotiation...
> but I see there are a bunch of "carrier" errors.

Phy reports 100Mbps half duplex with the hub and 100Mbps full duplex
with the switch, which looks correct.

btw, It seems that after a timeout error, the au1000_timer isn't
restored correctly ( I put a printk in it and before the errors, it
prints every sec, and never after ).

> You of course tried a different cable, just in case?

Sure, with 3 differents cables in fact.


	Sylvain

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-08-06 10:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-05 19:39 AMD Au1100 problems (USB & Ethernet) Sylvain Munaut
2005-08-05 19:47 ` Pete Popov
     [not found]   ` <6.2.0.14.2.20050805155414.044a0f00@mail.cogcomp.com>
2005-08-05 20:10     ` Pete Popov
2005-08-05 21:05       ` Sylvain Munaut
2005-08-05 21:14         ` Pete Popov
2005-08-06 10:42           ` Sylvain Munaut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox