* MPC5200 ethernet communication stops unexpected
@ 2007-04-20 19:21 Eberhard Stoll
2007-04-27 14:35 ` tacitus
0 siblings, 1 reply; 14+ messages in thread
From: Eberhard Stoll @ 2007-04-20 19:21 UTC (permalink / raw)
To: linuxppc-embedded
Hello,
can someone help me?
I have some problem with our MPC5200 board running denx linux kernel
2.4.25 with ethernet communication.
My problem is that our board stops receiving and transmitting any
ethernet frames suddenly and unexpected. No pings - nothing is getting
thru the ethernet any more! The rest of the controller is running well.
This situation is very rare - only a constellation with 6 Controllers
and a special ethernet communication load leads to this fault - in about
one day!
When i check how many tx and rx buffers are used in the BestComm buffer
descriptor ring (via TaskBDInUse() call) which hold pointers to transmit
and receive data, i see all tx and rx buffers are in use! When i look at
the FEC Tx Fifo Status Register i get the value 0x00030000. This means
FEC Tx Fifo empty!
When i check fec Task Control register i see Tx and Rx Task are enabled,
current Pointer register points to 0xF000828C which belongs to FEC Tx
BestComm Microcode.
When i now examine the status value of my currently active buffer
descriptor for Tx task, it shows that this buffer belongs to BestComm.
In my opinion BestComm shold tranfer this buffer to the empty FEC Tx
Fifo - but it doesn't!
Now the question for me is: WHY?
Do i oversee some bits in the registers/ram locations which lead to this
situation?
I don't know much about BestComm/FEC, but my conclusion out of this is
BestComm is stuck somewhere out of some reason. Is this a hardware
issue? Does someone know? Has someone similar problems?
Can someone help me or check if i'm right with my conclusion (see
registers and ram locations down)? Or give me a hint where to look further.
Any help is very welcome!
Many Thanks,
Eberhard
PS: I saw this problem only on MPC5200B processors until now. And we use
BestComm Api 2.2 (the newest denx 2.4.x code for bestcomm)
Here are some registers and ram addresses i collected in this situation:
BTW: I saw BestComm Microcode changing in 16k SRAM after the first
Ethernet frames were sent. This seems very strange to me - but maybe is
correct behaviour - i don't know. Does someone of you?
Its the address 0xf00082e8 (in FEC Tx BestComm microcode). After reset
(no frame sent) the ram at this address is 0x088AC398 after the first(?)
ethernet frame it's 0x0C8AC398!
=== BestComm registers ===
taskBar 0xF0008000
currentPointer 0xF000828C
endPointer 0x00000000
variablePointer 0xF0008800
IntVect1 0x0000000F
IntVect2 0x0000000F
PtdCntrl 0x00000001
IntPend 0x00000000
IntMask 0xFFFFFFF3
tcr_0 0x0000
tcr_1 0x0000
tcr_2 0xE082
tcr_3 0xC383
tcr_4 0x0000
tcr_5 0x0000
tcr_6 0x0000
tcr_7 0x0000
tcr_8 0x0000
tcr_9 0x0000
tcr_a 0x0000
tcr_b 0x0000
tcr_c 0x0000
tcr_d 0x0000
tcr_e 0x0000
tcr_f 0x0000
IPR0 0x07
IPR1 0x00
IPR2 0x00
IPR3 0x04
IPR4 0x03
IPR5 0x06
IPR6 0x05
IPR7 0x00
IPR8 0x00
IPR9 0x00
IPR10 0x00
IPR11 0x00
IPR12 0x00
IPR13 0x00
IPR14 0x00
IPR15 0x00
IPR16 0x00
IPR17 0x00
IPR18 0x00
IPR19 0x00
IPR20 0x00
IPR21 0x00
IPR22 0x00
IPR23 0x00
IPR24 0x00
IPR25 0x00
IPR26 0x00
IPR27 0x00
IPR28 0x00
IPR29 0x00
IPR30 0x00
IPR31 0x00
task_size0 0x00000000
task_size1 0x00000000
MDEDebug 0x01000008
ADSDebug 0x00000000
Value1 0x00000000
Value2 0x00000000
Control 0x00000000
Status 0x00000000
EU00 0x04155519
EU01 0x00000000
EU02 0x00000000
EU03 0x00000000
EU04 0x00000000
EU05 0x00000000
EU06 0x00000000
EU07 0x00000000
EU10 0x00000000
EU11 0x00000000
EU12 0x00000000
EU13 0x00000000
EU14 0x00000000
EU15 0x00000000
EU16 0x00000000
EU17 0x00000000
EU20 0x00000000
EU21 0x00000000
EU22 0x00000000
EU23 0x00000000
EU24 0x00000000
EU25 0x00000000
EU26 0x00000000
EU27 0x00000000
EU30 0x00000000
EU31 0x00000000
EU32 0x00000000
EU33 0x00000000
EU34 0x00000000
EU35 0x00000000
EU36 0x00000000
EU37 0x00000000
=== Return values of BestComm Api Functions ===
t_tasknum = 2
TaskBDInUse(tx) = 256
TaskStatus(tx) = run (0x00008000)
TaskIntPending(tx) = 0
r_tasknum = 3
TaskBDInUse(rx) = 256
TaskStatus(rx) = run (0x00008000)
TaskIntPending(rx) = 0
TasksGetSramOffset = 0x00002500
task 0: stop int: 0x00000000
task 1: stop int: 0x00000000
task 2: run int: 0x00000000
task 3: run int: 0x00000000
task 4: stop int: 0x00000000
task 5: stop int: 0x00000000
task 6: stop int: 0x00000000
task 7: stop int: 0x00000000
task 8: stop int: 0x00000000
task 9: stop int: 0x00000000
task 10: stop int: 0x00000000
task 11: stop int: 0x00000000
task 12: stop int: 0x00000000
task 13: stop int: 0x00000000
task 14: stop int: 0x00000000
task 15: stop int: 0x00000000
=== FEC Registers ===
fec base addr f0003000
fec_id 0x00000000
ievent 0x08000000
imask 0xF0FE0000
r_des_active 0x00000000
x_des_active 0x00000000
ecntrl 0xF0000002
mii_data 0x5F821200
mii_speed 0x0000001C
mib_control 0x40000000
r_cntrl 0x05EE0024
r_hash 0x8A000000
x_cntrl 0x00000004
paddr1 0x00E0BA90
paddr2 0x07DC8808
op_pause 0x00010020
iaddr1 0x00000000
iaddr2 0x00000000
gaddr1 0x00400000
gaddr2 0x00000000
x_wmrk 0x00000000
rfifo_status 0x214E0000
rfifo_cntrl 0x0F240000
rfifo_lrf_ptr 0x0000005D
rfifo_lwf_ptr 0x0000038D
rfifo_alarm 0x0000030C
rfifo_rdptr 0x0000005D
rfifo_wrptr 0x0000005D
tfifo_status 0x00030000
tfifo_cntrl 0x0F200000
tfifo_lrf_ptr 0x0000023C
tfifo_lwf_ptr 0x0000023C
tfifo_alarm 0x00000100
tfifo_rdptr 0x0000023C
tfifo_wrptr 0x0000023C
reset_cntrl 0x01000000
xmit_fsm 0x03000000
=== FEC driver vars ===
MBAR 0xF0000000
MBAR SIZE 0x10000000
queue_stopped 1
mpc5xxx_bdi_tx 73 (x49)
mpc5xxx_bdi_rx 97 (x61)
adr(tx_fifo_skb) c0233744
adr(tx_fifo_skb[0]) c0233744
adr(tx_fifo_skb[1]) c0233748
tx_fifo_skb[0] c2474f20
tx_fifo_skb[1] c24846e0
sizeof(tx_fifo_skb) 1024
sizeof(tx_fifo_skb[0]) 4
MPC5xxx_FEC_TBD_NUM 256
adr(rx_fifo_skb) c0233b44
sizeof(rx_fifo_skb) 1024
sizeof(rx_fifo_skb[0]) 4
MPC5xxx_FEC_RBD_NUM 256
full_duplex 1
tx_full 1
r_tasknum 3
t_tasknum 2
r_irq 24
t_irq 23
last_transmit_time 0
last_receive_time 0
phy_id 0x0015F442
phy_id_done 1
phy_status 0x00000000
phy_speed 28
sequence_done 0
link 0
duplex_change 0
link_up 0
old_status 0x00000000
=== BDHead Table ===
TASK #0
[0xC0232D74] = 0x00
[0xC0232D75] = 0x00
[0xC0232D76] = 0x00
[0xC0232D77] = 0x00
TASK #1
[0xC0232D78] = 0x00
[0xC0232D79] = 0x00
[0xC0232D7A] = 0x00
[0xC0232D7B] = 0x00
TASK #2 - tx task
[0xC0232D7C] = 0x00
[0xC0232D7D] = 0x00
[0xC0232D7E] = 0x00
[0xC0232D7F] = 0x49
--> actual tx index: 0x49->73
TASK #3 - rx task
[0xC0232D80] = 0x00
[0xC0232D81] = 0x00
[0xC0232D82] = 0x00
[0xC0232D83] = 0x61
--> actual rx index: 0x61->97
[0xC0232D84] = 0x00
[0xC0232D85] = 0x00
[0xC0232D86] = 0x00
...
=== TaskBDIdxTable ===
TASK#0
numBD [0xC0232DF4] = 0x0000
numPtr [0xC0232DF6] = 0x00
apiConfig [0xC0232DF7] = 0x00
BDTablePtr [0xC0232DF8] = 0x00000000
BDStartPtr [0xC0232DFC] = 0x00000000
currBDInUse[0xC0232E00] = 0x0000
[0xC0232E02] = 0x00
[0xC0232E03] = 0x00
TASK#1
[0xC0232E04] = 0x00
[0xC0232E05] = 0x00
[0xC0232E06] = 0x00
[0xC0232E07] = 0x00
[0xC0232E08] = 0x00
[0xC0232E09] = 0x00
[0xC0232E0A] = 0x00
[0xC0232E0B] = 0x00
[0xC0232E0C] = 0x00
[0xC0232E0D] = 0x00
[0xC0232E0E] = 0x00
[0xC0232E0F] = 0x00
[0xC0232E10] = 0x00
[0xC0232E11] = 0x00
[0xC0232E12] = 0x00
[0xC0232E13] = 0x00
TASK#2 - tx task
numBD [0xC0232E14] = 0x0100
numPtr [0xC0232E16] = 0x01
apiConfig [0xC0232E17] = 0x01
BDTablePtr [0xC0232E18] = 0xF0009D00
BDStartPtr [0xC0232E1C] = 0xF0008814
currBDInUse[0xC0232E20] = 0x0100
[0xC0232E22] = 0x00
[0xC0232E23] = 0x00
TASK#3 - rx task
numBD [0xC0232E24] = 0x0100
numPtr [0xC0232E26] = 0x01
apiConfig [0xC0232E27] = 0x00
BDTablePtr [0xC0232E28] = 0xF0009500
BDStartPtr [0xC0232E2C] = 0xF0008890
currBDInUse[0xC0232E30] = 0x0100
[0xC0232E32] = 0x00
[0xC0232E33] = 0x00
TASK#4
[0xC0232E34] = 0x00
[0xC0232E35] = 0x00
...
=== Tx Descriptor Ring ===
IDX 0x49-(73) is interesting, because active (see BDHeadTable).
So our descriptor is as Addr 0xF0009D00 + (0x49 * 0x8) = 0xF0009F48
IDX 0
[0xF0009D00] = 0x4C00003C
[0xF0009D04] = 0x02475922
IDX 1
[0xF0009D08] = 0x4C00003C
[0xF0009D0C] = 0x02475CA2
...
IDX 72
[0xF0009F40] = 0x4C00003C
[0xF0009F44] = 0x024665A2
IDX 73 - active descriptor
[0xF0009F48] = 0x4C00004E - owns BestComm, should transfer
[0xF0009F4C] = 0x0243F05E
IDX 74
[0xF0009F50] = 0x4C00004E
[0xF0009F54] = 0x0243F85E
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-04-20 19:21 Eberhard Stoll
@ 2007-04-27 14:35 ` tacitus
2007-04-27 15:42 ` Daniel Schnell
2007-04-27 17:09 ` Eberhard Stoll
0 siblings, 2 replies; 14+ messages in thread
From: tacitus @ 2007-04-27 14:35 UTC (permalink / raw)
To: linuxppc-embedded
Hallo Eberhard,
I've the same problem by running MPC5200B, but under MQX ...
But my first question, can we chat via german ?
ciao
mark
Eberhard Stoll wrote:
>
> Hello,
> can someone help me?
> I have some problem with our MPC5200 board running denx linux kernel
> 2.4.25 with ethernet communication.
>
> My problem is that our board stops receiving and transmitting any
> ethernet frames suddenly and unexpected. No pings - nothing is getting
> thru the ethernet any more! The rest of the controller is running well.
> This situation is very rare - only a constellation with 6 Controllers
> and a special ethernet communication load leads to this fault - in about
> one day!
>
> When i check how many tx and rx buffers are used in the BestComm buffer
> descriptor ring (via TaskBDInUse() call) which hold pointers to transmit
> and receive data, i see all tx and rx buffers are in use! When i look at
> the FEC Tx Fifo Status Register i get the value 0x00030000. This means
> FEC Tx Fifo empty!
> When i check fec Task Control register i see Tx and Rx Task are enabled,
> current Pointer register points to 0xF000828C which belongs to FEC Tx
> BestComm Microcode.
> When i now examine the status value of my currently active buffer
> descriptor for Tx task, it shows that this buffer belongs to BestComm.
> In my opinion BestComm shold tranfer this buffer to the empty FEC Tx
> Fifo - but it doesn't!
> Now the question for me is: WHY?
>
> Do i oversee some bits in the registers/ram locations which lead to this
> situation?
> I don't know much about BestComm/FEC, but my conclusion out of this is
> BestComm is stuck somewhere out of some reason. Is this a hardware
> issue? Does someone know? Has someone similar problems?
>
> Can someone help me or check if i'm right with my conclusion (see
> registers and ram locations down)? Or give me a hint where to look
> further.
> Any help is very welcome!
>
> Many Thanks,
> Eberhard
>
> PS: I saw this problem only on MPC5200B processors until now. And we use
> BestComm Api 2.2 (the newest denx 2.4.x code for bestcomm)
>
> Here are some registers and ram addresses i collected in this situation:
>
> BTW: I saw BestComm Microcode changing in 16k SRAM after the first
> Ethernet frames were sent. This seems very strange to me - but maybe is
> correct behaviour - i don't know. Does someone of you?
> Its the address 0xf00082e8 (in FEC Tx BestComm microcode). After reset
> (no frame sent) the ram at this address is 0x088AC398 after the first(?)
> ethernet frame it's 0x0C8AC398!
>
> === BestComm registers ===
> taskBar 0xF0008000
> currentPointer 0xF000828C
> endPointer 0x00000000
> variablePointer 0xF0008800
> IntVect1 0x0000000F
> IntVect2 0x0000000F
> PtdCntrl 0x00000001
> IntPend 0x00000000
> IntMask 0xFFFFFFF3
> tcr_0 0x0000
> tcr_1 0x0000
> tcr_2 0xE082
> tcr_3 0xC383
> tcr_4 0x0000
> tcr_5 0x0000
> tcr_6 0x0000
> tcr_7 0x0000
> tcr_8 0x0000
> tcr_9 0x0000
> tcr_a 0x0000
> tcr_b 0x0000
> tcr_c 0x0000
> tcr_d 0x0000
> tcr_e 0x0000
> tcr_f 0x0000
> IPR0 0x07
> IPR1 0x00
> IPR2 0x00
> IPR3 0x04
> IPR4 0x03
> IPR5 0x06
> IPR6 0x05
> IPR7 0x00
> IPR8 0x00
> IPR9 0x00
> IPR10 0x00
> IPR11 0x00
> IPR12 0x00
> IPR13 0x00
> IPR14 0x00
> IPR15 0x00
> IPR16 0x00
> IPR17 0x00
> IPR18 0x00
> IPR19 0x00
> IPR20 0x00
> IPR21 0x00
> IPR22 0x00
> IPR23 0x00
> IPR24 0x00
> IPR25 0x00
> IPR26 0x00
> IPR27 0x00
> IPR28 0x00
> IPR29 0x00
> IPR30 0x00
> IPR31 0x00
> task_size0 0x00000000
> task_size1 0x00000000
> MDEDebug 0x01000008
> ADSDebug 0x00000000
> Value1 0x00000000
> Value2 0x00000000
> Control 0x00000000
> Status 0x00000000
> EU00 0x04155519
> EU01 0x00000000
> EU02 0x00000000
> EU03 0x00000000
> EU04 0x00000000
> EU05 0x00000000
> EU06 0x00000000
> EU07 0x00000000
> EU10 0x00000000
> EU11 0x00000000
> EU12 0x00000000
> EU13 0x00000000
> EU14 0x00000000
> EU15 0x00000000
> EU16 0x00000000
> EU17 0x00000000
> EU20 0x00000000
> EU21 0x00000000
> EU22 0x00000000
> EU23 0x00000000
> EU24 0x00000000
> EU25 0x00000000
> EU26 0x00000000
> EU27 0x00000000
> EU30 0x00000000
> EU31 0x00000000
> EU32 0x00000000
> EU33 0x00000000
> EU34 0x00000000
> EU35 0x00000000
> EU36 0x00000000
> EU37 0x00000000
>
> === Return values of BestComm Api Functions ===
> t_tasknum = 2
> TaskBDInUse(tx) = 256
> TaskStatus(tx) = run (0x00008000)
> TaskIntPending(tx) = 0
> r_tasknum = 3
> TaskBDInUse(rx) = 256
> TaskStatus(rx) = run (0x00008000)
> TaskIntPending(rx) = 0
> TasksGetSramOffset = 0x00002500
> task 0: stop int: 0x00000000
> task 1: stop int: 0x00000000
> task 2: run int: 0x00000000
> task 3: run int: 0x00000000
> task 4: stop int: 0x00000000
> task 5: stop int: 0x00000000
> task 6: stop int: 0x00000000
> task 7: stop int: 0x00000000
> task 8: stop int: 0x00000000
> task 9: stop int: 0x00000000
> task 10: stop int: 0x00000000
> task 11: stop int: 0x00000000
> task 12: stop int: 0x00000000
> task 13: stop int: 0x00000000
> task 14: stop int: 0x00000000
> task 15: stop int: 0x00000000
>
> === FEC Registers ===
> fec base addr f0003000
> fec_id 0x00000000
> ievent 0x08000000
> imask 0xF0FE0000
> r_des_active 0x00000000
> x_des_active 0x00000000
> ecntrl 0xF0000002
> mii_data 0x5F821200
> mii_speed 0x0000001C
> mib_control 0x40000000
> r_cntrl 0x05EE0024
> r_hash 0x8A000000
> x_cntrl 0x00000004
> paddr1 0x00E0BA90
> paddr2 0x07DC8808
> op_pause 0x00010020
> iaddr1 0x00000000
> iaddr2 0x00000000
> gaddr1 0x00400000
> gaddr2 0x00000000
> x_wmrk 0x00000000
> rfifo_status 0x214E0000
> rfifo_cntrl 0x0F240000
> rfifo_lrf_ptr 0x0000005D
> rfifo_lwf_ptr 0x0000038D
> rfifo_alarm 0x0000030C
> rfifo_rdptr 0x0000005D
> rfifo_wrptr 0x0000005D
> tfifo_status 0x00030000
> tfifo_cntrl 0x0F200000
> tfifo_lrf_ptr 0x0000023C
> tfifo_lwf_ptr 0x0000023C
> tfifo_alarm 0x00000100
> tfifo_rdptr 0x0000023C
> tfifo_wrptr 0x0000023C
> reset_cntrl 0x01000000
> xmit_fsm 0x03000000
>
> === FEC driver vars ===
> MBAR 0xF0000000
> MBAR SIZE 0x10000000
> queue_stopped 1
> mpc5xxx_bdi_tx 73 (x49)
> mpc5xxx_bdi_rx 97 (x61)
> adr(tx_fifo_skb) c0233744
> adr(tx_fifo_skb[0]) c0233744
> adr(tx_fifo_skb[1]) c0233748
> tx_fifo_skb[0] c2474f20
> tx_fifo_skb[1] c24846e0
> sizeof(tx_fifo_skb) 1024
> sizeof(tx_fifo_skb[0]) 4
> MPC5xxx_FEC_TBD_NUM 256
> adr(rx_fifo_skb) c0233b44
> sizeof(rx_fifo_skb) 1024
> sizeof(rx_fifo_skb[0]) 4
> MPC5xxx_FEC_RBD_NUM 256
> full_duplex 1
> tx_full 1
> r_tasknum 3
> t_tasknum 2
> r_irq 24
> t_irq 23
> last_transmit_time 0
> last_receive_time 0
> phy_id 0x0015F442
> phy_id_done 1
> phy_status 0x00000000
> phy_speed 28
> sequence_done 0
> link 0
> duplex_change 0
> link_up 0
> old_status 0x00000000
>
> === BDHead Table ===
> TASK #0
> [0xC0232D74] = 0x00
> [0xC0232D75] = 0x00
> [0xC0232D76] = 0x00
> [0xC0232D77] = 0x00
>
> TASK #1
> [0xC0232D78] = 0x00
> [0xC0232D79] = 0x00
> [0xC0232D7A] = 0x00
> [0xC0232D7B] = 0x00
>
> TASK #2 - tx task
> [0xC0232D7C] = 0x00
> [0xC0232D7D] = 0x00
> [0xC0232D7E] = 0x00
> [0xC0232D7F] = 0x49
> --> actual tx index: 0x49->73
>
> TASK #3 - rx task
> [0xC0232D80] = 0x00
> [0xC0232D81] = 0x00
> [0xC0232D82] = 0x00
> [0xC0232D83] = 0x61
> --> actual rx index: 0x61->97
>
> [0xC0232D84] = 0x00
> [0xC0232D85] = 0x00
> [0xC0232D86] = 0x00
> ...
>
> === TaskBDIdxTable ===
> TASK#0
> numBD [0xC0232DF4] = 0x0000
> numPtr [0xC0232DF6] = 0x00
> apiConfig [0xC0232DF7] = 0x00
> BDTablePtr [0xC0232DF8] = 0x00000000
> BDStartPtr [0xC0232DFC] = 0x00000000
> currBDInUse[0xC0232E00] = 0x0000
> [0xC0232E02] = 0x00
> [0xC0232E03] = 0x00
>
> TASK#1
> [0xC0232E04] = 0x00
> [0xC0232E05] = 0x00
> [0xC0232E06] = 0x00
> [0xC0232E07] = 0x00
> [0xC0232E08] = 0x00
> [0xC0232E09] = 0x00
> [0xC0232E0A] = 0x00
> [0xC0232E0B] = 0x00
> [0xC0232E0C] = 0x00
> [0xC0232E0D] = 0x00
> [0xC0232E0E] = 0x00
> [0xC0232E0F] = 0x00
> [0xC0232E10] = 0x00
> [0xC0232E11] = 0x00
> [0xC0232E12] = 0x00
> [0xC0232E13] = 0x00
>
> TASK#2 - tx task
> numBD [0xC0232E14] = 0x0100
> numPtr [0xC0232E16] = 0x01
> apiConfig [0xC0232E17] = 0x01
> BDTablePtr [0xC0232E18] = 0xF0009D00
> BDStartPtr [0xC0232E1C] = 0xF0008814
> currBDInUse[0xC0232E20] = 0x0100
> [0xC0232E22] = 0x00
> [0xC0232E23] = 0x00
>
> TASK#3 - rx task
> numBD [0xC0232E24] = 0x0100
> numPtr [0xC0232E26] = 0x01
> apiConfig [0xC0232E27] = 0x00
> BDTablePtr [0xC0232E28] = 0xF0009500
> BDStartPtr [0xC0232E2C] = 0xF0008890
> currBDInUse[0xC0232E30] = 0x0100
> [0xC0232E32] = 0x00
> [0xC0232E33] = 0x00
>
> TASK#4
> [0xC0232E34] = 0x00
> [0xC0232E35] = 0x00
> ...
>
> === Tx Descriptor Ring ===
> IDX 0x49-(73) is interesting, because active (see BDHeadTable).
> So our descriptor is as Addr 0xF0009D00 + (0x49 * 0x8) = 0xF0009F48
>
> IDX 0
> [0xF0009D00] = 0x4C00003C
> [0xF0009D04] = 0x02475922
>
> IDX 1
> [0xF0009D08] = 0x4C00003C
> [0xF0009D0C] = 0x02475CA2
> ...
> IDX 72
> [0xF0009F40] = 0x4C00003C
> [0xF0009F44] = 0x024665A2
>
> IDX 73 - active descriptor
> [0xF0009F48] = 0x4C00004E - owns BestComm, should transfer
> [0xF0009F4C] = 0x0243F05E
>
> IDX 74
> [0xF0009F50] = 0x4C00004E
> [0xF0009F54] = 0x0243F85E
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
>
>
--
View this message in context: http://www.nabble.com/MPC5200-ethernet-communication-stops-unexpected-tf3620140.html#a10220160
Sent from the linuxppc-embedded mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: MPC5200 ethernet communication stops unexpected
2007-04-27 14:35 ` tacitus
@ 2007-04-27 15:42 ` Daniel Schnell
2007-04-27 17:09 ` Eberhard Stoll
1 sibling, 0 replies; 14+ messages in thread
From: Daniel Schnell @ 2007-04-27 15:42 UTC (permalink / raw)
To: tacitus, linuxppc-embedded
Hi,
please post relevant topics on the mailing list in English. We are also =
interested in anything concerning MPC5200B ethernet problems.
Daniel.
-----Original Message-----
From: linuxppc-embedded-bounces+daniel.schnell=3Dmarel.com@ozlabs.org =
[mailto:linuxppc-embedded-bounces+daniel.schnell=3Dmarel.com@ozlabs.org] =
On Behalf Of tacitus
Sent: 27. apr=EDl 2007 14:36
To: linuxppc-embedded@ozlabs.org
Subject: Re: MPC5200 ethernet communication stops unexpected
Hallo Eberhard,=20
I've the same problem by running MPC5200B, but under MQX ...
But my first question, can we chat via german ?
ciao
mark=20
Eberhard Stoll wrote:
>=20
> Hello,
> can someone help me?
> I have some problem with our MPC5200 board running denx linux kernel
> 2.4.25 with ethernet communication.
>=20
> My problem is that our board stops receiving and transmitting any=20
> ethernet frames suddenly and unexpected. No pings - nothing is getting =
> thru the ethernet any more! The rest of the controller is running =
well.
> This situation is very rare - only a constellation with 6 Controllers=20
> and a special ethernet communication load leads to this fault - in=20
> about one day!
>=20
> When i check how many tx and rx buffers are used in the BestComm=20
> buffer descriptor ring (via TaskBDInUse() call) which hold pointers to =
> transmit and receive data, i see all tx and rx buffers are in use!=20
> When i look at the FEC Tx Fifo Status Register i get the value=20
> 0x00030000. This means FEC Tx Fifo empty!
> When i check fec Task Control register i see Tx and Rx Task are=20
> enabled, current Pointer register points to 0xF000828C which belongs=20
> to FEC Tx BestComm Microcode.
> When i now examine the status value of my currently active buffer=20
> descriptor for Tx task, it shows that this buffer belongs to BestComm.
> In my opinion BestComm shold tranfer this buffer to the empty FEC Tx=20
> Fifo - but it doesn't!
> Now the question for me is: WHY?
>=20
> Do i oversee some bits in the registers/ram locations which lead to=20
> this situation?
> I don't know much about BestComm/FEC, but my conclusion out of this is =
> BestComm is stuck somewhere out of some reason. Is this a hardware=20
> issue? Does someone know? Has someone similar problems?
>=20
> Can someone help me or check if i'm right with my conclusion (see=20
> registers and ram locations down)? Or give me a hint where to look=20
> further.
> Any help is very welcome!
>=20
> Many Thanks,
> Eberhard
>=20
> PS: I saw this problem only on MPC5200B processors until now. And we=20
> use BestComm Api 2.2 (the newest denx 2.4.x code for bestcomm)
>=20
> Here are some registers and ram addresses i collected in this =
situation:
>=20
> BTW: I saw BestComm Microcode changing in 16k SRAM after the first=20
> Ethernet frames were sent. This seems very strange to me - but maybe=20
> is correct behaviour - i don't know. Does someone of you?
> Its the address 0xf00082e8 (in FEC Tx BestComm microcode). After reset =
> (no frame sent) the ram at this address is 0x088AC398 after the=20
> first(?) ethernet frame it's 0x0C8AC398!
>=20
> =3D=3D=3D BestComm registers =3D=3D=3D
> taskBar 0xF0008000
> currentPointer 0xF000828C
> endPointer 0x00000000
> variablePointer 0xF0008800
> IntVect1 0x0000000F
> IntVect2 0x0000000F
> PtdCntrl 0x00000001
> IntPend 0x00000000
> IntMask 0xFFFFFFF3
> tcr_0 0x0000
> tcr_1 0x0000
> tcr_2 0xE082
> tcr_3 0xC383
> tcr_4 0x0000
> tcr_5 0x0000
> tcr_6 0x0000
> tcr_7 0x0000
> tcr_8 0x0000
> tcr_9 0x0000
> tcr_a 0x0000
> tcr_b 0x0000
> tcr_c 0x0000
> tcr_d 0x0000
> tcr_e 0x0000
> tcr_f 0x0000
> IPR0 0x07
> IPR1 0x00
> IPR2 0x00
> IPR3 0x04
> IPR4 0x03
> IPR5 0x06
> IPR6 0x05
> IPR7 0x00
> IPR8 0x00
> IPR9 0x00
> IPR10 0x00
> IPR11 0x00
> IPR12 0x00
> IPR13 0x00
> IPR14 0x00
> IPR15 0x00
> IPR16 0x00
> IPR17 0x00
> IPR18 0x00
> IPR19 0x00
> IPR20 0x00
> IPR21 0x00
> IPR22 0x00
> IPR23 0x00
> IPR24 0x00
> IPR25 0x00
> IPR26 0x00
> IPR27 0x00
> IPR28 0x00
> IPR29 0x00
> IPR30 0x00
> IPR31 0x00
> task_size0 0x00000000
> task_size1 0x00000000
> MDEDebug 0x01000008
> ADSDebug 0x00000000
> Value1 0x00000000
> Value2 0x00000000
> Control 0x00000000
> Status 0x00000000
> EU00 0x04155519
> EU01 0x00000000
> EU02 0x00000000
> EU03 0x00000000
> EU04 0x00000000
> EU05 0x00000000
> EU06 0x00000000
> EU07 0x00000000
> EU10 0x00000000
> EU11 0x00000000
> EU12 0x00000000
> EU13 0x00000000
> EU14 0x00000000
> EU15 0x00000000
> EU16 0x00000000
> EU17 0x00000000
> EU20 0x00000000
> EU21 0x00000000
> EU22 0x00000000
> EU23 0x00000000
> EU24 0x00000000
> EU25 0x00000000
> EU26 0x00000000
> EU27 0x00000000
> EU30 0x00000000
> EU31 0x00000000
> EU32 0x00000000
> EU33 0x00000000
> EU34 0x00000000
> EU35 0x00000000
> EU36 0x00000000
> EU37 0x00000000
>=20
> =3D=3D=3D Return values of BestComm Api Functions =3D=3D=3D
> t_tasknum =3D 2
> TaskBDInUse(tx) =3D 256
> TaskStatus(tx) =3D run (0x00008000)
> TaskIntPending(tx) =3D 0
> r_tasknum =3D 3
> TaskBDInUse(rx) =3D 256
> TaskStatus(rx) =3D run (0x00008000)
> TaskIntPending(rx) =3D 0
> TasksGetSramOffset =3D 0x00002500
> task 0: stop int: 0x00000000
> task 1: stop int: 0x00000000
> task 2: run int: 0x00000000
> task 3: run int: 0x00000000
> task 4: stop int: 0x00000000
> task 5: stop int: 0x00000000
> task 6: stop int: 0x00000000
> task 7: stop int: 0x00000000
> task 8: stop int: 0x00000000
> task 9: stop int: 0x00000000
> task 10: stop int: 0x00000000
> task 11: stop int: 0x00000000
> task 12: stop int: 0x00000000
> task 13: stop int: 0x00000000
> task 14: stop int: 0x00000000
> task 15: stop int: 0x00000000
>=20
> =3D=3D=3D FEC Registers =3D=3D=3D
> fec base addr f0003000
> fec_id 0x00000000
> ievent 0x08000000
> imask 0xF0FE0000
> r_des_active 0x00000000
> x_des_active 0x00000000
> ecntrl 0xF0000002
> mii_data 0x5F821200
> mii_speed 0x0000001C
> mib_control 0x40000000
> r_cntrl 0x05EE0024
> r_hash 0x8A000000
> x_cntrl 0x00000004
> paddr1 0x00E0BA90
> paddr2 0x07DC8808
> op_pause 0x00010020
> iaddr1 0x00000000
> iaddr2 0x00000000
> gaddr1 0x00400000
> gaddr2 0x00000000
> x_wmrk 0x00000000
> rfifo_status 0x214E0000
> rfifo_cntrl 0x0F240000
> rfifo_lrf_ptr 0x0000005D
> rfifo_lwf_ptr 0x0000038D
> rfifo_alarm 0x0000030C
> rfifo_rdptr 0x0000005D
> rfifo_wrptr 0x0000005D
> tfifo_status 0x00030000
> tfifo_cntrl 0x0F200000
> tfifo_lrf_ptr 0x0000023C
> tfifo_lwf_ptr 0x0000023C
> tfifo_alarm 0x00000100
> tfifo_rdptr 0x0000023C
> tfifo_wrptr 0x0000023C
> reset_cntrl 0x01000000
> xmit_fsm 0x03000000
>=20
> =3D=3D=3D FEC driver vars =3D=3D=3D
> MBAR 0xF0000000
> MBAR SIZE 0x10000000
> queue_stopped 1
> mpc5xxx_bdi_tx 73 (x49)
> mpc5xxx_bdi_rx 97 (x61)
> adr(tx_fifo_skb) c0233744
> adr(tx_fifo_skb[0]) c0233744
> adr(tx_fifo_skb[1]) c0233748
> tx_fifo_skb[0] c2474f20
> tx_fifo_skb[1] c24846e0
> sizeof(tx_fifo_skb) 1024
> sizeof(tx_fifo_skb[0]) 4
> MPC5xxx_FEC_TBD_NUM 256
> adr(rx_fifo_skb) c0233b44
> sizeof(rx_fifo_skb) 1024
> sizeof(rx_fifo_skb[0]) 4
> MPC5xxx_FEC_RBD_NUM 256
> full_duplex 1
> tx_full 1
> r_tasknum 3
> t_tasknum 2
> r_irq 24
> t_irq 23
> last_transmit_time 0
> last_receive_time 0
> phy_id 0x0015F442
> phy_id_done 1
> phy_status 0x00000000
> phy_speed 28
> sequence_done 0
> link 0
> duplex_change 0
> link_up 0
> old_status 0x00000000
>=20
> =3D=3D=3D BDHead Table =3D=3D=3D
> TASK #0
> [0xC0232D74] =3D 0x00
> [0xC0232D75] =3D 0x00
> [0xC0232D76] =3D 0x00
> [0xC0232D77] =3D 0x00
>=20
> TASK #1
> [0xC0232D78] =3D 0x00
> [0xC0232D79] =3D 0x00
> [0xC0232D7A] =3D 0x00
> [0xC0232D7B] =3D 0x00
>=20
> TASK #2 - tx task
> [0xC0232D7C] =3D 0x00
> [0xC0232D7D] =3D 0x00
> [0xC0232D7E] =3D 0x00
> [0xC0232D7F] =3D 0x49
> --> actual tx index: 0x49->73
>=20
> TASK #3 - rx task
> [0xC0232D80] =3D 0x00
> [0xC0232D81] =3D 0x00
> [0xC0232D82] =3D 0x00
> [0xC0232D83] =3D 0x61
> --> actual rx index: 0x61->97
>=20
> [0xC0232D84] =3D 0x00
> [0xC0232D85] =3D 0x00
> [0xC0232D86] =3D 0x00
> ...
>=20
> =3D=3D=3D TaskBDIdxTable =3D=3D=3D
> TASK#0
> numBD [0xC0232DF4] =3D 0x0000
> numPtr [0xC0232DF6] =3D 0x00
> apiConfig [0xC0232DF7] =3D 0x00
> BDTablePtr [0xC0232DF8] =3D 0x00000000
> BDStartPtr [0xC0232DFC] =3D 0x00000000
> currBDInUse[0xC0232E00] =3D 0x0000
> [0xC0232E02] =3D 0x00
> [0xC0232E03] =3D 0x00
>=20
> TASK#1
> [0xC0232E04] =3D 0x00
> [0xC0232E05] =3D 0x00
> [0xC0232E06] =3D 0x00
> [0xC0232E07] =3D 0x00
> [0xC0232E08] =3D 0x00
> [0xC0232E09] =3D 0x00
> [0xC0232E0A] =3D 0x00
> [0xC0232E0B] =3D 0x00
> [0xC0232E0C] =3D 0x00
> [0xC0232E0D] =3D 0x00
> [0xC0232E0E] =3D 0x00
> [0xC0232E0F] =3D 0x00
> [0xC0232E10] =3D 0x00
> [0xC0232E11] =3D 0x00
> [0xC0232E12] =3D 0x00
> [0xC0232E13] =3D 0x00
>=20
> TASK#2 - tx task
> numBD [0xC0232E14] =3D 0x0100
> numPtr [0xC0232E16] =3D 0x01
> apiConfig [0xC0232E17] =3D 0x01
> BDTablePtr [0xC0232E18] =3D 0xF0009D00
> BDStartPtr [0xC0232E1C] =3D 0xF0008814
> currBDInUse[0xC0232E20] =3D 0x0100
> [0xC0232E22] =3D 0x00
> [0xC0232E23] =3D 0x00
>=20
> TASK#3 - rx task
> numBD [0xC0232E24] =3D 0x0100
> numPtr [0xC0232E26] =3D 0x01
> apiConfig [0xC0232E27] =3D 0x00
> BDTablePtr [0xC0232E28] =3D 0xF0009500
> BDStartPtr [0xC0232E2C] =3D 0xF0008890
> currBDInUse[0xC0232E30] =3D 0x0100
> [0xC0232E32] =3D 0x00
> [0xC0232E33] =3D 0x00
>=20
> TASK#4
> [0xC0232E34] =3D 0x00
> [0xC0232E35] =3D 0x00
> ...
>=20
> =3D=3D=3D Tx Descriptor Ring =3D=3D=3D
> IDX 0x49-(73) is interesting, because active (see BDHeadTable).
> So our descriptor is as Addr 0xF0009D00 + (0x49 * 0x8) =3D 0xF0009F48
>=20
> IDX 0
> [0xF0009D00] =3D 0x4C00003C
> [0xF0009D04] =3D 0x02475922
>=20
> IDX 1
> [0xF0009D08] =3D 0x4C00003C
> [0xF0009D0C] =3D 0x02475CA2
> ...
> IDX 72
> [0xF0009F40] =3D 0x4C00003C
> [0xF0009F44] =3D 0x024665A2
>=20
> IDX 73 - active descriptor
> [0xF0009F48] =3D 0x4C00004E - owns BestComm, should transfer=20
> [0xF0009F4C] =3D 0x0243F05E
>=20
> IDX 74
> [0xF0009F50] =3D 0x4C00004E
> [0xF0009F54] =3D 0x0243F85E
>=20
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email=20
> ______________________________________________________________________
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
>=20
>=20
--
View this message in context: =
http://www.nabble.com/MPC5200-ethernet-communication-stops-unexpected-tf3=
620140.html#a10220160
Sent from the linuxppc-embedded mailing list archive at Nabble.com.
_______________________________________________
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-04-27 14:35 ` tacitus
2007-04-27 15:42 ` Daniel Schnell
@ 2007-04-27 17:09 ` Eberhard Stoll
2007-04-27 17:22 ` David Kanceruk
2007-04-30 20:49 ` Wolfgang Denk
1 sibling, 2 replies; 14+ messages in thread
From: Eberhard Stoll @ 2007-04-27 17:09 UTC (permalink / raw)
To: linuxppc-embedded
Hi,
> I've the same problem by running MPC5200B, but under MQX ...
>
Did you check/compare already some registers with mine?
Maybe it could be a hint for a hardware problem in the (RevB) processor
- or a hint for two different drivers which have the same problem :-)
>> This situation is very rare - only a constellation with 6 Controllers
>> and a special ethernet communication load leads to this fault - in about
>> one day!
>>
No more! Now i can reproduce the fault in between 10 to 30 minutes in my
office!
Try this:
1) Use two mpc5200 boards and a pc which can telnet into the
controllers. I use 100M
FullDuplex settings for the link.
2) Now telnet into the controllers and start on all controllers a flood
ping. E.g:
# ping -f 10.255.226.70
on 10.255.226.71 and a
# ping -f 10.255.226.71
on 10.255.226.70. So both controllers ping each others. You can
watch the
dots in your telnet session.
3) After some minutes (about 10 to 30 minutes) one of the two
controllers doesn't
respond any more. It shows a slow blinking rx led, the link is
still ok, but the
controller doesn't respond any more. In the telnet session you can see
running points on the other controller mixed with some 'E's. In the
telnet
session on the other controller you see Ethernet is dead.
=====================================================
Congretulations, now you have reproduced the fault!
=====================================================
It seems that this fault only occurs if the ping is done thru a telnet
session. Logging into the controller on the serial console and doing the
pings won't trigger the fault (at least not so fast). This might be a
timing issue or has to do with Ethernet utilisation. ping -f prints some
points for transmitted ethernet frames. In case of a telnet session ping
produces 2 packets at the same time. One for the ping and another for
the dot in telnet!
If i do a 'ping -f 10.255.226.71 > /dev/null', which suppresses the
output now run for hours ...
Eberhard
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-04-27 17:09 ` Eberhard Stoll
@ 2007-04-27 17:22 ` David Kanceruk
[not found] ` <4632369E.8010000@berghof.com>
2007-04-30 20:49 ` Wolfgang Denk
1 sibling, 1 reply; 14+ messages in thread
From: David Kanceruk @ 2007-04-27 17:22 UTC (permalink / raw)
To: Eberhard Stoll; +Cc: linuxppc-embedded
[-- Attachment #1: Type: text/plain, Size: 2953 bytes --]
Hi,
Do your boards use the MPC5200B or MPC5200? Also, do you ever see any
corrupted data? (I'm guessing you would have mentioned it if you did see
corrupted data)
Regards,
Dave
On 4/27/07, Eberhard Stoll <eberhard.stoll@berghof.com> wrote:
>
> Hi,
> > I've the same problem by running MPC5200B, but under MQX ...
> >
> Did you check/compare already some registers with mine?
> Maybe it could be a hint for a hardware problem in the (RevB) processor
> - or a hint for two different drivers which have the same problem :-)
> >> This situation is very rare - only a constellation with 6 Controllers
> >> and a special ethernet communication load leads to this fault - in
> about
> >> one day!
> >>
> No more! Now i can reproduce the fault in between 10 to 30 minutes in my
> office!
>
> Try this:
> 1) Use two mpc5200 boards and a pc which can telnet into the
> controllers. I use 100M
> FullDuplex settings for the link.
> 2) Now telnet into the controllers and start on all controllers a flood
> ping. E.g:
> # ping -f 10.255.226.70
> on 10.255.226.71 and a
> # ping -f 10.255.226.71
> on 10.255.226.70. So both controllers ping each others. You can
> watch the
> dots in your telnet session.
> 3) After some minutes (about 10 to 30 minutes) one of the two
> controllers doesn't
> respond any more. It shows a slow blinking rx led, the link is
> still ok, but the
> controller doesn't respond any more. In the telnet session you can see
> running points on the other controller mixed with some 'E's. In the
> telnet
> session on the other controller you see Ethernet is dead.
> =====================================================
> Congretulations, now you have reproduced the fault!
> =====================================================
>
> It seems that this fault only occurs if the ping is done thru a telnet
> session. Logging into the controller on the serial console and doing the
> pings won't trigger the fault (at least not so fast). This might be a
> timing issue or has to do with Ethernet utilisation. ping -f prints some
> points for transmitted ethernet frames. In case of a telnet session ping
> produces 2 packets at the same time. One for the ping and another for
> the dot in telnet!
> If i do a 'ping -f 10.255.226.71 > /dev/null', which suppresses the
> output now run for hours ...
>
> Eberhard
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
>
--
David Kanceruk
"The generation of random numbers is far too important to be left to
chance."
[-- Attachment #2: Type: text/html, Size: 4087 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
[not found] ` <4632369E.8010000@berghof.com>
@ 2007-04-27 18:15 ` David Kanceruk
0 siblings, 0 replies; 14+ messages in thread
From: David Kanceruk @ 2007-04-27 18:15 UTC (permalink / raw)
To: Eberhard Stoll; +Cc: linuxppc-embedded
[-- Attachment #1: Type: text/plain, Size: 3768 bytes --]
Hi Eberhard,
I have the problem with corrupted data on the MPC5200B. I traced the
problem to the BestComm driver. I see the correct data in the socket buffer
for the icmp reply but when BestComm puts the data into the Tx FIFO of the
fec, it sometimes puts the wrong byte in location 0 (which corrupts the
ethernet destination address) or location 62 (which corrupts the ping data).
These locations are always the same locations. Here is the contents of some
sample buffers I captured after gracefully stopping the fec transmitter (so
I could examine the FIFO without the fec doing a reset):
icmp csum = 7c2b
skb->data = 14
skb->dst->output = c0175924
neigh_hh_output calls = c0162b20
calling neigh->ops->queue_xmit c015c360
calling q->enqueue c0168034
dev->hard_start_xmit = c013ff3c
before - fec_hard_start_xmit 1 skb->data = 14 ----------------- this is
at offset 62
after - fec_hard_start_xmit 1 skb->data = 14 ----------------- still
has the correct value
start_addr = 00000160, end_addr = 000001C6, buf_len = 00000066
FEC IEVENT- 00000000
00003923 CD3F0050 C2263023 08004500 ----------------- this is what is in
the fec Tx FIFO
005437E3 00004001 BF72C0A8 0102C0A8
01010000 7C2B064A 000405C4 27466579
00000809 0A0B0C0D 0E0F1011 12130015 ---------------- second last byte is
00 --- this is bad!
16171819 1A1B1C1D 1E1F2021 22232425
26272829 2A2B2C2D 2E2F3031 32333435
FEC IEVENT- 00000000
icmp csum = 3a2a
skb->data = 14
skb->dst->output = c0175924
neigh_hh_output calls = c0162b20
calling neigh->ops->queue_xmit c015c360
calling q->enqueue c0168034
dev->hard_start_xmit = c013ff3c
before - fec_hard_start_xmit 1 skb->data = 14
after - fec_hard_start_xmit 1 skb->data = 14
start_addr = 000001C6, end_addr = 0000022C, buf_len = 00000066
FEC IEVENT- 00000000
00003923 CD3F0050 C2263023 08004500
005437E4 00004001 BF71C0A8 0102C0A8
01010000 3A2A064A 000506C4 2746A679
00000809 0A0B0C0D 0E0F1011 12131415 ---------------- second last byte is
14 --- this is good this time!
16171819 1A1B1C1D 1E1F2021 22232425
26272829 2A2B2C2D 2E2F3031 32333435
FEC IEVENT- 00000000
I think we need to focus on how the BestComm driver works now. I wonder if
there are any experts out there that might know what could be wrong?
Best regards,
Dave
On 4/27/07, Eberhard Stoll <eberhard.stoll@berghof.com> wrote:
>
> Hi,
> > Do your boards use the MPC5200B or MPC5200? Also, do you ever see
> > any corrupted data? (I'm guessing you would have mentioned it if you
> > did see corrupted data)
> I use MPC5200B processors. With MPC5200(A) processors i don't get this
> error!
> I didn't recognize corrupted data 'til now. But this could be - now i'm
> sending only pings and sometimes get output like this on the console:
> -- 8< --
>
> ..............................................................................tcp_recheck_csum:
> seq 0x5881325f retransmit, csum 0x232b OK?
> .............................................tcp_recheck_csum: seq
> 0x58813dd6 retransmit, csum 0xd1ad OK?
> ...............tcp_recheck_csum: seq 0x58814286 retransmit, csum 0xee5b
> OK?
> .....................tcp_recheck_csum: seq 0x58814982 retransmit, csum
> 0x8528 OK?
> ..............................................tcp_recheck_csum: seq
> 0x58815601 retransmit, csum 0x6ee1 OK?
> -- 8< --
> but i don't know what it means and where it comes from. Does someone know?
>
> Eberhard
>
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
--
David Kanceruk
"The generation of random numbers is far too important to be left to
chance."
[-- Attachment #2: Type: text/html, Size: 4533 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-04-27 17:09 ` Eberhard Stoll
2007-04-27 17:22 ` David Kanceruk
@ 2007-04-30 20:49 ` Wolfgang Denk
1 sibling, 0 replies; 14+ messages in thread
From: Wolfgang Denk @ 2007-04-30 20:49 UTC (permalink / raw)
To: Eberhard Stoll; +Cc: linuxppc-embedded
In message <46322E3F.8050508@berghof.com> you wrote:
>
> It seems that this fault only occurs if the ping is done thru a telnet
> session. Logging into the controller on the serial console and doing the
> pings won't trigger the fault (at least not so fast). This might be a
According to our experience, it's actually not the "ping" which is
triggering the problem, but the telnet output you see (i. e. TCP/IP
traffic).
You will probaly see the same behaviour by running someting like
"telnet target_ip chargen >/dev/null" instead of the flood ping
(assuming you have enabled the chargen service on the target.
> If i do a 'ping -f 10.255.226.71 > /dev/null', which suppresses the
> output now run for hours ...
See above.
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, CEO: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
Bradley's Bromide: If computers get too powerful, we can organize
them into a committee - that will do them in.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
[not found] <1179234357.1668.10.camel@pc-hans>
@ 2007-05-15 15:03 ` David Kanceruk
2007-05-16 6:56 ` Sylvain Munaut
2007-05-21 7:43 ` Eberhard Stoll
0 siblings, 2 replies; 14+ messages in thread
From: David Kanceruk @ 2007-05-15 15:03 UTC (permalink / raw)
To: Hans Thielemans; +Cc: linuxppc-embedded
Hello Hans,
Our problem was with the FEC sending data with one or two
incorrect bytes when we switched from the MPC5200 to the MPC5200B. The
byte positions were always the same. The socket buffer has the correct
data before and after the DMA engine runs but the FEC TxFIFO does not
always match.
One solution to our problem was to make the following call prior to
starting the DMA:
flush_dcache_range((unsigned long)skb->data, (unsigned long)skb->data
+ skb->len);
The other solution was to set the BSDIS bit in the XLB config register
during initialization as follows:
xlb = (struct mpc52xx_xlb *)MPC5xxx_XLB;
out_be32(&xlb->config, in_be32(&xlb->config) | MPC52xx_XLB_CFG_BSDIS);
Either solution works for us. The BSDIS bit is a new feature in the
MPC5200B. The MPC5200 did not have this bit.
According to the Freescale documentation, (Application note AN3045,
for instance) setting this bit is supposed to "disable" BestComm bus
snooping. However, I have reason to believe the documentation is in
error. Everything I have observed seems to indicate that in the
MPC5200 BestComm bus snooping was always enabled or enabled via some
other means. In the MPC5200B it appears to be "disabled" at reset (not
"enabled" as the documentation states). This is why flushing the cache
manually is one solution. Since setting the BSDIS bit also fixes the
problem, it suggests that this actually "enables" BestComm bus
snooping instead of disabling it. In my mind, it could all boil down
to a simple documentation error.
Perhaps you are also experiencing a caching problem.
Best regards,
David Kanceruk
On 5/15/07, Hans Thielemans <hans.thielemans@metris.com> wrote:
> Hi David,
>
> I have a similar problem. I use the PSC for communication to a DSP.
> With the MPC5200 this has always worked. Now we got boards with the
> MCP5200B in place.
>
> The bestcomm dma seems to miss bits, bytes in the last word (32bit) of a
> dma block. Mostly it is one byte which becomes 0. The blocks are 256
> bytes and written/read by 32 bits.
> The behavior is influenced by cpu activity, bus priorities. So far I
> found no settings which have never errors.
>
> Did you have any further progress?
>
> Regards
>
> Hans Thielemans
>
--
David Kanceruk
"The generation of random numbers is far too important to be left to chance."
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: MPC5200 ethernet communication stops unexpected
@ 2007-05-15 15:16 Hans Thielemans
2007-05-15 15:40 ` David Kanceruk
0 siblings, 1 reply; 14+ messages in thread
From: Hans Thielemans @ 2007-05-15 15:16 UTC (permalink / raw)
To: David Kanceruk; +Cc: linuxppc-embedded
Hello David,
In this case, I am flushing cache. And overall, these are empty packets
sent which are never changed.
The cpu creates this once and then this is always reused. It is received
maybe 100000 times correct and
then suddenly I see an error in the last word.=20
I also tried playing with the BSDIS and PLDIS bits, and with the bus
priorities. This influences the error rate,
but it is never really gone.=20
As a hack I have now added an extra word after the packet, and have the
receiver ignore it. This seems to help,
but I don't like it.
Regards
Hans
-----Original Message-----
From: David Kanceruk [mailto:david.kanceruk@gmail.com]=20
Sent: dinsdag 15 mei 2007 17:04
To: Hans Thielemans
Cc: linuxppc-embedded@ozlabs.org
Subject: Re: MPC5200 ethernet communication stops unexpected
Hello Hans,
Our problem was with the FEC sending data with one or two incorrect
bytes when we switched from the MPC5200 to the MPC5200B. The byte
positions were always the same. The socket buffer has the correct data
before and after the DMA engine runs but the FEC TxFIFO does not always
match.
One solution to our problem was to make the following call prior to
starting the DMA:
flush_dcache_range((unsigned long)skb->data, (unsigned long)skb->data
+ skb->len);
The other solution was to set the BSDIS bit in the XLB config register
during initialization as follows:
xlb =3D (struct mpc52xx_xlb *)MPC5xxx_XLB;
out_be32(&xlb->config, in_be32(&xlb->config) |
MPC52xx_XLB_CFG_BSDIS);
Either solution works for us. The BSDIS bit is a new feature in the
MPC5200B. The MPC5200 did not have this bit.
According to the Freescale documentation, (Application note AN3045, for
instance) setting this bit is supposed to "disable" BestComm bus
snooping. However, I have reason to believe the documentation is in
error. Everything I have observed seems to indicate that in the MPC5200
BestComm bus snooping was always enabled or enabled via some other
means. In the MPC5200B it appears to be "disabled" at reset (not
"enabled" as the documentation states). This is why flushing the cache
manually is one solution. Since setting the BSDIS bit also fixes the
problem, it suggests that this actually "enables" BestComm bus snooping
instead of disabling it. In my mind, it could all boil down to a simple
documentation error.
Perhaps you are also experiencing a caching problem.
Best regards,
David Kanceruk
On 5/15/07, Hans Thielemans <hans.thielemans@metris.com> wrote:
> Hi David,
>
> I have a similar problem. I use the PSC for communication to a DSP.
> With the MPC5200 this has always worked. Now we got boards with the=20
> MCP5200B in place.
>
> The bestcomm dma seems to miss bits, bytes in the last word (32bit) of
> a dma block. Mostly it is one byte which becomes 0. The blocks are 256
> bytes and written/read by 32 bits.
> The behavior is influenced by cpu activity, bus priorities. So far I=20
> found no settings which have never errors.
>
> Did you have any further progress?
>
> Regards
>
> Hans Thielemans
>
--
David Kanceruk
"The generation of random numbers is far too important to be left to
chance."
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-05-15 15:16 Hans Thielemans
@ 2007-05-15 15:40 ` David Kanceruk
0 siblings, 0 replies; 14+ messages in thread
From: David Kanceruk @ 2007-05-15 15:40 UTC (permalink / raw)
To: Hans Thielemans; +Cc: linuxppc-embedded
Hello Hans,
There sure seems to be strange behavior in the BestComm unit. Our
problem was always with the first byte or the 63rd byte on a buffer of
102 bytes.
Did you use a value of (1<<16) for the BSDIS bit? We also do a
ioremap when setting a pointer to the xlb.
Dave
On 5/15/07, Hans Thielemans <Hans.Thielemans@metris.com> wrote:
> Hello David,
>
> In this case, I am flushing cache. And overall, these are empty packets
> sent which are never changed.
> The cpu creates this once and then this is always reused. It is received
> maybe 100000 times correct and
> then suddenly I see an error in the last word.
>
> I also tried playing with the BSDIS and PLDIS bits, and with the bus
> priorities. This influences the error rate,
> but it is never really gone.
>
> As a hack I have now added an extra word after the packet, and have the
> receiver ignore it. This seems to help,
> but I don't like it.
>
> Regards
>
> Hans
>
> -----Original Message-----
> From: David Kanceruk [mailto:david.kanceruk@gmail.com]
> Sent: dinsdag 15 mei 2007 17:04
> To: Hans Thielemans
> Cc: linuxppc-embedded@ozlabs.org
> Subject: Re: MPC5200 ethernet communication stops unexpected
>
> Hello Hans,
>
> Our problem was with the FEC sending data with one or two incorrect
> bytes when we switched from the MPC5200 to the MPC5200B. The byte
> positions were always the same. The socket buffer has the correct data
> before and after the DMA engine runs but the FEC TxFIFO does not always
> match.
>
> One solution to our problem was to make the following call prior to
> starting the DMA:
>
> flush_dcache_range((unsigned long)skb->data, (unsigned long)skb->data
> + skb->len);
>
> The other solution was to set the BSDIS bit in the XLB config register
> during initialization as follows:
>
> xlb = (struct mpc52xx_xlb *)MPC5xxx_XLB;
> out_be32(&xlb->config, in_be32(&xlb->config) |
> MPC52xx_XLB_CFG_BSDIS);
>
> Either solution works for us. The BSDIS bit is a new feature in the
> MPC5200B. The MPC5200 did not have this bit.
>
> According to the Freescale documentation, (Application note AN3045, for
> instance) setting this bit is supposed to "disable" BestComm bus
> snooping. However, I have reason to believe the documentation is in
> error. Everything I have observed seems to indicate that in the MPC5200
> BestComm bus snooping was always enabled or enabled via some other
> means. In the MPC5200B it appears to be "disabled" at reset (not
> "enabled" as the documentation states). This is why flushing the cache
> manually is one solution. Since setting the BSDIS bit also fixes the
> problem, it suggests that this actually "enables" BestComm bus snooping
> instead of disabling it. In my mind, it could all boil down to a simple
> documentation error.
>
> Perhaps you are also experiencing a caching problem.
>
> Best regards,
>
> David Kanceruk
>
> On 5/15/07, Hans Thielemans <hans.thielemans@metris.com> wrote:
> > Hi David,
> >
> > I have a similar problem. I use the PSC for communication to a DSP.
> > With the MPC5200 this has always worked. Now we got boards with the
> > MCP5200B in place.
> >
> > The bestcomm dma seems to miss bits, bytes in the last word (32bit) of
>
> > a dma block. Mostly it is one byte which becomes 0. The blocks are 256
>
> > bytes and written/read by 32 bits.
> > The behavior is influenced by cpu activity, bus priorities. So far I
> > found no settings which have never errors.
> >
> > Did you have any further progress?
> >
> > Regards
> >
> > Hans Thielemans
> >
>
>
>
> --
> David Kanceruk
>
> "The generation of random numbers is far too important to be left to
> chance."
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
>
>
--
David Kanceruk
"The generation of random numbers is far too important to be left to chance."
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-05-15 15:03 ` MPC5200 ethernet communication stops unexpected David Kanceruk
@ 2007-05-16 6:56 ` Sylvain Munaut
2007-05-21 7:43 ` Eberhard Stoll
1 sibling, 0 replies; 14+ messages in thread
From: Sylvain Munaut @ 2007-05-16 6:56 UTC (permalink / raw)
To: David Kanceruk; +Cc: Hans Thielemans, linuxppc-embedded
David Kanceruk wrote:
> Hello Hans,
>
> Our problem was with the FEC sending data with one or two
> incorrect bytes when we switched from the MPC5200 to the MPC5200B. The
> byte positions were always the same. The socket buffer has the correct
> data before and after the DMA engine runs but the FEC TxFIFO does not
> always match.
>
> One solution to our problem was to make the following call prior to
> starting the DMA:
>
> flush_dcache_range((unsigned long)skb->data, (unsigned long)skb->data
> + skb->len);
>
> The other solution was to set the BSDIS bit in the XLB config register
> during initialization as follows:
>
> xlb = (struct mpc52xx_xlb *)MPC5xxx_XLB;
> out_be32(&xlb->config, in_be32(&xlb->config) | MPC52xx_XLB_CFG_BSDIS);
>
> Either solution works for us. The BSDIS bit is a new feature in the
> MPC5200B. The MPC5200 did not have this bit.
>
> According to the Freescale documentation, (Application note AN3045,
> for instance) setting this bit is supposed to "disable" BestComm bus
> snooping. However, I have reason to believe the documentation is in
> error. Everything I have observed seems to indicate that in the
> MPC5200 BestComm bus snooping was always enabled or enabled via some
> other means. In the MPC5200B it appears to be "disabled" at reset (not
> "enabled" as the documentation states). This is why flushing the cache
> manually is one solution. Since setting the BSDIS bit also fixes the
> problem, it suggests that this actually "enables" BestComm bus
> snooping instead of disabling it. In my mind, it could all boil down
> to a simple documentation error.
>
That problem is _very_ weird ...
>From what I understand, Bestcomm XLB snooping means that when the
BestComm engine has some data cached internally and that it detects a write
to the address from where those data comes, he will invalidate his cache.
But when the kernel writes data to the skb buffer, they may partially
stay in cache so there won't be any transaction at all on the xlb bus.
It's when
bestcomm will read the skb, that the core will snoop the bus, detects
there is
a read request for some data he has in cache, force a retry of the
bestcomm read,
write the data to memory (via xlb), and finally let bestcomm retry the
transaction to fetch the good data.
So I guess what "could" happen is that :
- The kernel allocate a skb, but it ends up being as the same memory
location
as a "previous" one. (or maybe in a directly following position
because of
prefetch).
- You submit it to bestcomm
- When bestcomm does the read, since the skb was used "just before", the
line is still in cache but with the wrong data. Since the kernel just
wrote the
data, there was not yet a xlb transaction because the data are still in
cpu cache.
Bestcomm think he has the data (no xlb write so it's cache was not
invalidated),
so he doesn't generate a xlb read. But if there is no xlb read the core
doesn't get
a chance to snoop it and doesn't flush it's cache ...
Although that doesn't explain why setting BSDIS high solve the problem, nor
why there is only 1 byte wrong ...
Have you checked your XLB snoop window setting ? And that core snooping
is enabled ? Also that you don't use the "nap" power saving feature of the
core ? (it disables snooping altogether ...).
Sylvain
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: MPC5200 ethernet communication stops unexpected
@ 2007-05-16 7:29 Hans Thielemans
2007-05-16 15:25 ` John Rigby
0 siblings, 1 reply; 14+ messages in thread
From: Hans Thielemans @ 2007-05-16 7:29 UTC (permalink / raw)
To: Sylvain Munaut, David Kanceruk; +Cc: linuxppc-embedded
Hello Sylvain and David
I think it is a more basic problem then just cache. The setup is using
the psc2 and
psc3 in codec32 mode to communicate with a DSP. Because the MPC5200 had
problems with
the frame in slave mode (anomaly list), it is used in master mode, and
sends empty packets=20
of 256 bytes to keep the link active, so the DSP can send the data. This
because the send and
receive clocks and frames are the same on the mpc5200 side.
The empty packet is a fixed packet in memory, so it is never overwritten
by the mpc5200 once
the driver is initialized. So I can not believe in a cache problem. The
problem is always in the
last 32 bit word or last 4 bytes in the package. The error rate seems to
be influenced by cpu activity
and bus priorities.=20
I have now changed the protocol to send 260 bytes and just drop the last
4 bytes at the receiver.
This way I had it running this night, transmitting 50 GB without a
single error.
I would assume it has something to do with the bastcomm engine tasks at
the end of a dma block.
And probably something with the bus access. I tried several settings for
the arbiter and bus configurations
by changing the registers from within the bdi2000 debugger. Changing
behavior but no solution.
In the full system there are 6 bestcomm tasks active: fec rx and tx,
psc2 rx and tx and psc3 rx and tx.
Regards
Hans
-----Original Message-----
From: Sylvain Munaut [mailto:tnt@246tNt.com]=20
Sent: woensdag 16 mei 2007 8:57
To: David Kanceruk
Cc: Hans Thielemans; linuxppc-embedded@ozlabs.org
Subject: Re: MPC5200 ethernet communication stops unexpected
David Kanceruk wrote:
> Hello Hans,
>
> Our problem was with the FEC sending data with one or two=20
> incorrect bytes when we switched from the MPC5200 to the MPC5200B. The
> byte positions were always the same. The socket buffer has the correct
> data before and after the DMA engine runs but the FEC TxFIFO does not=20
> always match.
>
> One solution to our problem was to make the following call prior to=20
> starting the DMA:
>
> flush_dcache_range((unsigned long)skb->data, (unsigned long)skb->data
> + skb->len);
>
> The other solution was to set the BSDIS bit in the XLB config register
> during initialization as follows:
>
> xlb =3D (struct mpc52xx_xlb *)MPC5xxx_XLB;
> out_be32(&xlb->config, in_be32(&xlb->config) |=20
> MPC52xx_XLB_CFG_BSDIS);
>
> Either solution works for us. The BSDIS bit is a new feature in the=20
> MPC5200B. The MPC5200 did not have this bit.
>
> According to the Freescale documentation, (Application note AN3045,=20
> for instance) setting this bit is supposed to "disable" BestComm bus=20
> snooping. However, I have reason to believe the documentation is in=20
> error. Everything I have observed seems to indicate that in the=20
> MPC5200 BestComm bus snooping was always enabled or enabled via some=20
> other means. In the MPC5200B it appears to be "disabled" at reset (not
> "enabled" as the documentation states). This is why flushing the cache
> manually is one solution. Since setting the BSDIS bit also fixes the=20
> problem, it suggests that this actually "enables" BestComm bus=20
> snooping instead of disabling it. In my mind, it could all boil down=20
> to a simple documentation error.
> =20
That problem is _very_ weird ...
>From what I understand, Bestcomm XLB snooping means that when the
BestComm engine has some data cached internally and that it detects a
write to the address from where those data comes, he will invalidate his
cache.
But when the kernel writes data to the skb buffer, they may partially
stay in cache so there won't be any transaction at all on the xlb bus.
It's when
bestcomm will read the skb, that the core will snoop the bus, detects
there is a read request for some data he has in cache, force a retry of
the bestcomm read, write the data to memory (via xlb), and finally let
bestcomm retry the transaction to fetch the good data.
So I guess what "could" happen is that :
- The kernel allocate a skb, but it ends up being as the same memory
location
as a "previous" one. (or maybe in a directly following position
because of
prefetch).
- You submit it to bestcomm
- When bestcomm does the read, since the skb was used "just before",
the line is still in cache but with the wrong data. Since the kernel
just wrote the data, there was not yet a xlb transaction because the
data are still in cpu cache.
Bestcomm think he has the data (no xlb write so it's cache was not
invalidated), so he doesn't generate a xlb read. But if there is no xlb
read the core doesn't get a chance to snoop it and doesn't flush it's
cache ...
Although that doesn't explain why setting BSDIS high solve the problem,
nor why there is only 1 byte wrong ...
Have you checked your XLB snoop window setting ? And that core snooping
is enabled ? Also that you don't use the "nap" power saving feature of
the core ? (it disables snooping altogether ...).
Sylvain
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-05-16 7:29 Hans Thielemans
@ 2007-05-16 15:25 ` John Rigby
0 siblings, 0 replies; 14+ messages in thread
From: John Rigby @ 2007-05-16 15:25 UTC (permalink / raw)
To: Sylvain Munaut, Hans Thielemans; +Cc: David Kanceruk, linuxppc-embedded
[-- Attachment #1: Type: text/plain, Size: 7409 bytes --]
Sylvain,
This is a shot in the dark but the fact that it is the last word that is
wrong reminds me of your question last week about the Gen_BD_TX
tasks:
> 2) From what I understand, this part process everything execpt the last
> byte :
>
> 0xd9190300, /* LCDEXT: idx2 = idx2; idx2 > var12; idx2 += inc0
*/
> 0xb8c5e009, /* LCD: idx3 = *(idx1 + var00000015); ; idx3 += inc1
*/
> 0x03fec398, /* DRD1A: *idx0 = *idx3; FN=0 init=31 WS=3 RS=3 */
>
> and this process the remaining (1 to 4) bytes :
>
> 0x9919826a, /* LCD: idx2 = idx2, idx3 = idx3; idx2 > var9;
> idx2 += inc5, idx3 += inc2 */
> 0x0feac398, /* DRD1A: *idx0 = *idx3; FN=0 TFD INT init=31
> WS=1 RS=1 */
>
> But the first group stops when the remaining length <= 4 (continue
> while idx2 > var12, and var12 = 4). So if the buffer has a size multiple
> of 4, that means the last 4 bytes will be processed 1 by 1. But that
> means they may not be written as a full word access right ? I'm writing
> to the AC97 fifo and from doing tests without DMA, not doing full 32
> bits write just doesn't work.
As I said just a shot in the dark.
John
On 5/16/07, Hans Thielemans <Hans.Thielemans@metris.com> wrote:
> Hello Sylvain and David
>
> I think it is a more basic problem then just cache. The setup is using
> the psc2 and
> psc3 in codec32 mode to communicate with a DSP. Because the MPC5200 had
> problems with
> the frame in slave mode (anomaly list), it is used in master mode, and
> sends empty packets
> of 256 bytes to keep the link active, so the DSP can send the data. This
> because the send and
> receive clocks and frames are the same on the mpc5200 side.
>
> The empty packet is a fixed packet in memory, so it is never overwritten
> by the mpc5200 once
> the driver is initialized. So I can not believe in a cache problem. The
> problem is always in the
> last 32 bit word or last 4 bytes in the package. The error rate seems to
> be influenced by cpu activity
> and bus priorities.
>
> I have now changed the protocol to send 260 bytes and just drop the last
> 4 bytes at the receiver.
> This way I had it running this night, transmitting 50 GB without a
> single error.
>
> I would assume it has something to do with the bastcomm engine tasks at
> the end of a dma block.
> And probably something with the bus access. I tried several settings for
> the arbiter and bus configurations
> by changing the registers from within the bdi2000 debugger. Changing
> behavior but no solution.
>
> In the full system there are 6 bestcomm tasks active: fec rx and tx,
> psc2 rx and tx and psc3 rx and tx.
>
> Regards
>
> Hans
>
>
>
> -----Original Message-----
> From: Sylvain Munaut [mailto:tnt@246tNt.com]
> Sent: woensdag 16 mei 2007 8:57
> To: David Kanceruk
> Cc: Hans Thielemans; linuxppc-embedded@ozlabs.org
> Subject: Re: MPC5200 ethernet communication stops unexpected
>
> David Kanceruk wrote:
> > Hello Hans,
> >
> > Our problem was with the FEC sending data with one or two
> > incorrect bytes when we switched from the MPC5200 to the MPC5200B. The
>
> > byte positions were always the same. The socket buffer has the correct
>
> > data before and after the DMA engine runs but the FEC TxFIFO does not
> > always match.
> >
> > One solution to our problem was to make the following call prior to
> > starting the DMA:
> >
> > flush_dcache_range((unsigned long)skb->data, (unsigned long)skb->data
> > + skb->len);
> >
> > The other solution was to set the BSDIS bit in the XLB config register
>
> > during initialization as follows:
> >
> > xlb = (struct mpc52xx_xlb *)MPC5xxx_XLB;
> > out_be32(&xlb->config, in_be32(&xlb->config) |
> > MPC52xx_XLB_CFG_BSDIS);
> >
> > Either solution works for us. The BSDIS bit is a new feature in the
> > MPC5200B. The MPC5200 did not have this bit.
> >
> > According to the Freescale documentation, (Application note AN3045,
> > for instance) setting this bit is supposed to "disable" BestComm bus
> > snooping. However, I have reason to believe the documentation is in
> > error. Everything I have observed seems to indicate that in the
> > MPC5200 BestComm bus snooping was always enabled or enabled via some
> > other means. In the MPC5200B it appears to be "disabled" at reset (not
>
> > "enabled" as the documentation states). This is why flushing the cache
>
> > manually is one solution. Since setting the BSDIS bit also fixes the
> > problem, it suggests that this actually "enables" BestComm bus
> > snooping instead of disabling it. In my mind, it could all boil down
> > to a simple documentation error.
> >
> That problem is _very_ weird ...
>
> From what I understand, Bestcomm XLB snooping means that when the
> BestComm engine has some data cached internally and that it detects a
> write to the address from where those data comes, he will invalidate his
> cache.
>
> But when the kernel writes data to the skb buffer, they may partially
> stay in cache so there won't be any transaction at all on the xlb bus.
> It's when
> bestcomm will read the skb, that the core will snoop the bus, detects
> there is a read request for some data he has in cache, force a retry of
> the bestcomm read, write the data to memory (via xlb), and finally let
> bestcomm retry the transaction to fetch the good data.
>
> So I guess what "could" happen is that :
> - The kernel allocate a skb, but it ends up being as the same memory
> location
> as a "previous" one. (or maybe in a directly following position
> because of
> prefetch).
> - You submit it to bestcomm
> - When bestcomm does the read, since the skb was used "just before",
> the line is still in cache but with the wrong data. Since the kernel
> just wrote the data, there was not yet a xlb transaction because the
> data are still in cpu cache.
> Bestcomm think he has the data (no xlb write so it's cache was not
> invalidated), so he doesn't generate a xlb read. But if there is no xlb
> read the core doesn't get a chance to snoop it and doesn't flush it's
> cache ...
>
> Although that doesn't explain why setting BSDIS high solve the problem,
> nor why there is only 1 byte wrong ...
>
> Have you checked your XLB snoop window setting ? And that core snooping
> is enabled ? Also that you don't use the "nap" power saving feature of
> the core ? (it disables snooping altogether ...).
>
>
> Sylvain
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>
>
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
>
[-- Attachment #2: Type: text/html, Size: 9687 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: MPC5200 ethernet communication stops unexpected
2007-05-15 15:03 ` MPC5200 ethernet communication stops unexpected David Kanceruk
2007-05-16 6:56 ` Sylvain Munaut
@ 2007-05-21 7:43 ` Eberhard Stoll
1 sibling, 0 replies; 14+ messages in thread
From: Eberhard Stoll @ 2007-05-21 7:43 UTC (permalink / raw)
To: linuxppc-embedded
Hi,
> xlb = (struct mpc52xx_xlb *)MPC5xxx_XLB;
> out_be32(&xlb->config, in_be32(&xlb->config) | MPC52xx_XLB_CFG_BSDIS);
>
this modification seems to solve my problem, too (ethernet communication
stops unexpected / best comm tx task deadlock(?))!
As i described before ping -f triggered the problem in between 20
minutes everytime. With this patch my configuration to trigger the fault
run from Friday to Monday (> 60h) without any error!
So it seems for me this solved my problem, too!
Best regards,
Eberhard
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-05-21 7:45 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1179234357.1668.10.camel@pc-hans>
2007-05-15 15:03 ` MPC5200 ethernet communication stops unexpected David Kanceruk
2007-05-16 6:56 ` Sylvain Munaut
2007-05-21 7:43 ` Eberhard Stoll
2007-05-16 7:29 Hans Thielemans
2007-05-16 15:25 ` John Rigby
-- strict thread matches above, loose matches on Subject: below --
2007-05-15 15:16 Hans Thielemans
2007-05-15 15:40 ` David Kanceruk
2007-04-20 19:21 Eberhard Stoll
2007-04-27 14:35 ` tacitus
2007-04-27 15:42 ` Daniel Schnell
2007-04-27 17:09 ` Eberhard Stoll
2007-04-27 17:22 ` David Kanceruk
[not found] ` <4632369E.8010000@berghof.com>
2007-04-27 18:15 ` David Kanceruk
2007-04-30 20:49 ` Wolfgang Denk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).