* [ISSUE: mv88e6xxx]: Down/Up link and not forwarding
@ 2016-10-04 15:13 Jose Antonio Delgado Alfonso
0 siblings, 0 replies; 4+ messages in thread
From: Jose Antonio Delgado Alfonso @ 2016-10-04 15:13 UTC (permalink / raw)
To: netdev
We are working in an ARMv7 embedded system running kernel 4.1 but
including patches to upgrade dsa/mv88e6xxx to kernel version 4.3
(5acf4d0, Wed, 27 May 2015 15:32:15 -0700) "[PATCH] blk: rq_data_dir()
should not return a boolean."
This is the schema of the system.
+---------------------+ eth0
| +--+
| | |
| Embedded system +--+
| |
| ARMv7 |
| | Marvell 88E8057(sky2)
+------------------+
| +--+
+--+ +--+ eth1@marvell
| | +---------------------------+
| | +-------+
| +--+ CPU port +--+
mv88e6176 +--+
+------+--+-----------+
| |
emulated | |
| |
GPIO-MDIO +--+
+--+ +--+ eth2@marvell
+-------------------------------------------+
| | +-------+
MDIO
+--+ +--+
+------------------+
There is a bridge (br-lan) which includes eth0/eth1/eth2
>From time to time, We are seeing a link down and up of about 1s.
Following the message that kernel sends.
[ 312.769399] dsa dsa@0 eth2: Link is Down
[ 312.773372] br-lan: port 3(eth2) entered disabled state
[ 312.947274] dsa dsa@0 eth2: link up, 100 Mb/s, full duplex, flow
control disabled
[ 312.963807] br-lan: port 3(eth2) entered forwarding state
[ 312.969276] br-lan: port 3(eth2) entered forwarding state
[ 313.777815] dsa dsa@0 eth2: Link is Up - 100Mbps/Full - flow control
rx/tx
[ 314.966277] br-lan: port 3(eth2) entered forwarding state
Moreover, under a reboot loop test which consists in booting the system,
ping the unit and, if it responds, reboot again, we found that the
bridge does not forward packages after many reboots.
Looking into 88e6176 registers we saw the following
GLOBAL GLOBAL2 0 1 2 3 4 5 6
0: c820 0 de0f 5d0f 500f 500f 500f 4e07 4007
1: 3 0 3e 3 3 3 3 3 3
2: 0 ffff 0 0 0 0 0 0 0
3: 0 ffff 1761 1761 1761 1761 1761 1761 1761
4: 6000 258 373f 433 430 433 433 433 433
5: 1000 c12f 0 0 0 0 0 0 0
6: c000 1f0f 101e 3005 3003 4001 5001 6001 7001
7: 0 707f 0 0 0 0 0 0 0
8: 0 7800 2480 2480 2480 2480 2480 2480 2480
9: 0 1600 1 1 1 1 1 1 1
a: 148 0 0 0 0 0 0 0 0
b: 6000 1000 1 2 4 8 10 20 40
c: 0 22 0 0 0 0 0 0 0
d: ffff 507 0 0 0 0 0 0 0
e: ffff 36 0 0 0 0 0 0 0
f: ffff f00 dada dada dada dada dada dada dada
10: 0 0 0 0 0 0 0 0 0
11: 0 0 0 0 0 0 0 0 0
12: 5555 0 0 0 0 0 0 0 0
13: 5555 0 34d 8b18 54d 0 0 0 0
14: aaaa 400 0 0 0 0 0 0 0
15: aaaa 0 0 0 0 0 0 0 0
16: ffff 0 33 33 33 33 33 33 0
17: ffff 0 0 0 0 0 0 0 0
18: fa41 1884 3210 3210 3210 3210 3210 3210 3210
19: 0 5e1 7654 7654 7654 7654 7654 7654 7654
1a: 0 0 0 0 0 0 0 0 0
1b: 1fc f869 8000 8000 8000 8000 8000 8000 8000
1c: 0 4c00 0 0 0 0 0 0 0
1d: 5ce0 0 0 0 0 0 0 0 0
1e: 0 0 0 0 0 0 0 0 0
1f: 0 0 0 0 0 0 0 0 0
The main difference is GLOBAL2 5th register. When the unit is just
initialized, the driver sets this register to 00ff, however, when the
issue happens, its value is c12f.
We got a patch which allows us to set registers values. If we change
c12f to 00ff the ping works, otherwise, ping does not work. We do not
know who is changing the register value. Apparently, driver does not.
Weirderif possible, sometimes even global2 5th register is set to 00ff
and bridge does not forward packages either. We have not sorted out
which other register is affecting.
Finally, The weirdest behaviour we are seeing is that the unit does not
detect a link change, register 0 of ports 1 and 2 do not update their
status.
Have you experienced a similar issue in your side?
Is it possible that those micro-outage could be the reason of bad
settings in Global2 5th register?
Have you fixed this issues in a newer Linux kernel version?
Thanks in advance.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [ISSUE: mv88e6xxx]: Down/Up link and not forwarding
[not found] <8e5e36d7-7618-2a4e-6aba-e65e41662d47@aoifes.com>
@ 2016-10-04 15:37 ` Jose Antonio Delgado Alfonso
2016-10-04 18:58 ` Florian Fainelli
0 siblings, 1 reply; 4+ messages in thread
From: Jose Antonio Delgado Alfonso @ 2016-10-04 15:37 UTC (permalink / raw)
To: netdev
We are working in an ARMv7 embedded system running kernel 4.1 but
including patches to upgrade dsa/mv88e6xxx to kernel version 4.3
(5acf4d0, Wed, 27 May 2015 15:32:15 -0700) "[PATCH] blk: rq_data_dir()
should not return a boolean."
This is the schema of the system.
+-------------------+ eth0
| +--+
| | |
| Embedded system +--+
| |
| ARMv7 |
| | Marvell 88E8057(sky2) +-------------+
| +--+ +--+ +--+ eth1
| | +---------------------+ | | +------+
| +--+ CPU port +--+ mv88e6176 +--+
+------+--+---------+ | |
emulated| | | |
GPIO +--+ +--+ +--+ eth2
MDIO +-----------------------------------+ | | +------+
MDIO +--+ +--+
+-------------+
There is a bridge (br-lan) which includes eth0/eth1/eth2
>From time to time, We are seeing a link down and up of about 1s.
Following the message that kernel sends.
[ 312.769399] dsa dsa@0 eth2: Link is Down
[ 312.773372] br-lan: port 3(eth2) entered disabled state
[ 312.947274] dsa dsa@0 eth2: link up, 100 Mb/s, full duplex, flow
control disabled
[ 312.963807] br-lan: port 3(eth2) entered forwarding state
[ 312.969276] br-lan: port 3(eth2) entered forwarding state
[ 313.777815] dsa dsa@0 eth2: Link is Up - 100Mbps/Full - flow control
rx/tx
[ 314.966277] br-lan: port 3(eth2) entered forwarding state
Moreover, under a reboot loop test which consists in booting the system,
ping the unit and, if it responds, reboot again, we found that the
bridge does not forward packages after many reboots.
Looking into 88e6176 registers we saw the following
GLOBAL GLOBAL2 0 1 2 3 4 5 6
0: c820 0 de0f 5d0f 500f 500f 500f 4e07 4007
1: 3 0 3e 3 3 3 3 3 3
2: 0 ffff 0 0 0 0 0 0 0
3: 0 ffff 1761 1761 1761 1761 1761 1761 1761
4: 6000 258 373f 433 430 433 433 433 433
5: 1000 c12f 0 0 0 0 0 0 0
6: c000 1f0f 101e 3005 3003 4001 5001 6001 7001
7: 0 707f 0 0 0 0 0 0 0
8: 0 7800 2480 2480 2480 2480 2480 2480 2480
9: 0 1600 1 1 1 1 1 1 1
a: 148 0 0 0 0 0 0 0 0
b: 6000 1000 1 2 4 8 10 20 40
c: 0 22 0 0 0 0 0 0 0
d: ffff 507 0 0 0 0 0 0 0
e: ffff 36 0 0 0 0 0 0 0
f: ffff f00 dada dada dada dada dada dada dada
10: 0 0 0 0 0 0 0 0 0
11: 0 0 0 0 0 0 0 0 0
12: 5555 0 0 0 0 0 0 0 0
13: 5555 0 34d 8b18 54d 0 0 0 0
14: aaaa 400 0 0 0 0 0 0 0
15: aaaa 0 0 0 0 0 0 0 0
16: ffff 0 33 33 33 33 33 33 0
17: ffff 0 0 0 0 0 0 0 0
18: fa41 1884 3210 3210 3210 3210 3210 3210 3210
19: 0 5e1 7654 7654 7654 7654 7654 7654 7654
1a: 0 0 0 0 0 0 0 0 0
1b: 1fc f869 8000 8000 8000 8000 8000 8000 8000
1c: 0 4c00 0 0 0 0 0 0 0
1d: 5ce0 0 0 0 0 0 0 0 0
1e: 0 0 0 0 0 0 0 0 0
1f: 0 0 0 0 0 0 0 0 0
The main difference is GLOBAL2 5th register. When the unit is just
initialized, the driver sets this register to 00ff, however, when the
issue happens, its value is c12f.
We got a patch which allows us to set registers values. If we change
c12f to 00ff the ping works, otherwise, ping does not work. We do not
know who is changing the register value. Apparently, driver does not.
Weirderif possible, sometimes even global2 5th register is set to 00ff
and bridge does not forward packages either. We have not sorted out
which other register is affecting.
Finally, The weirdest behaviour we are seeing is that the unit does not
detect a link change, register 0 of ports 1 and 2 do not update their
status.
Have you experienced a similar issue in your side?
Is it possible that those micro-outage could be the reason of bad
settings in Global2 5th register?
Have you fixed this issues in a newer Linux kernel version?
Thanks in advance.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [ISSUE: mv88e6xxx]: Down/Up link and not forwarding
2016-10-04 15:37 ` [ISSUE: mv88e6xxx]: Down/Up link and not forwarding Jose Antonio Delgado Alfonso
@ 2016-10-04 18:58 ` Florian Fainelli
2016-10-04 20:28 ` Andrew Lunn
0 siblings, 1 reply; 4+ messages in thread
From: Florian Fainelli @ 2016-10-04 18:58 UTC (permalink / raw)
To: Jose Antonio Delgado Alfonso, netdev, Andrew Lunn, Vivien Didelot
On October 4, 2016 8:37:13 AM PDT, Jose Antonio Delgado Alfonso <jose.delgado@aoifes.com> wrote:
>We are working in an ARMv7 embedded system running kernel 4.1 but
>including patches to upgrade dsa/mv88e6xxx to kernel version 4.3
>(5acf4d0, Wed, 27 May 2015 15:32:15 -0700) "[PATCH] blk: rq_data_dir()
>should not return a boolean."
>
>This is the schema of the system.
>
> +-------------------+ eth0
> | +--+
> | | |
> | Embedded system +--+
> | |
> | ARMv7 |
> | | Marvell 88E8057(sky2) +-------------+
>| +--+ +--+ +--+ eth1
>| | +---------------------+ | |
>+------+
> | +--+ CPU port +--+ mv88e6176 +--+
> +------+--+---------+ | |
>emulated| | | |
>GPIO +--+ +--+ +--+
>eth2
>MDIO +-----------------------------------+ | |
>+------+
> MDIO +--+ +--+
> +-------------+
>
>There is a bridge (br-lan) which includes eth0/eth1/eth2
Can you detail what eth0 and eth1 actually correspond to? The bridge layer denies adding DSA master network interfaces as bridge members as soon as they have tags enabled.
>
>>From time to time, We are seeing a link down and up of about 1s.
>Following the message that kernel sends.
>
>[ 312.769399] dsa dsa@0 eth2: Link is Down
>[ 312.773372] br-lan: port 3(eth2) entered disabled state
>[ 312.947274] dsa dsa@0 eth2: link up, 100 Mb/s, full duplex, flow
>control disabled
>[ 312.963807] br-lan: port 3(eth2) entered forwarding state
>[ 312.969276] br-lan: port 3(eth2) entered forwarding state
>[ 313.777815] dsa dsa@0 eth2: Link is Up - 100Mbps/Full - flow control
>rx/tx
>[ 314.966277] br-lan: port 3(eth2) entered forwarding state
>
>Moreover, under a reboot loop test which consists in booting the
>system,
>ping the unit and, if it responds, reboot again, we found that the
>bridge does not forward packages after many reboots.
>Looking into 88e6176 registers we saw the following
>
> GLOBAL GLOBAL2 0 1 2 3 4 5 6
> 0: c820 0 de0f 5d0f 500f 500f 500f 4e07 4007
> 1: 3 0 3e 3 3 3 3 3 3
> 2: 0 ffff 0 0 0 0 0 0 0
> 3: 0 ffff 1761 1761 1761 1761 1761 1761 1761
> 4: 6000 258 373f 433 430 433 433 433 433
> 5: 1000 c12f 0 0 0 0 0 0 0
> 6: c000 1f0f 101e 3005 3003 4001 5001 6001 7001
> 7: 0 707f 0 0 0 0 0 0 0
> 8: 0 7800 2480 2480 2480 2480 2480 2480 2480
> 9: 0 1600 1 1 1 1 1 1 1
> a: 148 0 0 0 0 0 0 0 0
> b: 6000 1000 1 2 4 8 10 20 40
> c: 0 22 0 0 0 0 0 0 0
> d: ffff 507 0 0 0 0 0 0 0
> e: ffff 36 0 0 0 0 0 0 0
> f: ffff f00 dada dada dada dada dada dada dada
>10: 0 0 0 0 0 0 0 0 0
>11: 0 0 0 0 0 0 0 0 0
>12: 5555 0 0 0 0 0 0 0 0
>13: 5555 0 34d 8b18 54d 0 0 0 0
>14: aaaa 400 0 0 0 0 0 0 0
>15: aaaa 0 0 0 0 0 0 0 0
>16: ffff 0 33 33 33 33 33 33 0
>17: ffff 0 0 0 0 0 0 0 0
>18: fa41 1884 3210 3210 3210 3210 3210 3210 3210
>19: 0 5e1 7654 7654 7654 7654 7654 7654 7654
>1a: 0 0 0 0 0 0 0 0 0
>1b: 1fc f869 8000 8000 8000 8000 8000 8000 8000
>1c: 0 4c00 0 0 0 0 0 0 0
>1d: 5ce0 0 0 0 0 0 0 0 0
>1e: 0 0 0 0 0 0 0 0 0
>1f: 0 0 0 0 0 0 0 0 0
>
>The main difference is GLOBAL2 5th register. When the unit is just
>initialized, the driver sets this register to 00ff, however, when the
>issue happens, its value is c12f.
>We got a patch which allows us to set registers values. If we change
>c12f to 00ff the ping works, otherwise, ping does not work. We do not
>know who is changing the register value. Apparently, driver does not.
>
>Weirderif possible, sometimes even global2 5th register is set to 00ff
>and bridge does not forward packages either. We have not sorted out
>which other register is affecting.
>
>Finally, The weirdest behaviour we are seeing is that the unit does not
>detect a link change, register 0 of ports 1 and 2 do not update their
>status.
>
>Have you experienced a similar issue in your side?
>
>Is it possible that those micro-outage could be the reason of bad
>settings in Global2 5th register?
>
>Have you fixed this issues in a newer Linux kernel version?
Can you try reproducing this with the latest net-next tree?
--
Florian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [ISSUE: mv88e6xxx]: Down/Up link and not forwarding
2016-10-04 18:58 ` Florian Fainelli
@ 2016-10-04 20:28 ` Andrew Lunn
0 siblings, 0 replies; 4+ messages in thread
From: Andrew Lunn @ 2016-10-04 20:28 UTC (permalink / raw)
To: Florian Fainelli; +Cc: Jose Antonio Delgado Alfonso, netdev, Vivien Didelot
> >The main difference is GLOBAL2 5th register. When the unit is just
> >initialized, the driver sets this register to 00ff, however, when
> >the issue happens, its value is c12f.
You might want to hack the MDIO driver and get it to trap writes to
this register and give you a call stack.
Andrew
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-10-04 20:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <8e5e36d7-7618-2a4e-6aba-e65e41662d47@aoifes.com>
2016-10-04 15:37 ` [ISSUE: mv88e6xxx]: Down/Up link and not forwarding Jose Antonio Delgado Alfonso
2016-10-04 18:58 ` Florian Fainelli
2016-10-04 20:28 ` Andrew Lunn
2016-10-04 15:13 Jose Antonio Delgado Alfonso
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).