dsa/mv88e6xxx: leaking packets on MV88E6341 switch

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* dsa/mv88e6xxx: leaking packets on MV88E6341 switch
@ 2020-09-30 10:09 Peter Vollmer
  2020-09-30 10:28 ` Vladimir Oltean
  2020-09-30 19:19 ` Andrew Lunn
  0 siblings, 2 replies; 12+ messages in thread
From: Peter Vollmer @ 2020-09-30 10:09 UTC (permalink / raw)
  To: Network Development

Hi all,
I am currently investigating a leaking packets problem on a
armada-37xx + MV88E6341 switch (via SGMII)  + MV88E1512 Phy (via
RGMII)  platform. We are using the mainline 5.4.y kernel.

The switch and phy setup is defined in the flat device tree as follows:

&eth0 {
        phy-mode = "rgmii-id";
        phy = <&ethphy0>;
        status = "okay";
};

&eth1 {
        phy-mode = "sgmii";
        status = "okay";

        fixed-link {
                speed = <2500>;
                full-duplex;
        };
};

&mdio {
        reset-gpios = <&gpiosb 0 GPIO_ACTIVE_LOW>;
        reset-delay-us = <2>;

        ethphy0: ethernet-phy@0 {
                reg = <0x0>;
                status = "okay";
        };

        switch0: switch0@1 {
                compatible = "marvell,mv88e6085";
                #address-cells = <1>;
                #size-cells = <0>;
                reg = <1>;
                cpu-port = <5>;
                dsa,member = <0 0>;
                status = "okay";

                ports {
                        #address-cells = <0x1>;
                        #size-cells = <0x0>;

                        port@1 {
                                reg = <1>;
                                label = "lan0";
                                phy-handle = <&switch0phy1>;
                        };
                        port@2 {
                                reg = <2>;
                                label = "lan1";
                                phy-handle = <&switch0phy2>;
                        };

                        port@3 {
                                reg = <3>;
                                label = "lan2";
                                phy-handle = <&switch0phy3>;
                        };

                        port@4 {
                                reg = <4>;
                                label = "lan3";
                                phy-handle = <&switch0phy4>;
                        };

                        port@5 {
                                reg = <5>;
                                label = "cpu";
                                ethernet = <&eth1>;
                        };
                };

                mdio {
                        #address-cells = <1>;
                        #size-cells = <0>;

                        switch0phy1: switch0phy0@11 {
                                reg = <0x11>;
                        };
                        switch0phy2: switch0phy1@12 {
                                reg = <0x12>;
                        };
                        switch0phy3: switch0phy2@13 {
                                reg = <0x13>;
                        };
                        switch0phy4: switch0phy2@14 {
                                reg = <0x14>;
                        };
                };
        };
};

lan0..lan3 are members of the br0 bridge interface.

The problem is that for ICMP ping lan0-> eth0, ICMP ping request
packets are leaking (i.e. flooded)  to all other ports lan1..lan3,
while the ping reply eth0->lan0 arrives correctly at lan0 without any
leaked packets on lan1..lan3.
The problem temporarily goes away for ~280 seconds after I toggle the
multicast flag of the bridge interface ( ifconfig br0 [-]multicast )
We also noticed an asymmetric maximum network throughput, UDP traffic
lan0->eth0 is much slower than in the direction eth0->lan0.

My assumption is that in our case the SRC MAC address of the bridge
(or eth1) interface is not correctly learned by the switch, so it
floods the packets in reverse direction to all ports (CPU port 5 and
the other lan ports). As it seems the DSA packets ingressing on CPU
port5 (eth0->lan0) are sent as DSA MGMT frames, but those seem not to
be used for address learning.

Is this a known effect for this kind of setup, and is there something
we can do about it ?

What would be the best way to debug this ? Is there a way to dump the
ATU MAC tables to see what's going on with the address learning ?

Many thanks and best regards

Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-09-30 10:09 dsa/mv88e6xxx: leaking packets on MV88E6341 switch Peter Vollmer
@ 2020-09-30 10:28 ` Vladimir Oltean
  2020-09-30 11:57   ` Peter Vollmer
  2020-09-30 19:19 ` Andrew Lunn
  1 sibling, 1 reply; 12+ messages in thread
From: Vladimir Oltean @ 2020-09-30 10:28 UTC (permalink / raw)
  To: Peter Vollmer; +Cc: Network Development

On Wed, Sep 30, 2020 at 12:09:03PM +0200, Peter Vollmer wrote:
> lan0..lan3 are members of the br0 bridge interface.

and so is eth0, I assume?

> The problem is that for ICMP ping lan0-> eth0, ICMP ping request
> packets are leaking (i.e. flooded)  to all other ports lan1..lan3,
> while the ping reply eth0->lan0 arrives correctly at lan0 without any
> leaked packets on lan1..lan3.

What are you pinging exactly, the IP of the eth0 interface, or a station
connected to the eth0 which is part of the same bridge as the lan ports?

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-09-30 10:28 ` Vladimir Oltean
@ 2020-09-30 11:57   ` Peter Vollmer
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Vollmer @ 2020-09-30 11:57 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Network Development

On Wed, Sep 30, 2020 at 01:28:35PM +0300, Vladimir Oltean wrote:
> On Wed, Sep 30, 2020 at 12:09:03PM +0200, Peter Vollmer wrote:
> > lan0..lan3 are members of the br0 bridge interface.
>
> and so is eth0, I assume?

No, eth0 is a dedicated interface with its own IP. We have routing between eth0 and br0.

root@mGuard:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq qlen 1024
    link/ether a8:74:1d:85:08:be brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1508 qdisc mq qlen 1024
    link/ether 00:a0:45:38:22:90 brd ff:ff:ff:ff:ff:ff
4: sit0@NONE: <NOARP> mtu 1480 qdisc noop qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
5: lan0@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 qlen 1000
    link/ether a8:74:1d:85:08:bf brd ff:ff:ff:ff:ff:ff
6: lan1@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 qlen 1000
    link/ether a8:74:1d:85:08:c0 brd ff:ff:ff:ff:ff:ff
7: lan2@eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br0 qlen 1000
    link/ether a8:74:1d:85:08:c1 brd ff:ff:ff:ff:ff:ff
8: lan3@eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br0 qlen 1000
    link/ether a8:74:1d:85:08:c2 brd ff:ff:ff:ff:ff:ff
9: br0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue qlen 1000
    link/ether a8:74:1d:85:08:bf brd ff:ff:ff:ff:ff:ff

root@mGuard:~# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.a8741d8508bf       no              lan2
                                                        lan0
                                                        lan3
                                                        lan1


> > The problem is that for ICMP ping lan0-> eth0, ICMP ping request
> > packets are leaking (i.e. flooded)  to all other ports lan1..lan3,
> > while the ping reply eth0->lan0 arrives correctly at lan0 without any
> > leaked packets on lan1..lan3.
> 
> What are you pinging exactly, the IP of the eth0 interface, or a station
> connected to the eth0 which is part of the same bridge as the lan ports?
> 

I am pinging the address of a station connected to eth0 from a station
connected to switch port lan0.

Thanks,
Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-09-30 10:09 dsa/mv88e6xxx: leaking packets on MV88E6341 switch Peter Vollmer
  2020-09-30 10:28 ` Vladimir Oltean
@ 2020-09-30 19:19 ` Andrew Lunn
  2020-10-01  6:21   ` Peter Vollmer
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Lunn @ 2020-09-30 19:19 UTC (permalink / raw)
  To: Peter Vollmer; +Cc: Network Development

> What would be the best way to debug this ? Is there a way to dump the
> ATU MAC tables to see what's going on with the address learning ?

If you jump to net-next, and use

https://github.com/lunn/mv88e6xxx_dump

You can dump the full ATU from the switch.

bridge fdb show

can give you some idea what is going on, but it is less clear what is
in the hardware and what is in software.

   Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-09-30 19:19 ` Andrew Lunn
@ 2020-10-01  6:21   ` Peter Vollmer
  2020-11-25 14:09     ` Peter Vollmer
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Vollmer @ 2020-10-01  6:21 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Network Development

On Wed, Sep 30, 2020 at 09:19:56PM +0200, Andrew Lunn wrote:
> > What would be the best way to debug this ? Is there a way to dump the
> > ATU MAC tables to see what's going on with the address learning ?
> 
> If you jump to net-next, and use
> 
> https://github.com/lunn/mv88e6xxx_dump
> 
> You can dump the full ATU from the switch.
> 
> bridge fdb show
> 
> can give you some idea what is going on, but it is less clear what is
> in the hardware and what is in software.

Thanks, I will try that.

  Peter



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-10-01  6:21   ` Peter Vollmer
@ 2020-11-25 14:09     ` Peter Vollmer
  2020-11-26 21:41       ` Tobias Waldekranz
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Vollmer @ 2020-11-25 14:09 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Network Development

Hi,
I am still investigating the leaking packets problem we are having
with a setup of an armada-3720 SOC and a 88E6341 switch ( connected
via cpu port 5 , SGMII ,C_MODE=0xB, 2500 BASE-x). I now jumped to the
net-next kernel (5.10.0-rc4) and can now use the nice mv88e6xxx_dump
tool for switch register dumping.

The described packet leaking still occurs, in a setup of ports
lan0-lan3 (switch ports 1-4)  joined in a bridge br0.

Here is my setup, ports lan0-3 are DSA ports coming in through eth1,
eth0 is a single 88E1512 phy connected to RGMII
root@DUT:~# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.fafb2fbbd4c6       no              lan0
                                                        lan1
                                                        lan2
                                                        lan3
root@DUT:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
group default qlen 1024
    link/ether c2:49:bc:0d:a8:57 brd ff:ff:ff:ff:ff:ff
    inet 192.168.90.100/24 brd 192.168.90.255 scope global eth0
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1504 qdisc mq state UP
group default qlen 1024
    link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
4: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
5: lan0@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master br0 state UP group default qlen 1000
    link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
6: lan1@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master br0 state UP group default qlen 1000
    link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
7: lan2@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master br0 state UP group default qlen 1000
    link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
8: lan3@eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc
noqueue master br0 state LOWERLAYERDOWN group default qlen 1000
    link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP group default qlen 1000
    link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
    inet 172.16.4.1/16 brd 172.16.4.255 scope global br0
       valid_lft forever preferred_lft forever

- pinging from client0 (connected to lan0 ) to the bridge IP, the ping
requests (only the requests) are also seen on client1 connected to
lan1

- the other effect looks more suspicious: when pinging from br0 to the
IP of client0 connected to port lan0, after ~280 seconds client1
connected to lan1 will also see the ping replies of client0 (only the
replies). And after another ~300seconds this stops again. This repeats
in a cycle .

I see these problems since at least kernel version 5.4.y, but not with
the old linux-marvel kernel sources
(https://github.com/MarvellEmbeddedProcessors/linux-marvell.git)
Can somebody using this switch in SGMII mode perhaps reproduce this ?

One thing I noticed is that due to .tag_protocol=DSA_TAG_PROTO_EDSA
for the 88E6341 switch, EgressMode (port control 0x4 , bit13:12) is
set to an unsupported value of 0x3 ("reserved for future use" in the
switch spec). See the value in row 04 Port control for port 5 = 0x373f
in the following dump:

root@mguard3:~# mv88e6xxx_dump --ports
Using device <mdio_bus/d0032004.mdio-mii:01>
                           0    1    2    3    4    5
00 Port status            0006 9e4f 9e4f 9e4f 100f 0f0b
01 Physical control       0003 0003 0003 0003 0003 20ff
02 Jamming control        ff00 0000 0000 0000 0000 0000
03 Switch ID              3410 3410 3410 3410 3410 3410
04 Port control           007c 043f 043f 043f 043c 373f
05 Port control 1         0000 0000 0000 0000 0000 0000
06 Port base VLAN map     007e 007c 007a 0076 006e 005f
07 Def VLAN ID & Prio     0001 0000 0000 0000 0000 0000
08 Port control 2         2080 0080 0080 0080 0080 0080
09 Egress rate control    0001 0001 0001 0001 0001 0001
0a Egress rate control 2  8000 0000 0000 0000 0000 0000
0b Port association vec   0001 0002 0004 0008 0010 0000
0c Port ATU control       0000 0000 0000 0000 0000 0000
0d Override               0000 0000 0000 0000 0000 0000
0e Policy control         0000 0000 0000 0000 0000 0000
0f Port ether type        9100 9100 9100 9100 9100 dada
10 Reserved               0000 0000 0000 0000 0000 0000
11 Reserved               0000 0000 0000 0000 0000 0000
12 Reserved               0000 0000 0000 0000 0000 0000
13 Reserved               0000 0000 0000 0000 0000 0000
14 Reserved               0000 0000 0000 0000 0000 0000
15 Reserved               0000 0000 0000 0000 0000 0000
16 LED control            0000 10eb 10eb 10eb 10eb 0000
17 Reserved               0000 0000 0000 0000 0000 0000
18 Tag remap low          3210 3210 3210 3210 3210 3210
19 Tag remap high         7654 7654 7654 7654 7654 7654
1a Reserved               0000 0000 0000 0000 5ea0 a100
1b Queue counters         8000 8000 8000 8000 8000 8000
1c Queue control          0000 0000 0000 0000 0000 0000
1d queue control 2        0000 0000 0000 0000 0000 0000
1e Cut through control    f000 f000 f000 f000 f000 f000
1f Debug counters         0000 0014 0015 0012 0000 0010

I tested setting .tag_protocol=DSA_TAG_PROTO_DSA for the 6341 switch
instead, resulting in a register setting of 04 Port control for port 5
= 0x053f (i.e. EgressMode=Unmodified mode, frames are transmitted
unmodified), which looks correct to me. It does not fix the above
problem, but the change seems to make sense anyhow. Should I send a
patch ?

Thanks and best regards
  Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-11-25 14:09     ` Peter Vollmer
@ 2020-11-26 21:41       ` Tobias Waldekranz
  2020-11-26 22:23         ` Andrew Lunn
  2020-12-01  8:49         ` Peter Vollmer
  0 siblings, 2 replies; 12+ messages in thread
From: Tobias Waldekranz @ 2020-11-26 21:41 UTC (permalink / raw)
  To: Peter Vollmer, Andrew Lunn; +Cc: Network Development

On Wed, Nov 25, 2020 at 15:09, Peter Vollmer <peter.vollmer@gmail.com> wrote:
> Hi,
> I am still investigating the leaking packets problem we are having
> with a setup of an armada-3720 SOC and a 88E6341 switch ( connected
> via cpu port 5 , SGMII ,C_MODE=0xB, 2500 BASE-x). I now jumped to the
> net-next kernel (5.10.0-rc4) and can now use the nice mv88e6xxx_dump
> tool for switch register dumping.
>
> The described packet leaking still occurs, in a setup of ports
> lan0-lan3 (switch ports 1-4)  joined in a bridge br0.
>
> Here is my setup, ports lan0-3 are DSA ports coming in through eth1,
> eth0 is a single 88E1512 phy connected to RGMII
> root@DUT:~# brctl show
> bridge name     bridge id               STP enabled     interfaces
> br0             8000.fafb2fbbd4c6       no              lan0
>                                                         lan1
>                                                         lan2
>                                                         lan3
> root@DUT:~# ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
> group default qlen 1000
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
> group default qlen 1024
>     link/ether c2:49:bc:0d:a8:57 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.90.100/24 brd 192.168.90.255 scope global eth0
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1504 qdisc mq state UP
> group default qlen 1024
>     link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
> 4: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
>     link/sit 0.0.0.0 brd 0.0.0.0
> 5: lan0@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> master br0 state UP group default qlen 1000
>     link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
> 6: lan1@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> master br0 state UP group default qlen 1000
>     link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
> 7: lan2@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> master br0 state UP group default qlen 1000
>     link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
> 8: lan3@eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc
> noqueue master br0 state LOWERLAYERDOWN group default qlen 1000
>     link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
> 9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
> UP group default qlen 1000
>     link/ether fa:fb:2f:bb:d4:c6 brd ff:ff:ff:ff:ff:ff
>     inet 172.16.4.1/16 brd 172.16.4.255 scope global br0
>        valid_lft forever preferred_lft forever
>
> - pinging from client0 (connected to lan0 ) to the bridge IP, the ping
> requests (only the requests) are also seen on client1 connected to
> lan1

This is the expected behavior of the current implementation I am
afraid. It stems from the fact that the CPU responds to the echo request
(or to any other request for that matter) with a FROM_CPU. This means
that no learning takes place, and the SA of br0 will thus never reach
the switch's FDB. So while client0 knows the MAC of br0, the switch
(very counter-intuitively) does not.

The result is that the unicast echo request sent by client0 is flooded
as unknown unicast by the switch. This way it reaches the CPU but also,
as you have discovered, all other ports that allow unknown unicast to
egress.

> - the other effect looks more suspicious: when pinging from br0 to the
> IP of client0 connected to port lan0, after ~280 seconds client1
> connected to lan1 will also see the ping replies of client0 (only the
> replies). And after another ~300seconds this stops again. This repeats
> in a cycle .

I can not account for the oscillating effect. In my system I see a
continuous stream of respones from client0 when tcpdumping on
client1. That said, 300s is the default age timeout so I would start by
diffing the ATU when you are seeing replies on client1 and when you are
not.

The echo responses reaches client1 for the same reason as above. It is
just that now that client0 is the pinged host, the responses are
addressed to br0's MAC, which will be classified as unknown unicast.

> I see these problems since at least kernel version 5.4.y, but not with
> the old linux-marvel kernel sources
> (https://github.com/MarvellEmbeddedProcessors/linux-marvell.git)
> Can somebody using this switch in SGMII mode perhaps reproduce this ?

My system is connected to the CPU over RGMII, but I would guess that
that has no impact on this issue. The CPU is not responsible for
flooding the packets to client1, the switch does that autonomously. If
you tcpdump with "-Q out" on your base interface, I bet you will only
see FROM_CPUs to the port that client0 is connected to.

> One thing I noticed is that due to .tag_protocol=DSA_TAG_PROTO_EDSA
> for the 88E6341 switch, EgressMode (port control 0x4 , bit13:12) is
> set to an unsupported value of 0x3 ("reserved for future use" in the
> switch spec). See the value in row 04 Port control for port 5 = 0x373f
> in the following dump:
>
> root@mguard3:~# mv88e6xxx_dump --ports
> Using device <mdio_bus/d0032004.mdio-mii:01>
>                            0    1    2    3    4    5
> 00 Port status            0006 9e4f 9e4f 9e4f 100f 0f0b
> 01 Physical control       0003 0003 0003 0003 0003 20ff
> 02 Jamming control        ff00 0000 0000 0000 0000 0000
> 03 Switch ID              3410 3410 3410 3410 3410 3410
> 04 Port control           007c 043f 043f 043f 043c 373f
> 05 Port control 1         0000 0000 0000 0000 0000 0000
> 06 Port base VLAN map     007e 007c 007a 0076 006e 005f
> 07 Def VLAN ID & Prio     0001 0000 0000 0000 0000 0000
> 08 Port control 2         2080 0080 0080 0080 0080 0080
> 09 Egress rate control    0001 0001 0001 0001 0001 0001
> 0a Egress rate control 2  8000 0000 0000 0000 0000 0000
> 0b Port association vec   0001 0002 0004 0008 0010 0000
> 0c Port ATU control       0000 0000 0000 0000 0000 0000
> 0d Override               0000 0000 0000 0000 0000 0000
> 0e Policy control         0000 0000 0000 0000 0000 0000
> 0f Port ether type        9100 9100 9100 9100 9100 dada
> 10 Reserved               0000 0000 0000 0000 0000 0000
> 11 Reserved               0000 0000 0000 0000 0000 0000
> 12 Reserved               0000 0000 0000 0000 0000 0000
> 13 Reserved               0000 0000 0000 0000 0000 0000
> 14 Reserved               0000 0000 0000 0000 0000 0000
> 15 Reserved               0000 0000 0000 0000 0000 0000
> 16 LED control            0000 10eb 10eb 10eb 10eb 0000
> 17 Reserved               0000 0000 0000 0000 0000 0000
> 18 Tag remap low          3210 3210 3210 3210 3210 3210
> 19 Tag remap high         7654 7654 7654 7654 7654 7654
> 1a Reserved               0000 0000 0000 0000 5ea0 a100
> 1b Queue counters         8000 8000 8000 8000 8000 8000
> 1c Queue control          0000 0000 0000 0000 0000 0000
> 1d queue control 2        0000 0000 0000 0000 0000 0000
> 1e Cut through control    f000 f000 f000 f000 f000 f000
> 1f Debug counters         0000 0014 0015 0012 0000 0010
>
> I tested setting .tag_protocol=DSA_TAG_PROTO_DSA for the 6341 switch
> instead, resulting in a register setting of 04 Port control for port 5
> = 0x053f (i.e. EgressMode=Unmodified mode, frames are transmitted
> unmodified), which looks correct to me. It does not fix the above
> problem, but the change seems to make sense anyhow. Should I send a
> patch ?

This is not up to me, but my guess is that Andrew would like a patch,
yes. On 6390X, I know for a fact that setting the EgressMode to 3 does
indeed produce the behavior that was supported in older devices (like
the 6352), but there is no reason not to change it to regular DSA.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-11-26 21:41       ` Tobias Waldekranz
@ 2020-11-26 22:23         ` Andrew Lunn
  2020-12-01  9:00           ` Peter Vollmer
  2020-12-01  8:49         ` Peter Vollmer
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Lunn @ 2020-11-26 22:23 UTC (permalink / raw)
  To: Tobias Waldekranz; +Cc: Peter Vollmer, Network Development

> > I tested setting .tag_protocol=DSA_TAG_PROTO_DSA for the 6341 switch
> > instead, resulting in a register setting of 04 Port control for port 5
> > = 0x053f (i.e. EgressMode=Unmodified mode, frames are transmitted
> > unmodified), which looks correct to me. It does not fix the above
> > problem, but the change seems to make sense anyhow. Should I send a
> > patch ?
> 
> This is not up to me, but my guess is that Andrew would like a patch,
> yes. On 6390X, I know for a fact that setting the EgressMode to 3 does
> indeed produce the behavior that was supported in older devices (like
> the 6352), but there is no reason not to change it to regular DSA.

I already said to Tobias, i had problems getting the 6390 working, and
this was one of the things i changed. I don't think i ever undid this
specific change, to see how critical it is. But relying on
undocumented behaviour is not nice.

EDSA used to have the advantages that tcpdump understood it. But
thanks to work Florian and Vivien did, tcpdump can now decode DSA just
as well as EDSA.

So please do submit a patch.

   Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-11-26 22:23         ` Andrew Lunn
@ 2020-12-01  9:00           ` Peter Vollmer
  2020-12-01 13:58             ` Andrew Lunn
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Vollmer @ 2020-12-01  9:00 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Tobias Waldekranz, Network Development

On Thu, Nov 26, 2020 at 11:23:59PM +0100, Andrew Lunn wrote:
> > > I tested setting .tag_protocol=DSA_TAG_PROTO_DSA for the 6341 switch
> > > instead, resulting in a register setting of 04 Port control for port 5
> > > = 0x053f (i.e. EgressMode=Unmodified mode, frames are transmitted
> > > unmodified), which looks correct to me. It does not fix the above
> > > problem, but the change seems to make sense anyhow. Should I send a
> > > patch ?
> > 
> > This is not up to me, but my guess is that Andrew would like a patch,
> > yes. On 6390X, I know for a fact that setting the EgressMode to 3 does
> > indeed produce the behavior that was supported in older devices (like
> > the 6352), but there is no reason not to change it to regular DSA.
> 
> I already said to Tobias, i had problems getting the 6390 working, and
> this was one of the things i changed. I don't think i ever undid this
> specific change, to see how critical it is. But relying on
> undocumented behaviour is not nice.
> 
> EDSA used to have the advantages that tcpdump understood it. But
> thanks to work Florian and Vivien did, tcpdump can now decode DSA just
> as well as EDSA.
> 
> So please do submit a patch.

I checked both cases (EDSA, DSA) with tcpdump on eth1 (SGMII to the switch),
they both seem to work and tcpdump recognizes two different formats, MEDSA for
DSA_TAG_PROTO_EDSA and "ethertype unknown (0x4018 (or 0xc018))" for
DSA_TAG_PROTO_DSA (due to an older tcpdump version 4.9.3 I guess). Maybe I can
get some information from our support if DSA_TAG_PROTO_EDSA is supported
for the port config (0x4) register on the 6341 switch after all or if it should
be omitted.

Thanks

  Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-12-01  9:00           ` Peter Vollmer
@ 2020-12-01 13:58             ` Andrew Lunn
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2020-12-01 13:58 UTC (permalink / raw)
  To: Peter Vollmer; +Cc: Tobias Waldekranz, Network Development

Hi Peter

> Maybe I can get some information from our support if
> DSA_TAG_PROTO_EDSA is supported for the port config (0x4) register
> on the 6341 switch after all or if it should be omitted.

It would be nice to hear what Marvell says about this. It does seem an
odd thing to remove, so it could be a documentation issue.

    Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-11-26 21:41       ` Tobias Waldekranz
  2020-11-26 22:23         ` Andrew Lunn
@ 2020-12-01  8:49         ` Peter Vollmer
  2020-12-01 12:29           ` Tobias Waldekranz
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Vollmer @ 2020-12-01  8:49 UTC (permalink / raw)
  To: Tobias Waldekranz; +Cc: Andrew Lunn, Network Development

On Thu, Nov 26, 2020 at 10:41:44PM +0100, Tobias Waldekranz wrote:
> On Wed, Nov 25, 2020 at 15:09, Peter Vollmer <peter.vollmer@gmail.com> wrote:
> > - pinging from client0 (connected to lan0 ) to the bridge IP, the ping
> > requests (only the requests) are also seen on client1 connected to
> > lan1
> 
> This is the expected behavior of the current implementation I am
> afraid. It stems from the fact that the CPU responds to the echo request
> (or to any other request for that matter) with a FROM_CPU. This means
> that no learning takes place, and the SA of br0 will thus never reach
> the switch's FDB. So while client0 knows the MAC of br0, the switch
> (very counter-intuitively) does not.
> 
> The result is that the unicast echo request sent by client0 is flooded
> as unknown unicast by the switch. This way it reaches the CPU but also,
> as you have discovered, all other ports that allow unknown unicast to
> egress.
> 

Thanks for this explanation. Would there be a way to inject the br0 MAC
into the switch FDB using 'bridge fdb' or some other tool as a
workaround ?
And is this behaviour the same with all other DSA capable
switches (or at least the mv88e6xxx ones)?  Will this change eventually 
after the implementation is complete ?

Thanks and best regards

  Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: dsa/mv88e6xxx: leaking packets on MV88E6341 switch
  2020-12-01  8:49         ` Peter Vollmer
@ 2020-12-01 12:29           ` Tobias Waldekranz
  0 siblings, 0 replies; 12+ messages in thread
From: Tobias Waldekranz @ 2020-12-01 12:29 UTC (permalink / raw)
  To: Peter Vollmer; +Cc: Andrew Lunn, Vladimir Oltean, Network Development

On Tue, Dec 01, 2020 at 09:49, Peter Vollmer <peter.vollmer@gmail.com> wrote:
> On Thu, Nov 26, 2020 at 10:41:44PM +0100, Tobias Waldekranz wrote:
>> On Wed, Nov 25, 2020 at 15:09, Peter Vollmer <peter.vollmer@gmail.com> wrote:
>> > - pinging from client0 (connected to lan0 ) to the bridge IP, the ping
>> > requests (only the requests) are also seen on client1 connected to
>> > lan1
>> 
>> This is the expected behavior of the current implementation I am
>> afraid. It stems from the fact that the CPU responds to the echo request
>> (or to any other request for that matter) with a FROM_CPU. This means
>> that no learning takes place, and the SA of br0 will thus never reach
>> the switch's FDB. So while client0 knows the MAC of br0, the switch
>> (very counter-intuitively) does not.
>> 
>> The result is that the unicast echo request sent by client0 is flooded
>> as unknown unicast by the switch. This way it reaches the CPU but also,
>> as you have discovered, all other ports that allow unknown unicast to
>> egress.
>> 
>
> Thanks for this explanation. Would there be a way to inject the br0 MAC
> into the switch FDB using 'bridge fdb' or some other tool as a
> workaround ?

Unfortunately not. DSA will only attempt to offload FDB entries on user
ports to the ATU at the moment. Vladimir has started work on a series
that would also offload addresses from "foreign" ports:

https://lore.kernel.org/netdev/20201108131953.2462644-1-olteanv@gmail.com/

His work could possibly be extended to include addresses added to the
bridge itself.

> And is this behaviour the same with all other DSA capable
> switches (or at least the mv88e6xxx ones)?  Will this change eventually 

For mv88e6xxx, yes. These devices will never perform learning on
FROM_CPU frames.

> after the implementation is complete ?

I sure hope so. There are multiple ways forward here. Vladimirs approach
with adding dynamically learned addresses as static entries is one way.

I would like to do some work to optimize multicast forwarding
performance that would also, as a side-effect, solve this
problem. Because it would mean that we would start sending FORWARD
frames from the CPU for bridged traffic, and thus the switch would be
able to learn the location of the source address.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-12-01 13:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-30 10:09 dsa/mv88e6xxx: leaking packets on MV88E6341 switch Peter Vollmer
2020-09-30 10:28 ` Vladimir Oltean
2020-09-30 11:57   ` Peter Vollmer
2020-09-30 19:19 ` Andrew Lunn
2020-10-01  6:21   ` Peter Vollmer
2020-11-25 14:09     ` Peter Vollmer
2020-11-26 21:41       ` Tobias Waldekranz
2020-11-26 22:23         ` Andrew Lunn
2020-12-01  9:00           ` Peter Vollmer
2020-12-01 13:58             ` Andrew Lunn
2020-12-01  8:49         ` Peter Vollmer
2020-12-01 12:29           ` Tobias Waldekranz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).