netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Regression with improved multi chip isolation
@ 2022-03-06 19:15 Andrew Lunn
  2022-03-06 19:49 ` Vladimir Oltean
  2022-03-07  9:21 ` Tobias Waldekranz
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Lunn @ 2022-03-06 19:15 UTC (permalink / raw)
  To: Tobias Waldekranz; +Cc: netdev, Vladimir Oltean, emeric.dupont

Hi Tobias

I just found a regression with:

d352b20f4174a6bd998992329b773ab513232880 is the first bad commit
commit d352b20f4174a6bd998992329b773ab513232880
Author: Tobias Waldekranz <tobias@waldekranz.com>
Date:   Thu Feb 3 11:16:56 2022 +0100

    net: dsa: mv88e6xxx: Improve multichip isolation of standalone ports
    
    Given that standalone ports are now configured to bypass the ATU and
    forward all frames towards the upstream port, extend the ATU bypass to
    multichip systems.


I have a ZII devel B setup:

brctl addbr br0                                                                 
brctl addif br0 lan0                                                            
brctl addif br0 lan1                                                            
                                                                                
ip link set br0 up                                                              
ip link set lan0 up                                                             
ip link set lan1 up                                                             
                                                                                
ip link add link br0 name br0.11 type vlan id 11                                
ip link set br0.11 up                                                           
ip addr add 10.42.11.1/24 dev br0.11

Has it happens, lan0 has link, and i run tcpdump on the link peer. lan1
does not have link.

I then ping 10.42.11.2.

I found that the ARP Request who-has 10.42.11.2 tell 10.42.11.1 are
getting dropped. I also see:

     p06_sw_in_filtered: 122
     p06_sw_out_filtered: 90
     p06_atu_member_violation: 0
     p06_atu_miss_violation: 0
     p06_atu_full_violation: 0
     p06_vtu_member_violation: 0
     p06_vtu_miss_violation: 121

port 6 is the CPU port. Both p06_vtu_miss_violation and
p06_sw_in_filtered are incrementing with each ARP Request broadcast
from the host.

The bridge should be vlan unaware, vlan_filtering is 0.

$ ip -d link show br0
16: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode 
DEFAULT group default qlen 1000
    link/ether 8e:22:a0:47:66:f9 brd ff:ff:ff:ff:ff:ff promiscuity 0 
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_
state 0 priority 32768 vlan_filtering 0 bridge_id 8000.8e:22:a0:47:66:f9 designa
ted_root 8000.8e:22:a0:47:66:f9 root_port 0 root_path_cost 0 topology_change 0 t
opology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_
timer    0.00 gc_timer  295.16 group_fwd_mask 0 group_address 01:80:c2:00:00:00 
mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_h
ash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_qu
ery_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast
_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval
 1000 mcast_startup_query_interval 3125 addrgenmode eui64 numtxqueues 1 gso_max_
size 65536 gso_max_segs 65535

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Regression with improved multi chip isolation
  2022-03-06 19:15 Regression with improved multi chip isolation Andrew Lunn
@ 2022-03-06 19:49 ` Vladimir Oltean
  2022-03-06 20:27   ` Andrew Lunn
  2022-03-07  9:21 ` Tobias Waldekranz
  1 sibling, 1 reply; 4+ messages in thread
From: Vladimir Oltean @ 2022-03-06 19:49 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Tobias Waldekranz, netdev, emeric.dupont@zii.aero

On Sun, Mar 06, 2022 at 08:15:14PM +0100, Andrew Lunn wrote:
> Hi Tobias
>
> I just found a regression with:
>
> d352b20f4174a6bd998992329b773ab513232880 is the first bad commit
> commit d352b20f4174a6bd998992329b773ab513232880
> Author: Tobias Waldekranz <tobias@waldekranz.com>
> Date:   Thu Feb 3 11:16:56 2022 +0100
>
>     net: dsa: mv88e6xxx: Improve multichip isolation of standalone ports
>
>     Given that standalone ports are now configured to bypass the ATU and
>     forward all frames towards the upstream port, extend the ATU bypass to
>     multichip systems.
>
>
> I have a ZII devel B setup:
>
> brctl addbr br0
> brctl addif br0 lan0
> brctl addif br0 lan1
>
> ip link set br0 up
> ip link set lan0 up
> ip link set lan1 up
>
> ip link add link br0 name br0.11 type vlan id 11
> ip link set br0.11 up
> ip addr add 10.42.11.1/24 dev br0.11
>
> Has it happens, lan0 has link, and i run tcpdump on the link peer. lan1
> does not have link.
>
> I then ping 10.42.11.2.
>
> I found that the ARP Request who-has 10.42.11.2 tell 10.42.11.1 are
> getting dropped. I also see:
>
>      p06_sw_in_filtered: 122
>      p06_sw_out_filtered: 90
>      p06_atu_member_violation: 0
>      p06_atu_miss_violation: 0
>      p06_atu_full_violation: 0
>      p06_vtu_member_violation: 0
>      p06_vtu_miss_violation: 121
>
> port 6 is the CPU port. Both p06_vtu_miss_violation and
> p06_sw_in_filtered are incrementing with each ARP Request broadcast
> from the host.
>
> The bridge should be vlan unaware, vlan_filtering is 0.
>
> $ ip -d link show br0
> 16: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode
> DEFAULT group default qlen 1000
>     link/ether 8e:22:a0:47:66:f9 brd ff:ff:ff:ff:ff:ff promiscuity 0
>     bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_
> state 0 priority 32768 vlan_filtering 0 bridge_id 8000.8e:22:a0:47:66:f9 designa
> ted_root 8000.8e:22:a0:47:66:f9 root_port 0 root_path_cost 0 topology_change 0 t
> opology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_
> timer    0.00 gc_timer  295.16 group_fwd_mask 0 group_address 01:80:c2:00:00:00
> mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_h
> ash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_qu
> ery_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast
> _querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval
>  1000 mcast_startup_query_interval 3125 addrgenmode eui64 numtxqueues 1 gso_max_
> size 65536 gso_max_segs 65535

This example of injecting traffic through br0.11 is interesting because
I think that Tobias' patch merely exposes a shortcoming of tag_dsa.c.
The tagger should inject packets into the switch in VLAN 4095
(MV88E6XXX_VID_BRIDGED), because the ports offload a VLAN-unaware bridge.
Yet my guess is that it probably does so in VID 11 - this can be seen
using tcpdump and analyzing the DSA header. Hence the problem. Tobias' patch
basically enables 802.1Q secure mode on DSA and CPU ports, so any VLAN
that is absent from the VTU (here 11) will be dropped.
Sadly I am not at my desk right now so I can't work on a patch or test
one, but if I'm right, I'd go in the direction of refactoring
dsa_xmit_ll() such that it considers "skb->protocol == htons(ETH_P_8021Q)"
only while the port is under a VLAN-aware bridge. Hopefully this makes
some sense.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Regression with improved multi chip isolation
  2022-03-06 19:49 ` Vladimir Oltean
@ 2022-03-06 20:27   ` Andrew Lunn
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Lunn @ 2022-03-06 20:27 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Tobias Waldekranz, netdev, emeric.dupont@zii.aero

> This example of injecting traffic through br0.11 is interesting because
> I think that Tobias' patch merely exposes a shortcoming of tag_dsa.c.
> The tagger should inject packets into the switch in VLAN 4095
> (MV88E6XXX_VID_BRIDGED), because the ports offload a VLAN-unaware bridge.
> Yet my guess is that it probably does so in VID 11 - this can be seen
> using tcpdump and analyzing the DSA header.

20:25:42.239330 5a:e0:ae:a2:f6:db (oui Unknown) > Broadcast, ethertype MEDSA (0xdada), length 50: Forward, tagged, dev.port:vlan 3.0:11, pri 0: ethertype ARP (0x0806) Request who-has 10.42.11.2 tell 10.42.11.1, length 28
20:25:43.279670 5a:e0:ae:a2:f6:db (oui Unknown) > Broadcast, ethertype MEDSA (0xdada), length 50: Forward, tagged, dev.port:vlan 3.0:11, pri 0: ethertype ARP (0x0806) Request who-has 10.42.11.2 tell 10.42.11.1, length 28
20:25:44.319299 5a:e0:ae:a2:f6:db (oui Unknown) > Broadcast, ethertype MEDSA (0xdada), length 50: Forward, tagged, dev.port:vlan 3.0:11, pri 0: ethertype ARP (0x0806) Request who-has 10.42.11.2 tell 10.42.11.1, length 28
20:25:45.359288 5a:e0:ae:a2:f6:db (oui Unknown) > Broadcast, ethertype MEDSA (0xdada), length 50: Forward, tagged, dev.port:vlan 3.0:11, pri 0: ethertype ARP (0x0806) Request who-has 10.42.11.2 tell 10.42.11.1, length 28

	Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Regression with improved multi chip isolation
  2022-03-06 19:15 Regression with improved multi chip isolation Andrew Lunn
  2022-03-06 19:49 ` Vladimir Oltean
@ 2022-03-07  9:21 ` Tobias Waldekranz
  1 sibling, 0 replies; 4+ messages in thread
From: Tobias Waldekranz @ 2022-03-07  9:21 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev, Vladimir Oltean, emeric.dupont

On Sun, Mar 06, 2022 at 20:15, Andrew Lunn <andrew@lunn.ch> wrote:
> Hi Tobias
>
> I just found a regression with:
>
> d352b20f4174a6bd998992329b773ab513232880 is the first bad commit
> commit d352b20f4174a6bd998992329b773ab513232880
> Author: Tobias Waldekranz <tobias@waldekranz.com>
> Date:   Thu Feb 3 11:16:56 2022 +0100
>
>     net: dsa: mv88e6xxx: Improve multichip isolation of standalone ports
>     
>     Given that standalone ports are now configured to bypass the ATU and
>     forward all frames towards the upstream port, extend the ATU bypass to
>     multichip systems.

Sorry about that.

> I have a ZII devel B setup:
>
> brctl addbr br0                                                                 
> brctl addif br0 lan0                                                            
> brctl addif br0 lan1                                                            
>                                                                                 
> ip link set br0 up                                                              
> ip link set lan0 up                                                             
> ip link set lan1 up                                                             
>                                                                                 
> ip link add link br0 name br0.11 type vlan id 11                                
> ip link set br0.11 up                                                           
> ip addr add 10.42.11.1/24 dev br0.11
>
> Has it happens, lan0 has link, and i run tcpdump on the link peer. lan1
> does not have link.
>
> I then ping 10.42.11.2.
>
> I found that the ARP Request who-has 10.42.11.2 tell 10.42.11.1 are
> getting dropped. I also see:
>
>      p06_sw_in_filtered: 122
>      p06_sw_out_filtered: 90
>      p06_atu_member_violation: 0
>      p06_atu_miss_violation: 0
>      p06_atu_full_violation: 0
>      p06_vtu_member_violation: 0
>      p06_vtu_miss_violation: 121
>
> port 6 is the CPU port. Both p06_vtu_miss_violation and
> p06_sw_in_filtered are incrementing with each ARP Request broadcast
> from the host.
>
> The bridge should be vlan unaware, vlan_filtering is 0.

Huh, a VLAN upper without filtering enabled; didn't consider that
use-case...

Vladimir has already correctly diagnosed the problem. I'm working on a
fix right now, which I aim to send later today.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-03-07  9:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-06 19:15 Regression with improved multi chip isolation Andrew Lunn
2022-03-06 19:49 ` Vladimir Oltean
2022-03-06 20:27   ` Andrew Lunn
2022-03-07  9:21 ` Tobias Waldekranz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).