Re: MT7530 bug, forward broadcast and unknown frames to the correct CPU port

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: "Arınç ÜNAL" <arinc.unal@arinc9.com>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: DENG Qingfang <dqfext@gmail.com>, Greg Ungerer <gerg@kernel.org>,
	Daniel Golle <daniel@makrotopia.org>,
	Richard van Schagen <richard@routerhints.com>,
	Richard van Schagen <vschagen@cs.com>,
	Frank Wunderlich <frank-w@public-files.de>,
	mithat.guner@xeront.com, erkin.bozoglu@xeront.com,
	bartel.eerdekens@constell8.be, netdev <netdev@vger.kernel.org>
Subject: Re: MT7530 bug, forward broadcast and unknown frames to the correct CPU port
Date: Sat, 29 Apr 2023 16:03:57 +0300	[thread overview]
Message-ID: <8d6a46a7-a769-4532-dd44-f230b705a675@arinc9.com> (raw)
In-Reply-To: <0183eb91-8517-f40f-c2bb-b229e45d6fa5@arinc9.com>

(The netdev mailing list didn't like the big attachments so I'm sending
this again with links to the documents instead.)

I'm answering to myself now that I know better.

On 28.04.2023 16:31, Arınç ÜNAL wrote:
> On 26.04.2023 23:54, Vladimir Oltean wrote:
>> On Sun, Apr 23, 2023 at 06:22:41PM +0300, Arınç ÜNAL wrote:
>>> Hey there folks,
>>>
>>> On mt753x_cpu_port_enable() there's code [0] that sets which port to 
>>> forward
>>> the broadcast, unknown multicast, and unknown unicast frames to. Since
>>> mt753x_cpu_port_enable() runs twice when both CPU ports are enabled, 
>>> port 6
>>> becomes the port to forward the frames to. But port 5 is the active 
>>> port, so
>>> no broadcast frames received from the user ports will be forwarded to 
>>> port
>>> 5. This breaks network connectivity when multiple ports are being 
>>> used as
>>> CPU ports.
>>>
>>> My testing shows that only after receiving a broadcast ARP frame from 
>>> port 5
>>> then forwarding it to the user port, the unicast frames received from 
>>> that
>>> user port will be forwarded to port 5. I tested this with ping.
>>>
>>> Forwarding broadcast and unknown unicast frames to the CPU port was done
>>> with commit 5a30833b9a16 ("net: dsa: mt7530: support MDB and bridge flag
>>> operations"). I suppose forwarding the broadcast frames only to the 
>>> CPU port
>>> is what "disable flooding" here means.
>>
>> Flooding means forwarding a packet that does not have a precise 
>> destination
>> (its MAC DA is not present in the FDB or MDB). Flooding is done towards
>> the ports that have flooding enabled.
>>
>>>
>>> It’s a mystery to me how the switch classifies multicast and unicast 
>>> frames
>>> as unknown. Bartel's testing showed LLDP frames fall under this 
>>> category.
>>
>> What is mysterious exactly? What's not in the FDB/MDB is unknown. And
>> DSA, unless the requirements from dsa_switch_supports_uc_filtering() and
>> dsa_switch_supports_mc_filtering() are satisfied, will not program MAC
>> addresses for host RX filtering to the CPU port(s).
>>
>> This switch apparently has the option to automatically learn from the 
>> MAC SA
>> of packets injected by software. That option is automatically enabled
>> unless MTK_HDR_XMIT_SA_DIS is set (which currently it never is).
>>
>> So when software sends a broadcast ARP frame from port 5, the switch
>> learns the MAC SA of this packet (which is the software MAC address of
>> the user port) and it associates it with port 5. So future traffic
>> destined to the user port's software MAC address now reaches port 5, the
>> active CPU port (and the real CPU port from DSA's perspective).
>>
>> Wait 5 minutes for the learned FDB entry to expire, and the problem will
>> probably be back.
> 
> Understood, thank you.
> 
>>
>> LLDP frames should not obey the same rules. They are sent to the MAC DA
>> of 01:80:c2:00:00:0e, which is in the link-local multicast address space
>> (hence the "LL" in the name), and which according to IEEE 802.1Q-2018 is
>> the "Nearest Bridge group address":
>>
>> | The Nearest Bridge group address is an address that no conformant TPMR
>> | component, S-VLAN component, C-VLAN component, or MAC Bridge can
>> | forward. PDUs transmitted using this destination address, or any of the
>> | other addresses that appear in all three tables, can therefore 
>> travel no
>> | further than those stations that can be reached via a single individual
>> | LAN from the originating station. Hence the Nearest Bridge group 
>> address
>> | is also known as the Individual LAN Scope group address.
>>
>> Removing a packet from the forwarding data plane and delivering it only
>> to the CPU is known as "trapping", and thus, it is not "flooding".
>>
>> The MAC SA learning trick will not make port 5 see LLDP frames, since
>> those are not targeted towards a unicast MAC address which could be
>> learned.
> 
> Got it. Bartel has already wrote code to trap the LLDP frames with :03 
> and :0E MAC DA to the CPU port by utilising the MT753X_RGAC2 (0x2C) 
> register.
> 
> I will make sure the code gets in after the issues with multiple CPUs 
> are dealt with.
> 
>>
>>>
>>> Until the driver supports changing the DSA conduit, unknown frames 
>>> should be
>>> forwarded to the active CPU port, not the numerically greater one. 
>>> Any ideas
>>> how to address this and the "disable flooding" case?
>>
>> I think I also signaled the reverse problem in the other thread:
>> https://lore.kernel.org/netdev/20230222193951.rjxgxmopyatyv2t7@skbuf/
> 
> /* BPDU to CPU port */
> dsa_switch_for_each_cpu_port(cpu_dp, ds) {
>      mt7530_rmw(priv, MT7531_CFC, MT7531_CPU_PMAP_MASK,
>              BIT(cpu_dp->index));
>      break;
> }
> 
> I don't really understand this. This doesn't seem to be related to 
> processing BPDUs. This looks something extra for MT7531, on top of 
> MT7530_MFC.
> 
> If both CPU ports are enabled, should BIT(5) and BIT(6) on 
> MT7531_CPU_PMAP_MASK be set to 1?

Yes, the MT7531 reference manual [0] points to that. Looks like the
MT7531 switch looks at the CPU ports set on the CPU_PMAP bits for the
BPDU_PORT_FW configuration on the MT753X_BPC register to work.

MT7531 has got the MT7530_MFC register too but it's slightly different.
The last 8 bits on the register are for IGMP/MLD Query Frame Flooding
Ports, instead of CPU_EN, CPU_PORT, MIRROR_EN, MIRROR_PORT on MT7530 [1].

Mirroring and CPU port configuration is handled on MT7531_CFC for
MT7531, so it makes sense.

> 
>>
>> Well, the most important step in fixing the problem would be to
>> politically decide which port should be the active CPU port in the case
>> of multiple choices, then to start fixing up the bits in the driver that
>> disagree with that. Having half the code think it's 5 and the other half
>> think it's 6 surely isn't any good.
>>
>> There was a discussion in the other thread with Frank that port 6 would
>> be somehow preferable if both are available, but I haven't seen 
>> convincing
>> enough arguments yet.
> 
> It seems to be fine for MT7530, but MT7531BE has got RGMII on port 5. It 
> would be much better to use port 6 which has got a 2.5G SGMII link.
> 
> The preferred_default_local_cpu_port operation should work properly 
> after we figure out the issue above and below.
> 
>>
>>>
>>> There's also this "set the CPU number" code that runs only for 
>>> MT7621. I'm
>>> not sure why this is needed or why it's only needed for MT7621. Greg, 
>>> could
>>> you shed some light on this since you added this code with commit
>>> ddda1ac116c8 ("net: dsa: mt7530: support the 7530 switch on the Mediatek
>>> MT7621 SoC")?
>>>
>>> There're more things to discuss after supporting changing the DSA 
>>> conduit,
>>> such as which CPU port to forward the unknown frames to, when user ports
>>> under different conduits receive unknown frames. What makes sense to 
>>> me is,
>>> if there are multiple CPU ports being used, forward the unknown 
>>> frames to
>>> port 6. This is already the case except the code runs twice. If not, 
>>> set it
>>> to whatever 'int port' is, which is the default behaviour already.
>>>
>>> [0] 
>>> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/dsa/mt7530.c#n1005
>>
>> I suspect you may not have run sufficient tests. When there are 2 CPU
>> ports, both of them should be candidates for flooding unknown traffic.
>> Don't worry, software won't see duplicates, because the user <-> CPU port
>> affinity setting should restrict forwarding of the flooded frames to a
>> single CPU port.
>>
>> You might be confused by several things about this:
>>
>>     /* Disable flooding by default */
>>     mt7530_rmw(priv, MT7530_MFC, BC_FFP_MASK | UNM_FFP_MASK | 
>> UNU_FFP_MASK,
>>            BC_FFP(BIT(port)) | UNM_FFP(BIT(port)) | UNU_FFP(BIT(port)));
>>
>> First, the comment. It means to say: "disable flooding on all ports
>> except for this CPU port".
>>
>> Then, the fact that it runs twice, unsetting flooding for the first CPU
>> port (5) and setting it for the second one (6). There should be no
>> hardware limitation there. Both BIT(5) and BIT(6) could be part of the
>> flood mask without any problem.
>>
>> Perhaps the issue is that MT7530_MFC should have been written to all
>> zeroes as a first step, and then, every mt753x_cpu_port_enable() call
>> enables flooding to the "int port" argument.
> 
> Thanks a lot! I'm just learning about the process of bit masking. If I 
> understand correctly, masks for BC, UNM, and UNU are defined to have 8 
> bits.
> 
> UNU_FFP(~0) means that all bits are set to 1. I am supposed to first 
> clear them, then set bit 5 and 6 to 1.
> 
> Before mt753x_cpu_port_enable():
> 
> mt7530_rmw(priv, MT7530_MFC, BC_FFP_MASK | UNM_FFP_MASK | UNU_FFP_MASK, 0);
> 
> For every mt753x_cpu_port_enable():
> 
> mt7530_rmw(priv, MT7530_MFC, BC_FFP_MASK | UNM_FFP_MASK | UNU_FFP_MASK,
>         (BC_FFP(BIT(port))) & BC_FFP_MASK | (UNM_FFP(BIT(port))) &
>         UNM_FFP_MASK | (UNU_FFP(BIT(port))) & UNU_FFP_MASK);

This is the final diff I'm going to submit to net.

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 4d5c5820e461..cc5fa641b026 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -1008,9 +1008,9 @@ mt753x_cpu_port_enable(struct dsa_switch *ds, int port)
  	mt7530_write(priv, MT7530_PVC_P(port),
  		     PORT_SPEC_TAG);
  
-	/* Disable flooding by default */
-	mt7530_rmw(priv, MT7530_MFC, BC_FFP_MASK | UNM_FFP_MASK | UNU_FFP_MASK,
-		   BC_FFP(BIT(port)) | UNM_FFP(BIT(port)) | UNU_FFP(BIT(port)));
+	/* Enable flooding on the CPU port */
+	mt7530_set(priv, MT7530_MFC, BC_FFP(BIT(port)) | UNM_FFP(BIT(port)) |
+		   UNU_FFP(BIT(port)));
  
  	/* Set CPU port number */
  	if (priv->id == ID_MT7621)
@@ -2225,6 +2225,10 @@ mt7530_setup(struct dsa_switch *ds)
  		/* Disable learning by default on all ports */
  		mt7530_set(priv, MT7530_PSC_P(i), SA_DIS);
  
+		/* Disable flooding on all ports */
+		mt7530_clear(priv, MT7530_MFC, BC_FFP(BIT(i)) | UNM_FFP(BIT(i)) |
+			     UNU_FFP(BIT(i)));
+
  		if (dsa_is_cpu_port(ds, i)) {
  			ret = mt753x_cpu_port_enable(ds, i);
  			if (ret)
@@ -2412,6 +2416,10 @@ mt7531_setup(struct dsa_switch *ds)
  
  		mt7530_set(priv, MT7531_DBG_CNT(i), MT7531_DIS_CLR);
  
+		/* Disable flooding on all ports */
+		mt7530_clear(priv, MT7530_MFC, BC_FFP(BIT(i)) | UNM_FFP(BIT(i)) |
+			     UNU_FFP(BIT(i)));
+
  		if (dsa_is_cpu_port(ds, i)) {
  			ret = mt753x_cpu_port_enable(ds, i);
  			if (ret)

[0]
https://drive.google.com/file/d/1aVdQz3rbKWjkvdga8-LQ-VFXjmHR8yf9/view?usp=sharing
[1]
https://github.com/vschagen/documents/blob/main/MT7621_ProgrammingGuide_GSW_v0_3.pdf

Arınç

next prev parent reply	other threads:[~2023-04-29 13:04 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-23 15:22 MT7530 bug, forward broadcast and unknown frames to the correct CPU port Arınç ÜNAL
2023-04-26 20:54 ` Vladimir Oltean
2023-04-28 13:31   ` Arınç ÜNAL
2023-04-29 13:03     ` Arınç ÜNAL [this message]
2023-04-29 17:35       ` Vladimir Oltean
2023-04-29 18:39         ` Arınç ÜNAL
2023-04-29 18:44           ` Arınç ÜNAL
2023-04-29 18:56           ` Vladimir Oltean
2023-04-29 19:52             ` Arınç ÜNAL
2023-05-01 10:09               ` Vladimir Oltean
2023-05-01 10:31                 ` Daniel Golle
2023-05-01 10:43                   ` Arınç ÜNAL
2023-05-10  8:59                     ` Arınç ÜNAL
2023-05-10 14:02                       ` Vladimir Oltean
2023-05-16 20:01                         ` Arınç ÜNAL
2023-05-17 15:52                           ` Vladimir Oltean

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:4d5c5820e46 dfblob:cc5fa641b02 )
 OR (
bs:"Re: MT7530 bug, forward broadcast and unknown frames to the correct CPU port" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8d6a46a7-a769-4532-dd44-f230b705a675@arinc9.com \
    --to=arinc.unal@arinc9.com \
    --cc=bartel.eerdekens@constell8.be \
    --cc=daniel@makrotopia.org \
    --cc=dqfext@gmail.com \
    --cc=erkin.bozoglu@xeront.com \
    --cc=frank-w@public-files.de \
    --cc=gerg@kernel.org \
    --cc=mithat.guner@xeront.com \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=richard@routerhints.com \
    --cc=vschagen@cs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox