Re: Isolating DSA Slave Interfaces on a Bridge with Bridge Offloading

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: "Arınç ÜNAL" <arinc.unal@arinc9.com>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: "Jakub Kicinski" <kuba@kernel.org>,
	"Florian Fainelli" <f.fainelli@gmail.com>,
	"Alvin Šipraga" <ALSI@bang-olufsen.dk>,
	"Linus Walleij" <linus.walleij@linaro.org>,
	"Andrew Lunn" <andrew@lunn.ch>,
	"Luiz Angelo Daros de Luca" <luizluca@gmail.com>,
	"DENG Qingfang" <dqfext@gmail.com>,
	erkin.bozoglu@xeront.com,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Sean Wang" <sean.wang@mediatek.com>,
	"Mark Lee" <Mark-MC.Lee@mediatek.com>
Subject: Re: Isolating DSA Slave Interfaces on a Bridge with Bridge Offloading
Date: Sun, 10 Apr 2022 16:00:19 +0300	[thread overview]
Message-ID: <e427d390-0cd4-d322-1665-4176cd134884@arinc9.com> (raw)
In-Reply-To: <20220314140011.idmbyc33e7zfezu2@skbuf>

On 14/03/2022 17:00, Vladimir Oltean wrote:
> On Mon, Mar 14, 2022 at 04:13:47PM +0300, Arınç ÜNAL wrote:
>> Hey Vladimir,
>>
>> On 13/03/2022 16:32, Vladimir Oltean wrote:
>>> Hello Arınç,
>>>
>>> On Sun, Mar 13, 2022 at 02:23:47PM +0300, Arınç ÜNAL wrote:
>>>> Hi all,
>>>>
>>>> The company I work with has got a product with a Mediatek MT7530 switch.
>>>> They want to offer isolation for the switch ports on a bridge. I have run a
>>>> test on a slightly modified 5.17-rc1 kernel. These commands below should
>>>> prevent communication between the two interfaces:
>>>>
>>>> bridge link set dev sw0p4 isolated on
>>>>
>>>> bridge link set dev sw0p3 isolated on
>>>>
>>>> However, computers connected to each of these ports can still communicate
>>>> with each other. Bridge TX forwarding offload is implemented on the MT7530
>>>> DSA driver.
>>>>
>>>> What I understand is isolation works on the software and because of the
>>>> bridge offloading feature, the frames never reach the CPU where we can block
>>>> it.
>>>>
>>>> Two solutions I can think of:
>>>>
>>>> - Disable bridge offloading when isolation is enabled on a DSA slave
>>>> interface. Not the best solution but seems easy to implement.
>>>>
>>>> - When isolation is enabled on a DSA slave interface, do not mirror the
>>>> related FDB entries to the switch hardware so we can keep the bridge
>>>> offloading feature for other ports.
>>>>
>>>> I suppose this could only be achieved on switch specific DSA drivers so the
>>>> implementation would differ by each driver.
>>>>
>>>> Cheers.
>>>> Arınç
>>>
>>> To be clear, are you talking about a patched or unpatched upstream
>>> kernel? Because mt7530 doesn't implement bridge TX forwarding offload.
>>> This can be seen because it is missing the "*tx_fwd_offload = true;"
>>> line from its ->port_bridge_join handler (of course, not only that).
>>
>> I'm compiling the kernel using OpenWrt SDK so it's patched but mt7530 DSA
>> driver is untouched. I've seen this commit which made me think of that:
>> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/drivers/net/dsa/mt7530.c?id=b079922ba2acf072b23d82fa246a0d8de198f0a2
>>
>> I'm very inexperienced in coding so I must've made a wrong assumption.
>>
>> I've tested that mt7530 DSA driver delivers frames between switch ports
>> without the frames appearing on either the slave interfaces or the master
>> interface. I suppose I was confusing this feature with tx_fwd_offload?
>>
>> For example, the current state of the rtl8365mb DSA driver doesn't do this.
>> Each frame goes in and out of the DSA master interface to deliver frames
>> between the switch ports.
> 
> That is just "forwarding offload". It means "for each packet that a
> bridged switch port receives from the outside world, look up the
> hardware FDB and send the packet just to its intended destination. If
> there is no FDB entry, flood the packet in the entire forwarding domain
> and also to the CPU. The packet flooded to the CPU will have the
> skb->offload_fwd_mark bit set to true by the hardware driver to indicate
> to the bridge that flooding to this port's hardware domain has already
> been taken care of. Only flooding towards other (foreign) bridge ports
> needs to be done for this packet."
> 
> TX forwarding offload means "when the bridge wants to send a packet that
> needs to be replicated multiple times (either flooded or plain multicast)
> to swp0, swp1, swp2, swp3 (all in the same hardware forwarding domain),
> avoid replicating it in software (skb_clone) and send it just once to
> the device driver dealing with a port from that hardware domain (swp0).
> The driver will inject that single packet into the hardware in such a
> way that the packet is forwarded based on consulting the hardware FDB.
> If the hardware FDB and the bridge's software FDB are in sync, then
> (a) the packet was flooded by the bridge => the bridge has no FDB entry
>      for it. The hardware FDB has no entry for it either => the hardware
>      floods it too.
> (b) the packet was multicast => the bridge has an MDB entry, and so does
>      the hardware. The packet is forwarded to the ports in the MDB
>      entry."
> 
> This feature is basically the TX equivalent of the basic forwarding
> offload, hence the name. TX packets for which the bridge leaves the FDB
> lookup (and replication) responsibility to hardware have
> skb->offload_fwd_mark set to true, but by the bridge, this time.
> 
> I think it's very important to understand what the TX forwarding offload
> feature was meant to do and how it was meant to operate, because there
> was a discussion with Qingfang on this topic too, and it may be that
> mt7530 switch supports this in a different way.
> https://patchwork.kernel.org/project/netdevbpf/cover/20210703115705.1034112-1-vladimir.oltean@nxp.com/#24290759
> The approach of populating a port mask with more than 1 bit set in the
> tagger is complicated and may not even achieve all the benefits of TX
> forwarding offload. Forcing an FDB/MDB lookup is what is desirable,
> since typically this should also make those packets go through the
> address learning process (they will be treated as data plane packets).
> Address learning on the CPU port is desirable for packets send on behalf
> of a software bridge, and undesirable for packets sent from a standalone
> port.
> If you can enable address learning in the tagger for the packets that
> are sent with skb->offload_fwd_mark == true, then you should be able to
> remove ds->assisted_learning_on_cpu_port and still get the benefit
> described here:
> https://patchwork.kernel.org/project/netdevbpf/patch/20210106095136.224739-7-olteanv@gmail.com/
> 
>>>
>>> You are probably confused as to why is the BR_ISOLATED brport flag is
>>> not rejected by a switchdev interface when it will only lead to
>>> incorrect behavior.
>>>
>>> In fact we've had this discussion with Qingfang, who sent this patch to
>>> allow switchdev drivers to *at the very least* reject the flag, but
>>> didn't want to write up a correct commit description for the change:
>>> https://lore.kernel.org/all/CALW65jbotyW0MSOd-bd1TH_mkiBWhhRCQ29jgn+d12rXdj2pZA@mail.gmail.com/T/
>>> https://patchwork.kernel.org/project/netdevbpf/patch/20210811135247.1703496-1-dqfext@gmail.com/
>>> https://patchwork.kernel.org/project/netdevbpf/patch/20210812142213.2251697-1-dqfext@gmail.com/
>>
>> With v2 applied, when I issue the port isolation command, I get "RTNETLINK
>> answers: Not supported" as it's not implemented in mt7530 yet?
> 
> The entire idea is to get -EINVAL as returned from here, because
> BR_ISOLATED will be present in flags.mask:
> 
> static int
> mt7530_port_pre_bridge_flags(struct dsa_switch *ds, int port,
> 			     struct switchdev_brport_flags flags,
> 			     struct netlink_ext_ack *extack)
> {
> 	if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
> 			   BR_BCAST_FLOOD))
> 		return -EINVAL;
> 
> 	return 0;
> }
> 
> Now, the entire error handling from br_switchdev_set_port_flag() is a
> bit of a disaster.
> https://elixir.bootlin.com/linux/latest/source/net/bridge/br_switchdev.c#L77
> 
> 	/* We run from atomic context here */
> 	err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev,
> 				       &info.info, extack);
> 	err = notifier_to_errno(err);
> 	if (err == -EOPNOTSUPP) // Drivers need to return -EINVAL here to distinguish themselves from "no one responded to this notifier", which is a non-error.
> 		return 0;
> 
> 	if (err) {
> 		if (extack && !extack->_msg)
> 			NL_SET_ERR_MSG_MOD(extack,
> 					   "bridge flag offload is not supported");
> 		return -EOPNOTSUPP; // However, any other error code passed from the driver, like -EINVAL, is ignored and replaced with -EOPNOTSUPP by the bridge.
> 	}
> 
> I think this is where the "Not supported" error message comes from, in
> your case. It would be good to double-check with prints, though.
> 
> Also, you might want to consider compiling iproute2 with libmnl support,
> to get extack messages from the kernel. Here it seems like you're
> missing "bridge flag offload is not supported", which would have been a
> clear indication that this is the code path that was taken.

I added a pr_info here as it looked easier to test this:

	/* We run from atomic context here */
	err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev,
				       &info.info, extack);
	err = notifier_to_errno(err);
	if (err == -EOPNOTSUPP)
		return 0;

	if (err) {
		if (extack && !extack->_msg)
			pr_info("Port flag is not supported");
			NL_SET_ERR_MSG_MOD(extack,
					   "bridge flag offload is not supported");
		pr_info("Port flag is not supported");
		return -EOPNOTSUPP;
	}

Looks like this is the expected result:

root@OpenWrt:/# bridge link set dev sw0p3 isolated on
[   55.862862] Port flag is not supported
RTNETLINK answers: Not supported
root@OpenWrt:/# bridge link set dev sw0p3 isolated on
[   55.862893] Port flag is not supported
[   58.492025] Port flag is not supported
RTNETLINK answers: Not supported

> 
>>> As a result, currently not even the correctness issue has still not yet
>>> been fixed, let alone having any driver act upon the feature correctly.
>>
>> It's unfortunate that this patch was put on hold just because of incomplete
>> commit description. In my eyes, this is a significant fix.
> 
> I agree here. You could send the patch yourself, after making some tests.

Will do.

Arınç

     prev parent reply	other threads:[~2022-04-10 13:00 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-13 11:23 Isolating DSA Slave Interfaces on a Bridge with Bridge Offloading Arınç ÜNAL
2022-03-13 13:32 ` Vladimir Oltean
2022-03-14 13:13   ` Arınç ÜNAL
2022-03-14 14:00     ` Vladimir Oltean
2022-04-10 13:00       ` Arınç ÜNAL [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e427d390-0cd4-d322-1665-4176cd134884@arinc9.com \
    --to=arinc.unal@arinc9.com \
    --cc=ALSI@bang-olufsen.dk \
    --cc=Mark-MC.Lee@mediatek.com \
    --cc=andrew@lunn.ch \
    --cc=dqfext@gmail.com \
    --cc=erkin.bozoglu@xeront.com \
    --cc=f.fainelli@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=luizluca@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=sean.wang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox