* [Bridge] [PATCH v2 2/3] net: mscc: Use NETIF_F_HW_BR_CAP
From: Horatiu Vultur @ 2019-08-26 8:11 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, andrew, f.fainelli, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
In-Reply-To: <1566807075-775-1-git-send-email-horatiu.vultur@microchip.com>
Enable NETIF_F_HW_BR_CAP feature for ocelot.
Because the HW can learn and flood the frames, so there is no need for SW
bridge to do this.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
drivers/net/ethernet/mscc/ocelot.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 4d1bce4..7d7c94b 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -2017,8 +2017,10 @@ int ocelot_probe_port(struct ocelot *ocelot, u8 port,
dev->ethtool_ops = &ocelot_ethtool_ops;
dev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXFCS |
- NETIF_F_HW_TC;
- dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_TC;
+ NETIF_F_HW_TC | NETIF_F_HW_BR_CAP;
+ dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_TC |
+ NETIF_F_HW_BR_CAP;
+ dev->priv_flags |= IFF_UNICAST_FLT;
memcpy(dev->dev_addr, ocelot->base_mac, ETH_ALEN);
dev->dev_addr[ETH_ALEN - 1] += port;
--
2.7.4
^ permalink raw reply related
* [Bridge] [PATCH v2 1/3] net: Add NETIF_HW_BR_CAP feature
From: Horatiu Vultur @ 2019-08-26 8:11 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, andrew, f.fainelli, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
In-Reply-To: <1566807075-775-1-git-send-email-horatiu.vultur@microchip.com>
This patch adds a netdev feature to allow the SW bridge not to set all the
slave interfaces in promisc mode if the HW is capable of learning and
flooading the frames.
The current implementation adds all the bridge ports in promisc mode. Even
if the HW has bridge capabilities(can learn and flood frames). Then all the
frames will be copy to the CPU even if there are cases where there is no
need for this.
For example in the following scenario:
+----------------------------------+
| SW BR |
+----------------------------------+
| | |
| | +------------------+
| | | HW BR |
| | +------------------+
| | | |
+ + + +
p1 p2 p3 p4
Case A: There is a SW bridge with the ports p1 and p2
Case B: There is a SW bridge with the ports p3 and p4.
Case C: THere is a SW bridge with the ports p2, p3 and p4.
For case A, the HW can't do learning and flooding of the frames. Therefore
all the frames need to be copied to the CPU to allow the SW bridge to
decide what do do with the frame(forward, flood, copy to the upper layers,
etc..).
For case B, there is HW support to learn and flood the frames. In this case
there is no point to send all the frames to the CPU(except for frames that
need to go to CPU and flooded frames if flooding is enabled). Because the
HW will already forward the frame to the correct network port, then the
SW bridge will not have anything to do. It would just use CPU cycles and
then drop the frame. The reason for dropping the frame is that the network
driver will set the flag fwd_offload_mark and then SW bridge will skip all
the ports that have the same parent_id as the port that received the frame.
Which is this case.
For case C, there is HW support to learn and flood frames for ports p3 and
p4 while p2 doesn't have HW support. In this case the port p2 needs to be
in promisc mode to allow SW bridge to do the learning and flooding of the
frames while ports p3 and p4 they don't need to be in promisc mode.
The ports p3 and p4 need to make sure that the CPU is in flood mask and
need to know which addresses can be access through SW bridge so it could
send those frames to CPU port. So it would allow the SW bridge to send to
the correct network port.
A workaround for all these cases is not to set the network port in
promisc mode if it is a bridge port, which is the case for mlxsw. Or not
to implement it at all, which is the case for ocelot. But the disadvantage
of this approach is that the network bridge ports can not be set in promisc
mode if there is a need to monitor all the traffic on that port using the
command 'ip link set dev swp promisc on'. This patch adds also support for
this case.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
include/linux/netdev_features.h | 6 ++++++
net/bridge/br_if.c | 11 ++++++++++-
net/core/ethtool.c | 1 +
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 4b19c54..b5a3463 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -78,6 +78,11 @@ enum {
NETIF_F_HW_TLS_TX_BIT, /* Hardware TLS TX offload */
NETIF_F_HW_TLS_RX_BIT, /* Hardware TLS RX offload */
+ NETIF_F_HW_BR_CAP_BIT, /* Hardware is capable to behave as a
+ * bridge to learn and switch frames
+ */
+
+
NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */
NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */
@@ -150,6 +155,7 @@ enum {
#define NETIF_F_GSO_UDP_L4 __NETIF_F(GSO_UDP_L4)
#define NETIF_F_HW_TLS_TX __NETIF_F(HW_TLS_TX)
#define NETIF_F_HW_TLS_RX __NETIF_F(HW_TLS_RX)
+#define NETIF_F_HW_BR_CAP __NETIF_F(HW_BR_CAP)
/* Finds the next feature with the highest number of the range of start till 0.
*/
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 4fe30b1..93bfc55 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -161,7 +161,16 @@ void br_manage_promisc(struct net_bridge *br)
(br->auto_cnt == 1 && br_auto_port(p)))
br_port_clear_promisc(p);
else
- br_port_set_promisc(p);
+ /* If the HW has bridge capabilities to learn
+ * and flood the frames then there is no need
+ * to copy all the frames to the SW to do the
+ * same. Because the HW already switched the
+ * frame and then there is nothing to do for
+ * the SW bridge. The SW will just use CPU
+ * and it would drop the frame.
+ */
+ if (!(p->dev->features & NETIF_F_HW_BR_CAP))
+ br_port_set_promisc(p);
}
}
}
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 6288e69..10430fe 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -111,6 +111,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
[NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record",
[NETIF_F_HW_TLS_TX_BIT] = "tls-hw-tx-offload",
[NETIF_F_HW_TLS_RX_BIT] = "tls-hw-rx-offload",
+ [NETIF_F_HW_BR_CAP_BIT] = "bridge-capabilities-offload",
};
static const char
--
2.7.4
^ permalink raw reply related
* [Bridge] [PATCH v2 0/3] Add NETIF_F_HW_BR_CAP feature
From: Horatiu Vultur @ 2019-08-26 8:11 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, andrew, f.fainelli, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
When a network port is added to a bridge then the port is added in
promisc mode. Some HW that has bridge capabilities(can learn, forward,
flood etc the frames) they are disabling promisc mode in the network
driver when the port is added to the SW bridge.
This patch adds the feature NETIF_F_HW_BR_CAP so that the network ports
that have this feature will not be set in promisc mode when they are
added to a SW bridge.
In this way the HW that has bridge capabilities don't need to send all the
traffic to the CPU and can also implement the promisc mode and toggle it
using the command 'ip link set dev swp promisc on'
v1 -> v2
- rename feature to NETIF_F_HW_BR_CAP
- add better description in the commit message and in the code
- remove the check that all network driver have same netdev_ops and
just check for the feature NETIF_F_HW_BR_CAP when setting the network
port in promisc mode.
Horatiu Vultur (3):
net: Add NETIF_HW_BR_CAP feature
net: mscc: Use NETIF_F_HW_BR_CAP
net: mscc: Implement promisc mode.
drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
include/linux/netdev_features.h | 6 ++++++
net/bridge/br_if.c | 11 ++++++++++-
net/core/ethtool.c | 1 +
4 files changed, 41 insertions(+), 3 deletions(-)
--
2.7.4
^ permalink raw reply
* Re: [Bridge] [PATCH] bridge:fragmented packets dropped by bridge
From: Jan Engelhardt @ 2019-08-26 7:59 UTC (permalink / raw)
To: Florian Westphal
Cc: Rundong Ge, yoshfuji, netdev, roopa, bridge, linux-kernel, kadlec,
nikolay, coreteam, netfilter-devel, kuznet, davem, pablo
In-Reply-To: <20190730123542.zrsrfvcy7t2n3d4g@breakpoint.cc>
On Tuesday 2019-07-30 14:35, Florian Westphal wrote:
>Rundong Ge <rdong.ge@gmail.com> wrote:
>> Given following setup:
>> -modprobe br_netfilter
>> -echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
>> -brctl addbr br0
>> -brctl addif br0 enp2s0
>> -brctl addif br0 enp3s0
>> -brctl addif br0 enp6s0
>> -ifconfig enp2s0 mtu 1300
>> -ifconfig enp3s0 mtu 1500
>> -ifconfig enp6s0 mtu 1500
>> -ifconfig br0 up
>>
>> multi-port
>> mtu1500 - mtu1500|bridge|1500 - mtu1500
>> A | B
>> mtu1300
>
>How can a bridge forward a frame from A/B to mtu1300?
There might be a misunderstanding here judging from the shortness of this
thread.
I understood it such that the bridge ports (eth0,eth1) have MTU 1500, yet br0
(in essence the third bridge port if you so wish) itself has MTU 1300.
Therefore, frame forwarding from eth0 to eth1 should succeed, since the
1300-byte MTU is only relevant if the bridge decides the packet needs to be
locally delivered.
^ permalink raw reply
* Re: [Bridge] [PATCH] bridge:fragmented packets dropped by bridge
From: Rundong Ge @ 2019-08-26 2:45 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: yoshfuji, netdev, Roopa Prabhu, bridge, Florian Westphal,
linux-kernel, kadlec, coreteam, netfilter-devel, kuznet, davem,
Pablo Neira Ayuso
In-Reply-To: <1dc87e69-628b-fd04-619a-8dbe5bdfa108@cumulusnetworks.com>
On Tue, Jul 30, 2019 at 8:41 PM Nikolay Aleksandrov
<nikolay@cumulusnetworks.com> wrote:
>
> On 30/07/2019 15:25, Rundong Ge wrote:
> > Given following setup:
> > -modprobe br_netfilter
> > -echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
> > -brctl addbr br0
> > -brctl addif br0 enp2s0
> > -brctl addif br0 enp3s0
> > -brctl addif br0 enp6s0
> > -ifconfig enp2s0 mtu 1300
> > -ifconfig enp3s0 mtu 1500
> > -ifconfig enp6s0 mtu 1500
> > -ifconfig br0 up
> >
> > multi-port
> > mtu1500 - mtu1500|bridge|1500 - mtu1500
> > A | B
> > mtu1300
> >
> > With netfilter defragmentation/conntrack enabled, fragmented
> > packets from A will be defragmented in prerouting, and refragmented
> > at postrouting.
> > But in this scenario the bridge found the frag_max_size(1500) is
> > larger than the dst mtu stored in the fake_rtable whitch is
> > always equal to the bridge's mtu 1300, then packets will be dopped.
> >
> > This modifies ip_skb_dst_mtu to use the out dev's mtu instead
> > of bridge's mtu in bridge refragment.
> >
> > Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
> > ---
> > include/net/ip.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/include/net/ip.h b/include/net/ip.h
> > index 29d89de..0512de3 100644
> > --- a/include/net/ip.h
> > +++ b/include/net/ip.h
> > @@ -450,6 +450,8 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
> > static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
> > const struct sk_buff *skb)
> > {
> > + if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
> > + return min(skb->dev->mtu, IP_MAX_MTU);
> > if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
> > bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
> >
> >
>
> I don't think this is correct, there's a reason why the bridge chooses the smallest
> possible MTU out of its members and this is simply a hack to circumvent it.
> If you really like to do so just set the bridge MTU manually, we've added support
> so it won't change automatically to the smallest, but then how do you pass packets
> 1500 -> 1300 in this setup ?
>
> You're talking about the frag_size check in br_nf_ip_fragment(), right ?
>
Hi Nikolay
My setup may not be common. And may I know if there is any reason to
use output port's MTU
to do the re-fragment check but then use the bridge's MTU to do the re-fragment?
Is it the expected behavior that the bridge's MTU will affect the
FORWARD traffic re-fragment,
because I used to think the bridge's MTU will only effect the OUTPUT
traffic sent from "br0".
And the modification in this patch will replace the MTU in the
fake_rtable which is only
used in the FORWARD re-fragment and won't affect the local traffic from "br0".
TKS
Raydodn
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Horatiu Vultur @ 2019-08-25 16:30 UTC (permalink / raw)
To: Jiri Pirko
Cc: alexandre.belloni, Florian Fainelli, nikolay, netdev, roopa,
bridge, linux-kernel, UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <20190824074204.GA15041@nanopsycho.orion>
The 08/24/2019 09:42, Jiri Pirko wrote:
> External E-Mail
>
>
> Sat, Aug 24, 2019 at 01:25:02AM CEST, f.fainelli@gmail.com wrote:
> >On 8/22/19 12:07 PM, Horatiu Vultur wrote:
> >> Current implementation of the SW bridge is setting the interfaces in
> >> promisc mode when they are added to bridge if learning of the frames is
> >> enabled.
> >> In case of Ocelot which has HW capabilities to switch frames, it is not
> >> needed to set the ports in promisc mode because the HW already capable of
> >> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> >> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> >> the ports in promisc mode to do the switching.
> >
> >Then do not do anything when the ndo_set_rx_mode() for the ocelot
> >network device is called and indicates that IFF_PROMISC is set and that
> >your network port is a bridge port member. That is what mlxsw does AFAICT.
Yes, but then if you want to monitor all the traffic on a bridge port
you will not be able to do that. And this seems to be a limitation.
This is the case for mlxsw and ocelot(it doesn't implement at all
promisc mode) and might be others.
>
> Correct.
>
> >
> >As other pointed out, the Linux bridge implements a software bridge by
> >default, and because it needs to operate on a wide variety of network
> >devices, all with different capabilities, the easiest way to make sure
> >that all management (IGMP, BPDU, etc. ) as well as non-management
> >traffic can make it to the bridge ports, is to put the network devices
> >in promiscuous mode.
What if the HW can copy all the management traffic to the SW bridge and
HW knows to learn and flood frames. Then there is no point to set a
network port in promisc mode just because it is a bridge port member.
> >If this is suboptimal for you, you can take
> >shortcuts in your driver that do not hinder the overall functionality.
If I add this check, I don't see how any other network drivers will be
affected by this. If a network driver will start to use this then it
needs to know that the HW should be configure to include CPU in the
flood mask and to know which addresses can be reached through SW bridge.
> >
> >> This optimization takes places only if all the interfaces that are part
> >> of the bridge have this flag and have the same network driver.
> >>
> >> If the bridge interfaces is added in promisc mode then also the ports part
> >> of the bridge are set in promisc mode.
> >>
> >> Horatiu Vultur (3):
> >> net: Add HW_BRIDGE offload feature
> >> net: mscc: Use NETIF_F_HW_BRIDGE
> >> net: mscc: Implement promisc mode.
> >>
> >> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
> >> include/linux/netdev_features.h | 3 +++
> >> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
> >> net/core/ethtool.c | 1 +
> >> 4 files changed, 56 insertions(+), 3 deletions(-)
> >>
> >
> >
> >--
> >Florian
>
--
/Horatiu
^ permalink raw reply
* Re: [Bridge] [PATCH 1/3] net: Add HW_BRIDGE offload feature
From: Horatiu Vultur @ 2019-08-25 10:44 UTC (permalink / raw)
To: Florian Fainelli
Cc: Andrew Lunn, alexandre.belloni, nikolay, netdev, roopa, bridge,
linux-kernel, UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <afde1b82-2e4c-5b93-ff31-83cb80a0f7bd@gmail.com>
The 08/23/2019 16:30, Florian Fainelli wrote:
> External E-Mail
>
>
> On 8/23/19 5:39 AM, Horatiu Vultur wrote:
> > The 08/22/2019 22:08, Andrew Lunn wrote:
> >> External E-Mail
> >>
> >>
> >>> +/* Determin if the SW bridge can be offloaded to HW. Return true if all
> >>> + * the interfaces of the bridge have the feature NETIF_F_HW_SWITCHDEV set
> >>> + * and have the same netdev_ops.
> >>> + */
> >>
> >> Hi Horatiu
> >>
> >> Why do you need these restrictions. The HW bridge should be able to
> >> learn that a destination MAC address can be reached via the SW
> >> bridge. The software bridge can then forward it out the correct
> >> interface.
> >>
> >> Or are you saying your hardware cannot learn from frames which come
> >> from the CPU?
> >>
> >> Andrew
> >>
> > Hi Andrew,
> >
> > I do not believe that our HW can learn from frames which comes from the
> > CPU, at least not in the way they are injected today. But in case of Ocelot
> > (and the next chip we are working on), we have other issues in mixing with
> > foreign interfaces which is why we have the check in
> > ocelot_netdevice_dev_check.
> >
> > More important, as we responded to Nikolay, we properly introduced this
> > restriction for the wrong reasons.
> >
> > In SW bridge I will remove all these restrictions and only set ports in
> > promisc mode only if NETIF_F_HW_BRIDGE is not set.
> > Then in the network driver I can see if a foreign interface is added to
> > the bridge, and when that happens I can set the port in promisc mode.
> > Then the frames will be flooded to the SW bridge which eventually will
> > send to the foreign interface.
>
> Is that really necessary?
From what I see, it seems to be necessary.
> Is not the skb->fwd_offload_mark as well as
> the phys_switch_id supposed to tell that information to the bridge already?
Yes, the bridge is using the fwd_offload_mark to know that it should or
should not forward the frame. But in case that the network driver knows
that the SW bridge will not do anything with the frame, then there is no
point to send the frame to SW bridge just to use CPU cycles for dropping
the frame.
> --
> Florian
--
/Horatiu
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Stephen Hemminger @ 2019-08-24 12:05 UTC (permalink / raw)
To: Horatiu Vultur
Cc: alexandre.belloni, nikolay, netdev, roopa, bridge, linux-kernel,
UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
On Thu, 22 Aug 2019 21:07:27 +0200
Horatiu Vultur <horatiu.vultur@microchip.com> wrote:
> Current implementation of the SW bridge is setting the interfaces in
> promisc mode when they are added to bridge if learning of the frames is
> enabled.
> In case of Ocelot which has HW capabilities to switch frames, it is not
> needed to set the ports in promisc mode because the HW already capable of
> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> the ports in promisc mode to do the switching.
> This optimization takes places only if all the interfaces that are part
> of the bridge have this flag and have the same network driver.
>
> If the bridge interfaces is added in promisc mode then also the ports part
> of the bridge are set in promisc mode.
>
> Horatiu Vultur (3):
> net: Add HW_BRIDGE offload feature
> net: mscc: Use NETIF_F_HW_BRIDGE
> net: mscc: Implement promisc mode.
>
> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
> include/linux/netdev_features.h | 3 +++
> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
> net/core/ethtool.c | 1 +
> 4 files changed, 56 insertions(+), 3 deletions(-)
>
IMHO there are already enough nerd knobs in bridge to support this.
If you hardware can't do real bridging, it is only doing MACVLAN so that
is what you should support instead.
^ permalink raw reply
* Re: [Bridge] [PATCH 1/3] net: Add HW_BRIDGE offload feature
From: Florian Fainelli @ 2019-08-23 23:30 UTC (permalink / raw)
To: Horatiu Vultur, Andrew Lunn
Cc: alexandre.belloni, nikolay, netdev, roopa, bridge, linux-kernel,
UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <20190823123929.ta4ikozz7jwkwbo2@soft-dev3.microsemi.net>
On 8/23/19 5:39 AM, Horatiu Vultur wrote:
> The 08/22/2019 22:08, Andrew Lunn wrote:
>> External E-Mail
>>
>>
>>> +/* Determin if the SW bridge can be offloaded to HW. Return true if all
>>> + * the interfaces of the bridge have the feature NETIF_F_HW_SWITCHDEV set
>>> + * and have the same netdev_ops.
>>> + */
>>
>> Hi Horatiu
>>
>> Why do you need these restrictions. The HW bridge should be able to
>> learn that a destination MAC address can be reached via the SW
>> bridge. The software bridge can then forward it out the correct
>> interface.
>>
>> Or are you saying your hardware cannot learn from frames which come
>> from the CPU?
>>
>> Andrew
>>
> Hi Andrew,
>
> I do not believe that our HW can learn from frames which comes from the
> CPU, at least not in the way they are injected today. But in case of Ocelot
> (and the next chip we are working on), we have other issues in mixing with
> foreign interfaces which is why we have the check in
> ocelot_netdevice_dev_check.
>
> More important, as we responded to Nikolay, we properly introduced this
> restriction for the wrong reasons.
>
> In SW bridge I will remove all these restrictions and only set ports in
> promisc mode only if NETIF_F_HW_BRIDGE is not set.
> Then in the network driver I can see if a foreign interface is added to
> the bridge, and when that happens I can set the port in promisc mode.
> Then the frames will be flooded to the SW bridge which eventually will
> send to the foreign interface.
Is that really necessary? Is not the skb->fwd_offload_mark as well as
the phys_switch_id supposed to tell that information to the bridge already?
--
Florian
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Florian Fainelli @ 2019-08-23 23:25 UTC (permalink / raw)
To: Horatiu Vultur, roopa, nikolay, davem, UNGLinuxDriver,
alexandre.belloni, allan.nielsen, netdev, linux-kernel, bridge
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
On 8/22/19 12:07 PM, Horatiu Vultur wrote:
> Current implementation of the SW bridge is setting the interfaces in
> promisc mode when they are added to bridge if learning of the frames is
> enabled.
> In case of Ocelot which has HW capabilities to switch frames, it is not
> needed to set the ports in promisc mode because the HW already capable of
> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> the ports in promisc mode to do the switching.
Then do not do anything when the ndo_set_rx_mode() for the ocelot
network device is called and indicates that IFF_PROMISC is set and that
your network port is a bridge port member. That is what mlxsw does AFAICT.
As other pointed out, the Linux bridge implements a software bridge by
default, and because it needs to operate on a wide variety of network
devices, all with different capabilities, the easiest way to make sure
that all management (IGMP, BPDU, etc. ) as well as non-management
traffic can make it to the bridge ports, is to put the network devices
in promiscuous mode. If this is suboptimal for you, you can take
shortcuts in your driver that do not hinder the overall functionality.
> This optimization takes places only if all the interfaces that are part
> of the bridge have this flag and have the same network driver.
>
> If the bridge interfaces is added in promisc mode then also the ports part
> of the bridge are set in promisc mode.
>
> Horatiu Vultur (3):
> net: Add HW_BRIDGE offload feature
> net: mscc: Use NETIF_F_HW_BRIDGE
> net: mscc: Implement promisc mode.
>
> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
> include/linux/netdev_features.h | 3 +++
> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
> net/core/ethtool.c | 1 +
> 4 files changed, 56 insertions(+), 3 deletions(-)
>
--
Florian
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Horatiu Vultur @ 2019-08-23 15:57 UTC (permalink / raw)
To: Andrew Lunn
Cc: alexandre.belloni, Nikolay Aleksandrov, netdev, roopa, bridge,
linux-kernel, UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <20190823132538.GO13020@lunn.ch>
Hi Andrew
The 08/23/2019 15:25, Andrew Lunn wrote:
> External E-Mail
>
>
> > > > Why do the devices have to be from the same driver ?
> > After seeing yours and Andrews comments I realize that I try to do two
> > things, but I have only explained one of them.
> >
> > Here is what I was trying to do:
> > A. Prevent ports from going into promisc mode, if it is not needed.
>
> The switch definition is promisc is a bit odd. You really need to
> split it into two use cases.
>
> The Linux interface is not a member of a bridge. In this case, promisc
> mode would mean all frames ingressing the port should be forwarded to
> the CPU. Without promisc, you can program the hardware to just accept
> frames with the interfaces MAC address. So this is just the usual
> behaviour of an interface.
Yes, this is well understood.
>
> When the interface is part of the bridge, then you can turn on all the
> learning and not forward frames to the CPU, unless the CPU asks for
> them. But you need to watch out for various flags. By default, you
> should flood to the CPU, unknown destinations to the CPU etc. But some
> of these can be turned off by flags.
>
> > B. Prevent adding the CPU to the flood-mask (in Ocelot we have a
> > flood-mask controlling who should be included when flooding due to
> > destination unknown).
>
> So destination unknown should be flooded to the CPU. The CPU might
> know where to send the frame.
Exactly the CPU should be in the flood mask by default. But if the
network driver knows that CPU will not forward it anywhere else, then it
is safe to remove the CPU from the flood mask. The network driver will
know this by monitoring which interfaces are added to the bridge.
>
> > To solve item "B", the network driver needs to detect if there is a
> > foreign interfaces added to the bridge. If that is the case then to add
> > the CPU port to the flooding mask otherwise no.
>
> It is not just a foreign interface. What about the MAC address on the
> bridge interface?
I think the network driver will get this event and it can install a
entry in the MAC table to copy the frames to the CPU.
>
> > > > This is too specific targeting some devices.
> > Maybe I was wrong to mention specific HW in the commit message. The
> > purpose of the patch was to add an optimization (not to copy all the
> > frames to the CPU) for HW that is capable of learning and flooding the
> > frames.
>
> To some extent, this is also tied to your hardware not learning MAC
> addresses from frames passed from the CPU. You should also consider
> fixing this. The SW bridge does send out notifications when it
> adds/removes MAC addresses to its tables. You probably want to receive
> this modifications, and use them to program your hardware to forward
> frames to the CPU when needed.
Yes we will fix this. We will listen to the notifications and then update
the HW so it would send those frames to CPU.
Maybe intially I should just resend this patch, with the changes that I
mention previously. And after that to send a new patch series with all
these remarks that you mention here Andrew.
>
> Andrew
>
--
/Horatiu
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Andrew Lunn @ 2019-08-23 13:25 UTC (permalink / raw)
To: Horatiu Vultur
Cc: alexandre.belloni, Nikolay Aleksandrov, netdev, roopa, bridge,
linux-kernel, UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <20190823122657.njk2tcgur2zu74i7@soft-dev3.microsemi.net>
> > > Why do the devices have to be from the same driver ?
> After seeing yours and Andrews comments I realize that I try to do two
> things, but I have only explained one of them.
>
> Here is what I was trying to do:
> A. Prevent ports from going into promisc mode, if it is not needed.
The switch definition is promisc is a bit odd. You really need to
split it into two use cases.
The Linux interface is not a member of a bridge. In this case, promisc
mode would mean all frames ingressing the port should be forwarded to
the CPU. Without promisc, you can program the hardware to just accept
frames with the interfaces MAC address. So this is just the usual
behaviour of an interface.
When the interface is part of the bridge, then you can turn on all the
learning and not forward frames to the CPU, unless the CPU asks for
them. But you need to watch out for various flags. By default, you
should flood to the CPU, unknown destinations to the CPU etc. But some
of these can be turned off by flags.
> B. Prevent adding the CPU to the flood-mask (in Ocelot we have a
> flood-mask controlling who should be included when flooding due to
> destination unknown).
So destination unknown should be flooded to the CPU. The CPU might
know where to send the frame.
> To solve item "B", the network driver needs to detect if there is a
> foreign interfaces added to the bridge. If that is the case then to add
> the CPU port to the flooding mask otherwise no.
It is not just a foreign interface. What about the MAC address on the
bridge interface?
> > > This is too specific targeting some devices.
> Maybe I was wrong to mention specific HW in the commit message. The
> purpose of the patch was to add an optimization (not to copy all the
> frames to the CPU) for HW that is capable of learning and flooding the
> frames.
To some extent, this is also tied to your hardware not learning MAC
addresses from frames passed from the CPU. You should also consider
fixing this. The SW bridge does send out notifications when it
adds/removes MAC addresses to its tables. You probably want to receive
this modifications, and use them to program your hardware to forward
frames to the CPU when needed.
Andrew
^ permalink raw reply
* Re: [Bridge] [PATCH 1/3] net: Add HW_BRIDGE offload feature
From: Horatiu Vultur @ 2019-08-23 12:39 UTC (permalink / raw)
To: Andrew Lunn
Cc: alexandre.belloni, nikolay, netdev, roopa, bridge, linux-kernel,
UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <20190822200817.GD21295@lunn.ch>
The 08/22/2019 22:08, Andrew Lunn wrote:
> External E-Mail
>
>
> > +/* Determin if the SW bridge can be offloaded to HW. Return true if all
> > + * the interfaces of the bridge have the feature NETIF_F_HW_SWITCHDEV set
> > + * and have the same netdev_ops.
> > + */
>
> Hi Horatiu
>
> Why do you need these restrictions. The HW bridge should be able to
> learn that a destination MAC address can be reached via the SW
> bridge. The software bridge can then forward it out the correct
> interface.
>
> Or are you saying your hardware cannot learn from frames which come
> from the CPU?
>
> Andrew
>
Hi Andrew,
I do not believe that our HW can learn from frames which comes from the
CPU, at least not in the way they are injected today. But in case of Ocelot
(and the next chip we are working on), we have other issues in mixing with
foreign interfaces which is why we have the check in
ocelot_netdevice_dev_check.
More important, as we responded to Nikolay, we properly introduced this
restriction for the wrong reasons.
In SW bridge I will remove all these restrictions and only set ports in
promisc mode only if NETIF_F_HW_BRIDGE is not set.
Then in the network driver I can see if a foreign interface is added to
the bridge, and when that happens I can set the port in promisc mode.
Then the frames will be flooded to the SW bridge which eventually will
send to the foreign interface.
--
/Horatiu
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Horatiu Vultur @ 2019-08-23 12:26 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: alexandre.belloni, netdev, roopa, bridge, linux-kernel,
UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <b2c52206-82d1-ef28-aeec-a5dcdbe9df6c@cumulusnetworks.com>
The 08/23/2019 01:26, Nikolay Aleksandrov wrote:
> External E-Mail
>
>
> On 8/23/19 1:09 AM, Nikolay Aleksandrov wrote:
> > On 22/08/2019 22:07, Horatiu Vultur wrote:
> >> Current implementation of the SW bridge is setting the interfaces in
> >> promisc mode when they are added to bridge if learning of the frames is
> >> enabled.
> >> In case of Ocelot which has HW capabilities to switch frames, it is not
> >> needed to set the ports in promisc mode because the HW already capable of
> >> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> >> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> >> the ports in promisc mode to do the switching.
> >> This optimization takes places only if all the interfaces that are part
> >> of the bridge have this flag and have the same network driver.
> >>
> >> If the bridge interfaces is added in promisc mode then also the ports part
> >> of the bridge are set in promisc mode.
> >>
> >> Horatiu Vultur (3):
> >> net: Add HW_BRIDGE offload feature
> >> net: mscc: Use NETIF_F_HW_BRIDGE
> >> net: mscc: Implement promisc mode.
> >>
> >> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
> >> include/linux/netdev_features.h | 3 +++
> >> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
> >> net/core/ethtool.c | 1 +
> >> 4 files changed, 56 insertions(+), 3 deletions(-)
> >>
> >
>
Hi Nikolay,
> Just to clarify:
> > IMO the name is misleading.
> - that's not mandatory or anything, just saying people might get confused when they see it
Naming is hard, I properly need to go back and find a better name once
the patch is more mature and the problem/purpose is better understood.
But you are right, this is a bad name, I will find a better one.
>
> > Why do the devices have to be from the same driver ?
After seeing yours and Andrews comments I realize that I try to do two
things, but I have only explained one of them.
Here is what I was trying to do:
A. Prevent ports from going into promisc mode, if it is not needed.
B. Prevent adding the CPU to the flood-mask (in Ocelot we have a
flood-mask controlling who should be included when flooding due to
destination unknown).
We have been thinking of these two as the same thing, but they are in
fact quite different. It is because of "B" we had to require all the
devices to be controlled by the same Switch.
For item "A" I do not think we need this restriction. In this patch we
will continue only focusing on item "A".
Sorry for the confusion. I will do a new patch that does not have this
restriction (which will also make it simpler).
It just needs to check for the flag NETIF_F_HW_BRIDGE and if it is not
set then set the port in promisc mode.
To solve item "B", the network driver needs to detect if there is a
foreign interfaces added to the bridge. If that is the case then to add
the CPU port to the flooding mask otherwise no. But this part will be in
a different patch series.
> > This is too specific targeting some devices.
Maybe I was wrong to mention specific HW in the commit message. The
purpose of the patch was to add an optimization (not to copy all the
frames to the CPU) for HW that is capable of learning and flooding the
frames.
I would expect that most switching HW would benefit from this.
> > The bridge should not care what's the port device, it should be the other way
Not sure I understand how that could be done. Are you suggesting that
the flag belongs to another structure? If you have something specific in
mind, then please let us know.
> That was mostly a rhetorical question, it's obvious why but please add an explanation
> at least in the commit message and please fix the typos in the comment. Also HW
> is capable of doing switching, this needs some clarification that the whole process
> stays in HW IIUC. More details here would be great.
Yes, I will add more details in the commit message and in code.
> > around, so adding device-specific code to the bridge is not ok. Isn't there a solution
> > where you can use NETDEV_JOIN and handle it all from your driver ?
> > Would all HW-learned entries be hidden from user-space in this case ?
Yes, they would. But if the network driver will call
'call_switchdev_notifiers' with SWITCHDEV_FDB_ADD_TO_BRIDGE and
SWITCHDEV_FDB_DEL_TO_BRIDGE then the SW will see these entries.
> >
> I.e. isn't there a way to do this without introducing a new feature flag ?
I do not have any better ideas.
>
>
--
/Horatiu
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: David Miller @ 2019-08-22 22:32 UTC (permalink / raw)
To: horatiu.vultur
Cc: alexandre.belloni, nikolay, netdev, roopa, bridge, linux-kernel,
UNGLinuxDriver, allan.nielsen
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
From: Horatiu Vultur <horatiu.vultur@microchip.com>
Date: Thu, 22 Aug 2019 21:07:27 +0200
> Current implementation of the SW bridge is setting the interfaces in
> promisc mode when they are added to bridge if learning of the frames is
> enabled.
> In case of Ocelot which has HW capabilities to switch frames, it is not
> needed to set the ports in promisc mode because the HW already capable of
> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> the ports in promisc mode to do the switching.
> This optimization takes places only if all the interfaces that are part
> of the bridge have this flag and have the same network driver.
>
> If the bridge interfaces is added in promisc mode then also the ports part
> of the bridge are set in promisc mode.
This doesn't look right at all.
The Linux bridge provides a software bridge.
By default, all hardware must provide a hardware implementation of
that software bridge behavior.
Anything that deviates from that behavior has to be explicitly asked
for by the user by explicit config commands.
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Nikolay Aleksandrov @ 2019-08-22 22:26 UTC (permalink / raw)
To: Horatiu Vultur, roopa, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
In-Reply-To: <1e16da88-08c5-abd5-0a3e-b8e6c3db134a@cumulusnetworks.com>
On 8/23/19 1:09 AM, Nikolay Aleksandrov wrote:
> On 22/08/2019 22:07, Horatiu Vultur wrote:
>> Current implementation of the SW bridge is setting the interfaces in
>> promisc mode when they are added to bridge if learning of the frames is
>> enabled.
>> In case of Ocelot which has HW capabilities to switch frames, it is not
>> needed to set the ports in promisc mode because the HW already capable of
>> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
>> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
>> the ports in promisc mode to do the switching.
>> This optimization takes places only if all the interfaces that are part
>> of the bridge have this flag and have the same network driver.
>>
>> If the bridge interfaces is added in promisc mode then also the ports part
>> of the bridge are set in promisc mode.
>>
>> Horatiu Vultur (3):
>> net: Add HW_BRIDGE offload feature
>> net: mscc: Use NETIF_F_HW_BRIDGE
>> net: mscc: Implement promisc mode.
>>
>> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
>> include/linux/netdev_features.h | 3 +++
>> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
>> net/core/ethtool.c | 1 +
>> 4 files changed, 56 insertions(+), 3 deletions(-)
>>
>
Just to clarify:
> IMO the name is misleading.
- that's not mandatory or anything, just saying people might get confused when they see it
> Why do the devices have to be from the same driver ? This is too specific targeting some
> devices. The bridge should not care what's the port device, it should be the other way
That was mostly a rhetorical question, it's obvious why but please add an explanation
at least in the commit message and please fix the typos in the comment. Also HW
is capable of doing switching, this needs some clarification that the whole process
stays in HW IIUC. More details here would be great.
> around, so adding device-specific code to the bridge is not ok. Isn't there a solution
> where you can use NETDEV_JOIN and handle it all from your driver ?
> Would all HW-learned entries be hidden from user-space in this case ?
>
I.e. isn't there a way to do this without introducing a new feature flag ?
^ permalink raw reply
* Re: [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Nikolay Aleksandrov @ 2019-08-22 22:09 UTC (permalink / raw)
To: Horatiu Vultur, roopa, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
On 22/08/2019 22:07, Horatiu Vultur wrote:
> Current implementation of the SW bridge is setting the interfaces in
> promisc mode when they are added to bridge if learning of the frames is
> enabled.
> In case of Ocelot which has HW capabilities to switch frames, it is not
> needed to set the ports in promisc mode because the HW already capable of
> doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
> HW has bridge capabilities. Therefore the SW bridge doesn't need to set
> the ports in promisc mode to do the switching.
> This optimization takes places only if all the interfaces that are part
> of the bridge have this flag and have the same network driver.
>
> If the bridge interfaces is added in promisc mode then also the ports part
> of the bridge are set in promisc mode.
>
> Horatiu Vultur (3):
> net: Add HW_BRIDGE offload feature
> net: mscc: Use NETIF_F_HW_BRIDGE
> net: mscc: Implement promisc mode.
>
> drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
> include/linux/netdev_features.h | 3 +++
> net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
> net/core/ethtool.c | 1 +
> 4 files changed, 56 insertions(+), 3 deletions(-)
>
IMO the name is misleading.
Why do the devices have to be from the same driver ? This is too specific targeting some
devices. The bridge should not care what's the port device, it should be the other way
around, so adding device-specific code to the bridge is not ok. Isn't there a solution
where you can use NETDEV_JOIN and handle it all from your driver ?
Would all HW-learned entries be hidden from user-space in this case ?
^ permalink raw reply
* Re: [Bridge] [PATCH 1/3] net: Add HW_BRIDGE offload feature
From: Andrew Lunn @ 2019-08-22 20:08 UTC (permalink / raw)
To: Horatiu Vultur
Cc: alexandre.belloni, nikolay, netdev, roopa, bridge, linux-kernel,
UNGLinuxDriver, allan.nielsen, davem
In-Reply-To: <1566500850-6247-2-git-send-email-horatiu.vultur@microchip.com>
> +/* Determin if the SW bridge can be offloaded to HW. Return true if all
> + * the interfaces of the bridge have the feature NETIF_F_HW_SWITCHDEV set
> + * and have the same netdev_ops.
> + */
Hi Horatiu
Why do you need these restrictions. The HW bridge should be able to
learn that a destination MAC address can be reached via the SW
bridge. The software bridge can then forward it out the correct
interface.
Or are you saying your hardware cannot learn from frames which come
from the CPU?
Andrew
^ permalink raw reply
* [Bridge] [PATCH 3/3] net: mscc: Implement promisc mode.
From: Horatiu Vultur @ 2019-08-22 19:07 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
Before when a port was added to a bridge then the port was added in
promisc mode. But because of the patches:
commit 6657c3d812dc5d ("net: Add HW_BRIDGE offload feature")
commit e2e3678c292f9c (net: mscc: Use NETIF_F_HW_BRIDGE")
the port is not needed to be in promisc mode to be part of the bridge.
So it is possible to togle the promisc mode of the port even if it is or
not part of the bridge.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
drivers/net/ethernet/mscc/ocelot.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index c9cf2bee..9fa97fe 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -691,6 +691,25 @@ static void ocelot_set_rx_mode(struct net_device *dev)
__dev_mc_sync(dev, ocelot_mc_sync, ocelot_mc_unsync);
}
+static void ocelot_change_rx_flags(struct net_device *dev, int flags)
+{
+ struct ocelot_port *port = netdev_priv(dev);
+ struct ocelot *ocelot = port->ocelot;
+ u32 val;
+
+ if (!(flags & IFF_PROMISC))
+ return;
+
+ val = ocelot_read_gix(ocelot, ANA_PORT_CPU_FWD_CFG,
+ port->chip_port);
+ if (dev->flags & IFF_PROMISC)
+ val |= ANA_PORT_CPU_FWD_CFG_CPU_SRC_COPY_ENA;
+ else
+ val &= ~(ANA_PORT_CPU_FWD_CFG_CPU_SRC_COPY_ENA);
+
+ ocelot_write_gix(ocelot, val, ANA_PORT_CPU_FWD_CFG, port->chip_port);
+}
+
static int ocelot_port_get_phys_port_name(struct net_device *dev,
char *buf, size_t len)
{
@@ -1070,6 +1089,7 @@ static const struct net_device_ops ocelot_port_netdev_ops = {
.ndo_stop = ocelot_port_stop,
.ndo_start_xmit = ocelot_port_xmit,
.ndo_set_rx_mode = ocelot_set_rx_mode,
+ .ndo_change_rx_flags = ocelot_change_rx_flags,
.ndo_get_phys_port_name = ocelot_port_get_phys_port_name,
.ndo_set_mac_address = ocelot_port_set_mac_address,
.ndo_get_stats64 = ocelot_get_stats64,
--
2.7.4
^ permalink raw reply related
* [Bridge] [PATCH 2/3] net: mscc: Use NETIF_F_HW_BRIDGE
From: Horatiu Vultur @ 2019-08-22 19:07 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
Enable HW_BRIDGE feature for ocelot. In this way the HW will do all the
switching of the frames so it is not needed for the ports to be in promisc
mode.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
drivers/net/ethernet/mscc/ocelot.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 4d1bce4..c9cf2bee 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -2017,8 +2017,10 @@ int ocelot_probe_port(struct ocelot *ocelot, u8 port,
dev->ethtool_ops = &ocelot_ethtool_ops;
dev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXFCS |
- NETIF_F_HW_TC;
- dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_TC;
+ NETIF_F_HW_TC | NETIF_F_HW_BRIDGE;
+ dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_TC |
+ NETIF_F_HW_BRIDGE;
+ dev->priv_flags |= IFF_UNICAST_FLT;
memcpy(dev->dev_addr, ocelot->base_mac, ETH_ALEN);
dev->dev_addr[ETH_ALEN - 1] += port;
--
2.7.4
^ permalink raw reply related
* [Bridge] [PATCH 1/3] net: Add HW_BRIDGE offload feature
From: Horatiu Vultur @ 2019-08-22 19:07 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
In-Reply-To: <1566500850-6247-1-git-send-email-horatiu.vultur@microchip.com>
This patch adds a netdev feature to configure the HW as a switch.
The purpose of this flag is to show that the hardware can do switching
of the frames.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
---
include/linux/netdev_features.h | 3 +++
net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
net/core/ethtool.c | 1 +
3 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 4b19c54..34274de 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -78,6 +78,8 @@ enum {
NETIF_F_HW_TLS_TX_BIT, /* Hardware TLS TX offload */
NETIF_F_HW_TLS_RX_BIT, /* Hardware TLS RX offload */
+ NETIF_F_HW_BRIDGE_BIT, /* Bridge offload support */
+
NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */
NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */
@@ -150,6 +152,7 @@ enum {
#define NETIF_F_GSO_UDP_L4 __NETIF_F(GSO_UDP_L4)
#define NETIF_F_HW_TLS_TX __NETIF_F(HW_TLS_TX)
#define NETIF_F_HW_TLS_RX __NETIF_F(HW_TLS_RX)
+#define NETIF_F_HW_BRIDGE __NETIF_F(HW_BRIDGE)
/* Finds the next feature with the highest number of the range of start till 0.
*/
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 4fe30b1..975a34c 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -127,6 +127,31 @@ static void br_port_clear_promisc(struct net_bridge_port *p)
p->flags &= ~BR_PROMISC;
}
+/* Determin if the SW bridge can be offloaded to HW. Return true if all
+ * the interfaces of the bridge have the feature NETIF_F_HW_SWITCHDEV set
+ * and have the same netdev_ops.
+ */
+static int br_hw_offload(struct net_bridge *br)
+{
+ const struct net_device_ops *ops = NULL;
+ struct net_bridge_port *p;
+
+ list_for_each_entry(p, &br->port_list, list) {
+ if (!(p->dev->hw_features & NETIF_F_HW_BRIDGE))
+ return 0;
+
+ if (!ops) {
+ ops = p->dev->netdev_ops;
+ continue;
+ }
+
+ if (ops != p->dev->netdev_ops)
+ return 0;
+ }
+
+ return 1;
+}
+
/* When a port is added or removed or when certain port flags
* change, this function is called to automatically manage
* promiscuity setting of all the bridge ports. We are always called
@@ -134,6 +159,7 @@ static void br_port_clear_promisc(struct net_bridge_port *p)
*/
void br_manage_promisc(struct net_bridge *br)
{
+ bool hw_offload = br_hw_offload(br);
struct net_bridge_port *p;
bool set_all = false;
@@ -161,7 +187,8 @@ void br_manage_promisc(struct net_bridge *br)
(br->auto_cnt == 1 && br_auto_port(p)))
br_port_clear_promisc(p);
else
- br_port_set_promisc(p);
+ if (!hw_offload)
+ br_port_set_promisc(p);
}
}
}
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 6288e69..4e224fe 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -111,6 +111,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
[NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record",
[NETIF_F_HW_TLS_TX_BIT] = "tls-hw-tx-offload",
[NETIF_F_HW_TLS_RX_BIT] = "tls-hw-rx-offload",
+ [NETIF_F_HW_BRIDGE_BIT] = "hw-bridge-offload",
};
static const char
--
2.7.4
^ permalink raw reply related
* [Bridge] [PATCH 0/3] Add NETIF_F_HW_BRIDGE feature
From: Horatiu Vultur @ 2019-08-22 19:07 UTC (permalink / raw)
To: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
allan.nielsen, netdev, linux-kernel, bridge
Cc: Horatiu Vultur
Current implementation of the SW bridge is setting the interfaces in
promisc mode when they are added to bridge if learning of the frames is
enabled.
In case of Ocelot which has HW capabilities to switch frames, it is not
needed to set the ports in promisc mode because the HW already capable of
doing that. Therefore add NETIF_F_HW_BRIDGE feature to indicate that the
HW has bridge capabilities. Therefore the SW bridge doesn't need to set
the ports in promisc mode to do the switching.
This optimization takes places only if all the interfaces that are part
of the bridge have this flag and have the same network driver.
If the bridge interfaces is added in promisc mode then also the ports part
of the bridge are set in promisc mode.
Horatiu Vultur (3):
net: Add HW_BRIDGE offload feature
net: mscc: Use NETIF_F_HW_BRIDGE
net: mscc: Implement promisc mode.
drivers/net/ethernet/mscc/ocelot.c | 26 ++++++++++++++++++++++++--
include/linux/netdev_features.h | 3 +++
net/bridge/br_if.c | 29 ++++++++++++++++++++++++++++-
net/core/ethtool.c | 1 +
4 files changed, 56 insertions(+), 3 deletions(-)
--
2.7.4
^ permalink raw reply
* Re: [Bridge] [PATCH net-next v3 0/4] net: bridge: mdb: allow dump/add/del of host-joined entries
From: David Miller @ 2019-08-17 19:37 UTC (permalink / raw)
To: nikolay; +Cc: netdev, roopa, bridge
In-Reply-To: <20190817112213.27097-1-nikolay@cumulusnetworks.com>
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Sat, 17 Aug 2019 14:22:09 +0300
> This set makes the bridge dump host-joined mdb entries, they should be
> treated as normal entries since they take a slot and are aging out.
> We already have notifications for them but we couldn't dump them until
> now so they remained hidden. We dump them similar to how they're
> notified, in order to keep user-space compatibility with the dumped
> objects (e.g. iproute2 dumps mdbs in a format which can be fed into
> add/del commands) we allow host-joined groups also to be added/deleted via
> mdb commands. That can later be used for L2 mcast MAC manipulation as
> was recently discussed. Note that iproute2 changes are not necessary,
> this set will work with the current user-space mdb code.
>
> Patch 01 - a trivial comment move
> Patch 02 - factors out the mdb filling code so it can be
> re-used for the host-joined entries
> Patch 03 - dumps host-joined entries
> Patch 04 - allows manipulation of host-joined entries via standard mdb
> calls
>
> v3: fix compiler warning in patch 04 (DaveM)
> v2: change patch 04 to avoid double notification and improve host group
> manual removal if no ports are present in the group
Series applied.
^ permalink raw reply
* [Bridge] [PATCH net-next v3 4/4] net: bridge: mdb: allow add/delete for host-joined groups
From: Nikolay Aleksandrov @ 2019-08-17 11:22 UTC (permalink / raw)
To: netdev; +Cc: Nikolay Aleksandrov, roopa, bridge, davem
In-Reply-To: <20190817112213.27097-1-nikolay@cumulusnetworks.com>
Currently this is needed only for user-space compatibility, so similar
object adds/deletes as the dumped ones would succeed. Later it can be
used for L2 mcast MAC add/delete.
v3: fix compiler warning (DaveM)
v2: don't send a notification when used from user-space, arm the group
timer if no ports are left after host entry del
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 78 +++++++++++++++++++++++++++------------
net/bridge/br_multicast.c | 30 +++++++++++----
net/bridge/br_private.h | 2 +
3 files changed, 80 insertions(+), 30 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 985273425117..44594635a972 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -616,6 +616,19 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
return err;
}
+ /* host join */
+ if (!port) {
+ /* don't allow any flags for host-joined groups */
+ if (state)
+ return -EINVAL;
+ if (mp->host_joined)
+ return -EEXIST;
+
+ br_multicast_host_join(mp, false);
+
+ return 0;
+ }
+
for (pp = &mp->ports;
(p = mlock_dereference(*pp, br)) != NULL;
pp = &p->next) {
@@ -640,19 +653,21 @@ static int __br_mdb_add(struct net *net, struct net_bridge *br,
{
struct br_ip ip;
struct net_device *dev;
- struct net_bridge_port *p;
+ struct net_bridge_port *p = NULL;
int ret;
if (!netif_running(br->dev) || !br_opt_get(br, BROPT_MULTICAST_ENABLED))
return -EINVAL;
- dev = __dev_get_by_index(net, entry->ifindex);
- if (!dev)
- return -ENODEV;
+ if (entry->ifindex != br->dev->ifindex) {
+ dev = __dev_get_by_index(net, entry->ifindex);
+ if (!dev)
+ return -ENODEV;
- p = br_port_get_rtnl(dev);
- if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
+ p = br_port_get_rtnl(dev);
+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+ }
__mdb_entry_to_br_ip(entry, &ip);
@@ -667,9 +682,9 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
{
struct net *net = sock_net(skb->sk);
struct net_bridge_vlan_group *vg;
+ struct net_bridge_port *p = NULL;
struct net_device *dev, *pdev;
struct br_mdb_entry *entry;
- struct net_bridge_port *p;
struct net_bridge_vlan *v;
struct net_bridge *br;
int err;
@@ -680,15 +695,19 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
br = netdev_priv(dev);
- pdev = __dev_get_by_index(net, entry->ifindex);
- if (!pdev)
- return -ENODEV;
+ if (entry->ifindex != br->dev->ifindex) {
+ pdev = __dev_get_by_index(net, entry->ifindex);
+ if (!pdev)
+ return -ENODEV;
- p = br_port_get_rtnl(pdev);
- if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
+ p = br_port_get_rtnl(pdev);
+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+ vg = nbp_vlan_group(p);
+ } else {
+ vg = br_vlan_group(br);
+ }
- vg = nbp_vlan_group(p);
/* If vlan filtering is enabled and VLAN is not specified
* install mdb entry on all vlans configured on the port.
*/
@@ -727,6 +746,15 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry)
if (!mp)
goto unlock;
+ /* host leave */
+ if (entry->ifindex == mp->br->dev->ifindex && mp->host_joined) {
+ br_multicast_host_leave(mp, false);
+ err = 0;
+ if (!mp->ports && netif_running(br->dev))
+ mod_timer(&mp->timer, jiffies);
+ goto unlock;
+ }
+
for (pp = &mp->ports;
(p = mlock_dereference(*pp, br)) != NULL;
pp = &p->next) {
@@ -759,9 +787,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
{
struct net *net = sock_net(skb->sk);
struct net_bridge_vlan_group *vg;
+ struct net_bridge_port *p = NULL;
struct net_device *dev, *pdev;
struct br_mdb_entry *entry;
- struct net_bridge_port *p;
struct net_bridge_vlan *v;
struct net_bridge *br;
int err;
@@ -772,15 +800,19 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
br = netdev_priv(dev);
- pdev = __dev_get_by_index(net, entry->ifindex);
- if (!pdev)
- return -ENODEV;
+ if (entry->ifindex != br->dev->ifindex) {
+ pdev = __dev_get_by_index(net, entry->ifindex);
+ if (!pdev)
+ return -ENODEV;
- p = br_port_get_rtnl(pdev);
- if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
+ p = br_port_get_rtnl(pdev);
+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+ vg = nbp_vlan_group(p);
+ } else {
+ vg = br_vlan_group(br);
+ }
- vg = nbp_vlan_group(p);
/* If vlan filtering is enabled and VLAN is not specified
* delete mdb entry on all vlans configured on the port.
*/
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 9b379e110129..ad12fe3fca8c 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -148,8 +148,7 @@ static void br_multicast_group_expired(struct timer_list *t)
if (!netif_running(br->dev) || timer_pending(&mp->timer))
goto out;
- mp->host_joined = false;
- br_mdb_notify(br->dev, NULL, &mp->addr, RTM_DELMDB, 0);
+ br_multicast_host_leave(mp, true);
if (mp->ports)
goto out;
@@ -512,6 +511,27 @@ static bool br_port_group_equal(struct net_bridge_port_group *p,
return ether_addr_equal(src, p->eth_addr);
}
+void br_multicast_host_join(struct net_bridge_mdb_entry *mp, bool notify)
+{
+ if (!mp->host_joined) {
+ mp->host_joined = true;
+ if (notify)
+ br_mdb_notify(mp->br->dev, NULL, &mp->addr,
+ RTM_NEWMDB, 0);
+ }
+ mod_timer(&mp->timer, jiffies + mp->br->multicast_membership_interval);
+}
+
+void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify)
+{
+ if (!mp->host_joined)
+ return;
+
+ mp->host_joined = false;
+ if (notify)
+ br_mdb_notify(mp->br->dev, NULL, &mp->addr, RTM_DELMDB, 0);
+}
+
static int br_multicast_add_group(struct net_bridge *br,
struct net_bridge_port *port,
struct br_ip *group,
@@ -534,11 +554,7 @@ static int br_multicast_add_group(struct net_bridge *br,
goto err;
if (!port) {
- if (!mp->host_joined) {
- mp->host_joined = true;
- br_mdb_notify(br->dev, NULL, &mp->addr, RTM_NEWMDB, 0);
- }
- mod_timer(&mp->timer, now + br->multicast_membership_interval);
+ br_multicast_host_join(mp, true);
goto out;
}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index b7a4942ff1b3..ce2ab14ee605 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -702,6 +702,8 @@ void br_multicast_get_stats(const struct net_bridge *br,
struct br_mcast_stats *dest);
void br_mdb_init(void);
void br_mdb_uninit(void);
+void br_multicast_host_join(struct net_bridge_mdb_entry *mp, bool notify);
+void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify);
#define mlock_dereference(X, br) \
rcu_dereference_protected(X, lockdep_is_held(&br->multicast_lock))
--
2.21.0
^ permalink raw reply related
* [Bridge] [PATCH net-next v3 3/4] net: bridge: mdb: dump host-joined entries as well
From: Nikolay Aleksandrov @ 2019-08-17 11:22 UTC (permalink / raw)
To: netdev; +Cc: Nikolay Aleksandrov, roopa, bridge, davem
In-Reply-To: <20190817112213.27097-1-nikolay@cumulusnetworks.com>
Currently we dump only the port mdb entries but we can have host-joined
entries on the bridge itself and they should be treated as normal temp
mdbs, they're already notified:
$ bridge monitor all
[MDB]dev br0 port br0 grp ff02::8 temp
The group will not be shown in the bridge mdb output, but it takes 1 slot
and it's timing out. If it's only host-joined then the mdb show output
can even be empty.
After this patch we show the host-joined groups:
$ bridge mdb show
dev br0 port br0 grp ff02::8 temp
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 41 +++++++++++++++++++++++++++++++----------
1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 77730983097e..985273425117 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -78,22 +78,35 @@ static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
}
static int __mdb_fill_info(struct sk_buff *skb,
+ struct net_bridge_mdb_entry *mp,
struct net_bridge_port_group *p)
{
+ struct timer_list *mtimer;
struct nlattr *nest_ent;
struct br_mdb_entry e;
+ u8 flags = 0;
+ int ifindex;
memset(&e, 0, sizeof(e));
- __mdb_entry_fill_flags(&e, p->flags);
- e.ifindex = p->port->dev->ifindex;
- e.vid = p->addr.vid;
- if (p->addr.proto == htons(ETH_P_IP))
- e.addr.u.ip4 = p->addr.u.ip4;
+ if (p) {
+ ifindex = p->port->dev->ifindex;
+ mtimer = &p->timer;
+ flags = p->flags;
+ } else {
+ ifindex = mp->br->dev->ifindex;
+ mtimer = &mp->timer;
+ }
+
+ __mdb_entry_fill_flags(&e, flags);
+ e.ifindex = ifindex;
+ e.vid = mp->addr.vid;
+ if (mp->addr.proto == htons(ETH_P_IP))
+ e.addr.u.ip4 = mp->addr.u.ip4;
#if IS_ENABLED(CONFIG_IPV6)
- if (p->addr.proto == htons(ETH_P_IPV6))
- e.addr.u.ip6 = p->addr.u.ip6;
+ if (mp->addr.proto == htons(ETH_P_IPV6))
+ e.addr.u.ip6 = mp->addr.u.ip6;
#endif
- e.addr.proto = p->addr.proto;
+ e.addr.proto = mp->addr.proto;
nest_ent = nla_nest_start_noflag(skb,
MDBA_MDB_ENTRY_INFO);
if (!nest_ent)
@@ -102,7 +115,7 @@ static int __mdb_fill_info(struct sk_buff *skb,
if (nla_put_nohdr(skb, sizeof(e), &e) ||
nla_put_u32(skb,
MDBA_MDB_EATTR_TIMER,
- br_timer_value(&p->timer))) {
+ br_timer_value(mtimer))) {
nla_nest_cancel(skb, nest_ent);
return -EMSGSIZE;
}
@@ -139,12 +152,20 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
break;
}
+ if (mp->host_joined) {
+ err = __mdb_fill_info(skb, mp, NULL);
+ if (err) {
+ nla_nest_cancel(skb, nest2);
+ break;
+ }
+ }
+
for (pp = &mp->ports; (p = rcu_dereference(*pp)) != NULL;
pp = &p->next) {
if (!p->port)
continue;
- err = __mdb_fill_info(skb, p);
+ err = __mdb_fill_info(skb, mp, p);
if (err) {
nla_nest_cancel(skb, nest2);
goto out;
--
2.21.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox