From: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: "Ido Schimmel" <idosch@mellanox.com>,
"Scott Feldman" <sfeldma@gmail.com>,
"Nikolay Aleksandrov" <razor@blackwall.org>,
Netdev <netdev@vger.kernel.org>, "Jiří Pírko" <jiri@resnulli.us>,
"David S. Miller" <davem@davemloft.net>,
"Nikolay Aleksandrov" <nikolay@cumulusnetworks.com>,
"Elad Raz" <eladr@mellanox.com>
Subject: Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges
Date: Wed, 14 Oct 2015 20:07:41 -0400 [thread overview]
Message-ID: <20151015000741.GA1297@ketchup.lan> (raw)
In-Reply-To: <561ED270.60003@gmail.com>
On Oct. Wednesday 14 (42) 03:08 PM, Florian Fainelli wrote:
> On 14/10/15 11:51, Vivien Didelot wrote:
> > On Oct. Wednesday 14 (42) 08:42 PM, Ido Schimmel wrote:
> >> Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote:
> >>> On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot
> >>> <vivien.didelot@savoirfairelinux.com> wrote:
> >>>> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote:
> >>>>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote:
> >>>>>> On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote:
> >>>>>>> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote:
> >>>>>>>> Hi guys,
> >>>>>>>>
> >>>>>>>> On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote:
> >>>>>>>>> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>>>>>>>>
> >>>>>>>>> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>>>>>>>> ---
> >>>>>>>>> net/switchdev/switchdev.c | 3 +++
> >>>>>>>>> 1 file changed, 3 insertions(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
> >>>>>>>>> index 6e4a4f9ad927..256c596de896 100644
> >>>>>>>>> --- a/net/switchdev/switchdev.c
> >>>>>>>>> +++ b/net/switchdev/switchdev.c
> >>>>>>>>> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev,
> >>>>>>>>> if (vlan.vid_begin)
> >>>>>>>>> return -EINVAL;
> >>>>>>>>> vlan.vid_begin = vinfo->vid;
> >>>>>>>>> + /* don't allow range of pvids */
> >>>>>>>>> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID)
> >>>>>>>>> + return -EINVAL;
> >>>>>>>>> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) {
> >>>>>>>>> if (!vlan.vid_begin)
> >>>>>>>>> return -EINVAL;
> >>>>>>>>> --
> >>>>>>>>> 2.4.3
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Yes the patch looks good, but it is a minor check though. I hope the
> >>>>>>>> subject of this thread is making sense.
> >>>>>>>>
> >>>>>>>> VLAN ranges seem to have been included for an UX purpose (so commands
> >>>>>>>> look like Cisco IOS). We don't want to change any existing interface, so
> >>>>>>>> we pushed that down to drivers, with the only valid reason that, maybe
> >>>>>>>> one day, an hardware can be capable of programming a range on a per-port
> >>>>>>>> basis.
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> That's actually what we are doing in mlxsw. We can do up to 256 entries in
> >>>>>>> one go. We've yet to submit this part.
> >>>>>>
> >>>>>> Perfect Ido, thanks for pointing this out! I'm OK with the range then.
> >>>>>>
> >>>>>> So there is now a very last question in my head for this, which is more
> >>>>>> a matter of kernel design. Should the user be aware of such underlying
> >>>>>> support? In other words, would it make sense to do this in a driver:
> >>>>>>
> >>>>>> foo_port_vlan_add(struct net_device *dev,
> >>>>>> struct switchdev_obj_port_vlan *vlan)
> >>>>>> {
> >>>>>> if (vlan->vid_begin != vlan->vid_end)
> >>>>>> return -ENOTSUPP; /* or something more relevant for user */
> >>>>>>
> >>>>>> return foo_port_single_vlan_add(dev, vlan->vid_begin);
> >>>>>> }
> >>>>>>
> >>>>>> So drivers keep being simple, and we can easily propagate the fact that
> >>>>>> one-or-all VLAN is not supportable, vs. the VLAN feature itself is not
> >>>>>> implemented and must be done in software.
> >>>>> I think that if you want to keep it simple, then Scott's advice from the
> >>>>> previous thread is the most appropriate one. I believe the hardware you
> >>>>> are using is simply not meant to support multiple 802.1Q bridges.
> >>>>
> >>>> You mean allowing only one Linux bridge over an hardware switch?
> >>>>
> >>>> It would for sure simplify how, as developers and users, we represent a
> >>>> physical switch. But I am not sure how to achieve that and I don't have
> >>>> strong opinions on this TBH.
> >>>
> >>> Hi Vivien, I think it's possible to keep switch ports on just one
> >>> bridge if we do a little bit of work on the NETDEV_CHANGEUPPER
> >>> notifier. This will give you the driver-level control you want. Do
> >>> you have time to investigate? The idea is:
> >>>
> >>> 1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is
> >>> being added to a second bridge,then return NOTIFY_BAD. Your driver
> >>> needs to track the bridge count.
> >>>
> >>> 2) In __netdev_upper_dev_link(), check the return code from the
> >>> call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if
> >>> NOTIFY_BAD, abort the linking operation (goto rollback_xxx).
> >>>
> >> Hi,
> >>
> >> We are doing something similar in mlxsw (not upstream yet). Jiri
> >> introduced PRE_CHANGEUPPER, which is called from the function you
> >> mentioned, but before the linking operation (so that you don't need to
> >> rollback).
> >>
> >> If the notification is about a linking operation and the master is a
> >> bridge different than the current one, then NOTIFY_BAD is returned.
> >
> > Great, I'll wait for this then.
> >
> > Scott, this is another good reason why we definitely need a simple
> > struct device per switch chip. In addition to the port net_device
> > registration, the netdev notifier is another exact same piece of code
> > that both Rocker and DSA implement.
> >
> >> Vivien, regarding your WAN interface question, this is something we
> >> currently don't do. We don't even flood traffic from bridged ports
> >> to CPU (although we can), as it can saturate the bus. Only control
> >> traffic is supposed to go there.
> >
> > I kinda answered it myself: a Linux bridge needs to remain a user
> > abstraction of a logical group of net_device. In other words, we must
> > allow physical distinct ports under the same bridge.
> >
> > Below is an example of a custom router with 2 chained switch chips sw0
> > and sw1, and what usage I believe we expect:
> >
> > [ Linux soft bridge "br0" which can accelerate VLAN, STP, etc. ]
> > (CPU) (WAN)
> > [ sw0p0 sw0p1 sw0p2 ] [ sw1p0 sw1p1 sw1p2 sw1p3 ] [ eth0 ] [ eth1 ]
> > `--DSA--' `-------'
>
> Your use case is something that is fairly common with PON/GPON devices,
> but AFAIR they typically implement this by having two sets of bridges,
> one which spans the WAN interface and a bunch of other ports, and
> another bridge which is LAN only (few ports + Wi-Fi typically). Usually
> this is under the same physical switch though, so this is all about
> partitioning physical ports and physical interfaces under logical groups.
>
> By definition the WAN and LAN domains are separate logical and broadcast
> domains, with separate admission control rules, STP etc. I do not think
> your "br0" example should span the WAN interface in this case. Also,
> with eth0 being the conduit interface, it cannot be allowed to be a
> bridge member, that's something I ended up fixing in the bridge layer,
> otherwise untagging does not work, but that is nitpicking.
>
> It still makes sense though allow the creation of a "br0" device which
> spans the entire set of ports and interfaces (except eth0), but not name
> eth1 "WAN", just treat it as an extension in that case.
>
> Sorry if that seems like f'ing flies, but having concise example should
> help make progress on these design issues ;)
You are right, my drawing wasn't that correct. What you just described
is more accurate.
Thanks,
-v
next prev parent reply other threads:[~2015-10-15 0:07 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-09 23:30 switchdev and VLAN ranges Vivien Didelot
2015-10-10 4:22 ` Scott Feldman
2015-10-10 16:33 ` Vivien Didelot
2015-10-10 18:10 ` Florian Fainelli
2015-10-10 19:47 ` Vivien Didelot
2015-10-10 7:49 ` Elad Raz
2015-10-10 10:36 ` Nikolay Aleksandrov
2015-10-11 7:12 ` Jiri Pirko
2015-10-11 10:49 ` [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges Nikolay Aleksandrov
2015-10-11 14:13 ` Jiri Pirko
2015-10-13 2:59 ` David Miller
2015-10-11 22:41 ` switchdev and VLAN ranges Vivien Didelot
2015-10-12 0:13 ` Nikolay Aleksandrov
2015-10-12 5:14 ` Scott Feldman
2015-10-12 10:15 ` Nikolay Aleksandrov
2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov
2015-10-12 12:11 ` Elad Raz
2015-10-12 12:17 ` Jiri Pirko
2015-10-12 17:36 ` Vivien Didelot
2015-10-13 6:13 ` Scott Feldman
2015-10-13 8:31 ` Ido Schimmel
2015-10-13 14:32 ` Vivien Didelot
2015-10-14 6:14 ` Ido Schimmel
2015-10-14 15:25 ` Vivien Didelot
2015-10-14 17:14 ` Scott Feldman
2015-10-14 17:42 ` Ido Schimmel
2015-10-14 18:51 ` Vivien Didelot
2015-10-14 22:08 ` Florian Fainelli
2015-10-15 0:07 ` Vivien Didelot [this message]
2015-10-15 2:58 ` Scott Feldman
2015-10-15 7:28 ` Ido Schimmel
2015-10-13 11:42 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151015000741.GA1297@ketchup.lan \
--to=vivien.didelot@savoirfairelinux.com \
--cc=davem@davemloft.net \
--cc=eladr@mellanox.com \
--cc=f.fainelli@gmail.com \
--cc=idosch@mellanox.com \
--cc=jiri@resnulli.us \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=razor@blackwall.org \
--cc=sfeldma@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).