Ethernet Bridge development
 help / color / mirror / Atom feed
From: Joseph Huang <joseph.huang.2024@gmail.com>
To: Nikolay Aleksandrov <razor@blackwall.org>,
	Vladimir Oltean <olteanv@gmail.com>
Cc: "Joseph Huang" <Joseph.Huang@garmin.com>,
	netdev@vger.kernel.org, "Andrew Lunn" <andrew@lunn.ch>,
	"Florian Fainelli" <f.fainelli@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Roopa Prabhu" <roopa@nvidia.com>,
	"Linus Lüssing" <linus.luessing@c0d3.blue>,
	linux-kernel@vger.kernel.org, bridge@lists.linux.dev
Subject: Re: [PATCH RFC net-next 00/10] MC Flood disable and snooping
Date: Thu, 4 Apr 2024 18:16:12 -0400	[thread overview]
Message-ID: <804b7bf3-1b29-42c4-be42-4c23f1355aaf@gmail.com> (raw)
In-Reply-To: <065b803f-14a9-4013-8f11-712bb8d54848@blackwall.org>

On 4/2/2024 5:59 PM, Nikolay Aleksandrov wrote:
> On 4/2/24 23:46, Vladimir Oltean wrote:
>> On Tue, Apr 02, 2024 at 09:50:51PM +0300, Nikolay Aleksandrov wrote:
>>> On 4/2/24 20:43, Vladimir Oltean wrote:
>>>> Hi Nikolai,
>>>>
>>>> On Tue, Apr 02, 2024 at 12:28:38PM +0300, Nikolay Aleksandrov wrote:
>>>>> For the bridge patches:
>>>>> Nacked-by: Nikolay Aleksandrov <razor@blackwall.org>
>>>>>
>>>>> You cannot break the multicast flood flag to add support for a custom
>>>>> use-case. This is unacceptable. The current bridge behaviour is correct
>>>>> your patch 02 doesn't fix anything, you should configure the bridge
>>>>> properly to avoid all those problems, not break protocols.
>>>>>
>>>>> Your special use case can easily be solved by a user-space helper or
>>>>> eBPF and nftables. You can set the mcast flood flag and bypass the
>>>>> bridge for these packets. I basically said the same in 2021, if this is
>>>>> going to be in the bridge it should be hidden behind an option that is
>>>>> default off. But in my opinion adding an option to solve such special
>>>>> cases is undesirable, they can be easily solved with what's currently
>>>>> available.
>>>>
>>>> I appreciate your time is limited, but could you please translate your
>>>> suggestion, and detail your proposed alternative a bit, for those of us
>>>> who are not very familiar with IP multicast snooping?
>>>>
>>>
>>> My suggestion is not related to snooping really, but to the goal of
>>> patches 01-03. The bridge patches in this set are trying to forward
>>> traffic that is not supposed to be forwarded with the proposed
>>> configuration,
>> 
>> Correct up to a point. Reinterpreting the given user space configuration
>> and trying to make it do something else seems like a mistake, but in
>> principle one could also look at alternative bridge configurations like
>> the one I described here:
>> https://lore.kernel.org/netdev/20240402180805.yhhwj2f52sdc4dl2@skbuf/
>> 
>>> so that can be done by a user-space helper that installs
>>> rules to bypass the bridge specifically for those packets while
>>> monitoring the bridge state to implement a policy and manage these rules
>>> in order to keep snooping working.
>>>
>>>> Bypass the bridge for which packets? General IGMP/MLD queries? Wouldn't
>>>> that break snooping? And then do what with the packets, forward them in
>>>> another software layer than the bridge?
>>>>
>>>
>>> The ones that are not supposed to be forwarded in the proposed config
>>> and are needed for this use case (control traffic and link-local). Obviously
>>> to have proper snooping you'd need to manage these bypass
>>> rules and use them only while needed.
>> 
>> I think Joseph will end up in a situation where he needs IGMP control
>> messages both in the bridge data path and outside of it :)
>> 
> 
> My solution does not exclude such scenario. With all unregistered mcast
> disabled it will be handled the same as with this proposed solution.
> 
>> Also, your proposal eliminates the possibility of cooperating with a
>> hardware accelerator which can forward the IGMP messages where they need
>> to go.
>> 
>> As far as I understand, I don't think Joseph has a very "special" use case.
>> Disabling flooding of unregistered multicast in the data plane sounds
>> reasonable. There seems to be a gap in the bridge API, in that this
> 
> This we already have, but..
> 
>> operation also affects the control plane, which he is trying to fix with
>> this "force flooding", because of insufficiently fine grained control.
>> 
> 
> Right, this is the part that is more special, I'm not saying it's
> unreasonable. The proposition to make it optional and break it down to
> type of traffic sounds good to me.
> 
>>>> I also don't quite understand the suggestion of turning on mcast flooding:
>>>> isn't Joseph saying that he wants it off for the unregistered multicast
>>>> data traffic?
>>>
>>> Ah my bad, I meant to turn off flooding and bypass the bridge for those
>>> packets and ports while necessary, under necessary can be any policy
>>> that the user-space helper wants to implement.
>>>
>>> In any case, if this is going to be yet another kernel solution then it
>>> must be a new option that is default off, and doesn't break current mcast
>>> flood flag behaviour.
>> 
>> Yeah, maybe something like this, simple and with clear offload
>> semantics, as seen in existing hardware (not Marvell though):
>> 
>> mcast_flood == off:
>> - mcast_ipv4_ctrl_flood: don't care (maybe can force to "off")
>> - mcast_ipv4_data_flood: don't care
>> - mcast_ipv6_ctrl_flood: don't care
>> - mcast_ipv6_data_flood: don't care
>> - mcast_l2_flood: don't care
>> mcast_flood == on:
>> - Flood 224.0.0.x according to mcast_ipv4_ctrl_flood
>> - Flood all other IPv4 multicast according to mcast_ipv4_data_flood
>> - Flood ff02::/16 according to mcast_ipv6_ctrl_flood
>> - Flood all other IPv6 multicast according to mcast_ipv6_data_flood
>> - Flood L2 according to mcast_l2_flood

Did you mean

if mcast_flood == on (meaning mcast_flood is ENABLED)
- mcast_ipv4_ctrl_flood: don't care (since 224.0.0.x will be flooded anyway)
...

if mcast_flood == off (meaning mcast_flood is DISABLED)
- Flood 224.0.0.x according to mcast_ipv4_ctrl_flood
...

? Otherwise the problem is still not solved when mcast_flood is disabled.

> 
> Yep, sounds good to me. I was thinking about something in these lines
> as well if doing a kernel solution in order to make it simpler and more
> generic. The ctrl flood bits need to be handled more carefully to make
> sure they match only control traffic and not link-local data.

Do we consider 224.0.0.251 (mDNS) to be control or data? What qualifies 
as control I guess that's my question.

> I think the old option can be converted to use this fine-grained
> control, that is if anyone sets the old flood on/off then the flood
> mask gets set properly so we can do just 1 & in the fast path and avoid
> adding more tests. It will also make it symmetric - if it can override
> the on case, then it will be able to override the off case. And to be
> more explicit you can pass a mask variable to br_multicast_rcv() which
> will get populated and then you can pass it down to br_flood(). That
> will also avoid adding new bits to the skb's bridge CB.
> 
> 


  reply	other threads:[~2024-04-04 22:16 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-02  0:10 [PATCH RFC net-next 00/10] MC Flood disable and snooping Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 01/10] net: bridge: Flood Queries even when mc flood is disabled Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 02/10] net: bridge: Always multicast_flood Reports Joseph Huang
2024-04-03 15:52   ` Simon Horman
2024-04-02  0:11 ` [PATCH RFC net-next 03/10] net: bridge: Always flood local subnet mc packets Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 04/10] net: dsa: mv88e6xxx: Add all hosts mc addr to ATU Joseph Huang
2024-04-02 18:08   ` Vladimir Oltean
2024-04-02  0:11 ` [PATCH RFC net-next 05/10] net: dsa: Add support for PORT_MROUTER attribute Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 06/10] net: dsa: mv88e6xxx: Track soft bridge objects Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 07/10] net: dsa: mv88e6xxx: Track bridge mdb objects Joseph Huang
2024-04-02 12:23   ` Vladimir Oltean
2024-04-04 20:43     ` Joseph Huang
2024-04-05 11:07       ` Vladimir Oltean
2024-04-05 18:58         ` Joseph Huang
2024-04-29 22:07           ` Joseph Huang
2024-04-30  0:59             ` Vladimir Oltean
2024-04-30 16:27               ` Joseph Huang
2024-05-02 20:37                 ` Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 08/10] net: dsa: mv88e6xxx: Convert MAB to use bit flags Joseph Huang
2024-04-02  0:11 ` [PATCH RFC net-next 09/10] net: dsa: mv88e6xxx: Enable mc flood for mrouter port Joseph Huang
2024-04-03 15:49   ` Simon Horman
2024-04-02  0:11 ` [PATCH RFC net-next 10/10] net: dsa: mv88e6xxx: Offload " Joseph Huang
2024-04-02  9:28 ` [PATCH RFC net-next 00/10] MC Flood disable and snooping Nikolay Aleksandrov
2024-04-02 17:43   ` Vladimir Oltean
2024-04-02 18:50     ` Nikolay Aleksandrov
2024-04-02 20:46       ` Vladimir Oltean
2024-04-02 21:59         ` Nikolay Aleksandrov
2024-04-04 22:16           ` Joseph Huang [this message]
2024-04-05 10:20             ` Vladimir Oltean
2024-04-05 11:00               ` Nikolay Aleksandrov
2024-04-05 20:22                 ` Joseph Huang
2024-04-05 21:15                   ` Vladimir Oltean
2024-04-29 20:14                     ` Joseph Huang
2024-04-30  1:21                       ` Vladimir Oltean
2024-04-30 17:01                         ` Joseph Huang
2024-05-02 12:12                           ` Nikolay Aleksandrov
2025-02-26 20:20                             ` Linus Lüssing
2025-02-26 22:17                               ` Linus Lüssing
2024-04-02 12:43 ` Andrew Lunn
2024-04-04 21:35   ` Joseph Huang
2024-04-04 22:11     ` Andrew Lunn
2024-04-04 22:40       ` Joseph Huang
2024-04-05 13:09         ` Andrew Lunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=804b7bf3-1b29-42c4-be42-4c23f1355aaf@gmail.com \
    --to=joseph.huang.2024@gmail.com \
    --cc=Joseph.Huang@garmin.com \
    --cc=andrew@lunn.ch \
    --cc=bridge@lists.linux.dev \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=f.fainelli@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linus.luessing@c0d3.blue \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=roopa@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox