netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Petr Machata <petrm@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Roopa Prabhu <roopa@nvidia.com>,
	netdev@vger.kernel.org, bridge@lists.linux-foundation.org
Subject: Re: [PATCH net-next 07/16] net: bridge: Maintain number of MDB entries in net_bridge_mcast_port
Date: Mon, 30 Jan 2023 10:08:21 +0200	[thread overview]
Message-ID: <Y9d69bP7tzp/2reQ@shredder> (raw)
In-Reply-To: <81821548-4839-e7ba-37b0-92966beb7930@blackwall.org>

On Sun, Jan 29, 2023 at 06:55:26PM +0200, Nikolay Aleksandrov wrote:
> On 26/01/2023 19:01, Petr Machata wrote:
> > The MDB maintained by the bridge is limited. When the bridge is configured
> > for IGMP / MLD snooping, a buggy or malicious client can easily exhaust its
> > capacity. In SW datapath, the capacity is configurable through the
> > IFLA_BR_MCAST_HASH_MAX parameter, but ultimately is finite. Obviously a
> > similar limit exists in the HW datapath for purposes of offloading.
> > 
> > In order to prevent the issue of unilateral exhaustion of MDB resources,
> > introduce two parameters in each of two contexts:
> > 
> > - Per-port and per-port-VLAN number of MDB entries that the port
> >   is member in.
> > 
> > - Per-port and (when BROPT_MCAST_VLAN_SNOOPING_ENABLED is enabled)
> >   per-port-VLAN maximum permitted number of MDB entries, or 0 for
> >   no limit.
> > 
> > The per-port multicast context is used for tracking of MDB entries for the
> > port as a whole. This is available for all bridges.
> > 
> > The per-port-VLAN multicast context is then only available on
> > VLAN-filtering bridges on VLANs that have multicast snooping on.
> > 
> > With these changes in place, it will be possible to configure MDB limit for
> > bridge as a whole, or any one port as a whole, or any single port-VLAN.
> > 
> > Note that unlike the global limit, exhaustion of the per-port and
> > per-port-VLAN maximums does not cause disablement of multicast snooping.
> > It is also permitted to configure the local limit larger than hash_max,
> > even though that is not useful.
> > 
> > In this patch, introduce only the accounting for number of entries, and the
> > max field itself, but not the means to toggle the max. The next patch
> > introduces the netlink APIs to toggle and read the values.
> > 
> > Note that the per-port-VLAN mcast_max_groups value gets reset when VLAN
> > snooping is enabled. The reason for this is that while VLAN snooping is
> > disabled, permanent entries can be added above the limit imposed by the
> > configured maximum. Under those circumstances, whatever caused the VLAN
> > context enablement, would need to be rolled back, adding a fair amount of
> > code that would be rarely hit and tricky to maintain. At the same time,
> > the feature that this would enable is IMHO not interesting: I posit that
> > the usefulness of keeping mcast_max_groups intact across
> > mcast_vlan_snooping toggles is marginal at best.
> > 
> 
> Hmm, I keep thinking about this one and I don't completely agree. It would be
> more user-friendly if the max count doesn't get reset when mcast snooping is toggled.
> Imposing order of operations (first enable snooping, then config max entries) isn't necessary
> and it makes sense for someone to first set the limit and then enable vlan snooping.
> Also it would be consistent with port max entries, I'd prefer if we have the same
> behaviour for port and vlan pmctxs. If we allow to set any maximum at any time we
> don't need to rollback anything, also we already always lookup vlans in br_multicast_port_vid_to_port_ctx()
> to check if snooping is enabled so we can keep the count correct regardless, the same as
> it's done for the ports. Keeping both limits with consistent semantics seems better to me.
> 
> WDYT ?

The current approach is strict and prevents user space from performing
configuration that does not make a lot of sense:

1. Setting the maximum to be less than the current count.

2. Increasing the port-VLAN count above port-VLAN maximum when VLAN
snooping is disabled (i.e., maximum is not enforced) and then enabling
VLAN snooping.

However, it is not consistent with similar existing behavior where the
kernel is more liberal. For example:

1. It is possible to set the global maximum to be less than the current
number of entries.

2. Other port-VLAN attributes are not reset when VLAN snooping is
toggled.

And it also results in order of operations problems like you described.

So, it seems to me that we have more good reasons to not reset the
maximum than to reset it. Regardless of which path we take, it is
important to document the behavior in the man page (and in the commit
message, obviously) to avoid "bug reports" later on.

  reply	other threads:[~2023-01-30  8:10 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-26 17:01 [PATCH net-next 00/16] bridge: Limit number of MDB entries per port, port-vlan Petr Machata
2023-01-26 17:01 ` [PATCH net-next 01/16] net: bridge: Set strict_start_type at two policies Petr Machata
2023-01-26 19:18   ` Stephen Hemminger
2023-01-26 20:27     ` Nikolay Aleksandrov
2023-01-29  9:09   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 02/16] net: bridge: Add extack to br_multicast_new_port_group() Petr Machata
2023-01-29  9:09   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 03/16] net: bridge: Move extack-setting " Petr Machata
2023-01-29  9:09   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 04/16] net: bridge: Add br_multicast_del_port_group() Petr Machata
2023-01-29  9:11   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 05/16] net: bridge: Change a cleanup in br_multicast_new_port_group() to goto Petr Machata
2023-01-29  9:11   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 06/16] net: bridge: Add a tracepoint for MDB overflows Petr Machata
2023-01-26 17:53   ` Steven Rostedt
2023-01-27 14:29     ` Petr Machata
2023-01-30 15:50     ` Petr Machata
2023-01-30 23:23       ` Steven Rostedt
2023-01-26 17:01 ` [PATCH net-next 07/16] net: bridge: Maintain number of MDB entries in net_bridge_mcast_port Petr Machata
2023-01-29  9:40   ` Nikolay Aleksandrov
2023-01-29 16:55   ` Nikolay Aleksandrov
2023-01-30  8:08     ` Ido Schimmel [this message]
2023-01-30  8:56       ` Nikolay Aleksandrov
2023-01-30 15:02     ` Petr Machata
2023-01-26 17:01 ` [PATCH net-next 08/16] net: bridge: Add netlink knobs for number / maximum MDB entries Petr Machata
2023-01-29 10:07   ` Nikolay Aleksandrov
2023-01-29 14:58     ` Ido Schimmel
2023-01-30 11:07     ` Petr Machata
2023-01-26 17:01 ` [PATCH net-next 09/16] selftests: forwarding: Move IGMP- and MLD-related functions to lib Petr Machata
2023-01-29 10:08   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 10/16] selftests: forwarding: bridge_mdb: Fix a typo Petr Machata
2023-01-29 10:09   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 11/16] selftests: forwarding: lib: Add helpers for IP address handling Petr Machata
2023-01-29 10:09   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 12/16] selftests: forwarding: lib: Add helpers for checksum handling Petr Machata
2023-01-29 10:10   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 13/16] selftests: forwarding: lib: Parameterize IGMPv3/MLDv2 generation Petr Machata
2023-01-29 10:10   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 14/16] selftests: forwarding: lib: Allow list of IPs for IGMPv3/MLDv2 Petr Machata
2023-01-29 10:11   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 15/16] selftests: forwarding: lib: Add helpers to build IGMP/MLD leave packets Petr Machata
2023-01-29 10:11   ` Nikolay Aleksandrov
2023-01-26 17:01 ` [PATCH net-next 16/16] selftests: forwarding: bridge_mdb_max: Add a new selftest Petr Machata
2023-01-29 10:12   ` Nikolay Aleksandrov
2023-01-26 20:28 ` [PATCH net-next 00/16] bridge: Limit number of MDB entries per port, port-vlan Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9d69bP7tzp/2reQ@shredder \
    --to=idosch@nvidia.com \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    --cc=razor@blackwall.org \
    --cc=roopa@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).