From: "Huang, Joseph" <joseph.huang.at.garmin@gmail.com>
To: "Linus Lüssing" <linus.luessing@c0d3.blue>,
"Ido Schimmel" <idosch@nvidia.com>
Cc: Joseph Huang <Joseph.Huang@garmin.com>,
netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Nikolay Aleksandrov <razor@blackwall.org>,
David Ahern <dsahern@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Kuniyuki Iwashima <kuniyu@google.com>,
Ahmed Zaki <ahmed.zaki@intel.com>,
Alexander Lobakin <aleksander.lobakin@intel.com>,
linux-kernel@vger.kernel.org, bridge@lists.linux.dev
Subject: Re: [PATCH net] net: bridge: Trigger host query on v6 addr valid
Date: Mon, 6 Oct 2025 11:43:02 -0400 [thread overview]
Message-ID: <9cc66694-6fcd-4460-9bce-cdbcb0153a89@gmail.com> (raw)
In-Reply-To: <aOEu6uQ4pP4PJH-y@sellars>
On 10/4/2025 10:27 AM, Linus Lüssing wrote:
> On Wed, Sep 17, 2025 at 02:30:51PM +0300, Ido Schimmel wrote:
>> But before making changes, I want to better understand the problem you
>> are seeing. Is it specific to the offloaded data path? I believe the
>> problem was fixed in the software data path by this commit:
>
> Two issues I noticed recently, even without any hardware switch
> offloading, on plain soft bridges:
>
> 1) (Probably not the issue here? But just to avoid that this
> causes additional confusion:) we don't seem to properly converge to
> the lowest MAC address, which is a bug, a violation of the RFCs.
>
> If we received an IGMP/MLD query from a foreign host with an
> address like fe80::2 and selected it and then enable our own
> multicast querier with a lower address like fe80::1 on our bridge
> interface for example then we won't send our queries, won't reelect
> ourself. If I recall correctly. (Not too critical though, as at least we
> have a querier on the link. But I find the election code a bit
> confusing and I wouldn't dare to touch it without adding some tests.)
>
I agree that there might be some corner cases which the current election
code does not handle very well (one of them is outlined below).
> 2) Without Ido's suggested workaround when the bridge multicast snooping
> + querier is enabled before the IPv6 DAD has taken place then our
> first IGMP/MLD query will fizzle, not be transmitted.
This (#2) is what this patch trying to address. With DAD enabled, the
first MLD Query is never transmitted. That essentially means that the
Robustness Variable is 1 (which is not very robust).
> However (at least for a non-hardware-offloaded) bridge as far as I
> recall this shouldn't create any multicast packet loss and should
> operate as "normal" with flooding multicast data packets first,
> with multicast snooping activating on multicast data
> after another IGMP/MLD querier interval has elapsed (default:
> 125 sec.)?
>
Some systems could not afford to flood multicast traffic. Think of some
resource-constrained low power sensors connected to a network with high
volume multicast video traffic for example. The multicast traffic could
easily choke the sensors and is essentially a DDoS attack.
> Which indeed could be optimized and is confusing, this delay could
> be avoided. Is that that the issue you mean, Joseph?
> (I'd consider it more an optimization, so for net-next, not
> net though.)
>
I'm not sure this should be categorized as an optimization. If we never
intend to send Startup Queries, that's a different story. But if we
intend to send it but failed, I think that should be a bug.
>> In current implementation, :: always wins the election
>
> That would be news to me.
>
> RFC2710, section 5:
>
> To be valid, the Query message MUST come from a link-
> local IPv6 Source Address
>
> RFC3810, section 5.1.14, is even more explicit:
>
> 5.1.14. Source Addresses for Queries
>
> All MLDv2 Queries MUST be sent with a valid IPv6 link-local source
> address. If a node (router or host) receives a Query message with
> the IPv6 Source Address set to the unspecified address (::), or any
> other address that is not a valid IPv6 link-local address, it MUST
> silently discard the message and SHOULD log a warning.
>
> So :: can't be used as a source address for an MLD query.
> And since 2014 with "bridge: multicast: add sanity check for query source addresses"
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6565b9eeef194afbb3beec80d6dd2447f4091f8c)
> we should be adhering to that requirement? Let me know if I'm missing
> something.
>
This is what I meant by ":: always wins":
In br_multicast_select_querier(),
if (ipv6_addr_cmp(&saddr->src.ip6, &querier->addr.src.ip6) <= 0)
goto update;
If querier->addr.src.ip6 is 0, nothing can be less than that, so "::
always wins".
However,
1. querier->addr.src.ip6 is (un)initialized(?) to 0 (I couldn't find the
place where ip6_querier.addr is initialized)
2. Querier election cannot take place due to the comparison above, until
the bridge selects itself first via br_multicast_select_own_querier()
3. the bridge only selects itself after the first successful Query is
sent to the host
4. br_ip6_multicast_alloc_query() will fail if v6 address is not valid
So, without this patch a system would have to wait for
31.25 seconds (for the second Query to the host to selects itself) +
~125 seconds (for the next Query from the real Querier to arrive)
in order to receive multicast traffic. For some embedded devices that's
a very long time (imagine turning on a TV and have to wait for 2 minutes
and a half before it starts working).
Thanks,
Joseph
> For IPv4 and 0.0.0.0 this is a different story though... I'm not
> aware of a requirement in RFCs to avoid 0.0.0.0 in IGMP
> queries. And "intuitively" one would prefer 0.0.0.0 to be the
> least prefered querier address. But when taking the IGMP RFCs
> literally then 0.0.0.0 would be the lowest one and always win... And RFC4541
> unfortunately does not clarify the use of 0.0.0.0 for IGMP queries.
> Not quite sure what the common practice among other layer 2 multicast
> snooping implemetations across other vendos is.
>
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0888d5f3c0f183ea6177355752ada433d370ac89
>>
>> And Linus is working [1][2] on reflecting it to device drivers so that
>> the hardware data path will act like the software data path and flood
>> unregistered multicast traffic to all the ports as long as no querier
>> was detected.
>
> Right, for hardware offloading bridges/switches I'm on it, next
> revision shouldn't take much longer...
>
> Regards, Linus
next prev parent reply other threads:[~2025-10-06 15:43 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-12 22:39 [PATCH net] net: bridge: Trigger host query on v6 addr valid Joseph Huang
2025-09-13 18:23 ` Ido Schimmel
2025-09-15 22:41 ` Huang, Joseph
2025-09-17 11:30 ` Ido Schimmel
2025-10-04 14:27 ` Linus Lüssing
2025-10-06 15:43 ` Huang, Joseph [this message]
2025-10-08 12:28 ` Ido Schimmel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9cc66694-6fcd-4460-9bce-cdbcb0153a89@gmail.com \
--to=joseph.huang.at.garmin@gmail.com \
--cc=Joseph.Huang@garmin.com \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=bridge@lists.linux.dev \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linus.luessing@c0d3.blue \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).