netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Linus Lüssing" <linus.luessing@c0d3.blue>
To: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Jakub Kicinski <kuba@kernel.org>,
	bridge@lists.linux.dev, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Ido Schimmel <idosch@nvidia.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Simon Horman <horms@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S . Miller" <davem@davemloft.net>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Xiao Liang <shaw.leon@gmail.com>
Subject: Re: [PATCH 0/9] net: bridge: reduce multicast checks in fast path
Date: Tue, 2 Sep 2025 22:16:04 +0200	[thread overview]
Message-ID: <aLdQhJoViBzxcWYE@sellars> (raw)
In-Reply-To: <bfb11627-64d5-42a0-911e-8be99e222396@blackwall.org>

Hi Nik, thanks for the suggestions and review again!


On Fri, Aug 29, 2025 at 07:23:24PM +0300, Nikolay Aleksandrov wrote:
> 
> a few notes for v2:
> - please use READ/WRTE_ONCE() for variables that are used without locking

Just to understand the issue, otherwise the data path would assume
an old inactive or active state for a prolonged time after toggling?
Or what would happen?


> - please make locking symmetric, I saw that br_multicast_open() expects the lock to be already held, while
>   __br_multicast_stop() takes it itself

I think that's what I tried before submitting. Initially wanted
to have the locking inside, but then it would deadlock on
br_multicast_toggle()->br_multicast_open()->... as this is the one
place where br_multicast_open() is called with the multicast spinlock
already held.

On the other hand, moving the spinlock out of / around
__br_multicast_stop() would lead to a sleeping-while-atomic bug
when calling timer_delete_sync() in there. And if I were to change
these to a timer_delete() I guess there is no way to do the sync
part after unlocking? There is no equivalent to something like the
flush_workqueue() / drain_workqueue() for workqueues, but for
simple timers instead, right?

I would also love to go for the first approach, taking the
spinlock inside of __br_multicast_open(). But not quite sure how
to best and safely change br_multicast_toggle() then.


> - is the mcast lock really necessary, would atomic ops do for this tracking?

Hm, not sure. We'd be checking multiple variables before toggling
the new brmctx->ip{4,6}_active. As the checked variables can
change from multiple contexts couldn't the following happen then?


Start: ip*_active = false, snooping_enabled = false,
       vlan_snooping_enabled = true, vlan{id:42}->snooping_enabled = false

Thread A)                     Thread B)
--------------------------------------------------------------------------
                              br_multicast_toggle(br, 1, ...)
			      -> loads vlan{id:42}->snooping_enabled: false
--------------------------------------------------------------------------
br_multicast_toggle_one_vlan(vlan{id:42}, true)
-> vlan->priv_flags: set per-vlan-mc-snooping to true
-> br_multicast_update_active()
   checks snooping_enabled: false
   -> keeping vlan's ip*_active at false
--------------------------------------------------------------------------
                              -> sets snooping_enabled: true
                              -> br_multicast_update_active()
			         -> checks vlan{id:42}->snooping_enabled:
				    false (from earlier load above)
                                 -> keeping vlan's ip*_active at false

Result: vlan's ip*_active is still false even though it should be
true now?

.o(Or were netlink and sysfs handlers somehow ensured to never run in
parallel?)


I'm not that versed with atomic's and explicit memory barriers,
could that prevent something like that even if multiple variables
are involved? Only used lockless atomic_t's for atomic_inc()/_dec() so far.
And otherwise used explicit locking.



> - can you provide the full view somewhere, how would this tracking be used? I fear
>   there might still be races.

My original switchdev integration plan so far was roughly still the same
as in the previous pull-request:

https://patchwork.kernel.org/project/netdevbpf/patch/20250522195952.29265-5-linus.luessing@c0d3.blue/

And using it in rtl83xx in OpenWrt like this:
https://github.com/openwrt/openwrt/pull/18780/commits/708bbc4b73bc90cd13839c613e6634aa5faeeace#diff-b5c9a9cc66e207d77fea5d063dca2cce20cf4ae3a28fc0a5d5eebd47a024d5c3

But haven't updated these yet onto this PR, will do that later.

  reply	other threads:[~2025-09-02 20:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29  8:53 [PATCH 0/9] net: bridge: reduce multicast checks in fast path Linus Lüssing
2025-08-29  8:53 ` [PATCH 1/9] net: bridge: mcast: track active state, IGMP/MLD querier appearance Linus Lüssing
2025-08-29  8:53 ` [PATCH 2/9] net: bridge: mcast: track active state, foreign IGMP/MLD querier disappearance Linus Lüssing
2025-08-29  8:53 ` [PATCH 3/9] net: bridge: mcast: track active state, IPv6 address availability Linus Lüssing
2025-08-29  8:53 ` [PATCH 4/9] net: bridge: mcast: track active state, own MLD querier disappearance Linus Lüssing
2025-08-29  8:53 ` [PATCH 5/9] net: bridge: mcast: export ip{4,6}_active state to netlink Linus Lüssing
2025-08-29  8:53 ` [PATCH 6/9] net: bridge: mcast: use combined active state in fast/data path Linus Lüssing
2025-08-29  8:53 ` [PATCH 7/9] net: bridge: mcast: active state, if snooping is enabled Linus Lüssing
2025-08-29  8:53 ` [PATCH 8/9] net: bridge: mcast: track active state, bridge up/down Linus Lüssing
2025-08-29  8:53 ` [PATCH 9/9] net: bridge: mcast: add inactive state assertions Linus Lüssing
2025-08-29 15:47 ` [PATCH 0/9] net: bridge: reduce multicast checks in fast path Jakub Kicinski
2025-08-29 16:23   ` Nikolay Aleksandrov
2025-09-02 20:16     ` Linus Lüssing [this message]
2025-09-02 23:00       ` Linus Lüssing
2025-09-03 10:11         ` Nikolay Aleksandrov
2025-09-03 10:08       ` Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aLdQhJoViBzxcWYE@sellars \
    --to=linus.luessing@c0d3.blue \
    --cc=andrew+netdev@lunn.ch \
    --cc=bridge@lists.linux.dev \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=sdf@fomichev.me \
    --cc=shaw.leon@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).