public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <razor@blackwall.org>
To: "Linus Lüssing" <linus.luessing@c0d3.blue>
Cc: bridge@lists.linux.dev, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Ido Schimmel <idosch@nvidia.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Simon Horman <horms@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	"David S . Miller" <davem@davemloft.net>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Xiao Liang <shaw.leon@gmail.com>,
	shuah@kernel.org
Subject: Re: [PATCH net-next v3 03/14] net: bridge: mcast: avoid sleeping on bridge-down
Date: Mon, 2 Mar 2026 14:58:27 +0200	[thread overview]
Message-ID: <aaWJc1JPqZThLZBe@penguin> (raw)
In-Reply-To: <20260302054008.21638-4-linus.luessing@c0d3.blue>

On Mon, Mar 02, 2026 at 06:39:57AM +0100, Linus Lüssing wrote:
> We later want to use the multicast lock when setting the bridge
> interface up or down, to be able to atomically both check all conditions
> to toggle the multicast active state and to subsequently toggle it.
> While most variables we check / contexts we check from are serialized
> (toggled variables through netlink/sysfs) the timer_pending() check is
> not and might run in parallel.
> 
> However so far we are not allowed to spinlock __br_multicast_stop() as
> its call to timer_delete_sync() might sleep. Therefore replacing the
> sleeping variant with the non-sleeping timer_shutdown(). It is sufficient
> to only wait for any timer callback to finish when we are freeing the
> multicast context.
> 
> While the disadvantage of using a non-syncing variant might lead us to
> race and still execute its timer callback just after timer_shutdown() was
> called timer_shutdown() also has the following additional advantage(s):
> It for one thing clears the callback function pointer and by that avoids
> rearming. For another a missing function pointer allows us to detect
> early in the timer callback if we, this timer, were just canceled.
> 
> In other words, this also allows us to make sure that once
> timer_shutdown() was called while we do potentially enter its timer
> callback briefly we never run the main task of this timer. Similar to
> what a timer_delete_sync() would have avoided, too. Except we are not
> waiting/sleeping/syncing in br_multicast_stop() but instead (in rare cases)
> briefly busy-wait/sync a bit later when grabbing the multicast spinlock
> in the timer callback.
> 
> This new check also makes the netif_running() check redundant/obsolete
> in these contexts.
> 
> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
> ---
>  net/bridge/br_multicast.c | 128 +++++++++++++++++++++++++++-----------
>  net/bridge/br_private.h   |   5 ++
>  net/bridge/br_vlan.c      |   5 ++
>  3 files changed, 100 insertions(+), 38 deletions(-)
> 
> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> index 881d866d687a..b90f0e149c40 100644
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -1663,6 +1663,14 @@ static void br_multicast_router_expired(struct net_bridge_mcast_port *pmctx,
>  	spin_unlock(&br->multicast_lock);
>  }
>  
> +static bool br_multicast_is_stopping(struct net_bridge *br,
> +				     struct timer_list *timer)

both should be const

> +{
> +	lockdep_assert_held_once(&br->multicast_lock);
> +
> +	return !timer->function;
> +}
> +
>  static void br_ip4_multicast_router_expired(struct timer_list *t)
>  {
>  	struct net_bridge_mcast_port *pmctx = timer_container_of(pmctx, t,
> @@ -1698,7 +1706,8 @@ static void br_multicast_local_router_expired(struct net_bridge_mcast *brmctx,
>  					      struct timer_list *timer)
>  {
>  	spin_lock(&brmctx->br->multicast_lock);
> -	if (brmctx->multicast_router == MDB_RTR_TYPE_DISABLED ||
> +	if (br_multicast_is_stopping(brmctx->br, timer) ||
> +	    brmctx->multicast_router == MDB_RTR_TYPE_DISABLED ||
>  	    brmctx->multicast_router == MDB_RTR_TYPE_PERM ||
>  	    br_ip4_multicast_is_router(brmctx) ||
>  	    br_ip6_multicast_is_router(brmctx))
> @@ -1728,10 +1737,11 @@ static void br_ip6_multicast_local_router_expired(struct timer_list *t)
>  #endif
>  
>  static void br_multicast_querier_expired(struct net_bridge_mcast *brmctx,
> -					 struct bridge_mcast_own_query *query)
> +					 struct bridge_mcast_own_query *query,
> +					 struct timer_list *timer)
>  {
>  	spin_lock(&brmctx->br->multicast_lock);
> -	if (!netif_running(brmctx->br->dev) ||
> +	if (br_multicast_is_stopping(brmctx->br, timer) ||
>  	    br_multicast_ctx_vlan_global_disabled(brmctx) ||
>  	    !br_opt_get(brmctx->br, BROPT_MULTICAST_ENABLED))
>  		goto out;
> @@ -1747,7 +1757,7 @@ static void br_ip4_multicast_querier_expired(struct timer_list *t)
>  	struct net_bridge_mcast *brmctx = timer_container_of(brmctx, t,
>  							     ip4_other_query.timer);
>  
> -	br_multicast_querier_expired(brmctx, &brmctx->ip4_own_query);
> +	br_multicast_querier_expired(brmctx, &brmctx->ip4_own_query, t);
>  }
>  
>  #if IS_ENABLED(CONFIG_IPV6)
> @@ -1756,7 +1766,7 @@ static void br_ip6_multicast_querier_expired(struct timer_list *t)
>  	struct net_bridge_mcast *brmctx = timer_container_of(brmctx, t,
>  							     ip6_other_query.timer);
>  
> -	br_multicast_querier_expired(brmctx, &brmctx->ip6_own_query);
> +	br_multicast_querier_expired(brmctx, &brmctx->ip6_own_query, t);
>  }
>  #endif
>  
> @@ -4040,10 +4050,12 @@ int br_multicast_rcv(struct net_bridge_mcast **brmctx,
>  }
>  
>  static void br_multicast_query_expired(struct net_bridge_mcast *brmctx,
> -				       struct bridge_mcast_own_query *query)
> +				       struct bridge_mcast_own_query *query,
> +				       struct timer_list *timer)
>  {
>  	spin_lock(&brmctx->br->multicast_lock);
> -	if (br_multicast_ctx_vlan_disabled(brmctx))
> +	if (br_multicast_is_stopping(brmctx->br, timer) ||
> +	    br_multicast_ctx_vlan_disabled(brmctx))
>  		goto out;
>  
>  	if (query->startup_sent < brmctx->multicast_startup_query_count)
> @@ -4059,7 +4071,7 @@ static void br_ip4_multicast_query_expired(struct timer_list *t)
>  	struct net_bridge_mcast *brmctx = timer_container_of(brmctx, t,
>  							     ip4_own_query.timer);
>  
> -	br_multicast_query_expired(brmctx, &brmctx->ip4_own_query);
> +	br_multicast_query_expired(brmctx, &brmctx->ip4_own_query, t);
>  }
>  
>  #if IS_ENABLED(CONFIG_IPV6)
> @@ -4068,7 +4080,7 @@ static void br_ip6_multicast_query_expired(struct timer_list *t)
>  	struct net_bridge_mcast *brmctx = timer_container_of(brmctx, t,
>  							     ip6_own_query.timer);
>  
> -	br_multicast_query_expired(brmctx, &brmctx->ip6_own_query);
> +	br_multicast_query_expired(brmctx, &brmctx->ip6_own_query, t);
>  }
>  #endif
>  
> @@ -4111,29 +4123,30 @@ void br_multicast_ctx_init(struct net_bridge *br,
>  	seqcount_spinlock_init(&brmctx->ip6_querier.seq, &br->multicast_lock);
>  #endif
>  
> -	timer_setup(&brmctx->ip4_mc_router_timer,
> -		    br_ip4_multicast_local_router_expired, 0);
> -	timer_setup(&brmctx->ip4_other_query.timer,
> -		    br_ip4_multicast_querier_expired, 0);
> -	timer_setup(&brmctx->ip4_other_query.delay_timer,
> -		    br_multicast_query_delay_expired, 0);
> -	timer_setup(&brmctx->ip4_own_query.timer,
> -		    br_ip4_multicast_query_expired, 0);
> +	timer_setup(&brmctx->ip4_mc_router_timer, NULL, 0);
> +	timer_setup(&brmctx->ip4_other_query.timer, NULL, 0);
> +	timer_setup(&brmctx->ip4_other_query.delay_timer, NULL, 0);
> +	timer_setup(&brmctx->ip4_own_query.timer, NULL, 0);
>  #if IS_ENABLED(CONFIG_IPV6)
> -	timer_setup(&brmctx->ip6_mc_router_timer,
> -		    br_ip6_multicast_local_router_expired, 0);
> -	timer_setup(&brmctx->ip6_other_query.timer,
> -		    br_ip6_multicast_querier_expired, 0);
> -	timer_setup(&brmctx->ip6_other_query.delay_timer,
> -		    br_multicast_query_delay_expired, 0);
> -	timer_setup(&brmctx->ip6_own_query.timer,
> -		    br_ip6_multicast_query_expired, 0);
> +	timer_setup(&brmctx->ip6_mc_router_timer, NULL, 0);
> +	timer_setup(&brmctx->ip6_other_query.timer, NULL, 0);
> +	timer_setup(&brmctx->ip6_other_query.delay_timer, NULL, 0);
> +	timer_setup(&brmctx->ip6_own_query.timer, NULL, 0);
>  #endif
>  }
>  
>  void br_multicast_ctx_deinit(struct net_bridge_mcast *brmctx)
>  {
> -	__br_multicast_stop(brmctx);
> +	timer_shutdown_sync(&brmctx->ip4_mc_router_timer);
> +	timer_shutdown_sync(&brmctx->ip4_other_query.timer);
> +	timer_shutdown_sync(&brmctx->ip4_other_query.delay_timer);
> +	timer_shutdown_sync(&brmctx->ip4_own_query.timer);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	timer_shutdown_sync(&brmctx->ip6_mc_router_timer);
> +	timer_shutdown_sync(&brmctx->ip6_other_query.timer);
> +	timer_shutdown_sync(&brmctx->ip6_other_query.delay_timer);
> +	timer_shutdown_sync(&brmctx->ip6_own_query.timer);
> +#endif
>  }
>  
>  void br_multicast_init(struct net_bridge *br)
> @@ -4213,9 +4226,27 @@ void br_multicast_leave_snoopers(struct net_bridge *br)
>  	br_ip6_multicast_leave_snoopers(br);
>  }
>  
> +void br_multicast_reset_timer_cbs(struct net_bridge_mcast *brmctx)
> +{
> +	lockdep_assert_held_once(&brmctx->br->multicast_lock);
> +
> +	brmctx->ip4_mc_router_timer.function = br_ip4_multicast_local_router_expired;
> +	brmctx->ip4_other_query.timer.function = br_ip4_multicast_querier_expired;
> +	brmctx->ip4_other_query.delay_timer.function = br_multicast_query_delay_expired;
> +	brmctx->ip4_own_query.timer.function = br_ip4_multicast_query_expired;
> +#if IS_ENABLED(CONFIG_IPV6)
> +	brmctx->ip6_mc_router_timer.function = br_ip6_multicast_local_router_expired;
> +	brmctx->ip6_other_query.timer.function = br_ip6_multicast_querier_expired;
> +	brmctx->ip6_other_query.delay_timer.function = br_multicast_query_delay_expired;
> +	brmctx->ip6_own_query.timer.function = br_ip6_multicast_query_expired;
> +#endif
> +}
> +
>  static void __br_multicast_open_query(struct net_bridge *br,
>  				      struct bridge_mcast_own_query *query)
>  {
> +	lockdep_assert_held_once(&br->multicast_lock);
> +
>  	query->startup_sent = 0;
>  
>  	if (!br_opt_get(br, BROPT_MULTICAST_ENABLED))
> @@ -4226,13 +4257,15 @@ static void __br_multicast_open_query(struct net_bridge *br,
>  
>  static void __br_multicast_open(struct net_bridge_mcast *brmctx)
>  {
> +	br_multicast_reset_timer_cbs(brmctx);
> +
>  	__br_multicast_open_query(brmctx->br, &brmctx->ip4_own_query);
>  #if IS_ENABLED(CONFIG_IPV6)
>  	__br_multicast_open_query(brmctx->br, &brmctx->ip6_own_query);
>  #endif
>  }
>  
> -void br_multicast_open(struct net_bridge *br)
> +static void br_multicast_open_locked(struct net_bridge *br)
>  {
>  	ASSERT_RTNL();
>  
> @@ -4256,17 +4289,26 @@ void br_multicast_open(struct net_bridge *br)
>  	}
>  }
>  
> +void br_multicast_open(struct net_bridge *br)
> +{
> +	spin_lock_bh(&br->multicast_lock);
> +	br_multicast_open_locked(br);
> +	spin_unlock_bh(&br->multicast_lock);
> +}
> +
>  static void __br_multicast_stop(struct net_bridge_mcast *brmctx)
>  {
> -	timer_delete_sync(&brmctx->ip4_mc_router_timer);
> -	timer_delete_sync(&brmctx->ip4_other_query.timer);
> -	timer_delete_sync(&brmctx->ip4_other_query.delay_timer);
> -	timer_delete_sync(&brmctx->ip4_own_query.timer);
> +	lockdep_assert_held_once(&brmctx->br->multicast_lock);
> +
> +	timer_shutdown(&brmctx->ip4_mc_router_timer);
> +	timer_shutdown(&brmctx->ip4_other_query.timer);
> +	timer_shutdown(&brmctx->ip4_other_query.delay_timer);
> +	timer_shutdown(&brmctx->ip4_own_query.timer);
>  #if IS_ENABLED(CONFIG_IPV6)
> -	timer_delete_sync(&brmctx->ip6_mc_router_timer);
> -	timer_delete_sync(&brmctx->ip6_other_query.timer);
> -	timer_delete_sync(&brmctx->ip6_other_query.delay_timer);
> -	timer_delete_sync(&brmctx->ip6_own_query.timer);
> +	timer_shutdown(&brmctx->ip6_mc_router_timer);
> +	timer_shutdown(&brmctx->ip6_other_query.timer);
> +	timer_shutdown(&brmctx->ip6_other_query.delay_timer);
> +	timer_shutdown(&brmctx->ip6_own_query.timer);
>  #endif
>  }
>  
> @@ -4317,12 +4359,12 @@ void br_multicast_toggle_one_vlan(struct net_bridge_vlan *vlan, bool on)
>  
>  		spin_lock_bh(&br->multicast_lock);
>  		vlan->priv_flags ^= BR_VLFLAG_MCAST_ENABLED;
> -		spin_unlock_bh(&br->multicast_lock);
>  
>  		if (on)
>  			__br_multicast_open(&vlan->br_mcast_ctx);
>  		else
>  			__br_multicast_stop(&vlan->br_mcast_ctx);
> +		spin_unlock_bh(&br->multicast_lock);
>  	} else {
>  		struct net_bridge_mcast *brmctx;
>  
> @@ -4380,6 +4422,7 @@ int br_multicast_toggle_vlan_snooping(struct net_bridge *br, bool on,
>  	if (!vg)
>  		return 0;
>  
> +	spin_lock_bh(&br->multicast_lock);
>  	br_opt_toggle(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED, on);
>  
>  	/* disable/enable non-vlan mcast contexts based on vlan snooping */
> @@ -4387,6 +4430,8 @@ int br_multicast_toggle_vlan_snooping(struct net_bridge *br, bool on,
>  		__br_multicast_stop(&br->multicast_ctx);
>  	else
>  		__br_multicast_open(&br->multicast_ctx);
> +	spin_unlock_bh(&br->multicast_lock);
> +
>  	list_for_each_entry(p, &br->port_list, list) {
>  		if (on)
>  			br_multicast_disable_port_ctx(&p->multicast_ctx);
> @@ -4416,7 +4461,7 @@ bool br_multicast_toggle_global_vlan(struct net_bridge_vlan *vlan, bool on)
>  	return true;
>  }
>  
> -void br_multicast_stop(struct net_bridge *br)
> +static void br_multicast_stop_locked(struct net_bridge *br)
>  {
>  	ASSERT_RTNL();
>  
> @@ -4440,6 +4485,13 @@ void br_multicast_stop(struct net_bridge *br)
>  	}
>  }
>  
> +void br_multicast_stop(struct net_bridge *br)
> +{
> +	spin_lock_bh(&br->multicast_lock);
> +	br_multicast_stop_locked(br);
> +	spin_unlock_bh(&br->multicast_lock);
> +}
> +
>  void br_multicast_dev_del(struct net_bridge *br)
>  {
>  	struct net_bridge_mdb_entry *mp;
> @@ -4675,7 +4727,7 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val,
>  	if (!netif_running(br->dev))
>  		goto unlock;
>  
> -	br_multicast_open(br);
> +	br_multicast_open_locked(br);
>  	list_for_each_entry(port, &br->port_list, list)
>  		__br_multicast_enable_port_ctx(&port->multicast_ctx);
>  
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 4ab6a1f58116..a181a27aa559 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -976,6 +976,7 @@ void br_multicast_disable_port(struct net_bridge_port *port);
>  void br_multicast_init(struct net_bridge *br);
>  void br_multicast_join_snoopers(struct net_bridge *br);
>  void br_multicast_leave_snoopers(struct net_bridge *br);
> +void br_multicast_reset_timer_cbs(struct net_bridge_mcast *brmctx);
>  void br_multicast_open(struct net_bridge *br);
>  void br_multicast_stop(struct net_bridge *br);
>  void br_multicast_dev_del(struct net_bridge *br);
> @@ -1416,6 +1417,10 @@ static inline void br_multicast_leave_snoopers(struct net_bridge *br)
>  {
>  }
>  
> +static inline void br_multicast_reset_timer_cbs(struct net_bridge_mcast *brmctx)
> +{
> +}
> +
>  static inline void br_multicast_open(struct net_bridge *br)
>  {
>  }
> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> index 326933b455b3..3facb4eda306 100644
> --- a/net/bridge/br_vlan.c
> +++ b/net/bridge/br_vlan.c
> @@ -325,7 +325,12 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags,
>  			if (err && err != -EOPNOTSUPP)
>  				goto out;
>  		}
> +

extra new line

>  		br_multicast_ctx_init(br, v, &v->br_mcast_ctx);
> +
> +		spin_lock_bh(&br->multicast_lock);
> +		br_multicast_reset_timer_cbs(&v->br_mcast_ctx);
> +		spin_unlock_bh(&br->multicast_lock);

Have you tested this without bridge IGMP_SNOOPING defined?
I don't think it will compile.

Also, please avoid spilling multicast lock outside of mcast code. In fact
why don't you move this in br_multicast_ctx_init?

>  		v->priv_flags |= BR_VLFLAG_GLOBAL_MCAST_ENABLED;
>  	}
>  
> -- 
> 2.51.0
> 

  reply	other threads:[~2026-03-02 12:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02  5:39 [PATCH net-next v3 00/14] net: bridge: reduce multicast checks in fast path Linus Lüssing
2026-03-02  5:39 ` [PATCH net-next v3 01/14] net: bridge: mcast: export ip{4,6}_active state to netlink Linus Lüssing
2026-03-02  5:39 ` [PATCH net-next v3 02/14] net: bridge: mcast: track active state, adding tests Linus Lüssing
2026-03-02  8:08   ` Linus Lüssing
2026-03-02 14:32     ` Jakub Kicinski
2026-03-02  5:39 ` [PATCH net-next v3 03/14] net: bridge: mcast: avoid sleeping on bridge-down Linus Lüssing
2026-03-02 12:58   ` Nikolay Aleksandrov [this message]
2026-03-02  5:39 ` [PATCH net-next v3 04/14] net: bridge: mcast: track active state, IGMP/MLD querier appearance Linus Lüssing
2026-03-02  5:39 ` [PATCH net-next v3 05/14] net: bridge: mcast: track active state, foreign IGMP/MLD querier disappearance Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 06/14] net: bridge: mcast: track active state, IPv6 address availability Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 07/14] net: bridge: mcast: track active state, own MLD querier disappearance Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 08/14] net: bridge: mcast: track active state, if snooping is enabled Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 09/14] net: bridge: mcast: track active state, VLAN snooping Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 10/14] net: bridge: mcast: track active state, bridge up/down Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 11/14] net: bridge: mcast: track active state, prepare for outside lock reads Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 12/14] net: bridge: mcast: use combined active state in netlink Linus Lüssing
2026-03-02  5:40 ` [PATCH net-next v3 13/14] net: bridge: mcast: use combined active state in fast/data path Linus Lüssing
2026-03-02 13:02   ` Nikolay Aleksandrov
2026-03-02  5:40 ` [PATCH net-next v3 14/14] net: bridge: mcast: add inactive state assertions Linus Lüssing
2026-03-02 13:23   ` Nikolay Aleksandrov
2026-03-03 12:14 ` [PATCH net-next v3 00/14] net: bridge: reduce multicast checks in fast path Simon Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaWJc1JPqZThLZBe@penguin \
    --to=razor@blackwall.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=bridge@lists.linux.dev \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linus.luessing@c0d3.blue \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shaw.leon@gmail.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox