Netdev List
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: razor@blackwall.org, idosch@nvidia.com
Cc: Jakub Kicinski <kuba@kernel.org>,
	edumazet@google.com, davem@davemloft.net, pabeni@redhat.com,
	horms@kernel.org, netdev@vger.kernel.org, eric.dumazet@gmail.com
Subject: Re: [PATCH v2 net-next 2/5] bridge: use atomic ops to read/change p->flags in br_netlink.c
Date: Thu, 11 Jun 2026 11:30:38 -0700	[thread overview]
Message-ID: <20260611183037.1948556-2-kuba@kernel.org> (raw)
In-Reply-To: <20260610161836.1275531-3-edumazet@google.com>

Ido, Nik, do we care about this?

---
bridge: use atomic ops to read/change p->flags in br_netlink.c

The change converts net/bridge/br_netlink.c reads of p->flags to
test_bit()/READ_ONCE() and rewrites br_setport() to compute a new
flags value in a local variable, then commit it with a cmpxchg()
retry loop.

> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 7cb24de9c77d..12850f60bfcb 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c

[ ... ]

> @@ -934,58 +934,64 @@ static int br_set_port_state(struct net_bridge_port *p, u8 state)
>  }
>  
>  /* Set/clear or port flags based on attribute */

[ ... ]

>  /* Process bridge protocol info on port */
>  static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
>  		      struct netlink_ext_ack *extack)
>  {
> -	unsigned long old_flags, changed_mask;
> +	unsigned long old_flags, flags, changed_mask;
>  	bool br_vlan_tunnel_old;
>  	int err;
>  
> -	old_flags = p->flags;
> +retry:
> +	flags = old_flags = READ_ONCE(p->flags);
>  	br_vlan_tunnel_old = (old_flags & BR_VLAN_TUNNEL) ? true : false;

[ ... ]

> +	if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB)) {
>  		struct net_bridge_fdb_flush_desc desc = {
>  			.flags = BIT(BR_FDB_LOCKED),
>  			.flags_mask = BIT(BR_FDB_LOCKED),
> @@ -995,15 +1001,16 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
>  		br_fdb_flush(p->br, &desc);
>  	}
>  
> -	changed_mask = old_flags ^ p->flags;
> +	changed_mask = old_flags ^ flags;
>  
> -	err = br_switchdev_set_port_flag(p, p->flags, changed_mask, extack);
> -	if (err) {
> -		p->flags = old_flags;
> +	err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
> +	if (err)
>  		return err;
> -	}
>  
> -	if (br_vlan_tunnel_old && !(p->flags & BR_VLAN_TUNNEL))
> +	if (cmpxchg(&p->flags, old_flags, flags) != old_flags)
> +		goto retry;
> +

[High]

Are the side effects in this loop safe to re-execute on a cmpxchg retry?

The retry path now wraps br_fdb_flush() and br_switchdev_set_port_flag()
inside the loop body:

	retry:
		flags = old_flags = READ_ONCE(p->flags);
		...
		if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB))
			br_fdb_flush(p->br, &desc);
		changed_mask = old_flags ^ flags;
		err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
		if (err)
			return err;
		if (cmpxchg(&p->flags, old_flags, flags) != old_flags)
			goto retry;
		...
		br_port_flags_change(p, changed_mask);

Can this retry actually be taken?  br_setport() runs under RTNL, but
several other writers modify p->flags without RTNL via set_bit() and
clear_bit(), for example BR_MRP_LOST_CONT_BIT and BR_MRP_LOST_IN_CONT_BIT
toggled from work-queue and softirq context in net/bridge/br_mrp_netlink.c
(br_mrp_ring_port_open() and br_mrp_in_port_open(), called from
br_mrp_test_work_expired() and br_mrp_mrm_process()).

Because cmpxchg compares the entire word, a concurrent set_bit() or
clear_bit() on any unrelated bit (such as the MRP bits above) makes the
cmpxchg fail and trigger goto retry.

When that happens, what is the effect on switchdev?

	err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);

br_switchdev_set_port_flag() first calls call_switchdev_notifiers() with
SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS, then queues a SWITCHDEV_F_DEFER
attribute set for SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:

	attr.id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS;
	attr.flags = SWITCHDEV_F_DEFER;
	err = switchdev_port_attr_set(p->dev, &attr, extack);

The deferred update from a first iteration is already queued and cannot
be cancelled when the retry loop queues a second one with a different
(val, mask) pair.  Drivers will see notifier callbacks for both
iterations.

For br_fdb_flush(), the BR_PORT_MAB transition

	if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB))
		br_fdb_flush(p->br, &desc);

can re-evaluate to true on a retry and flush FDB entries again.

Finally, br_port_flags_change() runs only once after the loop, with the
final iteration's changed_mask:

	br_port_flags_change(p, changed_mask);

If a bit such as BR_AUTO_MASK or BR_NEIGH_SUPPRESS was set in an
earlier iteration's changed_mask but not in the last one (because a
concurrent writer altered an unrelated bit between iterations), would
the switchdev driver have already been told about that transition while
the bridge software-side recompute in br_port_flags_change() is skipped?

The original code performed each of these side effects exactly once per
br_setport() call.  Should the side effects be moved outside the cmpxchg
loop so only the pure flags computation is retried?

>  	if (br_vlan_tunnel_old && !(flags & BR_VLAN_TUNNEL))
>  		nbp_vlan_tunnel_info_flush(p);
>  
>  	br_port_flags_change(p, changed_mask);
-- 
pw-bot: cr

  parent reply	other threads:[~2026-06-11 18:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 16:18 [PATCH v2 net-next 0/5] net: bridge: take care of p->flags accesses Eric Dumazet
2026-06-10 16:18 ` [PATCH v2 net-next 1/5] bridge: use atomic ops to read/change p->flags in sysfs Eric Dumazet
2026-06-11  4:46   ` Nikolay Aleksandrov
2026-06-11 18:11     ` Jakub Kicinski
2026-06-10 16:18 ` [PATCH v2 net-next 2/5] bridge: use atomic ops to read/change p->flags in br_netlink.c Eric Dumazet
2026-06-11  4:47   ` Nikolay Aleksandrov
2026-06-11 18:30   ` Jakub Kicinski [this message]
2026-06-11 20:01     ` Eric Dumazet
2026-06-10 16:18 ` [PATCH v2 net-next 3/5] net: bridge: use atomic ops to read/change p->flags (I) Eric Dumazet
2026-06-11  4:47   ` Nikolay Aleksandrov
2026-06-10 16:18 ` [PATCH v2 net-next 4/5] net: bridge: use atomic ops to read/change p->flags (II) Eric Dumazet
2026-06-11  4:47   ` Nikolay Aleksandrov
2026-06-10 16:18 ` [PATCH v2 net-next 5/5] net: bridge: use atomic ops to read/change p->flags (III) Eric Dumazet
2026-06-11  4:48   ` Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260611183037.1948556-2-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox