From: Jakub Kicinski <kuba@kernel.org>
To: razor@blackwall.org, idosch@nvidia.com
Cc: Jakub Kicinski <kuba@kernel.org>,
edumazet@google.com, davem@davemloft.net, pabeni@redhat.com,
horms@kernel.org, netdev@vger.kernel.org, eric.dumazet@gmail.com
Subject: Re: [PATCH v2 net-next 2/5] bridge: use atomic ops to read/change p->flags in br_netlink.c
Date: Thu, 11 Jun 2026 11:30:38 -0700 [thread overview]
Message-ID: <20260611183037.1948556-2-kuba@kernel.org> (raw)
In-Reply-To: <20260610161836.1275531-3-edumazet@google.com>
Ido, Nik, do we care about this?
---
bridge: use atomic ops to read/change p->flags in br_netlink.c
The change converts net/bridge/br_netlink.c reads of p->flags to
test_bit()/READ_ONCE() and rewrites br_setport() to compute a new
flags value in a local variable, then commit it with a cmpxchg()
retry loop.
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 7cb24de9c77d..12850f60bfcb 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
[ ... ]
> @@ -934,58 +934,64 @@ static int br_set_port_state(struct net_bridge_port *p, u8 state)
> }
>
> /* Set/clear or port flags based on attribute */
[ ... ]
> /* Process bridge protocol info on port */
> static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
> struct netlink_ext_ack *extack)
> {
> - unsigned long old_flags, changed_mask;
> + unsigned long old_flags, flags, changed_mask;
> bool br_vlan_tunnel_old;
> int err;
>
> - old_flags = p->flags;
> +retry:
> + flags = old_flags = READ_ONCE(p->flags);
> br_vlan_tunnel_old = (old_flags & BR_VLAN_TUNNEL) ? true : false;
[ ... ]
> + if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB)) {
> struct net_bridge_fdb_flush_desc desc = {
> .flags = BIT(BR_FDB_LOCKED),
> .flags_mask = BIT(BR_FDB_LOCKED),
> @@ -995,15 +1001,16 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
> br_fdb_flush(p->br, &desc);
> }
>
> - changed_mask = old_flags ^ p->flags;
> + changed_mask = old_flags ^ flags;
>
> - err = br_switchdev_set_port_flag(p, p->flags, changed_mask, extack);
> - if (err) {
> - p->flags = old_flags;
> + err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
> + if (err)
> return err;
> - }
>
> - if (br_vlan_tunnel_old && !(p->flags & BR_VLAN_TUNNEL))
> + if (cmpxchg(&p->flags, old_flags, flags) != old_flags)
> + goto retry;
> +
[High]
Are the side effects in this loop safe to re-execute on a cmpxchg retry?
The retry path now wraps br_fdb_flush() and br_switchdev_set_port_flag()
inside the loop body:
retry:
flags = old_flags = READ_ONCE(p->flags);
...
if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB))
br_fdb_flush(p->br, &desc);
changed_mask = old_flags ^ flags;
err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
if (err)
return err;
if (cmpxchg(&p->flags, old_flags, flags) != old_flags)
goto retry;
...
br_port_flags_change(p, changed_mask);
Can this retry actually be taken? br_setport() runs under RTNL, but
several other writers modify p->flags without RTNL via set_bit() and
clear_bit(), for example BR_MRP_LOST_CONT_BIT and BR_MRP_LOST_IN_CONT_BIT
toggled from work-queue and softirq context in net/bridge/br_mrp_netlink.c
(br_mrp_ring_port_open() and br_mrp_in_port_open(), called from
br_mrp_test_work_expired() and br_mrp_mrm_process()).
Because cmpxchg compares the entire word, a concurrent set_bit() or
clear_bit() on any unrelated bit (such as the MRP bits above) makes the
cmpxchg fail and trigger goto retry.
When that happens, what is the effect on switchdev?
err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
br_switchdev_set_port_flag() first calls call_switchdev_notifiers() with
SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS, then queues a SWITCHDEV_F_DEFER
attribute set for SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
attr.id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS;
attr.flags = SWITCHDEV_F_DEFER;
err = switchdev_port_attr_set(p->dev, &attr, extack);
The deferred update from a first iteration is already queued and cannot
be cancelled when the retry loop queues a second one with a different
(val, mask) pair. Drivers will see notifier callbacks for both
iterations.
For br_fdb_flush(), the BR_PORT_MAB transition
if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB))
br_fdb_flush(p->br, &desc);
can re-evaluate to true on a retry and flush FDB entries again.
Finally, br_port_flags_change() runs only once after the loop, with the
final iteration's changed_mask:
br_port_flags_change(p, changed_mask);
If a bit such as BR_AUTO_MASK or BR_NEIGH_SUPPRESS was set in an
earlier iteration's changed_mask but not in the last one (because a
concurrent writer altered an unrelated bit between iterations), would
the switchdev driver have already been told about that transition while
the bridge software-side recompute in br_port_flags_change() is skipped?
The original code performed each of these side effects exactly once per
br_setport() call. Should the side effects be moved outside the cmpxchg
loop so only the pure flags computation is retried?
> if (br_vlan_tunnel_old && !(flags & BR_VLAN_TUNNEL))
> nbp_vlan_tunnel_info_flush(p);
>
> br_port_flags_change(p, changed_mask);
--
pw-bot: cr
next prev parent reply other threads:[~2026-06-11 18:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 16:18 [PATCH v2 net-next 0/5] net: bridge: take care of p->flags accesses Eric Dumazet
2026-06-10 16:18 ` [PATCH v2 net-next 1/5] bridge: use atomic ops to read/change p->flags in sysfs Eric Dumazet
2026-06-11 4:46 ` Nikolay Aleksandrov
2026-06-11 18:11 ` Jakub Kicinski
2026-06-10 16:18 ` [PATCH v2 net-next 2/5] bridge: use atomic ops to read/change p->flags in br_netlink.c Eric Dumazet
2026-06-11 4:47 ` Nikolay Aleksandrov
2026-06-11 18:30 ` Jakub Kicinski [this message]
2026-06-11 20:01 ` Eric Dumazet
2026-06-10 16:18 ` [PATCH v2 net-next 3/5] net: bridge: use atomic ops to read/change p->flags (I) Eric Dumazet
2026-06-11 4:47 ` Nikolay Aleksandrov
2026-06-10 16:18 ` [PATCH v2 net-next 4/5] net: bridge: use atomic ops to read/change p->flags (II) Eric Dumazet
2026-06-11 4:47 ` Nikolay Aleksandrov
2026-06-10 16:18 ` [PATCH v2 net-next 5/5] net: bridge: use atomic ops to read/change p->flags (III) Eric Dumazet
2026-06-11 4:48 ` Nikolay Aleksandrov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260611183037.1948556-2-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox