From: Jakub Kicinski <kuba@kernel.org>
To: razor@blackwall.org, idosch@nvidia.com
Cc: Jakub Kicinski <kuba@kernel.org>,
edumazet@google.com, davem@davemloft.net, pabeni@redhat.com,
horms@kernel.org, netdev@vger.kernel.org, eric.dumazet@gmail.com
Subject: Re: [PATCH v2 net-next 2/5] bridge: use atomic ops to read/change p->flags in br_netlink.c
Date: Thu, 11 Jun 2026 11:30:38 -0700 [thread overview]
Message-ID: <20260611183037.1948556-2-kuba@kernel.org> (raw)
In-Reply-To: <20260610161836.1275531-3-edumazet@google.com>
Ido, Nik, do we care about this?
---
bridge: use atomic ops to read/change p->flags in br_netlink.c
The change converts net/bridge/br_netlink.c reads of p->flags to
test_bit()/READ_ONCE() and rewrites br_setport() to compute a new
flags value in a local variable, then commit it with a cmpxchg()
retry loop.
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 7cb24de9c77d..12850f60bfcb 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
[ ... ]
> @@ -934,58 +934,64 @@ static int br_set_port_state(struct net_bridge_port *p, u8 state)
> }
>
> /* Set/clear or port flags based on attribute */
[ ... ]
> /* Process bridge protocol info on port */
> static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
> struct netlink_ext_ack *extack)
> {
> - unsigned long old_flags, changed_mask;
> + unsigned long old_flags, flags, changed_mask;
> bool br_vlan_tunnel_old;
> int err;
>
> - old_flags = p->flags;
> +retry:
> + flags = old_flags = READ_ONCE(p->flags);
> br_vlan_tunnel_old = (old_flags & BR_VLAN_TUNNEL) ? true : false;
[ ... ]
> + if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB)) {
> struct net_bridge_fdb_flush_desc desc = {
> .flags = BIT(BR_FDB_LOCKED),
> .flags_mask = BIT(BR_FDB_LOCKED),
> @@ -995,15 +1001,16 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
> br_fdb_flush(p->br, &desc);
> }
>
> - changed_mask = old_flags ^ p->flags;
> + changed_mask = old_flags ^ flags;
>
> - err = br_switchdev_set_port_flag(p, p->flags, changed_mask, extack);
> - if (err) {
> - p->flags = old_flags;
> + err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
> + if (err)
> return err;
> - }
>
> - if (br_vlan_tunnel_old && !(p->flags & BR_VLAN_TUNNEL))
> + if (cmpxchg(&p->flags, old_flags, flags) != old_flags)
> + goto retry;
> +
[High]
Are the side effects in this loop safe to re-execute on a cmpxchg retry?
The retry path now wraps br_fdb_flush() and br_switchdev_set_port_flag()
inside the loop body:
retry:
flags = old_flags = READ_ONCE(p->flags);
...
if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB))
br_fdb_flush(p->br, &desc);
changed_mask = old_flags ^ flags;
err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
if (err)
return err;
if (cmpxchg(&p->flags, old_flags, flags) != old_flags)
goto retry;
...
br_port_flags_change(p, changed_mask);
Can this retry actually be taken? br_setport() runs under RTNL, but
several other writers modify p->flags without RTNL via set_bit() and
clear_bit(), for example BR_MRP_LOST_CONT_BIT and BR_MRP_LOST_IN_CONT_BIT
toggled from work-queue and softirq context in net/bridge/br_mrp_netlink.c
(br_mrp_ring_port_open() and br_mrp_in_port_open(), called from
br_mrp_test_work_expired() and br_mrp_mrm_process()).
Because cmpxchg compares the entire word, a concurrent set_bit() or
clear_bit() on any unrelated bit (such as the MRP bits above) makes the
cmpxchg fail and trigger goto retry.
When that happens, what is the effect on switchdev?
err = br_switchdev_set_port_flag(p, flags, changed_mask, extack);
br_switchdev_set_port_flag() first calls call_switchdev_notifiers() with
SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS, then queues a SWITCHDEV_F_DEFER
attribute set for SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
attr.id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS;
attr.flags = SWITCHDEV_F_DEFER;
err = switchdev_port_attr_set(p->dev, &attr, extack);
The deferred update from a first iteration is already queued and cannot
be cancelled when the retry loop queues a second one with a different
(val, mask) pair. Drivers will see notifier callbacks for both
iterations.
For br_fdb_flush(), the BR_PORT_MAB transition
if (!(flags & BR_PORT_MAB) && (old_flags & BR_PORT_MAB))
br_fdb_flush(p->br, &desc);
can re-evaluate to true on a retry and flush FDB entries again.
Finally, br_port_flags_change() runs only once after the loop, with the
final iteration's changed_mask:
br_port_flags_change(p, changed_mask);
If a bit such as BR_AUTO_MASK or BR_NEIGH_SUPPRESS was set in an
earlier iteration's changed_mask but not in the last one (because a
concurrent writer altered an unrelated bit between iterations), would
the switchdev driver have already been told about that transition while
the bridge software-side recompute in br_port_flags_change() is skipped?
The original code performed each of these side effects exactly once per
br_setport() call. Should the side effects be moved outside the cmpxchg
loop so only the pure flags computation is retried?
> if (br_vlan_tunnel_old && !(flags & BR_VLAN_TUNNEL))
> nbp_vlan_tunnel_info_flush(p);
>
> br_port_flags_change(p, changed_mask);
--
pw-bot: cr
next prev parent reply other threads:[~2026-06-11 18:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 16:18 [PATCH v2 net-next 0/5] net: bridge: take care of p->flags accesses Eric Dumazet
2026-06-10 16:18 ` [PATCH v2 net-next 1/5] bridge: use atomic ops to read/change p->flags in sysfs Eric Dumazet
2026-06-11 4:46 ` Nikolay Aleksandrov
2026-06-11 18:11 ` Jakub Kicinski
2026-06-10 16:18 ` [PATCH v2 net-next 2/5] bridge: use atomic ops to read/change p->flags in br_netlink.c Eric Dumazet
2026-06-11 4:47 ` Nikolay Aleksandrov
2026-06-11 18:30 ` Jakub Kicinski [this message]
2026-06-11 20:01 ` Eric Dumazet
2026-06-10 16:18 ` [PATCH v2 net-next 3/5] net: bridge: use atomic ops to read/change p->flags (I) Eric Dumazet
2026-06-11 4:47 ` Nikolay Aleksandrov
2026-06-10 16:18 ` [PATCH v2 net-next 4/5] net: bridge: use atomic ops to read/change p->flags (II) Eric Dumazet
2026-06-11 4:47 ` Nikolay Aleksandrov
2026-06-10 16:18 ` [PATCH v2 net-next 5/5] net: bridge: use atomic ops to read/change p->flags (III) Eric Dumazet
2026-06-11 4:48 ` Nikolay Aleksandrov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260611183037.1948556-2-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.