From: Vladimir Oltean <olteanv@gmail.com>
To: Tobias Waldekranz <tobias@waldekranz.com>
Cc: davem@davemloft.net, kuba@kernel.org, andrew@lunn.ch,
vivien.didelot@gmail.com, f.fainelli@gmail.com,
j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net,
netdev@vger.kernel.org
Subject: Re: [PATCH v2 net-next 2/4] net: dsa: Link aggregation support
Date: Tue, 1 Dec 2020 03:37:06 +0200 [thread overview]
Message-ID: <20201201013706.6clgrx2tnapywgxf@skbuf> (raw)
In-Reply-To: <20201130140610.4018-3-tobias@waldekranz.com>
Hi Tobias,
On Mon, Nov 30, 2020 at 03:06:08PM +0100, Tobias Waldekranz wrote:
> Monitor the following events and notify the driver when:
>
> - A DSA port joins/leaves a LAG.
> - A LAG, made up of DSA ports, joins/leaves a bridge.
> - A DSA port in a LAG is enabled/disabled (enabled meaning
> "distributing" in 802.3ad LACP terms).
>
> Each LAG interface to which a DSA port is attached is represented by a
> `struct dsa_lag` which is globally reachable from the switch tree and
> from each associated port.
>
> When a LAG joins a bridge, the DSA subsystem will treat that as each
> individual port joining the bridge. The driver may look at the port's
> LAG pointer to see if it is associated with any LAG, if that is
> required. This is analogue to how switchdev events are replicated out
> to all lower devices when reaching e.g. a LAG.
>
> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> ---
> include/net/dsa.h | 97 +++++++++++++++++++++++++++++++
> net/dsa/dsa2.c | 51 ++++++++++++++++
> net/dsa/dsa_priv.h | 31 ++++++++++
> net/dsa/port.c | 141 +++++++++++++++++++++++++++++++++++++++++++++
> net/dsa/slave.c | 83 ++++++++++++++++++++++++--
> net/dsa/switch.c | 49 ++++++++++++++++
> 6 files changed, 446 insertions(+), 6 deletions(-)
>
> +static inline struct dsa_lag *dsa_lag_by_id(struct dsa_switch_tree *dst, int id)
> +{
> + if (!test_bit(id, dst->lags.busy))
> + return NULL;
> +
> + return &dst->lags.pool[id];
> +}
> +
> +static inline struct net_device *dsa_lag_dev_by_id(struct dsa_switch_tree *dst,
> + int id)
> +{
> + struct dsa_lag *lag = dsa_lag_by_id(dst, id);
> +
> + return lag ? rcu_dereference(lag->dev) : NULL;
> +}
> +
> +static inline struct dsa_lag *dsa_lag_by_dev(struct dsa_switch_tree *dst,
> + struct net_device *dev)
> +{
> + struct dsa_lag *lag;
> + int id;
> +
> + dsa_lag_foreach(id, dst) {
> + lag = dsa_lag_by_id(dst, id);
> +
> + if (rtnl_dereference(lag->dev) == dev)
> + return lag;
> + }
> +
> + return NULL;
> +}
>
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index 73569c9af3cc..c2332ee5f5c7 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -193,6 +193,147 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
> dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
> }
>
> +static struct dsa_lag *dsa_lag_get(struct dsa_switch_tree *dst,
> + struct net_device *dev)
> +{
> + struct dsa_lag *lag;
> + int id;
> +
> + lag = dsa_lag_by_dev(dst, dev);
> + if (lag) {
> + kref_get(&lag->refcount);
> + return lag;
> + }
> +
> + id = find_first_zero_bit(dst->lags.busy, dst->lags.num);
> + if (id >= dst->lags.num) {
> + WARN(1, "No LAGs available");
> + return NULL;
> + }
> +
> + set_bit(id, dst->lags.busy);
> +
> + lag = &dst->lags.pool[id];
> + kref_init(&lag->refcount);
> + lag->id = id;
> + INIT_LIST_HEAD(&lag->ports);
> + INIT_LIST_HEAD(&lag->tx_ports);
> +
> + rcu_assign_pointer(lag->dev, dev);
> + return lag;
> +}
> +
> +static void dsa_lag_release(struct kref *refcount)
> +{
> + struct dsa_lag *lag = container_of(refcount, struct dsa_lag, refcount);
> +
> + rcu_assign_pointer(lag->dev, NULL);
> + synchronize_rcu();
> + memset(lag, 0, sizeof(*lag));
> +}
What difference does it make if lag->dev is set to NULL right away or
after a grace period? Squeezing one last packet from that bonding interface?
Pointer updates are atomic operations on all architectures that the
kernel supports, and, as long as you use WRITE_ONCE and READ_ONCE memory
barriers, there should be no reason for RCU protection that I can see.
And unlike typical uses of RCU, you do not free lag->dev, because you do
not own lag->dev. Instead, the bonding interface pointed to by lag->dev
is going to be freed (in case of a deletion using ip link) after an RCU
grace period anyway. And the receive data path is under an RCU read-side
critical section anyway. So even if you set lag->dev to NULL using
WRITE_ONCE, the existing in-flight readers from the RX data path that
had called dsa_lag_dev_by_id() will still hold a reference to a valid
bonding interface.
next prev parent reply other threads:[~2020-12-01 1:38 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-30 14:06 [PATCH v2 net-next 0/4] net: dsa: Link aggregation support Tobias Waldekranz
2020-11-30 14:06 ` [PATCH v2 net-next 1/4] net: bonding: Notify ports about their initial state Tobias Waldekranz
2020-11-30 14:06 ` [PATCH v2 net-next 2/4] net: dsa: Link aggregation support Tobias Waldekranz
2020-12-01 1:37 ` Vladimir Oltean [this message]
2020-12-01 8:13 ` Tobias Waldekranz
2020-12-01 13:29 ` Vladimir Oltean
2020-12-01 14:22 ` Tobias Waldekranz
2020-12-01 14:03 ` Vladimir Oltean
2020-12-01 14:29 ` Tobias Waldekranz
2020-12-01 20:04 ` Vladimir Oltean
2020-12-01 21:48 ` Tobias Waldekranz
2020-12-01 22:23 ` Vladimir Oltean
2020-11-30 14:06 ` [PATCH v2 net-next 3/4] net: dsa: mv88e6xxx: " Tobias Waldekranz
2020-11-30 14:06 ` [PATCH v2 net-next 4/4] net: dsa: tag_dsa: Support reception of packets from LAG devices Tobias Waldekranz
2020-12-01 21:24 ` Vladimir Oltean
2020-12-01 22:31 ` Tobias Waldekranz
2020-12-02 0:33 ` Vladimir Oltean
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201201013706.6clgrx2tnapywgxf@skbuf \
--to=olteanv@gmail.com \
--cc=andrew@lunn.ch \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=f.fainelli@gmail.com \
--cc=j.vosburgh@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tobias@waldekranz.com \
--cc=vfalico@gmail.com \
--cc=vivien.didelot@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox