Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] mlxsw: spectrum_dcb: Reject writes owned by an offloaded qdisc
@ 2026-05-18  5:56 Keno Fischer
  2026-05-19 15:14 ` Petr Machata
  0 siblings, 1 reply; 2+ messages in thread
From: Keno Fischer @ 2026-05-18  5:56 UTC (permalink / raw)
  To: netdev; +Cc: Ido Schimmel, Petr Machata, linux-kernel

Some bits of state in the mlxsw driver can be controlled by two
mechanisms. The first is the DCB netlink command set (RTM_SETDCB
and e.g. DCB_ATTR_IEEE_ETS/DCB_ATTR_IEEE_MAXRATE), reachable e.g. from
the `dcb` userspace utility. The second is via the qdisc framework
(reachable e.g. from the `tc` userspace utility). Because they touch
the same bit of state, it is currently possible for the ETS config
to get out of sync with what the qdisc expects. One hard-to-debug
consequence is that a config script becomes non-idempotent. E.g.,
consider

```
dcb ets set dev "${port}" prio-tc 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
tc qdisc replace dev "${port}" root handle 1: prio bands 8 \
	priomap 0 1 2 3 4 5 6 7
```

Here the user made a mistake and put the priomap in the wrong order.
The first time this code is run, the `qdisc` specification overwrites
the `prio-tc` specification. However, the second time this code is run,
the subsystem sees that the qdisc is identical and does nothing, causing
the `prio-tc` setting to stick and the final mapping to be different.
This can be very confusing, since the same config script creates two
different states.

I think there's two reasonable ways to fix this. One would be by
reflecting the ets state changes back into the qdisc, so that the
qdisc subsystem notices that it's been changed. However, I think it
is clearer to just forbid the non-qdisc API from trying to make
changes while the qdisc is offloaded.

This is similar to the existing check in mlxsw_sp_dcbnl_setbuffer,
which does essentially the same thing in reverse (buffers are only
explicitly managed when there is an offloaded qdisc, otherwise
they're managed explicitly).

I hope this saves the next person a few hours of debugging wondering
why their tc_no_buffer_discard_uc_tc_ counters are going up.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Keno Fischer <keno@juliahub.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum.h    |  2 +
 .../ethernet/mellanox/mlxsw/spectrum_dcb.c    | 10 +++++
 .../ethernet/mellanox/mlxsw/spectrum_qdisc.c  | 43 +++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index b03ff9e044f9..ecf4525c42a4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -1260,6 +1260,8 @@ int mlxsw_sp_setup_tc_block_qevent_early_drop(struct mlxsw_sp_port *mlxsw_sp_por
 					      struct flow_block_offload *f);
 int mlxsw_sp_setup_tc_block_qevent_mark(struct mlxsw_sp_port *mlxsw_sp_port,
 					struct flow_block_offload *f);
+bool mlxsw_sp_qdisc_has_prio_ets(struct mlxsw_sp_port *mlxsw_sp_port);
+bool mlxsw_sp_qdisc_has_tbf_subgroup(struct mlxsw_sp_port *mlxsw_sp_port);
 
 /* spectrum_fid.c */
 struct mlxsw_sp_fid *mlxsw_sp_fid_lookup_by_index(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c
index f1ad937405a3..6e689d5de6fa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c
@@ -151,6 +151,11 @@ static int mlxsw_sp_dcbnl_ieee_setets(struct net_device *dev,
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
 	int err;
 
+	if (mlxsw_sp_qdisc_has_prio_ets(mlxsw_sp_port)) {
+		netdev_err(dev, "ETS configuration is controlled by offloaded qdisc\n");
+		return -EBUSY;
+	}
+
 	err = mlxsw_sp_port_ets_validate(mlxsw_sp_port, ets);
 	if (err)
 		return err;
@@ -450,6 +455,11 @@ static int mlxsw_sp_dcbnl_ieee_setmaxrate(struct net_device *dev,
 	struct ieee_maxrate *my_maxrate = mlxsw_sp_port->dcb.maxrate;
 	int err, i;
 
+	if (mlxsw_sp_qdisc_has_tbf_subgroup(mlxsw_sp_port)) {
+		netdev_err(dev, "maxrate is controlled by offloaded qdisc\n");
+		return -EBUSY;
+	}
+
 	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
 		err = mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port,
 						    MLXSW_REG_QEEC_HR_SUBGROUP,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
index 73519301b744..c2ce4f3f562e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
@@ -2316,6 +2316,49 @@ int mlxsw_sp_setup_tc_block_qevent_mark(struct mlxsw_sp_port *mlxsw_sp_port,
 					      action_mask);
 }
 
+bool mlxsw_sp_qdisc_has_prio_ets(struct mlxsw_sp_port *mlxsw_sp_port)
+{
+	struct mlxsw_sp_qdisc_state *qdisc_state = mlxsw_sp_port->qdisc;
+	struct mlxsw_sp_qdisc *root_qdisc;
+	bool found;
+
+	mutex_lock(&qdisc_state->lock);
+	root_qdisc = &qdisc_state->root_qdisc;
+	found = root_qdisc->ops &&
+		    (root_qdisc->ops->type == MLXSW_SP_QDISC_PRIO ||
+		     root_qdisc->ops->type == MLXSW_SP_QDISC_ETS);
+	mutex_unlock(&qdisc_state->lock);
+
+	return found;
+}
+
+static struct mlxsw_sp_qdisc *
+mlxsw_sp_qdisc_walk_cb_find_subgroup_tbf(struct mlxsw_sp_qdisc *qdisc,
+					 void *data)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port = (struct mlxsw_sp_port *)data;
+
+	if (qdisc->ops && qdisc->ops->type == MLXSW_SP_QDISC_TBF &&
+	    mlxsw_sp_qdisc_tbf_hr(mlxsw_sp_port, qdisc) ==
+	    MLXSW_REG_QEEC_HR_SUBGROUP)
+		return qdisc;
+	return NULL;
+}
+
+bool mlxsw_sp_qdisc_has_tbf_subgroup(struct mlxsw_sp_port *mlxsw_sp_port)
+{
+	struct mlxsw_sp_qdisc_state *qdisc_state = mlxsw_sp_port->qdisc;
+	bool found;
+
+	mutex_lock(&qdisc_state->lock);
+	found = mlxsw_sp_qdisc_walk(&qdisc_state->root_qdisc,
+				    mlxsw_sp_qdisc_walk_cb_find_subgroup_tbf,
+				    mlxsw_sp_port) != NULL;
+	mutex_unlock(&qdisc_state->lock);
+
+	return found;
+}
+
 int mlxsw_sp_tc_qdisc_init(struct mlxsw_sp_port *mlxsw_sp_port)
 {
 	struct mlxsw_sp_qdisc_state *qdisc_state;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-19 16:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18  5:56 [PATCH net-next] mlxsw: spectrum_dcb: Reject writes owned by an offloaded qdisc Keno Fischer
2026-05-19 15:14 ` Petr Machata

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox