* [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support
@ 2026-06-02 0:43 Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 1/6] net: bridge: mdb: add MDB_FLAGS_STREAM_RESERVED flag Luke Howard
` (5 more replies)
0 siblings, 6 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
This patch series introduces the hooks necessary for a user space
implementation of the 802.1Qat Stream Reservation Protocol (SRP) to
enforce admission control of reserved streams using specific frame
priorities and destination MAC addresses. This is typically combined
with a traffic shaper such as the Credit Based Shaper (CBS) so that
bandwidth can be reserved for AVB/TSN streams.
The patch adds a new flag, MDB_FLAGS_STREAM_RESERVED, that marks a
multicast destination address as belonging to a "reserved" stream.
Ingress ports with the new BR_FILTER_STREAM_RESERVED flag set will
drop any frames whose 802.1p priorities map (via MQPRIO/TAPRIO) to
a non-zero traffic class, and whose destination addresses lack a
MDB entry with MDB_FLAGS_STREAM_RESERVED set.
802.1Qat admission control is implemented for both the software
bridge and the Marvell MV88E6XXX switch chips with AVB support.
New tests are added to the former.
Whilst 802.1Qat does permit the use of SRP for unicast destination
addresses, this is relatively uncommon in practice and is not
supported by this patch series.
In order to fully support 802.1Qat with hardware offloading on the
MV88E6XXX switch chips, we added support for MQPRIO to those chips
which have more than one transmit queue, with the following modes:
* AVB mode (MQPRIO DCB mode), which enables AVB features on supporting
switch chips such as BR_FILTER_STREAM_RESERVED; traffic classes map
to AVB classes, with TC0 representing non-AVB traffic. This is the
only MQPRIO mode that supports 802.1Qat stream admission, but it can
also be used for general traffic classification by not setting the
BR_FILTER_STREAM_RESERVED on the switch ports. However, it is limited
to three traffic classes, and the configuration is shared by all ports
(even on the 88E6390 family of chips).
* Channel mode (MQPRIO channel mode), which does not enable 802.1Qat/AVB
features but supports a more flexible priority to queue mapping,
particularly on the 88E6390 family of switches where the mapping can be
configured per port, rather than per chip. In this mode, traffic classes
map directly to switch QPris (i.e. queues).
Changes since v1:
- dropped CBS implementation (this is provided separately by Cedric
Jehasse's patch series [1], and is required for its definition of
num_tx_queues and qav_info)
- added MQPRIO channel support, as well as per-port queue configuration
for the 6390 family
- admission control is configured using a bridge port flag rather than
a device tree entry
- software bridge support for admission control
- Link to v1: https://lore.kernel.org/all/cover.1779841530.git.lukeh@padl.com/
[1] https://lore.kernel.org/all/20260528-net-next-mv88e6xxx-cbs-v4-0-8bd13b906457@luminex.be/
Signed-off-by: Luke Howard <lukeh@padl.com>
---
Luke Howard (6):
net: bridge: mdb: add MDB_FLAGS_STREAM_RESERVED flag
net: bridge: convert mdb_entry host_joined to a flags field
net: bridge: add 802.1Qat stream reservation admission control
net: bridge: allow MDB_FLAGS_STREAM_RESERVED on host groups
net: dsa: mv88e6xxx: MQPRIO support
net: dsa: mv88e6xxx: honour MDB_FLAGS_STREAM_RESERVED for AVB streams
drivers/net/dsa/mv88e6xxx/Makefile | 3 +-
drivers/net/dsa/mv88e6xxx/avb.c | 221 +++++++
drivers/net/dsa/mv88e6xxx/avb.h | 79 +++
drivers/net/dsa/mv88e6xxx/chip.c | 507 +++++++++++++++-
drivers/net/dsa/mv88e6xxx/chip.h | 68 ++-
drivers/net/dsa/mv88e6xxx/global1.c | 28 +-
drivers/net/dsa/mv88e6xxx/global1.h | 6 +-
drivers/net/dsa/mv88e6xxx/global1_atu.c | 17 +
drivers/net/dsa/mv88e6xxx/global2.h | 2 +
drivers/net/dsa/mv88e6xxx/global2_avb.c | 121 ++++
drivers/net/dsa/mv88e6xxx/port.c | 18 +
drivers/net/dsa/mv88e6xxx/port.h | 2 +
include/linux/if_bridge.h | 1 +
include/net/switchdev.h | 4 +
include/uapi/linux/if_bridge.h | 7 +
include/uapi/linux/if_link.h | 9 +
net/bridge/Kconfig | 22 +
net/bridge/br_input.c | 61 +-
net/bridge/br_mdb.c | 103 ++--
net/bridge/br_multicast.c | 49 +-
net/bridge/br_netlink.c | 8 +-
net/bridge/br_private.h | 22 +-
net/bridge/br_switchdev.c | 22 +-
net/core/rtnetlink.c | 2 +-
tools/testing/selftests/net/forwarding/Makefile | 1 +
.../net/forwarding/bridge_mdb_stream_reserved.sh | 653 +++++++++++++++++++++
tools/testing/selftests/net/forwarding/config | 2 +
27 files changed, 1933 insertions(+), 105 deletions(-)
---
base-commit: 0906c117f81c2ae6e6dbfa82719f79c75e1c9325
change-id: 20260602-mv88e6xxx-8021qat-mqprio-46fc466d70e1
prerequisite-change-id: 20260430-net-next-mv88e6xxx-cbs-2121169caa68:v4
prerequisite-patch-id: 8ad59c43368d4639e0cabcc59a7f6e487560d3f7
prerequisite-patch-id: 90cce4d7dadbead4f10cdd0493129b88abf0be75
Best regards,
--
Luke Howard <lukeh@padl.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH net-next v2 1/6] net: bridge: mdb: add MDB_FLAGS_STREAM_RESERVED flag
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
@ 2026-06-02 0:43 ` Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 2/6] net: bridge: convert mdb_entry host_joined to a flags field Luke Howard
` (4 subsequent siblings)
5 siblings, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
Add a new MDB entry flag, MDB_FLAGS_STREAM_RESERVED, that user space
can set on RTM_NEWMDB to mark a multicast destination as belonging
to a reserved stream (typically, one managed by the IEEE 802.1Q Stream
Reservation Protocol).
The flag is settable via the new nested attribute MDBE_ATTR_FLAGS, an
NLA_U32 bitmask whose accepted bits are presently restricted to
MDB_FLAGS_STREAM_RESERVED by NLA_POLICY_MASK(). As with the other
per-port group attributes, it is rejected for host groups.
The flag is stored on the port group and propagated through switchdev
so it is visible to drivers.
MDB entries with this flag would typically be managed by a user space SRP
service, which would also be responsible for configuring a traffic shaper
on the egress port.
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Luke Howard <lukeh@padl.com>
---
include/net/switchdev.h | 4 +++
include/uapi/linux/if_bridge.h | 6 ++++
net/bridge/br_mdb.c | 74 ++++++++++++++++++++++++------------------
net/bridge/br_private.h | 2 ++
net/bridge/br_switchdev.c | 17 ++++++----
5 files changed, 65 insertions(+), 38 deletions(-)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index ee500706496b0..03d176708b768 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -111,10 +111,14 @@ struct switchdev_obj_port_vlan {
container_of((OBJ), struct switchdev_obj_port_vlan, obj)
/* SWITCHDEV_OBJ_ID_PORT_MDB */
+
+#define SWITCHDEV_MDB_F_STREAM_RESERVED BIT(0)
+
struct switchdev_obj_port_mdb {
struct switchdev_obj obj;
unsigned char addr[ETH_ALEN];
u16 vid;
+ u32 flags;
};
#define SWITCHDEV_OBJ_PORT_MDB(OBJ) \
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 21a700c02ef76..01955a575528c 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -705,6 +705,7 @@ struct br_mdb_entry {
#define MDB_FLAGS_STAR_EXCL (1 << 2)
#define MDB_FLAGS_BLOCKED (1 << 3)
#define MDB_FLAGS_OFFLOAD_FAILED (1 << 4)
+#define MDB_FLAGS_STREAM_RESERVED (1 << 5)
__u8 flags;
__u16 vid;
struct {
@@ -746,6 +747,10 @@ enum {
/* [MDBA_SET_ENTRY_ATTRS] = {
* [MDBE_ATTR_xxx]
* ...
+ * [MDBE_ATTR_FLAGS]
+ * u32, a mask of MDB_FLAGS_* values to set on the entry. Valid only
+ * for port-group entries; currently only MDB_FLAGS_STREAM_RESERVED
+ * may be set from user space.
* }
*/
enum {
@@ -760,6 +765,7 @@ enum {
MDBE_ATTR_IFINDEX,
MDBE_ATTR_SRC_VNI,
MDBE_ATTR_STATE_MASK,
+ MDBE_ATTR_FLAGS,
__MDBE_ATTR_MAX,
};
#define MDBE_ATTR_MAX (__MDBE_ATTR_MAX - 1)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index e0c7020b12f5f..3ddfbd536edb4 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -146,6 +146,8 @@ static void __mdb_entry_fill_flags(struct br_mdb_entry *e, unsigned char flags)
e->flags |= MDB_FLAGS_BLOCKED;
if (flags & MDB_PG_FLAGS_OFFLOAD_FAILED)
e->flags |= MDB_FLAGS_OFFLOAD_FAILED;
+ if (flags & MDB_PG_FLAGS_STREAM_RESERVED)
+ e->flags |= MDB_FLAGS_STREAM_RESERVED;
}
static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip,
@@ -664,6 +666,7 @@ static const struct nla_policy br_mdbe_attrs_pol[MDBE_ATTR_MAX + 1] = {
MCAST_INCLUDE),
[MDBE_ATTR_SRC_LIST] = NLA_POLICY_NESTED(br_mdbe_src_list_pol),
[MDBE_ATTR_RTPROT] = NLA_POLICY_MIN(NLA_U8, RTPROT_STATIC),
+ [MDBE_ATTR_FLAGS] = NLA_POLICY_MASK(NLA_U32, MDB_FLAGS_STREAM_RESERVED),
};
static bool is_valid_mdb_source(struct nlattr *attr, __be16 proto,
@@ -739,14 +742,13 @@ __br_mdb_choose_context(struct net_bridge *br,
static int br_mdb_replace_group_sg(const struct br_mdb_config *cfg,
struct net_bridge_mdb_entry *mp,
struct net_bridge_port_group *pg,
- struct net_bridge_mcast *brmctx,
- unsigned char flags)
+ struct net_bridge_mcast *brmctx)
{
unsigned long now = jiffies;
- pg->flags = flags;
+ pg->flags = cfg->pg_flags;
pg->rt_protocol = cfg->rt_protocol;
- if (!(flags & MDB_PG_FLAGS_PERMANENT) && !cfg->src_entry)
+ if (!(cfg->pg_flags & MDB_PG_FLAGS_PERMANENT) && !cfg->src_entry)
mod_timer(&pg->timer,
now + brmctx->multicast_membership_interval);
else
@@ -760,7 +762,6 @@ static int br_mdb_replace_group_sg(const struct br_mdb_config *cfg,
static int br_mdb_add_group_sg(const struct br_mdb_config *cfg,
struct net_bridge_mdb_entry *mp,
struct net_bridge_mcast *brmctx,
- unsigned char flags,
struct netlink_ext_ack *extack)
{
struct net_bridge_port_group __rcu **pp;
@@ -775,20 +776,19 @@ static int br_mdb_add_group_sg(const struct br_mdb_config *cfg,
NL_SET_ERR_MSG_MOD(extack, "(S, G) group is already joined by port");
return -EEXIST;
}
- return br_mdb_replace_group_sg(cfg, mp, p, brmctx,
- flags);
+ return br_mdb_replace_group_sg(cfg, mp, p, brmctx);
}
if ((unsigned long)p->key.port < (unsigned long)cfg->p)
break;
}
- p = br_multicast_new_port_group(cfg->p, &cfg->group, *pp, flags, NULL,
- MCAST_INCLUDE, cfg->rt_protocol, extack);
+ p = br_multicast_new_port_group(cfg->p, &cfg->group, *pp, cfg->pg_flags,
+ NULL, MCAST_INCLUDE, cfg->rt_protocol, extack);
if (unlikely(!p))
return -ENOMEM;
rcu_assign_pointer(*pp, p);
- if (!(flags & MDB_PG_FLAGS_PERMANENT) && !cfg->src_entry)
+ if (!(cfg->pg_flags & MDB_PG_FLAGS_PERMANENT) && !cfg->src_entry)
mod_timer(&p->timer,
now + brmctx->multicast_membership_interval);
br_mdb_notify(cfg->br->dev, mp, p, RTM_NEWMDB);
@@ -818,7 +818,6 @@ static int br_mdb_add_group_src_fwd(const struct br_mdb_config *cfg,
struct net_bridge_mdb_entry *sgmp;
struct br_mdb_config sg_cfg;
struct br_ip sg_ip;
- u8 flags = 0;
sg_ip = cfg->group;
sg_ip.src = src_ip->src;
@@ -828,12 +827,8 @@ static int br_mdb_add_group_src_fwd(const struct br_mdb_config *cfg,
return PTR_ERR(sgmp);
}
- if (cfg->entry->state == MDB_PERMANENT)
- flags |= MDB_PG_FLAGS_PERMANENT;
- if (cfg->filter_mode == MCAST_EXCLUDE)
- flags |= MDB_PG_FLAGS_BLOCKED;
-
memset(&sg_cfg, 0, sizeof(sg_cfg));
+
sg_cfg.br = cfg->br;
sg_cfg.p = cfg->p;
sg_cfg.entry = cfg->entry;
@@ -842,7 +837,11 @@ static int br_mdb_add_group_src_fwd(const struct br_mdb_config *cfg,
sg_cfg.filter_mode = MCAST_INCLUDE;
sg_cfg.rt_protocol = cfg->rt_protocol;
sg_cfg.nlflags = cfg->nlflags;
- return br_mdb_add_group_sg(&sg_cfg, sgmp, brmctx, flags, extack);
+ sg_cfg.pg_flags = cfg->pg_flags;
+ if (cfg->filter_mode == MCAST_EXCLUDE)
+ sg_cfg.pg_flags |= MDB_PG_FLAGS_BLOCKED;
+
+ return br_mdb_add_group_sg(&sg_cfg, sgmp, brmctx, extack);
}
static int br_mdb_add_group_src(const struct br_mdb_config *cfg,
@@ -953,7 +952,6 @@ static int br_mdb_replace_group_star_g(const struct br_mdb_config *cfg,
struct net_bridge_mdb_entry *mp,
struct net_bridge_port_group *pg,
struct net_bridge_mcast *brmctx,
- unsigned char flags,
struct netlink_ext_ack *extack)
{
unsigned long now = jiffies;
@@ -963,10 +961,10 @@ static int br_mdb_replace_group_star_g(const struct br_mdb_config *cfg,
if (err)
return err;
- pg->flags = flags;
+ pg->flags = cfg->pg_flags;
pg->filter_mode = cfg->filter_mode;
pg->rt_protocol = cfg->rt_protocol;
- if (!(flags & MDB_PG_FLAGS_PERMANENT) &&
+ if (!(cfg->pg_flags & MDB_PG_FLAGS_PERMANENT) &&
cfg->filter_mode == MCAST_EXCLUDE)
mod_timer(&pg->timer,
now + brmctx->multicast_membership_interval);
@@ -984,7 +982,6 @@ static int br_mdb_replace_group_star_g(const struct br_mdb_config *cfg,
static int br_mdb_add_group_star_g(const struct br_mdb_config *cfg,
struct net_bridge_mdb_entry *mp,
struct net_bridge_mcast *brmctx,
- unsigned char flags,
struct netlink_ext_ack *extack)
{
struct net_bridge_port_group __rcu **pp;
@@ -1000,15 +997,14 @@ static int br_mdb_add_group_star_g(const struct br_mdb_config *cfg,
NL_SET_ERR_MSG_MOD(extack, "(*, G) group is already joined by port");
return -EEXIST;
}
- return br_mdb_replace_group_star_g(cfg, mp, p, brmctx,
- flags, extack);
+ return br_mdb_replace_group_star_g(cfg, mp, p, brmctx, extack);
}
if ((unsigned long)p->key.port < (unsigned long)cfg->p)
break;
}
- p = br_multicast_new_port_group(cfg->p, &cfg->group, *pp, flags, NULL,
- cfg->filter_mode, cfg->rt_protocol,
+ p = br_multicast_new_port_group(cfg->p, &cfg->group, *pp, cfg->pg_flags,
+ NULL, cfg->filter_mode, cfg->rt_protocol,
extack);
if (unlikely(!p))
return -ENOMEM;
@@ -1018,7 +1014,7 @@ static int br_mdb_add_group_star_g(const struct br_mdb_config *cfg,
goto err_del_port_group;
rcu_assign_pointer(*pp, p);
- if (!(flags & MDB_PG_FLAGS_PERMANENT) &&
+ if (!(cfg->pg_flags & MDB_PG_FLAGS_PERMANENT) &&
cfg->filter_mode == MCAST_EXCLUDE)
mod_timer(&p->timer,
now + brmctx->multicast_membership_interval);
@@ -1046,7 +1042,6 @@ static int br_mdb_add_group(const struct br_mdb_config *cfg,
struct net_bridge *br = cfg->br;
struct net_bridge_mcast *brmctx;
struct br_ip group = cfg->group;
- unsigned char flags = 0;
brmctx = __br_mdb_choose_context(br, entry, extack);
if (!brmctx)
@@ -1069,13 +1064,10 @@ static int br_mdb_add_group(const struct br_mdb_config *cfg,
return 0;
}
- if (entry->state == MDB_PERMANENT)
- flags |= MDB_PG_FLAGS_PERMANENT;
-
if (br_multicast_is_star_g(&group))
- return br_mdb_add_group_star_g(cfg, mp, brmctx, flags, extack);
+ return br_mdb_add_group_star_g(cfg, mp, brmctx, extack);
else
- return br_mdb_add_group_sg(cfg, mp, brmctx, flags, extack);
+ return br_mdb_add_group_sg(cfg, mp, brmctx, extack);
}
static int __br_mdb_add(const struct br_mdb_config *cfg,
@@ -1225,6 +1217,15 @@ static int br_mdb_config_attrs_init(struct nlattr *set_attrs,
cfg->rt_protocol = nla_get_u8(mdb_attrs[MDBE_ATTR_RTPROT]);
}
+ if (mdb_attrs[MDBE_ATTR_FLAGS]) {
+ if (!cfg->p) {
+ NL_SET_ERR_MSG_MOD(extack, "Flags cannot be set for host groups");
+ return -EINVAL;
+ }
+ if (nla_get_u32(mdb_attrs[MDBE_ATTR_FLAGS]) & MDB_FLAGS_STREAM_RESERVED)
+ cfg->pg_flags |= MDB_PG_FLAGS_STREAM_RESERVED;
+ }
+
return 0;
}
@@ -1280,6 +1281,9 @@ static int br_mdb_config_init(struct br_mdb_config *cfg, struct net_device *dev,
return -EINVAL;
}
+ if (cfg->entry->state == MDB_PERMANENT)
+ cfg->pg_flags |= MDB_PG_FLAGS_PERMANENT;
+
if (tb[MDBA_SET_ENTRY_ATTRS])
return br_mdb_config_attrs_init(tb[MDBA_SET_ENTRY_ATTRS], cfg,
extack);
@@ -1307,6 +1311,12 @@ int br_mdb_add(struct net_device *dev, struct nlattr *tb[], u16 nlmsg_flags,
return err;
err = -EINVAL;
+ if ((cfg.pg_flags & MDB_PG_FLAGS_STREAM_RESERVED) &&
+ cfg.entry->state != MDB_PERMANENT) {
+ NL_SET_ERR_MSG_MOD(extack, "stream_reserved entries must be permanent");
+ goto out;
+ }
+
/* host join errors which can happen before creating the group */
if (!cfg.p && !br_group_is_l2(&cfg.group)) {
/* don't allow any flags for host-joined IP groups */
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 02671e648dac7..6a2dabd6f4bfb 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -111,6 +111,7 @@ struct br_mdb_config {
struct br_mdb_src_entry *src_entries;
int num_src_entries;
u8 rt_protocol;
+ u8 pg_flags;
};
#endif
@@ -317,6 +318,7 @@ struct net_bridge_fdb_flush_desc {
#define MDB_PG_FLAGS_STAR_EXCL BIT(3)
#define MDB_PG_FLAGS_BLOCKED BIT(4)
#define MDB_PG_FLAGS_OFFLOAD_FAILED BIT(5)
+#define MDB_PG_FLAGS_STREAM_RESERVED BIT(6)
#define PG_SRC_ENT_LIMIT 32
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index ee3ad9dfbab99..c46d8e49ce990 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -547,7 +547,8 @@ static void br_switchdev_mdb_complete(struct net_device *dev, int err, void *pri
}
static void br_switchdev_mdb_populate(struct switchdev_obj_port_mdb *mdb,
- const struct net_bridge_mdb_entry *mp)
+ const struct net_bridge_mdb_entry *mp,
+ const struct net_bridge_port_group *pg)
{
if (mp->addr.proto == htons(ETH_P_IP))
ip_eth_mc_map(mp->addr.dst.ip4, mdb->addr);
@@ -559,6 +560,9 @@ static void br_switchdev_mdb_populate(struct switchdev_obj_port_mdb *mdb,
ether_addr_copy(mdb->addr, mp->addr.dst.mac_addr);
mdb->vid = mp->addr.vid;
+ mdb->flags = 0;
+ if (pg && (pg->flags & MDB_PG_FLAGS_STREAM_RESERVED))
+ mdb->flags |= SWITCHDEV_MDB_F_STREAM_RESERVED;
}
static void br_switchdev_host_mdb_one(struct net_device *dev,
@@ -574,7 +578,7 @@ static void br_switchdev_host_mdb_one(struct net_device *dev,
},
};
- br_switchdev_mdb_populate(&mdb, mp);
+ br_switchdev_mdb_populate(&mdb, mp, NULL);
switch (type) {
case RTM_NEWMDB:
@@ -621,6 +625,7 @@ static int br_switchdev_mdb_queue_one(struct list_head *mdb_list,
unsigned long action,
enum switchdev_obj_id id,
const struct net_bridge_mdb_entry *mp,
+ const struct net_bridge_port_group *pg,
struct net_device *orig_dev)
{
struct switchdev_obj_port_mdb mdb = {
@@ -631,7 +636,7 @@ static int br_switchdev_mdb_queue_one(struct list_head *mdb_list,
};
struct switchdev_obj_port_mdb *pmdb;
- br_switchdev_mdb_populate(&mdb, mp);
+ br_switchdev_mdb_populate(&mdb, mp, pg);
if (action == SWITCHDEV_PORT_OBJ_ADD &&
switchdev_port_obj_act_is_deferred(dev, action, &mdb.obj)) {
@@ -670,7 +675,7 @@ void br_switchdev_mdb_notify(struct net_device *dev,
if (!pg)
return br_switchdev_host_mdb(dev, mp, type);
- br_switchdev_mdb_populate(&mdb, mp);
+ br_switchdev_mdb_populate(&mdb, mp, pg);
mdb.obj.orig_dev = pg->key.port->dev;
switch (type) {
@@ -739,7 +744,7 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
if (mp->host_joined) {
err = br_switchdev_mdb_queue_one(&mdb_list, dev, action,
SWITCHDEV_OBJ_ID_HOST_MDB,
- mp, br_dev);
+ mp, NULL, br_dev);
if (err) {
spin_unlock_bh(&br->multicast_lock);
goto out_free_mdb;
@@ -753,7 +758,7 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
err = br_switchdev_mdb_queue_one(&mdb_list, dev, action,
SWITCHDEV_OBJ_ID_PORT_MDB,
- mp, dev);
+ mp, p, dev);
if (err) {
spin_unlock_bh(&br->multicast_lock);
goto out_free_mdb;
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v2 2/6] net: bridge: convert mdb_entry host_joined to a flags field
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 1/6] net: bridge: mdb: add MDB_FLAGS_STREAM_RESERVED flag Luke Howard
@ 2026-06-02 0:43 ` Luke Howard
2026-06-03 7:38 ` Nikolay Aleksandrov
2026-06-02 0:43 ` [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control Luke Howard
` (3 subsequent siblings)
5 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
Replace the bool host_joined in struct net_bridge_mdb_entry with a u8
flags field and a BRIDGE_MDBE_F_HOST_JOINED bit.
Signed-off-by: Luke Howard <lukeh@padl.com>
---
net/bridge/br_input.c | 2 +-
net/bridge/br_mdb.c | 14 ++++++++------
net/bridge/br_multicast.c | 26 ++++++++++++++------------
net/bridge/br_private.h | 4 +++-
net/bridge/br_switchdev.c | 2 +-
5 files changed, 27 insertions(+), 21 deletions(-)
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 470615675bdc0..5787066b1f4cb 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -188,7 +188,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
mdst = br_mdb_entry_skb_get(brmctx, skb, vid);
if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
br_multicast_querier_exists(brmctx, eth_hdr(skb), mdst)) {
- if ((mdst && mdst->host_joined) ||
+ if ((mdst && (mdst->flags & BRIDGE_MDBE_F_HOST_JOINED)) ||
br_multicast_is_router(brmctx, skb) ||
br->dev->flags & IFF_ALLMULTI) {
local_rcv = true;
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 3ddfbd536edb4..b95ca72ec6347 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -344,7 +344,7 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
break;
}
- if (!s_pidx && mp->host_joined) {
+ if (!s_pidx && (mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
err = __mdb_fill_info(skb, mp, NULL);
if (err) {
nla_nest_cancel(skb, nest2);
@@ -1053,7 +1053,8 @@ static int br_mdb_add_group(const struct br_mdb_config *cfg,
/* host join */
if (!port) {
- if (mp->host_joined && !(cfg->nlflags & NLM_F_REPLACE)) {
+ if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
+ !(cfg->nlflags & NLM_F_REPLACE)) {
NL_SET_ERR_MSG_MOD(extack, "Group is already joined by host");
return -EEXIST;
}
@@ -1381,7 +1382,8 @@ static int __br_mdb_del(const struct br_mdb_config *cfg)
goto unlock;
/* host leave */
- if (entry->ifindex == mp->br->dev->ifindex && mp->host_joined) {
+ if (entry->ifindex == mp->br->dev->ifindex &&
+ (mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
br_multicast_host_leave(mp, false);
err = 0;
br_mdb_notify(br->dev, mp, NULL, RTM_DELMDB);
@@ -1619,7 +1621,7 @@ br_mdb_get_reply_alloc(const struct net_bridge_mdb_entry *mp)
/* MDBA_MDB_ENTRY */
nla_total_size(0);
- if (mp->host_joined)
+ if (mp->flags & BRIDGE_MDBE_F_HOST_JOINED)
nlmsg_size += rtnl_mdb_nlmsg_pg_size(NULL);
for (pg = mlock_dereference(mp->ports, mp->br); pg;
@@ -1658,7 +1660,7 @@ static int br_mdb_get_reply_fill(struct sk_buff *skb,
goto cancel;
}
- if (mp->host_joined) {
+ if (mp->flags & BRIDGE_MDBE_F_HOST_JOINED) {
err = __mdb_fill_info(skb, mp, NULL);
if (err)
goto cancel;
@@ -1702,7 +1704,7 @@ int br_mdb_get(struct net_device *dev, struct nlattr *tb[], u32 portid, u32 seq,
spin_lock_bh(&br->multicast_lock);
mp = br_mdb_ip_get(br, &group);
- if (!mp || (!mp->ports && !mp->host_joined)) {
+ if (!mp || (!mp->ports && !(mp->flags & BRIDGE_MDBE_F_HOST_JOINED))) {
NL_SET_ERR_MSG_MOD(extack, "MDB entry not found");
err = -ENOENT;
goto unlock;
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 5d6fdfb43c046..4107bf7bd271f 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -391,13 +391,13 @@ static void br_multicast_sg_host_state(struct net_bridge_mdb_entry *star_mp,
if (WARN_ON(!br_multicast_is_star_g(&star_mp->addr)))
return;
- if (!star_mp->host_joined)
+ if (!(star_mp->flags & BRIDGE_MDBE_F_HOST_JOINED))
return;
sg_mp = br_mdb_ip_get(star_mp->br, &sg->key.addr);
if (!sg_mp)
return;
- sg_mp->host_joined = true;
+ sg_mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
}
/* set the host_joined state of all of *,G's S,G entries */
@@ -425,7 +425,8 @@ static void br_multicast_star_g_host_state(struct net_bridge_mdb_entry *star_mp)
sg_mp = br_mdb_ip_get(br, &sg_ip);
if (!sg_mp)
continue;
- sg_mp->host_joined = star_mp->host_joined;
+ sg_mp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
+ sg_mp->flags |= star_mp->flags & BRIDGE_MDBE_F_HOST_JOINED;
}
}
}
@@ -453,7 +454,7 @@ static void br_multicast_sg_del_exclude_ports(struct net_bridge_mdb_entry *sgmp)
* we treat it as EXCLUDE {}, so for an S,G it's considered a
* STAR_EXCLUDE entry and we can safely leave it
*/
- sgmp->host_joined = false;
+ sgmp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
for (pp = &sgmp->ports;
(p = mlock_dereference(*pp, sgmp->br)) != NULL;) {
@@ -824,7 +825,8 @@ void br_multicast_del_pg(struct net_bridge_mdb_entry *mp,
hlist_add_head(&pg->mcast_gc.gc_node, &br->mcast_gc_list);
queue_work(system_long_wq, &br->mcast_gc_work);
- if (!mp->ports && !mp->host_joined && netif_running(br->dev))
+ if (!mp->ports && !(mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
+ netif_running(br->dev))
mod_timer(&mp->timer, jiffies);
}
@@ -1470,8 +1472,8 @@ void br_multicast_del_port_group(struct net_bridge_port_group *p)
void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
struct net_bridge_mdb_entry *mp, bool notify)
{
- if (!mp->host_joined) {
- mp->host_joined = true;
+ if (!(mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
+ mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
if (br_multicast_is_star_g(&mp->addr))
br_multicast_star_g_host_state(mp);
if (notify)
@@ -1486,10 +1488,10 @@ void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify)
{
- if (!mp->host_joined)
+ if (!(mp->flags & BRIDGE_MDBE_F_HOST_JOINED))
return;
- mp->host_joined = false;
+ mp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
if (br_multicast_is_star_g(&mp->addr))
br_multicast_star_g_host_state(mp);
if (notify)
@@ -3537,7 +3539,7 @@ static void br_ip4_multicast_query(struct net_bridge_mcast *brmctx,
max_delay *= brmctx->multicast_last_member_count;
- if (mp->host_joined &&
+ if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
(timer_pending(&mp->timer) ?
time_after(mp->timer.expires, now + max_delay) :
timer_delete_sync_try(&mp->timer) >= 0))
@@ -3626,7 +3628,7 @@ static int br_ip6_multicast_query(struct net_bridge_mcast *brmctx,
goto out;
max_delay *= brmctx->multicast_last_member_count;
- if (mp->host_joined &&
+ if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
(timer_pending(&mp->timer) ?
time_after(mp->timer.expires, now + max_delay) :
timer_delete_sync_try(&mp->timer) >= 0))
@@ -3722,7 +3724,7 @@ br_multicast_leave_group(struct net_bridge_mcast *brmctx,
brmctx->multicast_last_member_interval;
if (!pmctx) {
- if (mp->host_joined &&
+ if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
(timer_pending(&mp->timer) ?
time_after(mp->timer.expires, time) :
timer_delete_sync_try(&mp->timer) >= 0)) {
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 6a2dabd6f4bfb..1e0eefaf50dd1 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -373,12 +373,14 @@ struct net_bridge_port_group {
struct rcu_head rcu;
};
+#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
+
struct net_bridge_mdb_entry {
struct rhash_head rhnode;
struct net_bridge *br;
struct net_bridge_port_group __rcu *ports;
struct br_ip addr;
- bool host_joined;
+ u8 flags;
struct timer_list timer;
struct hlist_node mdb_node;
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index c46d8e49ce990..39535f1a6b8ce 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -741,7 +741,7 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
struct net_bridge_port_group __rcu * const *pp;
const struct net_bridge_port_group *p;
- if (mp->host_joined) {
+ if (mp->flags & BRIDGE_MDBE_F_HOST_JOINED) {
err = br_switchdev_mdb_queue_one(&mdb_list, dev, action,
SWITCHDEV_OBJ_ID_HOST_MDB,
mp, NULL, br_dev);
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 1/6] net: bridge: mdb: add MDB_FLAGS_STREAM_RESERVED flag Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 2/6] net: bridge: convert mdb_entry host_joined to a flags field Luke Howard
@ 2026-06-02 0:43 ` Luke Howard
2026-06-02 1:28 ` Luke Howard
2026-06-03 7:35 ` Nikolay Aleksandrov
2026-06-02 0:43 ` [PATCH net-next v2 4/6] net: bridge: allow MDB_FLAGS_STREAM_RESERVED on host groups Luke Howard
` (2 subsequent siblings)
5 siblings, 2 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
Add the BR_FILTER_STREAM_RESERVED bridge port flag, gated by
CONFIG_BRIDGE_8021Q_SRP, which may be used to enforce 802.1Qat admission
control on ports that have it set.
A frame received by a port with the flag set, and whose 802.1p priority
maps (via an MQPRIO/TAPRIO Qdisc on the bridge) to a non-zero traffic
class, is admitted only if it belongs to a reserved stream. Reserved
streams are multicast frames whose MDB entry has FLAGS_STREAM_RESERVED
set. Unicast and broadcast frames sharing this priority are dropped.
Non-admitted frames are dropped after source address learning.
Multicast snooping must be enabled on the bridge for admission control
to function correctly: with snooping disabled, no MDB entries exist and
all SR-class multicast frames on SR-filtered ports would be dropped.
There is no support for reserved unicast streams at present; although
permitted by 802.1Qat (SRP) they are rarely used in practice.
The choice to not allow configurable mapping of traffic classes was done
in the interest of simplicity and keeping the number of additional
instructions in the forwarding path to a minimum. It is anticipated this
will suffice for the common AVB/TSN use case.
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Luke Howard <lukeh@padl.com>
---
include/linux/if_bridge.h | 1 +
include/uapi/linux/if_link.h | 9 +
net/bridge/Kconfig | 22 +
net/bridge/br_input.c | 59 ++-
net/bridge/br_netlink.c | 8 +-
net/bridge/br_private.h | 3 +-
net/bridge/br_switchdev.c | 3 +-
net/core/rtnetlink.c | 2 +-
tools/testing/selftests/net/forwarding/Makefile | 1 +
.../net/forwarding/bridge_mdb_stream_reserved.sh | 536 +++++++++++++++++++++
tools/testing/selftests/net/forwarding/config | 2 +
11 files changed, 641 insertions(+), 5 deletions(-)
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index ec9ffea1e46ed..a5e91bace0464 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -62,6 +62,7 @@ struct br_ip_list {
#define BR_PORT_MAB BIT(22)
#define BR_NEIGH_VLAN_SUPPRESS BIT(23)
#define BR_NEIGH_FORWARD_GRAT BIT(24)
+#define BR_FILTER_STREAM_RESERVED BIT(25)
#define BR_DEFAULT_AGEING_TIME (300 * HZ)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 46413392b402c..80f5a2b4162c8 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1106,6 +1106,14 @@ enum {
* backup port that has VLAN tunnel mapping enabled (via the
* *IFLA_BRPORT_VLAN_TUNNEL* option). Setting a value of 0 (default) has
* the effect of not attaching any ID.
+ *
+ * @IFLA_BRPORT_FILTER_STREAM_RESERVED
+ * Controls whether the port enforces 802.1Qat stream reservation
+ * admission control. When enabled, a frame received on the port whose
+ * 802.1p priority maps (via an MQPRIO/TAPRIO Qdisc on the bridge) to a
+ * non-zero traffic class is dropped at ingress unless it belongs to a
+ * reserved stream, i.e. it is multicast and its destination address has a
+ * stream-reserved MDB entry. The flag is off by default.
*/
enum {
IFLA_BRPORT_UNSPEC,
@@ -1154,6 +1162,7 @@ enum {
IFLA_BRPORT_NEIGH_VLAN_SUPPRESS,
IFLA_BRPORT_BACKUP_NHID,
IFLA_BRPORT_NEIGH_FORWARD_GRAT,
+ IFLA_BRPORT_FILTER_STREAM_RESERVED,
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/Kconfig b/net/bridge/Kconfig
index 318715c8fc9bc..7e46b791922a6 100644
--- a/net/bridge/Kconfig
+++ b/net/bridge/Kconfig
@@ -47,6 +47,28 @@ config BRIDGE_IGMP_SNOOPING
If unsure, say Y.
+config BRIDGE_8021Q_SRP
+ bool "802.1Qat Stream Reservation Protocol (SRP) admission control"
+ depends on BRIDGE
+ default n
+ help
+ If you say Y here, then the Ethernet bridge will enforce 802.1Qat
+ stream reservation admission control in software on ingress ports
+ that have the BR_FILTER_STREAM_RESERVED flag set: a frame whose
+ 802.1p priority maps (via an MQPRIO/TAPRIO Qdisc on the bridge) to
+ a non-zero traffic class is dropped unless it is multicast and its
+ destination address MDB_FLAGS_STREAM_RESERVED set on its MDB entry.
+
+ This option only controls software enforcement. The
+ BR_FILTER_STREAM_RESERVED port flag and MDB_FLAGS_STREAM_RESERVED
+ MDB flag are always accepted from user space and propagated via
+ switchdev so that hardware-offloading switches can enforce
+ admission control even when this option is disabled.
+
+ Say N to exclude software enforcement and reduce the binary size.
+
+ If unsure, say N.
+
config BRIDGE_VLAN_FILTERING
bool "VLAN filtering"
depends on BRIDGE
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 5787066b1f4cb..2e8aa19a9b542 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -11,6 +11,7 @@
#include <linux/kernel.h>
#include <linux/netdevice.h>
#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
#include <linux/netfilter_bridge.h>
#ifdef CONFIG_NETFILTER_FAMILY_BRIDGE
#include <net/netfilter/nf_queue.h>
@@ -72,6 +73,57 @@ static int br_pass_frame_up(struct sk_buff *skb, bool promisc)
br_netif_receive_skb);
}
+#ifdef CONFIG_BRIDGE_8021Q_SRP
+/* Return false if the bridge has an MQPRIO/TAPRIO Qdisc that maps the
+ * frame's VLAN PCP to a non-zero traffic class.
+ */
+static bool br_skb_is_sr_class(const struct net_bridge *br,
+ const struct sk_buff *skb)
+{
+ if (!skb_vlan_tag_present(skb) || !netdev_get_num_tc(br->dev))
+ return false;
+
+ return netdev_get_prio_tc_map(br->dev, skb_vlan_tag_get_prio(skb)) != 0;
+}
+
+/* 802.1Qat admission control: a frame whose priority maps to a non-zero
+ * TC and which ingresses a port with BR_FILTER_STREAM_RESERVED is admitted
+ * only if it belongs to a reserved stream. Only multicast can be a reserved
+ * stream: either via an MDB port-group member with MDB_PG_FLAGS_STREAM_RESERVED,
+ * or via a host-group entry marked BRIDGE_MDBE_F_HOST_STREAM_RESERVED.
+ */
+static bool br_sr_admission_denied(const struct net_bridge_port *p,
+ const struct sk_buff *skb,
+ const struct net_bridge_mdb_entry *mdst)
+{
+ const struct net_bridge_port_group *pg;
+
+ if (!(p->flags & BR_FILTER_STREAM_RESERVED) ||
+ !br_skb_is_sr_class(p->br, skb))
+ return false;
+
+ if (!mdst)
+ return true;
+
+ if (mdst->flags & BRIDGE_MDBE_F_HOST_STREAM_RESERVED)
+ return false;
+
+ for (pg = rcu_dereference(mdst->ports); pg;
+ pg = rcu_dereference(pg->next))
+ if (pg->flags & MDB_PG_FLAGS_STREAM_RESERVED)
+ return false;
+
+ return true;
+}
+#else
+static inline bool br_sr_admission_denied(const struct net_bridge_port *p,
+ const struct sk_buff *skb,
+ const struct net_bridge_mdb_entry *mdst)
+{
+ return false;
+}
+#endif
+
/* note: already called with rcu_read_lock */
int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
{
@@ -183,9 +235,14 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
br_do_suppress_nd(skb, br, vid, p, msg);
}
+ mdst = pkt_type == BR_PKT_MULTICAST ?
+ br_mdb_entry_skb_get(brmctx, skb, vid) : NULL;
+
+ if (br_sr_admission_denied(p, skb, mdst))
+ goto drop;
+
switch (pkt_type) {
case BR_PKT_MULTICAST:
- mdst = br_mdb_entry_skb_get(brmctx, skb, vid);
if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
br_multicast_querier_exists(brmctx, eth_hdr(skb), mdst)) {
if ((mdst && (mdst->flags & BRIDGE_MDBE_F_HOST_JOINED)) ||
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a104b25c871d2..99e2a19255773 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -191,6 +191,7 @@ static inline size_t br_port_info_size(void)
+ nla_total_size(1) /* IFLA_BRPORT_MAB */
+ nla_total_size(1) /* IFLA_BRPORT_NEIGH_VLAN_SUPPRESS */
+ nla_total_size(1) /* IFLA_BRPORT_NEIGH_FORWARD_GRAT */
+ + nla_total_size(1) /* IFLA_BRPORT_FILTER_STREAM_RESERVED */
+ nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_ROOT_ID */
+ nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */
+ nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */
@@ -285,7 +286,9 @@ static int br_port_fill_attrs(struct sk_buff *skb,
nla_put_u8(skb, IFLA_BRPORT_NEIGH_VLAN_SUPPRESS,
!!(p->flags & BR_NEIGH_VLAN_SUPPRESS)) ||
nla_put_u8(skb, IFLA_BRPORT_NEIGH_FORWARD_GRAT,
- !!(p->flags & BR_NEIGH_FORWARD_GRAT)))
+ !!(p->flags & BR_NEIGH_FORWARD_GRAT)) ||
+ nla_put_u8(skb, IFLA_BRPORT_FILTER_STREAM_RESERVED,
+ !!(p->flags & BR_FILTER_STREAM_RESERVED)))
return -EMSGSIZE;
timerval = br_timer_value(&p->message_age_timer);
@@ -906,6 +909,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = {
[IFLA_BRPORT_NEIGH_VLAN_SUPPRESS] = NLA_POLICY_MAX(NLA_U8, 1),
[IFLA_BRPORT_BACKUP_NHID] = { .type = NLA_U32 },
[IFLA_BRPORT_NEIGH_FORWARD_GRAT] = NLA_POLICY_MAX(NLA_U8, 1),
+ [IFLA_BRPORT_FILTER_STREAM_RESERVED] = NLA_POLICY_MAX(NLA_U8, 1),
};
/* Change the state of the port and notify spanning tree */
@@ -976,6 +980,8 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
BR_NEIGH_VLAN_SUPPRESS);
br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_FORWARD_GRAT,
BR_NEIGH_FORWARD_GRAT);
+ br_set_port_flag(p, tb, IFLA_BRPORT_FILTER_STREAM_RESERVED,
+ BR_FILTER_STREAM_RESERVED);
if ((p->flags & BR_PORT_MAB) &&
(!(p->flags & BR_PORT_LOCKED) || !(p->flags & BR_LEARNING))) {
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 1e0eefaf50dd1..4ae050ae4826e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -373,7 +373,8 @@ struct net_bridge_port_group {
struct rcu_head rcu;
};
-#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
+#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
+#define BRIDGE_MDBE_F_HOST_STREAM_RESERVED BIT(1)
struct net_bridge_mdb_entry {
struct rhash_head rhnode;
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 39535f1a6b8ce..7b531d483817c 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -76,7 +76,8 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
/* Flags that can be offloaded to hardware */
#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | BR_PORT_MAB | \
BR_MCAST_FLOOD | BR_BCAST_FLOOD | BR_PORT_LOCKED | \
- BR_HAIRPIN_MODE | BR_ISOLATED | BR_MULTICAST_TO_UNICAST)
+ BR_HAIRPIN_MODE | BR_ISOLATED | BR_MULTICAST_TO_UNICAST | \
+ BR_FILTER_STREAM_RESERVED)
int br_switchdev_set_port_flag(struct net_bridge_port *p,
unsigned long flags,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 652dd008955a9..8ad7f1d0357b2 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -63,7 +63,7 @@
#include "dev.h"
#define RTNL_MAX_TYPE 50
-#define RTNL_SLAVE_MAX_TYPE 45
+#define RTNL_SLAVE_MAX_TYPE 46
struct rtnl_link {
rtnl_doit_func doit;
diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile
index bbaf4d937dd8b..3899551db05b9 100644
--- a/tools/testing/selftests/net/forwarding/Makefile
+++ b/tools/testing/selftests/net/forwarding/Makefile
@@ -10,6 +10,7 @@ TEST_PROGS := \
bridge_mdb_host.sh \
bridge_mdb_max.sh \
bridge_mdb_port_down.sh \
+ bridge_mdb_stream_reserved.sh \
bridge_mld.sh \
bridge_port_isolation.sh \
bridge_sticky_fdb.sh \
diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh b/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
new file mode 100755
index 0000000000000..a21dc2ec3e95c
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
@@ -0,0 +1,536 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Test 802.1Qat stream reservation admission control. A bridge port with the
+# BR_FILTER_STREAM_RESERVED flag set (bridge link set ... filter_stream_reserved
+# on) polices multicast it receives: a frame whose 802.1p priority maps (via an
+# mqprio/TC configuration on the bridge netdev) to a non-zero traffic class is
+# admitted only if its destination is a reserved stream, i.e. has an MDB entry
+# with the stream_reserved flag (the allow-list, typically maintained by an SRP
+# daemon). Other SR-class multicast is dropped at ingress, so it reaches neither
+# the host nor any port. TC 0 traffic, and traffic on ports without the flag,
+# is unaffected.
+#
+# +------------------------+
+# | H1 (vrf) - talker |
+# | + $h1 |
+# +----|-------------------+
+# | PCP-tagged mcast
+# +-----------------------------------|------------------------------------+
+# | SW $swp1 (filter_stream_reserved) BR0 (802.1q, mqprio) |
+# | + |
+# | + $swp2 (listener) + $swp3 (listener) |
+# +------------------|-------------------------|---------------------------+
+# | |
+# +--------------|---------+ +-----------|------------+
+# | H2 (vrf) - listener | | H3 (vrf) - listener |
+# | + $h2 | | + $h3 |
+# +------------------------+ +------------------------+
+
+ALL_TESTS="
+ cfg_test
+ fwd_sr_member_test
+ fwd_foreign_blocked_test
+ fwd_unicast_blocked_test
+ fwd_flag_gates_test
+ fwd_tc_toggle_test
+ fwd_flag_toggle_test
+ fwd_sr_ipv6_test
+"
+
+NUM_NETIFS=6
+source lib.sh
+source tc_common.sh
+
+# GRP is the stream-reserved group; GRP2 is a plain group that both swp2 and
+# swp3 join, used to show that a foreign SR-class group is dropped at ingress.
+GRP=239.1.1.1
+GRP_DMAC=01:00:5e:01:01:01
+GRP2=239.1.1.2
+GRP2_DMAC=01:00:5e:01:01:02
+GRP3=239.1.1.3
+GRP3_DMAC=01:00:5e:01:01:03
+# IPv6 (MLD) groups: GRP6 is stream-reserved, GRP6B is a plain group.
+GRP6=ff0e::1
+GRP6_DMAC=33:33:00:00:00:01
+GRP6B=ff0e::2
+GRP6B_DMAC=33:33:00:00:00:02
+# Source for the (S, G) configuration check.
+SRC=192.0.2.10
+# PCP 3 is SR class A; the mqprio map below sends it to TC 1.
+SR_PCP=3
+BE_PCP=0
+VID=10
+
+h1_create()
+{
+ simple_if_init $h1
+ vlan_create $h1 $VID v$h1 192.0.2.1/28
+}
+
+h1_destroy()
+{
+ vlan_destroy $h1 $VID
+ simple_if_fini $h1
+}
+
+h2_create()
+{
+ simple_if_init $h2
+ vlan_create $h2 $VID v$h2 192.0.2.2/28
+}
+
+h2_destroy()
+{
+ vlan_destroy $h2 $VID
+ simple_if_fini $h2
+}
+
+h3_create()
+{
+ simple_if_init $h3
+ vlan_create $h3 $VID v$h3 192.0.2.3/28
+}
+
+h3_destroy()
+{
+ vlan_destroy $h3 $VID
+ simple_if_fini $h3
+}
+
+switch_create()
+{
+ # The bridge must have multiple TX queues so that an mqprio qdisc (which
+ # populates the netdev prio->tc map the SR filter consults) can be
+ # attached, and a multicast querier so that the bridge forwards
+ # selectively.
+ ip link add name br0 numtxqueues 8 numrxqueues 8 type bridge \
+ vlan_filtering 1 vlan_default_pvid 0 \
+ mcast_snooping 1 mcast_igmp_version 3 mcast_mld_version 2 \
+ mcast_querier 1
+ bridge vlan add vid $VID dev br0 self
+ ip link set dev br0 up
+
+ # A link-local address lets the bridge act as the IPv6 (MLD) querier,
+ # mirroring the IGMP querier.
+ ip address add fe80::1/64 dev br0 nodad
+
+ local swp
+ for swp in $swp1 $swp2 $swp3; do
+ ip link set dev $swp master br0
+ ip link set dev $swp up
+ bridge vlan add vid $VID dev $swp
+ done
+
+ # PCP $SR_PCP -> TC 1, everything else -> TC 0 (software mode).
+ tc qdisc add dev br0 root handle 100: mqprio num_tc 2 \
+ map 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 \
+ queues 1@0 1@1 hw 0
+
+ tc qdisc add dev $h2 clsact
+ tc qdisc add dev $h3 clsact
+
+ # Wait for the bridge's own querier to become active.
+ sleep 10
+}
+
+switch_destroy()
+{
+ tc qdisc del dev $h3 clsact
+ tc qdisc del dev $h2 clsact
+ tc qdisc del dev br0 root handle 100: mqprio 2>/dev/null
+
+ local swp
+ for swp in $swp3 $swp2 $swp1; do
+ bridge vlan del vid $VID dev $swp
+ ip link set dev $swp down
+ ip link set dev $swp nomaster
+ done
+
+ ip link set dev br0 down
+ bridge vlan del vid $VID dev br0 self
+ ip link del dev br0
+}
+
+setup_prepare()
+{
+ h1=${NETIFS[p1]}
+ swp1=${NETIFS[p2]}
+ swp2=${NETIFS[p3]}
+ h2=${NETIFS[p4]}
+ swp3=${NETIFS[p5]}
+ h3=${NETIFS[p6]}
+
+ vrf_prepare
+ forwarding_enable
+
+ h1_create
+ h2_create
+ h3_create
+ switch_create
+}
+
+cleanup()
+{
+ pre_cleanup
+
+ switch_destroy
+ h3_destroy
+ h2_destroy
+ h1_destroy
+
+ forwarding_restore
+ vrf_cleanup
+}
+
+# Arm or disarm SR-class admission control on a bridge port
+# (BR_FILTER_STREAM_RESERVED).
+sr_filter()
+{
+ local dev=$1 onoff=$2
+
+ bridge link set dev $dev filter_stream_reserved $onoff
+}
+
+# Probe whether the running kernel and iproute2 understand the MDB flag and the
+# port flag. If not, the whole suite is skipped, so it is safe to invoke on an
+# unpatched system.
+stream_reserved_supported()
+{
+ bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
+ stream_reserved 2>/dev/null
+ if [[ $? -ne 0 ]]; then
+ return 1
+ fi
+ bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
+
+ sr_filter $swp1 on 2>/dev/null || return 1
+ sr_filter $swp1 off
+ return 0
+}
+
+cfg_test()
+{
+ RET=0
+
+ # stream_reserved entries must be permanent.
+ bridge mdb add dev br0 port $swp2 grp $GRP vid $VID \
+ stream_reserved 2>/dev/null
+ check_fail $? "non-permanent stream_reserved port entry accepted"
+
+ # The flag must be rejected on host groups.
+ bridge mdb add dev br0 port br0 grp $GRP permanent vid $VID \
+ stream_reserved 2>/dev/null
+ check_fail $? "stream_reserved accepted on a host group"
+
+ # Add a port group with the flag and confirm it is reflected in dump.
+ bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
+ stream_reserved
+ check_err $? "Failed to add stream_reserved entry"
+ bridge -d mdb show dev br0 | grep -q "$GRP.*stream_reserved"
+ check_err $? "stream_reserved flag not shown in dump"
+
+ # The other state flags must not be disturbed: a permanent entry stays
+ # permanent and carries no group timer when stream_reserved is set.
+ bridge -d mdb get dev br0 grp $GRP vid $VID | grep -q "permanent"
+ check_err $? "stream_reserved entry not kept \"permanent\""
+ bridge -d -s mdb get dev br0 grp $GRP vid $VID | grep -q " 0.00"
+ check_err $? "\"permanent\" stream_reserved entry has a pending group timer"
+
+ # The flag is also accepted, and reported, on a source-specific (S, G).
+ bridge mdb add dev br0 port $swp2 grp $GRP3 src $SRC permanent vid $VID \
+ stream_reserved
+ check_err $? "stream_reserved rejected on an (S, G) entry"
+ bridge -d mdb show dev br0 | grep "$SRC" | grep -q stream_reserved
+ check_err $? "stream_reserved flag not shown on (S, G) entry"
+ bridge mdb del dev br0 port $swp2 grp $GRP3 src $SRC vid $VID
+
+ # Replacing without the flag must clear it.
+ bridge mdb replace dev br0 port $swp2 grp $GRP permanent vid $VID
+ bridge -d mdb show dev br0 | grep -q "$GRP.*stream_reserved"
+ check_fail $? "stream_reserved flag not cleared on replace"
+
+ bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
+
+ # The port flag round-trips through netlink and is shown in the dump.
+ sr_filter $swp1 on
+ check_err $? "Failed to set filter_stream_reserved on a port"
+ bridge -d link show dev $swp1 | grep -q "filter_stream_reserved on"
+ check_err $? "filter_stream_reserved not shown in link dump"
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved configuration"
+}
+
+rx_filter_install()
+{
+ local dev=$1 pref=$2 grp=$3 ethtype=${4:-ipv4}
+
+ tc filter add dev $dev ingress protocol 802.1q pref $pref handle $pref \
+ flower vlan_ethtype $ethtype vlan_id $VID dst_ip $grp action drop
+}
+
+rx_filter_uninstall()
+{
+ local dev=$1 pref=$2
+
+ tc filter del dev $dev ingress protocol 802.1q pref $pref handle $pref \
+ flower
+}
+
+send_mc()
+{
+ local grp=$1 dmac=$2 pcp=$3
+
+ $MZ $h1 -a own -b $dmac -c 1 -p 64 \
+ -A 192.0.2.1 -B $grp -t udp -Q $pcp:$VID -q
+}
+
+send_mc6()
+{
+ local grp=$1 dmac=$2 pcp=$3
+
+ $MZ -6 $h1 -a own -b $dmac -c 1 -p 64 \
+ -A 2001:db8:1::1 -B $grp -t udp -Q $pcp:$VID -q
+}
+
+# An arbitrary unicast DA: the bridge floods it as unknown unicast, so it
+# reaches h2 unless dropped at ingress.
+UC_DMAC=00:de:ad:be:ef:02
+
+send_uc()
+{
+ local dip=$1 pcp=$2
+
+ $MZ $h1 -a own -b $UC_DMAC -c 1 -p 64 \
+ -A 192.0.2.1 -B $dip -t udp -Q $pcp:$VID -q
+}
+
+# An SR-class frame for a reserved stream is admitted on a filtering port and
+# delivered to the stream's member.
+fwd_sr_member_test()
+{
+ RET=0
+
+ sr_filter $swp1 on
+ bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
+ stream_reserved
+ rx_filter_install $h2 1 $GRP
+
+ send_mc $GRP $GRP_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 1 1
+ check_err $? "reserved-stream SR-class frame not admitted to its member"
+
+ rx_filter_uninstall $h2 1
+ bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved member delivery"
+}
+
+# swp1 filters SR-class ingress. A foreign (non-reserved) group GRP2 at SR class
+# is dropped at ingress, reaching neither listener, while a best-effort (TC 0)
+# frame is admitted and delivered to both.
+fwd_foreign_blocked_test()
+{
+ RET=0
+
+ sr_filter $swp1 on
+ bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
+ bridge mdb add dev br0 port $swp3 grp $GRP2 permanent vid $VID
+
+ rx_filter_install $h2 2 $GRP2
+ rx_filter_install $h3 2 $GRP2
+
+ # SR-class: dropped at ingress, reaches neither listener.
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 0
+ check_err $? "foreign SR-class frame leaked to a listener"
+ tc_check_packets "dev $h3 ingress" 2 0
+ check_err $? "foreign SR-class frame leaked to a listener"
+
+ # Best-effort (TC 0): unaffected, delivered to both.
+ send_mc $GRP2 $GRP2_DMAC $BE_PCP
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "best-effort frame not delivered"
+ tc_check_packets "dev $h3 ingress" 2 1
+ check_err $? "best-effort frame not delivered"
+
+ rx_filter_uninstall $h3 2
+ rx_filter_uninstall $h2 2
+
+ bridge mdb del dev br0 port $swp3 grp $GRP2 permanent vid $VID
+ bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved blocks foreign SR-class group at ingress"
+}
+
+# Unicast cannot belong to a reserved stream, so an SR-class unicast frame is
+# dropped at a filtering ingress port (otherwise it would consume the AVB
+# queue's reserved bandwidth). A best-effort unicast frame is unaffected.
+fwd_unicast_blocked_test()
+{
+ RET=0
+
+ sr_filter $swp1 on
+ rx_filter_install $h2 5 192.0.2.2
+
+ send_uc 192.0.2.2 $SR_PCP
+ tc_check_packets "dev $h2 ingress" 5 0
+ check_err $? "SR-class unicast leaked through a filtering ingress port"
+
+ send_uc 192.0.2.2 $BE_PCP
+ tc_check_packets "dev $h2 ingress" 5 1
+ check_err $? "best-effort unicast not delivered"
+
+ rx_filter_uninstall $h2 5
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved blocks SR-class unicast at ingress"
+}
+
+# Filtering is gated by the ingress port flag, not by the presence of a
+# reserved stream: with a reserved stream registered but the flag clear, a
+# foreign SR-class group is forwarded; setting the flag then blocks it.
+fwd_flag_gates_test()
+{
+ RET=0
+
+ bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
+ stream_reserved
+ bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
+
+ rx_filter_install $h2 2 $GRP2
+
+ # Flag clear (default): the reserved stream does not engage the gate.
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "SR-class frame blocked with filter flag clear"
+
+ # Flag set on the ingress port: the foreign group is now dropped.
+ sr_filter $swp1 on
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "foreign SR-class frame leaked after filter flag set"
+
+ rx_filter_uninstall $h2 2
+ bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
+ bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved gated by port flag, not membership"
+}
+
+# The gate only engages while the prio->tc map has a non-zero class. With the
+# mqprio qdisc removed, the foreign group is admitted again.
+fwd_tc_toggle_test()
+{
+ RET=0
+
+ sr_filter $swp1 on
+ bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
+
+ rx_filter_install $h2 2 $GRP2
+
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 0
+ check_err $? "foreign SR-class frame leaked while TC enabled"
+
+ # Drop the TC configuration; the prio->tc map is gone, gate is inert.
+ tc qdisc del dev br0 root handle 100: mqprio
+
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "frame not delivered after TC configuration removed"
+
+ tc qdisc add dev br0 root handle 100: mqprio num_tc 2 \
+ map 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 \
+ queues 1@0 1@1 hw 0
+
+ rx_filter_uninstall $h2 2
+
+ bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved gate follows TC configuration"
+}
+
+# Clearing the port flag stops the port filtering, so the previously blocked
+# group is admitted again.
+fwd_flag_toggle_test()
+{
+ RET=0
+
+ sr_filter $swp1 on
+ bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
+
+ rx_filter_install $h2 2 $GRP2
+
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 0
+ check_err $? "foreign SR-class frame leaked while ingress filtering"
+
+ # Disarm the filter on swp1.
+ sr_filter $swp1 off
+
+ send_mc $GRP2 $GRP2_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "frame not delivered after filter flag cleared"
+
+ rx_filter_uninstall $h2 2
+
+ bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
+
+ log_test "MDB stream_reserved filtering disabled on port flag clear"
+}
+
+# The semantics are protocol-independent: a foreign IPv6/MLD group at SR class
+# is dropped at ingress, while best-effort is delivered.
+fwd_sr_ipv6_test()
+{
+ RET=0
+
+ sr_filter $swp1 on
+ bridge mdb add dev br0 port $swp2 grp $GRP6B permanent vid $VID
+ bridge mdb add dev br0 port $swp3 grp $GRP6B permanent vid $VID
+
+ rx_filter_install $h2 2 $GRP6B ipv6
+ rx_filter_install $h3 2 $GRP6B ipv6
+
+ send_mc6 $GRP6B $GRP6B_DMAC $SR_PCP
+ tc_check_packets "dev $h2 ingress" 2 0
+ check_err $? "foreign SR-class IPv6 frame leaked to a listener"
+ tc_check_packets "dev $h3 ingress" 2 0
+ check_err $? "foreign SR-class IPv6 frame leaked to a listener"
+
+ send_mc6 $GRP6B $GRP6B_DMAC $BE_PCP
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "best-effort IPv6 frame not delivered"
+ tc_check_packets "dev $h3 ingress" 2 1
+ check_err $? "best-effort IPv6 frame not delivered"
+
+ rx_filter_uninstall $h3 2
+ rx_filter_uninstall $h2 2
+
+ bridge mdb del dev br0 port $swp3 grp $GRP6B permanent vid $VID
+ bridge mdb del dev br0 port $swp2 grp $GRP6B permanent vid $VID
+ sr_filter $swp1 off
+
+ log_test "MDB stream_reserved blocks foreign SR-class IPv6 group at ingress"
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+
+if ! stream_reserved_supported; then
+ log_test_skip "MDB stream_reserved" \
+ "kernel or iproute2 lacks MDB_FLAGS_STREAM_RESERVED support"
+ exit $EXIT_STATUS
+fi
+
+tests_run
+
+exit $EXIT_STATUS
diff --git a/tools/testing/selftests/net/forwarding/config b/tools/testing/selftests/net/forwarding/config
index 75a6c3d3c1da3..d1fe9ec41340e 100644
--- a/tools/testing/selftests/net/forwarding/config
+++ b/tools/testing/selftests/net/forwarding/config
@@ -1,5 +1,6 @@
CONFIG_BPF_SYSCALL=y
CONFIG_BRIDGE=m
+CONFIG_BRIDGE_8021Q_SRP=y
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_BRIDGE_VLAN_FILTERING=y
CONFIG_CGROUP_BPF=y
@@ -40,6 +41,7 @@ CONFIG_NET_L3_MASTER_DEV=y
CONFIG_NET_NS=y
CONFIG_NET_SCH_ETS=m
CONFIG_NET_SCH_INGRESS=m
+CONFIG_NET_SCH_MQPRIO=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_TBF=m
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v2 4/6] net: bridge: allow MDB_FLAGS_STREAM_RESERVED on host groups
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
` (2 preceding siblings ...)
2026-06-02 0:43 ` [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control Luke Howard
@ 2026-06-02 0:43 ` Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 6/6] net: dsa: mv88e6xxx: honour MDB_FLAGS_STREAM_RESERVED for AVB streams Luke Howard
5 siblings, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
Allow the local bridge host to declare itself a reserved stream listener
for a MDB group, for example on a device which is both an AVB end station
and bridge.
Only MDB_FLAGS_STREAM_RESERVED is accepted on host groups; the other
MDB_FLAGS_* bits remain port-group-only.
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Luke Howard <lukeh@padl.com>
---
include/uapi/linux/if_bridge.h | 7 +-
net/bridge/br_input.c | 2 +-
net/bridge/br_mdb.c | 21 +++-
net/bridge/br_multicast.c | 37 ++++--
net/bridge/br_private.h | 15 ++-
.../net/forwarding/bridge_mdb_stream_reserved.sh | 125 ++++++++++++++++++++-
6 files changed, 182 insertions(+), 25 deletions(-)
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 01955a575528c..989d13a866be4 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -748,9 +748,10 @@ enum {
* [MDBE_ATTR_xxx]
* ...
* [MDBE_ATTR_FLAGS]
- * u32, a mask of MDB_FLAGS_* values to set on the entry. Valid only
- * for port-group entries; currently only MDB_FLAGS_STREAM_RESERVED
- * may be set from user space.
+ * u32, a mask of MDB_FLAGS_* values to set on the entry. Currently
+ * only MDB_FLAGS_STREAM_RESERVED may be set from user space, and is
+ * accepted on both port-group and host-group entries (on the latter
+ * it declares the local bridge host as a reserved-stream listener).
* }
*/
enum {
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 2e8aa19a9b542..649b819906bf8 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -105,7 +105,7 @@ static bool br_sr_admission_denied(const struct net_bridge_port *p,
if (!mdst)
return true;
- if (mdst->flags & BRIDGE_MDBE_F_HOST_STREAM_RESERVED)
+ if ((mdst->flags & BRIDGE_MDBE_F_HOST_MASK) == BRIDGE_MDBE_F_HOST_MASK)
return false;
for (pg = rcu_dereference(mdst->ports); pg;
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b95ca72ec6347..93127a8ea54f7 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -250,6 +250,9 @@ static int __mdb_fill_info(struct sk_buff *skb,
} else {
ifindex = mp->br->dev->ifindex;
mtimer = &mp->timer;
+ if (mp->flags & BRIDGE_MDBE_F_HOST_STREAM_RESERVED)
+ flags = MDB_PG_FLAGS_PERMANENT |
+ MDB_PG_FLAGS_STREAM_RESERVED;
}
__mdb_entry_fill_flags(&e, flags);
@@ -1059,7 +1062,10 @@ static int br_mdb_add_group(const struct br_mdb_config *cfg,
return -EEXIST;
}
- br_multicast_host_join(brmctx, mp, false);
+ br_multicast_host_join(brmctx, mp,
+ cfg->pg_flags & MDB_PG_FLAGS_STREAM_RESERVED ?
+ BR_MCAST_SR_SET : BR_MCAST_SR_CLEAR,
+ false);
br_mdb_notify(br->dev, mp, NULL, RTM_NEWMDB);
return 0;
@@ -1219,11 +1225,14 @@ static int br_mdb_config_attrs_init(struct nlattr *set_attrs,
}
if (mdb_attrs[MDBE_ATTR_FLAGS]) {
- if (!cfg->p) {
- NL_SET_ERR_MSG_MOD(extack, "Flags cannot be set for host groups");
+ u32 attr_flags = nla_get_u32(mdb_attrs[MDBE_ATTR_FLAGS]);
+
+ if (!cfg->p && (attr_flags & ~MDB_FLAGS_STREAM_RESERVED)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Only stream_reserved may be set on host groups");
return -EINVAL;
}
- if (nla_get_u32(mdb_attrs[MDBE_ATTR_FLAGS]) & MDB_FLAGS_STREAM_RESERVED)
+ if (attr_flags & MDB_FLAGS_STREAM_RESERVED)
cfg->pg_flags |= MDB_PG_FLAGS_STREAM_RESERVED;
}
@@ -1320,8 +1329,8 @@ int br_mdb_add(struct net_device *dev, struct nlattr *tb[], u16 nlmsg_flags,
/* host join errors which can happen before creating the group */
if (!cfg.p && !br_group_is_l2(&cfg.group)) {
- /* don't allow any flags for host-joined IP groups */
- if (cfg.entry->state) {
+ if (cfg.entry->state &&
+ !(cfg.pg_flags & MDB_PG_FLAGS_STREAM_RESERVED)) {
NL_SET_ERR_MSG_MOD(extack, "Flags are not allowed for host groups");
goto out;
}
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 4107bf7bd271f..e3fc61bb63092 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -397,10 +397,10 @@ static void br_multicast_sg_host_state(struct net_bridge_mdb_entry *star_mp,
sg_mp = br_mdb_ip_get(star_mp->br, &sg->key.addr);
if (!sg_mp)
return;
- sg_mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
+ sg_mp->flags |= star_mp->flags & BRIDGE_MDBE_F_HOST_MASK;
}
-/* set the host_joined state of all of *,G's S,G entries */
+/* set the host state of all of *,G's S,G entries */
static void br_multicast_star_g_host_state(struct net_bridge_mdb_entry *star_mp)
{
struct net_bridge *br = star_mp->br;
@@ -425,8 +425,8 @@ static void br_multicast_star_g_host_state(struct net_bridge_mdb_entry *star_mp)
sg_mp = br_mdb_ip_get(br, &sg_ip);
if (!sg_mp)
continue;
- sg_mp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
- sg_mp->flags |= star_mp->flags & BRIDGE_MDBE_F_HOST_JOINED;
+ sg_mp->flags &= ~BRIDGE_MDBE_F_HOST_MASK;
+ sg_mp->flags |= star_mp->flags & BRIDGE_MDBE_F_HOST_MASK;
}
}
}
@@ -454,7 +454,7 @@ static void br_multicast_sg_del_exclude_ports(struct net_bridge_mdb_entry *sgmp)
* we treat it as EXCLUDE {}, so for an S,G it's considered a
* STAR_EXCLUDE entry and we can safely leave it
*/
- sgmp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
+ sgmp->flags &= ~BRIDGE_MDBE_F_HOST_MASK;
for (pp = &sgmp->ports;
(p = mlock_dereference(*pp, sgmp->br)) != NULL;) {
@@ -1470,10 +1470,18 @@ void br_multicast_del_port_group(struct net_bridge_port_group *p)
}
void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
- struct net_bridge_mdb_entry *mp, bool notify)
+ struct net_bridge_mdb_entry *mp,
+ enum br_mcast_sr_op sr_op, bool notify)
{
- if (!(mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
- mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
+ u8 old_flags = mp->flags;
+
+ mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
+ if (sr_op == BR_MCAST_SR_SET)
+ mp->flags |= BRIDGE_MDBE_F_HOST_STREAM_RESERVED;
+ else if (sr_op == BR_MCAST_SR_CLEAR)
+ mp->flags &= ~BRIDGE_MDBE_F_HOST_STREAM_RESERVED;
+
+ if ((mp->flags ^ old_flags) & BRIDGE_MDBE_F_HOST_MASK) {
if (br_multicast_is_star_g(&mp->addr))
br_multicast_star_g_host_state(mp);
if (notify)
@@ -1483,6 +1491,14 @@ void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
if (br_group_is_l2(&mp->addr))
return;
+ /* Host stream-reserved entries are permanent and have no timer; drop
+ * any timer left from an earlier non-reserved host join.
+ */
+ if (mp->flags & BRIDGE_MDBE_F_HOST_STREAM_RESERVED) {
+ timer_delete(&mp->timer);
+ return;
+ }
+
mod_timer(&mp->timer, jiffies + brmctx->multicast_membership_interval);
}
@@ -1491,7 +1507,8 @@ void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify)
if (!(mp->flags & BRIDGE_MDBE_F_HOST_JOINED))
return;
- mp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
+ mp->flags &= ~(BRIDGE_MDBE_F_HOST_JOINED |
+ BRIDGE_MDBE_F_HOST_STREAM_RESERVED);
if (br_multicast_is_star_g(&mp->addr))
br_multicast_star_g_host_state(mp);
if (notify)
@@ -1520,7 +1537,7 @@ __br_multicast_add_group(struct net_bridge_mcast *brmctx,
return ERR_CAST(mp);
if (!pmctx) {
- br_multicast_host_join(brmctx, mp, true);
+ br_multicast_host_join(brmctx, mp, BR_MCAST_SR_KEEP, true);
goto out;
}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 4ae050ae4826e..fbb7a8156f347 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -375,6 +375,18 @@ struct net_bridge_port_group {
#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
#define BRIDGE_MDBE_F_HOST_STREAM_RESERVED BIT(1)
+#define BRIDGE_MDBE_F_HOST_MASK \
+ (BRIDGE_MDBE_F_HOST_JOINED | BRIDGE_MDBE_F_HOST_STREAM_RESERVED)
+
+/* How a host join treats BRIDGE_MDBE_F_HOST_STREAM_RESERVED. Only the MDB
+ * netlink path administers the flag (SET/CLEAR); data-path joins must leave an
+ * existing reservation intact (KEEP).
+ */
+enum br_mcast_sr_op {
+ BR_MCAST_SR_KEEP,
+ BR_MCAST_SR_CLEAR,
+ BR_MCAST_SR_SET,
+};
struct net_bridge_mdb_entry {
struct rhash_head rhnode;
@@ -1049,7 +1061,8 @@ int br_mdb_dump(struct net_device *dev, struct sk_buff *skb,
int br_mdb_get(struct net_device *dev, struct nlattr *tb[], u32 portid, u32 seq,
struct netlink_ext_ack *extack);
void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
- struct net_bridge_mdb_entry *mp, bool notify);
+ struct net_bridge_mdb_entry *mp,
+ enum br_mcast_sr_op sr_op, bool notify);
void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify);
void br_multicast_star_g_handle_mode(struct net_bridge_port_group *pg,
u8 filter_mode);
diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh b/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
index a21dc2ec3e95c..4c5933455037a 100755
--- a/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
+++ b/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
@@ -30,6 +30,8 @@
ALL_TESTS="
cfg_test
fwd_sr_member_test
+ fwd_sr_host_member_test
+ fwd_sr_host_persistence_test
fwd_foreign_blocked_test
fwd_unicast_blocked_test
fwd_flag_gates_test
@@ -217,11 +219,43 @@ cfg_test()
bridge mdb add dev br0 port $swp2 grp $GRP vid $VID \
stream_reserved 2>/dev/null
check_fail $? "non-permanent stream_reserved port entry accepted"
-
- # The flag must be rejected on host groups.
- bridge mdb add dev br0 port br0 grp $GRP permanent vid $VID \
+ bridge mdb add dev br0 port br0 grp $GRP vid $VID \
stream_reserved 2>/dev/null
- check_fail $? "stream_reserved accepted on a host group"
+ check_fail $? "non-permanent stream_reserved host group accepted"
+
+ # A plain (non-SR) host join is still accepted, must not be permanent,
+ # and toggles cleanly with stream_reserved on replace.
+ bridge mdb add dev br0 port br0 grp $GRP vid $VID
+ check_err $? "plain host join rejected"
+ bridge mdb add dev br0 port br0 grp $GRP permanent vid $VID 2>/dev/null
+ check_fail $? "permanent flag accepted on a plain host group"
+ bridge -d mdb show dev br0 | grep "port br0" | grep "$GRP" | \
+ grep -q "stream_reserved"
+ check_fail $? "stream_reserved unexpectedly set on a plain host join"
+ bridge mdb replace dev br0 port br0 grp $GRP permanent vid $VID \
+ stream_reserved
+ check_err $? "Failed to replace plain host join with stream_reserved"
+ bridge mdb replace dev br0 port br0 grp $GRP vid $VID
+ check_err $? "Failed to replace stream_reserved host group with plain"
+ bridge -d mdb show dev br0 | grep "port br0" | grep "$GRP" | \
+ grep -q "stream_reserved"
+ check_fail $? "stream_reserved not cleared on host group replace"
+ bridge mdb del dev br0 port br0 grp $GRP vid $VID
+
+ # permanent + stream_reserved is accepted on host groups and the
+ # entry is dumped as both permanent and stream_reserved.
+ bridge mdb add dev br0 port br0 grp $GRP permanent vid $VID \
+ stream_reserved
+ check_err $? "stream_reserved rejected on a host group"
+ bridge -d mdb show dev br0 | grep "port br0" | grep "$GRP" | \
+ grep -q "stream_reserved"
+ check_err $? "stream_reserved flag not shown on host group"
+ bridge -d mdb get dev br0 grp $GRP vid $VID | grep -q permanent
+ check_err $? "host stream_reserved entry not reported as permanent"
+ bridge -d -s mdb get dev br0 grp $GRP vid $VID | grep "port br0" | \
+ grep -q " 0.00"
+ check_err $? "host stream_reserved entry has a pending group timer"
+ bridge mdb del dev br0 port br0 grp $GRP permanent vid $VID
# Add a port group with the flag and confirm it is reflected in dump.
bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
@@ -328,6 +362,89 @@ fwd_sr_member_test()
log_test "MDB stream_reserved member delivery"
}
+# An SR-class frame for a group the local bridge host has joined with
+# stream_reserved is delivered to the host (passed up via br0); without the
+# flag set on the host join, the same frame is denied at ingress.
+fwd_sr_host_member_test()
+{
+ RET=0
+
+ tc qdisc add dev br0 clsact
+ sr_filter $swp1 on
+
+ # Host join WITHOUT stream_reserved: SR-class frame must be dropped.
+ # A plain host-joined IP group cannot be permanent.
+ bridge mdb add dev br0 port br0 grp $GRP vid $VID
+ rx_filter_install br0 6 $GRP
+
+ send_mc $GRP $GRP_DMAC $SR_PCP
+ tc_check_packets "dev br0 ingress" 6 0
+ check_err $? "SR-class frame delivered to host without stream_reserved"
+
+ send_mc $GRP $GRP_DMAC $BE_PCP
+ tc_check_packets "dev br0 ingress" 6 1
+ check_err $? "best-effort frame not delivered to host"
+
+ # Replace host join WITH stream_reserved: SR-class frame admitted.
+ bridge mdb replace dev br0 port br0 grp $GRP permanent vid $VID \
+ stream_reserved
+ check_err $? "Failed to replace host group with stream_reserved"
+
+ send_mc $GRP $GRP_DMAC $SR_PCP
+ tc_check_packets "dev br0 ingress" 6 2
+ check_err $? "reserved-stream SR-class frame not delivered to host"
+
+ rx_filter_uninstall br0 6
+ bridge mdb del dev br0 port br0 grp $GRP permanent vid $VID
+ sr_filter $swp1 off
+ tc qdisc del dev br0 clsact
+
+ log_test "MDB stream_reserved host listener delivery"
+}
+
+# A permanent + stream_reserved host group has no group timer and must
+# outlive the membership interval, even when promoted from a plain (timer
+# armed) host join, which must not leave a stale group timer behind.
+fwd_sr_host_persistence_test()
+{
+ RET=0
+
+ ip link set dev br0 type bridge mcast_membership_interval 200
+
+ bridge mdb add dev br0 port br0 grp $GRP permanent vid $VID \
+ stream_reserved
+ check_err $? "Failed to add permanent stream_reserved host group"
+
+ sleep 3
+
+ bridge mdb get dev br0 grp $GRP vid $VID &>/dev/null
+ check_err $? "host stream_reserved entry expired"
+
+ bridge mdb del dev br0 port br0 grp $GRP permanent vid $VID
+
+ # A plain host join arms the group timer; promoting it to
+ # stream_reserved must cancel that timer, otherwise the reservation is
+ # torn down when the stale timer expires.
+ bridge mdb add dev br0 port br0 grp $GRP vid $VID
+ check_err $? "plain host join rejected"
+ bridge mdb replace dev br0 port br0 grp $GRP permanent vid $VID \
+ stream_reserved
+ check_err $? "Failed to promote plain host join to stream_reserved"
+ bridge -d -s mdb get dev br0 grp $GRP vid $VID | grep "port br0" | \
+ grep -q " 0.00"
+ check_err $? "stale group timer left after promotion to stream_reserved"
+
+ sleep 3
+
+ bridge mdb get dev br0 grp $GRP vid $VID &>/dev/null
+ check_err $? "promoted stream_reserved entry expired"
+
+ bridge mdb del dev br0 port br0 grp $GRP permanent vid $VID
+ ip link set dev br0 type bridge mcast_membership_interval 26000
+
+ log_test "MDB stream_reserved host entry persistence"
+}
+
# swp1 filters SR-class ingress. A foreign (non-reserved) group GRP2 at SR class
# is dropped at ingress, reaching neither listener, while a best-effort (TC 0)
# frame is admitted and delivered to both.
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
` (3 preceding siblings ...)
2026-06-02 0:43 ` [PATCH net-next v2 4/6] net: bridge: allow MDB_FLAGS_STREAM_RESERVED on host groups Luke Howard
@ 2026-06-02 0:43 ` Luke Howard
2026-06-02 12:00 ` Cedric Jehasse
2026-06-02 0:43 ` [PATCH net-next v2 6/6] net: dsa: mv88e6xxx: honour MDB_FLAGS_STREAM_RESERVED for AVB streams Luke Howard
5 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
Add MQPRIO traffic class offload for the Marvell 6352 and 6390 families
of switches, in two modes.
In DCB mode the switch is configured for AVB. Three traffic classes are
supported: legacy (TC0), low (TC1) and high (TC2), corresponding to
non-AVB, AVB Class B and AVB Class A traffic. A single Ethernet frame
priority may be mapped to each AVB class; the remaining "legacy"
(non-AVB) frame priorities are distributed amongst the other queues per
the MQPRIO TC policy.
In channel mode any frame priority may be mapped to any queue, without
AVB semantics. On the 6390 family the frame priority to queue priority
mapping is programmed per-port, so each port may have an independent
policy.
The AVB (DCB mode) policy is held in switch-global registers and so is
necessarily per-switch rather than per-port: HW offload can only be
enabled across multiple ports if the policy on each enabled port is the
same. This restriction does not apply to channel mode on devices with
per-port priority maps.
While in AVB mode, a port with the BR_FILTER_STREAM_RESERVED bridge flag
set is placed in enhanced AVB mode, which discards AVB-priority frames
whose destination is not a reserved stream.
Signed-off-by: Luke Howard <lukeh@padl.com>
---
drivers/net/dsa/mv88e6xxx/Makefile | 3 +-
drivers/net/dsa/mv88e6xxx/avb.c | 221 +++++++++++++++++
drivers/net/dsa/mv88e6xxx/avb.h | 79 +++++++
drivers/net/dsa/mv88e6xxx/chip.c | 406 +++++++++++++++++++++++++++++++-
drivers/net/dsa/mv88e6xxx/chip.h | 68 +++++-
drivers/net/dsa/mv88e6xxx/global1.c | 28 ++-
drivers/net/dsa/mv88e6xxx/global1.h | 6 +-
drivers/net/dsa/mv88e6xxx/global1_atu.c | 17 ++
drivers/net/dsa/mv88e6xxx/global2.h | 2 +
drivers/net/dsa/mv88e6xxx/global2_avb.c | 121 ++++++++++
drivers/net/dsa/mv88e6xxx/port.c | 18 ++
drivers/net/dsa/mv88e6xxx/port.h | 2 +
12 files changed, 959 insertions(+), 12 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile
index b0b08c6f159c6..6123b431e255e 100644
--- a/drivers/net/dsa/mv88e6xxx/Makefile
+++ b/drivers/net/dsa/mv88e6xxx/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx.o
-mv88e6xxx-objs := chip.o
+mv88e6xxx-objs := avb.o
+mv88e6xxx-objs += chip.o
mv88e6xxx-objs += devlink.o
mv88e6xxx-objs += global1.o
mv88e6xxx-objs += global1_atu.o
diff --git a/drivers/net/dsa/mv88e6xxx/avb.c b/drivers/net/dsa/mv88e6xxx/avb.c
new file mode 100644
index 0000000000000..d992ae560454c
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/avb.c
@@ -0,0 +1,221 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Marvell 88E6xxx Switch AVB support
+ *
+ * Copyright (c) 2024-2026 PADL Software Pty Ltd
+ */
+
+#include "avb.h"
+#include "chip.h"
+#include "global1.h"
+#include "global2.h"
+#include "port.h"
+
+static int mv88e6xxx_qav_read(struct mv88e6xxx_chip *chip, int addr,
+ u16 *data, int len)
+{
+ if (!chip->info->ops->avb_ops->qav_read)
+ return -EOPNOTSUPP;
+
+ return chip->info->ops->avb_ops->qav_read(chip, addr, data, len);
+}
+
+static int mv88e6xxx_qav_write(struct mv88e6xxx_chip *chip, int addr, u16 data)
+{
+ if (!chip->info->ops->avb_ops->qav_write)
+ return -EOPNOTSUPP;
+
+ return chip->info->ops->avb_ops->qav_write(chip, addr, data);
+}
+
+static int mv88e6xxx_avb_write(struct mv88e6xxx_chip *chip, int addr, u16 data)
+{
+ if (!chip->info->ops->avb_ops->avb_write)
+ return -EOPNOTSUPP;
+
+ return chip->info->ops->avb_ops->avb_write(chip, addr, data);
+}
+
+static int mv88e6xxx_port_avb_read(struct mv88e6xxx_chip *chip, int port,
+ int addr, u16 *data, int len)
+{
+ if (!chip->info->ops->avb_ops->port_avb_read)
+ return -EOPNOTSUPP;
+
+ return chip->info->ops->avb_ops->port_avb_read(chip, port, addr,
+ data, len);
+}
+
+static int mv88e6xxx_port_avb_write(struct mv88e6xxx_chip *chip, int port,
+ int addr, u16 data)
+{
+ if (!chip->info->ops->avb_ops->port_avb_write)
+ return -EOPNOTSUPP;
+
+ return chip->info->ops->avb_ops->port_avb_write(chip, port, addr, data);
+}
+
+static int mv88e6xxx_qav_set_iso_ptr(struct mv88e6xxx_chip *chip, u16 threshold)
+{
+ u16 data;
+ int err;
+
+ err = mv88e6xxx_qav_read(chip, MV88E6XXX_QAV_CFG, &data, 1);
+ if (err)
+ return err;
+
+ data &= ~(MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_MASK);
+ data |= MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_SET(threshold);
+
+ return mv88e6xxx_qav_write(chip, MV88E6XXX_QAV_CFG, data);
+}
+
+int mv88e6xxx_avb_set_port_avb_mode(struct mv88e6xxx_chip *chip,
+ int port, enum mv88e6xxx_avb_mode mode)
+{
+ u16 data;
+ int err;
+
+ err = mv88e6xxx_port_avb_read(chip, port, MV88E6XXX_PORT_AVB_CFG, &data, 1);
+ if (err)
+ return err;
+
+ data &= ~(MV88E6XXX_PORT_AVB_CFG_AVB_MODE |
+ MV88E6XXX_PORT_AVB_CFG_AVB_FILTER_BAD_AVB |
+ MV88E6XXX_PORT_AVB_CFG_AVB_DISCARD_BAD);
+
+ /* Enhanced mode additionally discards AVB-priority frames whose
+ * destination is not a reserved-stream (AVB_NRL) ATU entry, reserving
+ * the AVB traffic classes for reserved streams.
+ */
+ switch (mode) {
+ case MV88E6XXX_AVB_MODE_DISABLED:
+ data |= MV88E6XXX_PORT_AVB_CFG_AVB_MODE_LEGACY;
+ break;
+ case MV88E6XXX_AVB_MODE_STANDARD:
+ data |= MV88E6XXX_PORT_AVB_CFG_AVB_MODE_STANDARD;
+ break;
+ case MV88E6XXX_AVB_MODE_ENHANCED:
+ data |= MV88E6XXX_PORT_AVB_CFG_AVB_MODE_ENHANCED |
+ MV88E6XXX_PORT_AVB_CFG_AVB_FILTER_BAD_AVB |
+ MV88E6XXX_PORT_AVB_CFG_AVB_DISCARD_BAD;
+ break;
+ }
+
+ return mv88e6xxx_port_avb_write(chip, port, MV88E6XXX_PORT_AVB_CFG, data);
+}
+
+static u8 mv88e6xxx_mqprio_tc_fpri(const struct tc_mqprio_qopt *qopt, int tc)
+{
+ u8 fpri;
+
+ for (fpri = 0; fpri < IEEE_8021Q_MAX_PRIORITIES; fpri++)
+ if (qopt->prio_tc_map[fpri] == tc)
+ return fpri;
+
+ return 0;
+}
+
+static u16 mv88e6xxx_avb_pri_map_to_reg(const struct tc_mqprio_qopt *qopt)
+{
+ u8 hi_fpri = mv88e6xxx_mqprio_tc_fpri(qopt, MV88E6XXX_AVB_TC_HI);
+ u8 lo_fpri = mv88e6xxx_mqprio_tc_fpri(qopt, MV88E6XXX_AVB_TC_LO);
+ u8 hi_qpri = qopt->offset[MV88E6XXX_AVB_TC_HI];
+ u8 lo_qpri = qopt->offset[MV88E6XXX_AVB_TC_LO];
+
+ return MV88E6XXX_AVB_CFG_AVB_HI_FPRI_SET(hi_fpri) |
+ MV88E6XXX_AVB_CFG_AVB_HI_QPRI_SET(hi_qpri) |
+ MV88E6XXX_AVB_CFG_AVB_LO_FPRI_SET(lo_fpri) |
+ MV88E6XXX_AVB_CFG_AVB_LO_QPRI_SET(lo_qpri);
+}
+
+int mv88e6xxx_avb_enable(struct mv88e6xxx_chip *chip,
+ struct tc_mqprio_qopt_offload *mqprio)
+{
+ const struct mv88e6xxx_qav_info *qav = chip->info->qav;
+ enum mv88e6xxx_avb_mode mode;
+ int err, port;
+
+ if (!qav)
+ return -EOPNOTSUPP;
+
+ err = mv88e6xxx_qav_set_iso_ptr(chip, mv88e6xxx_num_ports(chip) << 6);
+ if (err)
+ return err;
+
+ /* interpret AVB_NRL bits in the ATU as STREAM_RESERVED */
+ err = mv88e6xxx_g1_atu_set_mac_avb(chip, true);
+ if (err)
+ goto err_iso_ptr;
+
+ err = mv88e6xxx_avb_write(chip, MV88E6XXX_AVB_CFG_AVB,
+ mv88e6xxx_avb_pri_map_to_reg(&mqprio->qopt));
+ if (err)
+ goto err_mac_avb;
+
+ /* A port with BR_FILTER_STREAM_RESERVED set uses enhanced AVB mode to
+ * reserve its AVB queues for reserved streams.
+ */
+ for (port = 0; port < mv88e6xxx_num_ports(chip); port++) {
+ if (!dsa_is_user_port(chip->ds, port))
+ continue;
+
+ mode = (chip->tc_policy.avb_enhanced_port_mask & BIT(port)) ?
+ MV88E6XXX_AVB_MODE_ENHANCED : MV88E6XXX_AVB_MODE_STANDARD;
+
+ err = mv88e6xxx_avb_set_port_avb_mode(chip, port, mode);
+ if (err)
+ goto err_port_mode;
+ }
+
+ return 0;
+
+err_port_mode:
+ while (port-- > 0) {
+ if (!dsa_is_user_port(chip->ds, port))
+ continue;
+
+ mv88e6xxx_avb_set_port_avb_mode(chip, port, MV88E6XXX_AVB_MODE_DISABLED);
+ }
+ mv88e6xxx_avb_write(chip, MV88E6XXX_AVB_CFG_AVB, qav->avb_pri_map);
+err_mac_avb:
+ mv88e6xxx_g1_atu_set_mac_avb(chip, false);
+err_iso_ptr:
+ mv88e6xxx_qav_set_iso_ptr(chip, 0);
+
+ return err;
+}
+
+int mv88e6xxx_avb_disable(struct mv88e6xxx_chip *chip)
+{
+ const struct mv88e6xxx_qav_info *qav = chip->info->qav;
+ int err, port;
+
+ if (!qav)
+ return -EOPNOTSUPP;
+
+ for (port = 0; port < mv88e6xxx_num_ports(chip); port++) {
+ if (!dsa_is_user_port(chip->ds, port))
+ continue;
+
+ err = mv88e6xxx_avb_set_port_avb_mode(chip, port,
+ MV88E6XXX_AVB_MODE_DISABLED);
+ if (err)
+ return err;
+ }
+
+ err = mv88e6xxx_avb_write(chip, MV88E6XXX_AVB_CFG_AVB, qav->avb_pri_map);
+ if (err)
+ return err;
+
+ err = mv88e6xxx_g1_atu_set_mac_avb(chip, false);
+ if (err)
+ return err;
+
+ err = mv88e6xxx_qav_set_iso_ptr(chip, 0);
+ if (err)
+ return err;
+
+ return 0;
+}
+
diff --git a/drivers/net/dsa/mv88e6xxx/avb.h b/drivers/net/dsa/mv88e6xxx/avb.h
new file mode 100644
index 0000000000000..0bd92724af4e2
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/avb.h
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Marvell 88E6xxx Switch AVB support
+ *
+ * Copyright (c) 2024-2026 PADL Software Pty Ltd
+ */
+
+#ifndef _MV88E6XXX_AVB_H
+#define _MV88E6XXX_AVB_H
+
+#include "chip.h"
+
+/* Global AVB registers */
+
+/* Offset 0x00: AVB Global Config */
+
+#define MV88E6XXX_AVB_CFG_AVB 0x00
+
+#define MV88E6XXX_AVB_CFG_AVB_HI_FPRI_MASK GENMASK(14, 12)
+#define MV88E6XXX_AVB_CFG_AVB_HI_FPRI_SET(p) FIELD_PREP(MV88E6XXX_AVB_CFG_AVB_HI_FPRI_MASK, p)
+
+#define MV88E6XXX_AVB_CFG_AVB_LO_FPRI_MASK GENMASK(6, 4)
+#define MV88E6XXX_AVB_CFG_AVB_LO_FPRI_SET(p) FIELD_PREP(MV88E6XXX_AVB_CFG_AVB_LO_FPRI_MASK, p)
+
+#define MV88E6XXX_AVB_CFG_AVB_HI_QPRI_MASK GENMASK(10, 8)
+#define MV88E6XXX_AVB_CFG_AVB_HI_QPRI_SET(p) FIELD_PREP(MV88E6XXX_AVB_CFG_AVB_HI_QPRI_MASK, p)
+
+#define MV88E6XXX_AVB_CFG_AVB_LO_QPRI_MASK GENMASK(2, 0)
+#define MV88E6XXX_AVB_CFG_AVB_LO_QPRI_SET(p) FIELD_PREP(MV88E6XXX_AVB_CFG_AVB_LO_QPRI_MASK, p)
+
+/* Global Qav registers */
+#define MV88E6XXX_QAV_CFG 0x00
+
+#define MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_MASK GENMASK(9, 0)
+#define MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_GET(x) FIELD_GET(MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_MASK, x)
+#define MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_SET(x) FIELD_PREP(MV88E6XXX_QAV_CFG_GLOBAL_ISO_PTR_MASK, x)
+
+/* allow mgmt frames in isochronous pointer pool */
+#define MV88E6XXX_QAV_CFG_ADMIT_MGMT 0x8000
+
+/* Per-port AVB registers */
+
+/* Offset 0x00: AVB Port Config */
+#define MV88E6XXX_PORT_AVB_CFG 0x00
+#define MV88E6XXX_PORT_AVB_CFG_AVB_MODE GENMASK(15, 14)
+/* all frames legacy (non-AVB) unless overridden */
+#define MV88E6XXX_PORT_AVB_CFG_AVB_MODE_LEGACY 0x0000
+/* AVB frames indicated by priority */
+#define MV88E6XXX_PORT_AVB_CFG_AVB_MODE_STANDARD 0x4000
+/* STANDARD && ATU has STATIC_AVB_NRL bit set */
+#define MV88E6XXX_PORT_AVB_CFG_AVB_MODE_ENHANCED 0x8000
+/* ENHANCED && source port in destination port vector */
+#define MV88E6XXX_PORT_AVB_CFG_AVB_MODE_SECURE 0xc000
+
+#define MV88E6XXX_PORT_AVB_CFG_AVB_OVERRIDE 0x2000
+#define MV88E6XXX_PORT_AVB_CFG_AVB_FILTER_BAD_AVB 0x1000
+#define MV88E6XXX_PORT_AVB_CFG_AVB_TUNNEL 0x0800
+#define MV88E6XXX_PORT_AVB_CFG_AVB_DISCARD_BAD 0x0400
+
+int mv88e6xxx_avb_enable(struct mv88e6xxx_chip *chip,
+ struct tc_mqprio_qopt_offload *mqprio);
+int mv88e6xxx_avb_disable(struct mv88e6xxx_chip *chip);
+
+/**
+ * enum mv88e6xxx_avb_mode - Current AVB mode
+ * @MV88E6XXX_AVB_MODE_DISABLED: No AVB TCs (DCB Qdisc) configured
+ * @MV88E6XXX_AVB_MODE_STANDARD: AVB configured, BR_FILTER_STREAM_RESERVED unset
+ * @MV88E6XXX_AVB_MODE_ENHANCED: AVB configured, BR_FILTER_STREAM_RESERVED set
+ */
+enum mv88e6xxx_avb_mode {
+ MV88E6XXX_AVB_MODE_DISABLED = 0,
+ MV88E6XXX_AVB_MODE_STANDARD,
+ MV88E6XXX_AVB_MODE_ENHANCED,
+};
+
+int mv88e6xxx_avb_set_port_avb_mode(struct mv88e6xxx_chip *chip, int port,
+ enum mv88e6xxx_avb_mode mode);
+
+#endif /* _MV88E6XXX_AVB_H */
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 2596e05681b43..db79302c2b84d 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -34,6 +34,7 @@
#include <net/dsa.h>
#include <net/pkt_sched.h>
+#include "avb.h"
#include "chip.h"
#include "devlink.h"
#include "global1.h"
@@ -1615,7 +1616,7 @@ static int mv88e6xxx_pri_setup(struct mv88e6xxx_chip *chip)
int err;
if (chip->info->ops->ieee_pri_map) {
- err = chip->info->ops->ieee_pri_map(chip);
+ err = chip->info->ops->ieee_pri_map(chip, NULL);
if (err)
return err;
}
@@ -5531,6 +5532,7 @@ static const struct mv88e6xxx_ops mv88e6390_ops = {
.port_set_cmode = mv88e6390_port_set_cmode,
.port_setup_message_port = mv88e6xxx_setup_message_port,
.port_set_scheduling_mode = mv88e6390_port_set_scheduling_mode,
+ .port_ieee_pri_map = mv88e6390_port_ieee_pri_map,
.stats_snapshot = mv88e6390_g1_stats_snapshot,
.stats_set_histogram = mv88e6390_g1_stats_set_histogram,
.stats_get_sset_count = mv88e6320_stats_get_sset_count,
@@ -5596,6 +5598,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
.port_set_cmode = mv88e6390x_port_set_cmode,
.port_setup_message_port = mv88e6xxx_setup_message_port,
.port_set_scheduling_mode = mv88e6390_port_set_scheduling_mode,
+ .port_ieee_pri_map = mv88e6390_port_ieee_pri_map,
.stats_snapshot = mv88e6390_g1_stats_snapshot,
.stats_set_histogram = mv88e6390_g1_stats_set_histogram,
.stats_get_sset_count = mv88e6320_stats_get_sset_count,
@@ -5653,6 +5656,7 @@ static const struct mv88e6xxx_ops mv88e6393x_ops = {
.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
.port_set_scheduling_mode = mv88e6390_port_set_scheduling_mode,
+ .port_ieee_pri_map = mv88e6390_port_ieee_pri_map,
.port_pause_limit = mv88e6390_port_pause_limit,
.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -5700,6 +5704,10 @@ static const struct mv88e6xxx_qav_info mv88e6352_qav_info = {
.rate_mask = GENMASK(14, 0),
.hi_limit_mask = GENMASK(14, 0),
.queue_mask = GENMASK(3, 0),
+ /* legacy (all queues), lo (queue 1/2), hi (queue 2/3) */
+ .avb_queue_mask = { GENMASK(3, 0), GENMASK(2, 1), GENMASK(3, 2) },
+ /* HI FPri 5/QPri 3, LO FPri 4/QPri 2 */
+ .avb_pri_map = 0x5342,
};
static const struct mv88e6xxx_qav_info mv88e6341_qav_info = {
@@ -5707,6 +5715,10 @@ static const struct mv88e6xxx_qav_info mv88e6341_qav_info = {
.rate_mask = GENMASK(15, 0),
.hi_limit_mask = GENMASK(14, 0),
.queue_mask = GENMASK(3, 0),
+ /* legacy (all queues), lo (queue 1/2), hi (queue 2/3) */
+ .avb_queue_mask = { GENMASK(3, 0), GENMASK(2, 1), GENMASK(3, 2) },
+ /* HI FPri 5/QPri 3, LO FPri 4/QPri 2 */
+ .avb_pri_map = 0x5342,
};
static const struct mv88e6xxx_qav_info mv88e6390_qav_info = {
@@ -5714,6 +5726,10 @@ static const struct mv88e6xxx_qav_info mv88e6390_qav_info = {
.rate_mask = GENMASK(15, 0),
.hi_limit_mask = GENMASK(13, 0),
.queue_mask = GENMASK(7, 0),
+ /* AVB traffic allowed on all queues */
+ .avb_queue_mask = { GENMASK(7, 0), GENMASK(7, 0), GENMASK(7, 0) },
+ /* HI FPri 3/QPri 7, LO FPri 2/QPri 6 */
+ .avb_pri_map = 0x3726,
};
static const struct mv88e6xxx_info mv88e6xxx_table[] = {
@@ -6878,7 +6894,8 @@ static int mv88e6xxx_port_pre_bridge_flags(struct dsa_switch *ds, int port,
const struct mv88e6xxx_ops *ops;
if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
- BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB))
+ BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB |
+ BR_FILTER_STREAM_RESERVED))
return -EINVAL;
ops = chip->info->ops;
@@ -6889,6 +6906,30 @@ static int mv88e6xxx_port_pre_bridge_flags(struct dsa_switch *ds, int port,
if ((flags.mask & BR_MCAST_FLOOD) && !ops->port_set_mcast_flood)
return -EINVAL;
+ if ((flags.mask & BR_FILTER_STREAM_RESERVED) && !chip->info->qav)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int mv88e6xxx_port_set_avb_mode(struct mv88e6xxx_chip *chip, int port,
+ bool enhanced)
+{
+ int err;
+
+ if (chip->tc_policy.tc_mode == MV88E6XXX_TC_MODE_AVB) {
+ err = mv88e6xxx_avb_set_port_avb_mode(chip, port,
+ enhanced ? MV88E6XXX_AVB_MODE_ENHANCED :
+ MV88E6XXX_AVB_MODE_STANDARD);
+ if (err)
+ return err;
+ }
+
+ if (enhanced)
+ chip->tc_policy.avb_enhanced_port_mask |= BIT(port);
+ else
+ chip->tc_policy.avb_enhanced_port_mask &= ~BIT(port);
+
return 0;
}
@@ -6949,6 +6990,14 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port,
if (err)
goto out;
}
+
+ if (flags.mask & BR_FILTER_STREAM_RESERVED) {
+ bool enhanced = !!(flags.val & BR_FILTER_STREAM_RESERVED);
+
+ err = mv88e6xxx_port_set_avb_mode(chip, port, enhanced);
+ if (err)
+ goto out;
+ }
out:
mv88e6xxx_reg_unlock(chip);
@@ -7236,6 +7285,355 @@ static int mv88e6xxx_crosschip_lag_leave(struct dsa_switch *ds, int sw_index,
return err_sync ? : err_pvt;
}
+static int mv88e6xxx_tc_query_caps(struct tc_query_caps_base *base)
+{
+ switch (base->type) {
+ case TC_SETUP_QDISC_MQPRIO: {
+ struct tc_mqprio_caps *caps = base->caps;
+
+ caps->validate_queue_counts = true;
+
+ return 0;
+ }
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int mv88e6xxx_validate_tc_mqprio_avb(const struct mv88e6xxx_chip *chip,
+ const struct tc_mqprio_qopt_offload *mqprio,
+ u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES])
+{
+ const struct mv88e6xxx_qav_info *qav = chip->info->qav;
+ const struct tc_mqprio_qopt *qopt = &mqprio->qopt;
+ int tc0_qcount, tc0_base_qpri, tc0_fpri_per_qpri;
+ struct netlink_ext_ack *extack = mqprio->extack;
+ u8 ieee_pri_map_set = 0;
+ int tc, fpri;
+
+ if (!qav || !chip->info->ops->avb_ops) {
+ NL_SET_ERR_MSG_MOD(extack, "chip does not support MQPRIO DCB offload");
+ return -EOPNOTSUPP;
+ } else if (mqprio->shaper != TC_MQPRIO_SHAPER_DCB) {
+ NL_SET_ERR_MSG_MOD(extack, "only DCB shaper is supported for AVB mode");
+ return -EOPNOTSUPP;
+ } else if (qopt->num_tc > MV88E6XXX_AVB_TC_MAX + 1) {
+ NL_SET_ERR_MSG_MOD(extack, "too many traffic classes for AVB mode");
+ return -EOPNOTSUPP;
+ }
+
+ /* Validate and map TCs to QPri */
+ for (tc = MV88E6XXX_AVB_TC_LEGACY; tc < qopt->num_tc; tc++) {
+ if (qopt->offset[tc] + qopt->count[tc] > chip->info->num_tx_queues) {
+ NL_SET_ERR_MSG_FMT_MOD(extack, "queue %d out of range",
+ qopt->offset[tc] + qopt->count[tc] - 1);
+ return -EOPNOTSUPP;
+ }
+
+ if (tc == MV88E6XXX_AVB_TC_LEGACY) {
+ if (qopt->count[tc] == 0) {
+ NL_SET_ERR_MSG_MOD(extack, "TC0 must have at least one queue");
+ return -ERANGE;
+ }
+
+ tc0_base_qpri = qopt->offset[tc];
+ tc0_fpri_per_qpri = DIV_ROUND_UP(IEEE_8021Q_MAX_PRIORITIES - 2,
+ qopt->count[tc]);
+ } else if (qopt->count[tc] != 1) {
+ NL_SET_ERR_MSG_FMT_MOD(extack, "only one queue supported for TC%d", tc);
+ return -EOPNOTSUPP;
+ } else if ((qav->avb_queue_mask[tc] & BIT(qopt->offset[tc])) == 0) {
+ NL_SET_ERR_MSG_FMT_MOD(extack, "queue %d not valid for TC%d",
+ qopt->offset[tc], tc);
+ return -EOPNOTSUPP;
+ }
+ }
+
+ /* Validate and map FPri to QPri: AVB FPris are mapped to a single QPri,
+ * with the remaining legacy (TC0) FPris being distributed amongst the
+ * remaining QPris.
+ */
+ for (fpri = 0, tc0_qcount = 0; fpri < IEEE_8021Q_MAX_PRIORITIES; fpri++) {
+ tc = qopt->prio_tc_map[fpri];
+
+ if (tc == MV88E6XXX_AVB_TC_LEGACY) {
+ ieee_pri_map[fpri] = tc0_base_qpri + (tc0_qcount++ / tc0_fpri_per_qpri);
+ continue;
+ }
+
+ if (ieee_pri_map_set & BIT(tc)) {
+ NL_SET_ERR_MSG_FMT_MOD(extack,
+ "only one frame priority can be mapped to TC%d", tc);
+ return -EOPNOTSUPP;
+ }
+
+ ieee_pri_map_set |= BIT(tc);
+ ieee_pri_map[fpri] = qopt->offset[tc];
+ }
+
+ if (ieee_pri_map_set != GENMASK(MV88E6XXX_AVB_TC_HI, MV88E6XXX_AVB_TC_LO)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "both TC1 and TC2 must have 802.1p priorities assigned");
+ return -EOPNOTSUPP;
+ }
+
+ return qopt->num_tc;
+}
+
+static int mv88e6xxx_validate_tc_mqprio_qpri(const struct mv88e6xxx_chip *chip,
+ const struct tc_mqprio_qopt_offload *mqprio,
+ u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES])
+{
+ const struct tc_mqprio_qopt *qopt = &mqprio->qopt;
+ struct netlink_ext_ack *extack = mqprio->extack;
+ int tc, qpri, fpri;
+
+ if (mqprio->flags & (TC_MQPRIO_F_MIN_RATE | TC_MQPRIO_F_MAX_RATE)) {
+ NL_SET_ERR_MSG_MOD(extack, "per-queue rate limiting is not supported");
+ return -EOPNOTSUPP;
+ }
+
+ for (tc = 0; tc < qopt->num_tc; tc++) {
+ if (qopt->offset[tc] + qopt->count[tc] > chip->info->num_tx_queues) {
+ NL_SET_ERR_MSG_FMT_MOD(extack, "queue %d out of range",
+ qopt->offset[tc] + qopt->count[tc] - 1);
+ return -EOPNOTSUPP;
+ }
+ }
+
+ for (fpri = 0; fpri < IEEE_8021Q_MAX_PRIORITIES; fpri++) {
+ qpri = qopt->prio_tc_map[fpri];
+ ieee_pri_map[fpri] = qopt->offset[qpri];
+ }
+
+ return qopt->num_tc;
+}
+
+static inline enum mv88e6xxx_tc_mode
+mv88e6xxx_mqprio_get_tc_mode(const struct tc_mqprio_qopt_offload *mqprio)
+{
+ switch (mqprio->mode) {
+ case TC_MQPRIO_MODE_DCB:
+ return MV88E6XXX_TC_MODE_AVB;
+ case TC_MQPRIO_MODE_CHANNEL:
+ return MV88E6XXX_TC_MODE_QPRI;
+ default:
+ return 0;
+ }
+}
+
+static int mv88e6xxx_validate_tc_mqprio(const struct mv88e6xxx_chip *chip,
+ const struct tc_mqprio_qopt_offload *mqprio,
+ enum mv88e6xxx_tc_mode *tc_mode,
+ u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES])
+{
+ const struct tc_mqprio_qopt *qopt = &mqprio->qopt;
+ struct netlink_ext_ack *extack = mqprio->extack;
+ int err;
+
+ if (qopt->num_tc == 0) {
+ *tc_mode = chip->tc_policy.tc_mode;
+ return 0;
+ }
+
+ if (qopt->hw != TC_MQPRIO_HW_OFFLOAD_TCS) {
+ NL_SET_ERR_MSG_MOD(extack, "only full TC hardware offload is supported");
+ return -EOPNOTSUPP;
+ } else if (mqprio->preemptible_tcs) {
+ NL_SET_ERR_MSG_MOD(extack, "frame preemption is not supported");
+ return -EOPNOTSUPP;
+ }
+
+ *tc_mode = mv88e6xxx_mqprio_get_tc_mode(mqprio);
+
+ switch (*tc_mode) {
+ case MV88E6XXX_TC_MODE_AVB:
+ err = mv88e6xxx_validate_tc_mqprio_avb(chip, mqprio, ieee_pri_map);
+ break;
+ case MV88E6XXX_TC_MODE_QPRI:
+ err = mv88e6xxx_validate_tc_mqprio_qpri(chip, mqprio, ieee_pri_map);
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ return err;
+}
+
+static int mv88e6xxx_set_port_ieee_pri_map(struct mv88e6xxx_chip *chip,
+ int port, const u8 *ieee_pri_map)
+{
+ if (!chip->info->ops->port_ieee_pri_map)
+ return -EOPNOTSUPP;
+
+ return chip->info->ops->port_ieee_pri_map(chip, port, ieee_pri_map);
+}
+
+static int mv88e6xxx_set_ieee_pri_map(struct mv88e6xxx_chip *chip,
+ const u8 *ieee_pri_map)
+{
+ int port, err;
+
+ if (chip->info->ops->ieee_pri_map)
+ return chip->info->ops->ieee_pri_map(chip, ieee_pri_map);
+
+ if (!chip->info->ops->port_ieee_pri_map)
+ return -EOPNOTSUPP;
+
+ for (port = 0; port < mv88e6xxx_num_ports(chip); port++) {
+ if (!dsa_is_user_port(chip->ds, port))
+ continue;
+
+ err = chip->info->ops->port_ieee_pri_map(chip, port, ieee_pri_map);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static inline bool mv88e6xxx_tc_mode_map_equal(struct mv88e6xxx_chip *chip,
+ enum mv88e6xxx_tc_mode tc_mode,
+ const u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES])
+{
+ if (memcmp(ieee_pri_map, chip->tc_policy.ieee_pri_map,
+ IEEE_8021Q_MAX_PRIORITIES))
+ return false;
+
+ if (chip->tc_policy.tc_mode != tc_mode)
+ return false;
+
+ return true;
+}
+
+static void mv88e6xxx_mqprio_update_policy(struct mv88e6xxx_tc_policy *pol,
+ int port, int num_tc,
+ enum mv88e6xxx_tc_mode tc_mode)
+{
+ if (num_tc) {
+ pol->tc_port_mask |= BIT(port);
+ pol->tc_mode = tc_mode;
+ } else {
+ pol->tc_port_mask &= ~BIT(port);
+ if (!pol->tc_port_mask)
+ pol->tc_mode = MV88E6XXX_TC_MODE_NONE;
+ }
+}
+
+static int mv88e6xxx_mqprio_netdev_set_tc(struct net_device *user,
+ const struct tc_mqprio_qopt *qopt,
+ int num_tc)
+{
+ int err, tc;
+
+ err = netdev_set_num_tc(user, num_tc);
+ if (err)
+ return err;
+
+ for (tc = 0; tc < num_tc; tc++) {
+ err = netdev_set_tc_queue(user, tc, qopt->count[tc],
+ qopt->offset[tc]);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int mv88e6xxx_setup_tc_mqprio(struct dsa_switch *ds, int port,
+ struct tc_mqprio_qopt_offload *mqprio)
+{
+ struct netlink_ext_ack *extack = mqprio->extack;
+ u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES];
+ struct mv88e6xxx_chip *chip = ds->priv;
+ struct mv88e6xxx_tc_policy *pol;
+ enum mv88e6xxx_tc_mode tc_mode;
+ struct net_device *user;
+ bool can_update_pol;
+ bool per_port_pol;
+ int num_tc, err;
+
+ if (!dsa_is_user_port(ds, port))
+ return -EINVAL;
+
+ num_tc = mv88e6xxx_validate_tc_mqprio(chip, mqprio, &tc_mode, ieee_pri_map);
+ if (num_tc < 0)
+ return num_tc;
+
+ user = dsa_to_port(ds, port)->user;
+
+ per_port_pol = (tc_mode == MV88E6XXX_TC_MODE_QPRI &&
+ chip->info->ops->port_ieee_pri_map);
+
+ mv88e6xxx_reg_lock(chip);
+
+ pol = &chip->tc_policy;
+
+ if (num_tc && pol->tc_mode && pol->tc_mode != tc_mode) {
+ NL_SET_ERR_MSG_MOD(extack, "all switch ports must use the same MQPRIO mode");
+ err = -EOPNOTSUPP;
+ goto err_unlock;
+ }
+
+ can_update_pol = per_port_pol ||
+ !pol->tc_port_mask || pol->tc_port_mask == BIT(port);
+ if (!can_update_pol && num_tc &&
+ !mv88e6xxx_tc_mode_map_equal(chip, tc_mode, ieee_pri_map)) {
+ NL_SET_ERR_MSG_MOD(extack, "only a single priority mapping supported per switch");
+ err = -EOPNOTSUPP;
+ goto err_unlock;
+ }
+
+ err = mv88e6xxx_mqprio_netdev_set_tc(user, &mqprio->qopt, num_tc);
+ if (err)
+ goto err_reset_tc;
+
+ if (can_update_pol) {
+ const u8 *map = num_tc ? ieee_pri_map : NULL;
+
+ if (per_port_pol)
+ err = mv88e6xxx_set_port_ieee_pri_map(chip, port, map);
+ else
+ err = mv88e6xxx_set_ieee_pri_map(chip, map);
+ if (err) {
+ NL_SET_ERR_MSG_FMT_MOD(extack, "p%d: failed to %s priority mapping",
+ port, num_tc ? "enable" : "disable");
+ goto err_reset_tc;
+ }
+
+ if (tc_mode == MV88E6XXX_TC_MODE_AVB) {
+ err = num_tc ? mv88e6xxx_avb_enable(chip, mqprio)
+ : mv88e6xxx_avb_disable(chip);
+ if (err) {
+ NL_SET_ERR_MSG_FMT_MOD(extack, "failed to %s AVB",
+ num_tc ? "enable" : "disable");
+ goto err_reset_pri_map;
+ }
+ }
+ }
+
+ mv88e6xxx_mqprio_update_policy(pol, port, num_tc, tc_mode);
+
+ if (num_tc && can_update_pol && !per_port_pol)
+ memcpy(pol->ieee_pri_map, ieee_pri_map, sizeof(ieee_pri_map));
+ else if (!pol->tc_port_mask)
+ memset(pol->ieee_pri_map, 0, sizeof(ieee_pri_map));
+
+ mv88e6xxx_reg_unlock(chip);
+
+ return 0;
+
+err_reset_pri_map:
+ mv88e6xxx_set_ieee_pri_map(chip, NULL);
+err_reset_tc:
+ netdev_reset_tc(user);
+err_unlock:
+ mv88e6xxx_reg_unlock(chip);
+
+ return err;
+}
+
static int mv88e6xxx_setup_tc_cbs(struct dsa_switch *ds, int port,
struct tc_cbs_qopt_offload *cbs)
{
@@ -7324,6 +7722,10 @@ static int mv88e6xxx_port_setup_tc(struct dsa_switch *ds, int port,
enum tc_setup_type type, void *type_data)
{
switch (type) {
+ case TC_QUERY_CAPS:
+ return mv88e6xxx_tc_query_caps(type_data);
+ case TC_SETUP_QDISC_MQPRIO:
+ return mv88e6xxx_setup_tc_mqprio(ds, port, type_data);
case TC_SETUP_QDISC_CBS:
return mv88e6xxx_setup_tc_cbs(ds, port, type_data);
default:
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index afcf88fd02a03..11f895ad99f3f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -8,6 +8,7 @@
#ifndef _MV88E6XXX_CHIP_H
#define _MV88E6XXX_CHIP_H
+#include <linux/dcbnl.h> /* for IEEE_8021Q_MAX_PRIORITIES */
#include <linux/idr.h>
#include <linux/if_vlan.h>
#include <linux/irq.h>
@@ -19,6 +20,7 @@
#include <linux/ptp_clock_kernel.h>
#include <linux/timecounter.h>
#include <net/dsa.h>
+#include <net/pkt_sched.h>
#define EDSA_HLEN 8
#define MV88E6XXX_N_FID 4096
@@ -252,6 +254,45 @@ struct mv88e6xxx_port_hwtstamp {
struct kernel_hwtstamp_config tstamp_config;
};
+/**
+ * enum mv88e6xxx_avb_tc - Traffic class values for AVB mode
+ * @MV88E6XXX_AVB_TC_LEGACY: Non-AVB traffic
+ * @MV88E6XXX_AVB_TC_LO: Low priority AVB (Class B)
+ * @MV88E6XXX_AVB_TC_HI: High priority AVB (Class A)
+ */
+enum mv88e6xxx_avb_tc {
+ MV88E6XXX_AVB_TC_LEGACY = 0,
+ MV88E6XXX_AVB_TC_LO = 1,
+ MV88E6XXX_AVB_TC_HI = 2,
+ MV88E6XXX_AVB_TC_MAX = MV88E6XXX_AVB_TC_HI,
+};
+
+/**
+ * enum mv88e6xxx_tc_mode - Current MQPRIO mode
+ * @MV88E6XXX_TC_MODE_NONE: No MQPRIO Qdisc configured
+ * @MV88E6XXX_TC_MODE_AVB: DCB Qdisc configured, TCs are AVB classes
+ * @MV88E6XXX_TC_MODE_QPRI: Channel Qdisc configured, TCs are QPris
+ */
+enum mv88e6xxx_tc_mode {
+ MV88E6XXX_TC_MODE_NONE = 0,
+ MV88E6XXX_TC_MODE_AVB,
+ MV88E6XXX_TC_MODE_QPRI,
+};
+
+struct mv88e6xxx_tc_policy {
+ /* Current MQPRIO mode */
+ enum mv88e6xxx_tc_mode tc_mode;
+
+ /* Ports with MQPRIO TC installed */
+ u16 tc_port_mask;
+
+ /* Ports in enhanced AVB mode (BR_FILTER_STREAM_RESERVED set) */
+ u16 avb_enhanced_port_mask;
+
+ /* FPri/QPri mapping */
+ u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES];
+};
+
enum mv88e6xxx_policy_mapping {
MV88E6XXX_POLICY_MAPPING_DA,
MV88E6XXX_POLICY_MAPPING_SA,
@@ -463,6 +504,9 @@ struct mv88e6xxx_chip {
/* TCAM entries */
struct mv88e6xxx_tcam tcam;
+ /* Global MQPRIO traffic class configuration */
+ struct mv88e6xxx_tc_policy tc_policy;
+
/* Global2 scratch register config data3 */
u8 g2_scratch_config3;
};
@@ -512,7 +556,10 @@ struct mv88e6xxx_ops {
*/
int (*setup_errata)(struct mv88e6xxx_chip *chip);
- int (*ieee_pri_map)(struct mv88e6xxx_chip *chip);
+ /* Setup IEEE FPri to QPri mapping. ieee_pri_map is NULL to reset,
+ * or an array of IEEE_8021Q_MAX_PRIORITIES frame priorities.
+ */
+ int (*ieee_pri_map)(struct mv88e6xxx_chip *chip, const u8 *ieee_pri_map);
int (*ip_pri_map)(struct mv88e6xxx_chip *chip);
/* Ingress Rate Limit unit (IRL) operations */
@@ -629,6 +676,9 @@ struct mv88e6xxx_ops {
phy_interface_t mode);
int (*port_get_cmode)(struct mv88e6xxx_chip *chip, int port, u8 *cmode);
+ int (*port_ieee_pri_map)(struct mv88e6xxx_chip *chip, int port,
+ const u8 *ieee_pri_map);
+
/* LED control */
int (*port_setup_leds)(struct mv88e6xxx_chip *chip, int port);
@@ -777,6 +827,20 @@ struct mv88e6xxx_avb_ops {
/* Access port-scoped 802.1Qav registers */
int (*port_qav_write)(struct mv88e6xxx_chip *chip, int port, int addr,
u16 data);
+
+ /* Access global Class Shaping and Pacing registers */
+ int (*qav_read)(struct mv88e6xxx_chip *chip, int addr, u16 *data,
+ int len);
+ int (*qav_write)(struct mv88e6xxx_chip *chip, int addr, u16 data);
+
+ /* Access port-scoped Audio Video Bridging registers */
+ int (*port_avb_read)(struct mv88e6xxx_chip *chip, int port, int addr,
+ u16 *data, int len);
+ int (*port_avb_write)(struct mv88e6xxx_chip *chip, int port, int addr,
+ u16 data);
+
+ /* Access global Audio Video Bridging registers */
+ int (*avb_write)(struct mv88e6xxx_chip *chip, int addr, u16 data);
};
struct mv88e6xxx_ptp_ops {
@@ -817,6 +881,8 @@ struct mv88e6xxx_qav_info {
u16 rate_mask; /* QPri Rate valid bits mask */
u16 hi_limit_mask; /* Qpri Hi Limit bits mask*/
u8 queue_mask; /* supported queues bitmask */
+ u8 avb_queue_mask[MV88E6XXX_AVB_TC_MAX + 1]; /* AVB supported queues bitmask */
+ u16 avb_pri_map; /* default AVB FPri to QPri map */
};
static inline bool mv88e6xxx_has_stu(struct mv88e6xxx_chip *chip)
diff --git a/drivers/net/dsa/mv88e6xxx/global1.c b/drivers/net/dsa/mv88e6xxx/global1.c
index 9820cd5967574..e9a3db0a7c2c6 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.c
+++ b/drivers/net/dsa/mv88e6xxx/global1.c
@@ -356,16 +356,32 @@ int mv88e6085_g1_ip_pri_map(struct mv88e6xxx_chip *chip)
/* Offset 0x18: IEEE-PRI Register */
-int mv88e6085_g1_ieee_pri_map(struct mv88e6xxx_chip *chip)
+static int mv88e6xxx_g1_set_ieee_pri_map(struct mv88e6xxx_chip *chip,
+ const u8 *map)
{
- /* Reset the IEEE Tag priorities to defaults */
- return mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IEEE_PRI, 0xfa41);
+ u16 val = 0;
+ u8 fpri;
+
+ for (fpri = 0; fpri < IEEE_8021Q_MAX_PRIORITIES; fpri++)
+ val |= (map[fpri] & 0x3) << (2 * fpri);
+
+ return mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IEEE_PRI, val);
}
-int mv88e6250_g1_ieee_pri_map(struct mv88e6xxx_chip *chip)
+static const u8 mv88e6085_default_ieee_pri_map[] = { 1, 0, 0, 1, 2, 2, 3, 3 };
+
+int mv88e6085_g1_ieee_pri_map(struct mv88e6xxx_chip *chip, const u8 *map)
{
- /* Reset the IEEE Tag priorities to defaults */
- return mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IEEE_PRI, 0xfa50);
+ return mv88e6xxx_g1_set_ieee_pri_map(chip,
+ map ? map : mv88e6085_default_ieee_pri_map);
+}
+
+static const u8 mv88e6250_default_ieee_pri_map[] = { 0, 0, 1, 1, 2, 2, 3, 3 };
+
+int mv88e6250_g1_ieee_pri_map(struct mv88e6xxx_chip *chip, const u8 *map)
+{
+ return mv88e6xxx_g1_set_ieee_pri_map(chip,
+ map ? map : mv88e6250_default_ieee_pri_map);
}
/* Offset 0x1a: Monitor Control */
diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
index 3dbb7a1b8fe11..9456ee6f65165 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.h
+++ b/drivers/net/dsa/mv88e6xxx/global1.h
@@ -111,6 +111,7 @@
/* Offset 0x0A: ATU Control Register */
#define MV88E6XXX_G1_ATU_CTL 0x0a
+#define MV88E6XXX_G1_ATU_CTL_MAC_AVB 0x8000
#define MV88E6XXX_G1_ATU_CTL_LEARN2ALL 0x0008
#define MV88E6161_G1_ATU_CTL_HASH_MASK 0x0003
@@ -310,8 +311,8 @@ int mv88e6390_g1_mgmt_rsvd2cpu(struct mv88e6xxx_chip *chip);
int mv88e6085_g1_ip_pri_map(struct mv88e6xxx_chip *chip);
-int mv88e6085_g1_ieee_pri_map(struct mv88e6xxx_chip *chip);
-int mv88e6250_g1_ieee_pri_map(struct mv88e6xxx_chip *chip);
+int mv88e6085_g1_ieee_pri_map(struct mv88e6xxx_chip *chip, const u8 *map);
+int mv88e6250_g1_ieee_pri_map(struct mv88e6xxx_chip *chip, const u8 *map);
int mv88e6185_g1_set_cascade_port(struct mv88e6xxx_chip *chip, int port);
@@ -322,6 +323,7 @@ int mv88e6390_g1_rmu_disable(struct mv88e6xxx_chip *chip);
int mv88e6xxx_g1_set_device_number(struct mv88e6xxx_chip *chip, int index);
int mv88e6xxx_g1_atu_set_learn2all(struct mv88e6xxx_chip *chip, bool learn2all);
+int mv88e6xxx_g1_atu_set_mac_avb(struct mv88e6xxx_chip *chip, bool mac_avb);
int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip,
unsigned int msecs);
int mv88e6xxx_g1_atu_getnext(struct mv88e6xxx_chip *chip, u16 fid,
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c
index c47f068f56b32..429a1ee44e47d 100644
--- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
+++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
@@ -41,6 +41,23 @@ int mv88e6xxx_g1_atu_set_learn2all(struct mv88e6xxx_chip *chip, bool learn2all)
return mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_CTL, val);
}
+int mv88e6xxx_g1_atu_set_mac_avb(struct mv88e6xxx_chip *chip, bool mac_avb)
+{
+ u16 val;
+ int err;
+
+ err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &val);
+ if (err)
+ return err;
+
+ if (mac_avb)
+ val |= MV88E6XXX_G1_ATU_CTL_MAC_AVB;
+ else
+ val &= ~MV88E6XXX_G1_ATU_CTL_MAC_AVB;
+
+ return mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_CTL, val);
+}
+
int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip,
unsigned int msecs)
{
diff --git a/drivers/net/dsa/mv88e6xxx/global2.h b/drivers/net/dsa/mv88e6xxx/global2.h
index c2b9baf0a9371..bc7bc167a3c51 100644
--- a/drivers/net/dsa/mv88e6xxx/global2.h
+++ b/drivers/net/dsa/mv88e6xxx/global2.h
@@ -176,9 +176,11 @@
#define MV88E6352_G2_AVB_CMD_PORT_TAIGLOBAL 0xe
#define MV88E6165_G2_AVB_CMD_PORT_PTPGLOBAL 0xf
#define MV88E6352_G2_AVB_CMD_PORT_PTPGLOBAL 0xf
+#define MV88E6352_G2_AVB_CMD_PORT_AVBGLOBAL 0xf
#define MV88E6390_G2_AVB_CMD_PORT_MASK 0x1f00
#define MV88E6390_G2_AVB_CMD_PORT_TAIGLOBAL 0x1e
#define MV88E6390_G2_AVB_CMD_PORT_PTPGLOBAL 0x1f
+#define MV88E6390_G2_AVB_CMD_PORT_AVBGLOBAL 0x1f
#define MV88E6352_G2_AVB_CMD_BLOCK_PTP 0
#define MV88E6352_G2_AVB_CMD_BLOCK_AVB 1
#define MV88E6352_G2_AVB_CMD_BLOCK_QAV 2
diff --git a/drivers/net/dsa/mv88e6xxx/global2_avb.c b/drivers/net/dsa/mv88e6xxx/global2_avb.c
index 6b54e275d21ab..fe1607bc78734 100644
--- a/drivers/net/dsa/mv88e6xxx/global2_avb.c
+++ b/drivers/net/dsa/mv88e6xxx/global2_avb.c
@@ -119,6 +119,27 @@ static int mv88e6352_g2_avb_port_qav_write(struct mv88e6xxx_chip *chip,
return mv88e6xxx_g2_avb_write(chip, writeop, data);
}
+static int mv88e6352_g2_avb_port_avb_read(struct mv88e6xxx_chip *chip,
+ int port, int addr, u16 *data,
+ int len)
+{
+ u16 readop = (len == 1 ? MV88E6352_G2_AVB_CMD_OP_READ :
+ MV88E6352_G2_AVB_CMD_OP_READ_INCR) |
+ (port << 8) | (MV88E6352_G2_AVB_CMD_BLOCK_AVB << 5) |
+ addr;
+
+ return mv88e6xxx_g2_avb_read(chip, readop, data, len);
+}
+
+static int mv88e6352_g2_avb_port_avb_write(struct mv88e6xxx_chip *chip,
+ int port, int addr, u16 data)
+{
+ u16 writeop = MV88E6352_G2_AVB_CMD_OP_WRITE | (port << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_AVB << 5) | addr;
+
+ return mv88e6xxx_g2_avb_write(chip, writeop, data);
+}
+
static int mv88e6352_g2_avb_ptp_read(struct mv88e6xxx_chip *chip, int addr,
u16 *data, int len)
{
@@ -151,6 +172,38 @@ static int mv88e6352_g2_avb_tai_write(struct mv88e6xxx_chip *chip, int addr,
addr, data);
}
+static int mv88e6352_g2_avb_qav_read(struct mv88e6xxx_chip *chip, int addr,
+ u16 *data, int len)
+{
+ u16 readop = (len == 1 ? MV88E6352_G2_AVB_CMD_OP_READ :
+ MV88E6352_G2_AVB_CMD_OP_READ_INCR) |
+ (MV88E6352_G2_AVB_CMD_PORT_AVBGLOBAL << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_QAV << 5) |
+ addr;
+
+ return mv88e6xxx_g2_avb_read(chip, readop, data, len);
+}
+
+static int mv88e6352_g2_avb_qav_write(struct mv88e6xxx_chip *chip, int addr,
+ u16 data)
+{
+ u16 writeop = MV88E6352_G2_AVB_CMD_OP_WRITE |
+ (MV88E6352_G2_AVB_CMD_PORT_AVBGLOBAL << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_QAV << 5) | addr;
+
+ return mv88e6xxx_g2_avb_write(chip, writeop, data);
+}
+
+static int mv88e6352_g2_avb_avb_write(struct mv88e6xxx_chip *chip, int addr,
+ u16 data)
+{
+ u16 writeop = MV88E6352_G2_AVB_CMD_OP_WRITE |
+ (MV88E6352_G2_AVB_CMD_PORT_AVBGLOBAL << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_AVB << 5) | addr;
+
+ return mv88e6xxx_g2_avb_write(chip, writeop, data);
+}
+
const struct mv88e6xxx_avb_ops mv88e6352_avb_ops = {
.port_ptp_read = mv88e6352_g2_avb_port_ptp_read,
.port_ptp_write = mv88e6352_g2_avb_port_ptp_write,
@@ -159,6 +212,11 @@ const struct mv88e6xxx_avb_ops mv88e6352_avb_ops = {
.tai_read = mv88e6352_g2_avb_tai_read,
.tai_write = mv88e6352_g2_avb_tai_write,
.port_qav_write = mv88e6352_g2_avb_port_qav_write,
+ .qav_read = mv88e6352_g2_avb_qav_read,
+ .qav_write = mv88e6352_g2_avb_qav_write,
+ .port_avb_read = mv88e6352_g2_avb_port_avb_read,
+ .port_avb_write = mv88e6352_g2_avb_port_avb_write,
+ .avb_write = mv88e6352_g2_avb_avb_write,
};
static int mv88e6165_g2_avb_tai_read(struct mv88e6xxx_chip *chip, int addr,
@@ -185,6 +243,11 @@ const struct mv88e6xxx_avb_ops mv88e6165_avb_ops = {
.tai_read = mv88e6165_g2_avb_tai_read,
.tai_write = mv88e6165_g2_avb_tai_write,
.port_qav_write = mv88e6352_g2_avb_port_qav_write,
+ .qav_read = mv88e6352_g2_avb_qav_read,
+ .qav_write = mv88e6352_g2_avb_qav_write,
+ .port_avb_read = mv88e6352_g2_avb_port_avb_read,
+ .port_avb_write = mv88e6352_g2_avb_port_avb_write,
+ .avb_write = mv88e6352_g2_avb_avb_write,
};
static int mv88e6390_g2_avb_port_ptp_read(struct mv88e6xxx_chip *chip,
@@ -217,6 +280,27 @@ static int mv88e6390_g2_avb_port_qav_write(struct mv88e6xxx_chip *chip,
return mv88e6xxx_g2_avb_write(chip, writeop, data);
}
+static int mv88e6390_g2_avb_port_avb_read(struct mv88e6xxx_chip *chip,
+ int port, int addr, u16 *data,
+ int len)
+{
+ u16 readop = (len == 1 ? MV88E6390_G2_AVB_CMD_OP_READ :
+ MV88E6390_G2_AVB_CMD_OP_READ_INCR) |
+ (port << 8) | (MV88E6352_G2_AVB_CMD_BLOCK_AVB << 5) |
+ addr;
+
+ return mv88e6xxx_g2_avb_read(chip, readop, data, len);
+}
+
+static int mv88e6390_g2_avb_port_avb_write(struct mv88e6xxx_chip *chip,
+ int port, int addr, u16 data)
+{
+ u16 writeop = MV88E6390_G2_AVB_CMD_OP_WRITE | (port << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_AVB << 5) | addr;
+
+ return mv88e6xxx_g2_avb_write(chip, writeop, data);
+}
+
static int mv88e6390_g2_avb_ptp_read(struct mv88e6xxx_chip *chip, int addr,
u16 *data, int len)
{
@@ -249,6 +333,38 @@ static int mv88e6390_g2_avb_tai_write(struct mv88e6xxx_chip *chip, int addr,
addr, data);
}
+static int mv88e6390_g2_avb_qav_read(struct mv88e6xxx_chip *chip, int addr,
+ u16 *data, int len)
+{
+ u16 readop = (len == 1 ? MV88E6390_G2_AVB_CMD_OP_READ :
+ MV88E6390_G2_AVB_CMD_OP_READ_INCR) |
+ (MV88E6390_G2_AVB_CMD_PORT_AVBGLOBAL << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_QAV << 5) |
+ addr;
+
+ return mv88e6xxx_g2_avb_read(chip, readop, data, len);
+}
+
+static int mv88e6390_g2_avb_qav_write(struct mv88e6xxx_chip *chip, int addr,
+ u16 data)
+{
+ u16 writeop = MV88E6390_G2_AVB_CMD_OP_WRITE |
+ (MV88E6390_G2_AVB_CMD_PORT_AVBGLOBAL << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_QAV << 5) | addr;
+
+ return mv88e6xxx_g2_avb_write(chip, writeop, data);
+}
+
+static int mv88e6390_g2_avb_avb_write(struct mv88e6xxx_chip *chip, int addr,
+ u16 data)
+{
+ u16 writeop = MV88E6390_G2_AVB_CMD_OP_WRITE |
+ (MV88E6390_G2_AVB_CMD_PORT_AVBGLOBAL << 8) |
+ (MV88E6352_G2_AVB_CMD_BLOCK_AVB << 5) | addr;
+
+ return mv88e6xxx_g2_avb_write(chip, writeop, data);
+}
+
const struct mv88e6xxx_avb_ops mv88e6390_avb_ops = {
.port_ptp_read = mv88e6390_g2_avb_port_ptp_read,
.port_ptp_write = mv88e6390_g2_avb_port_ptp_write,
@@ -257,4 +373,9 @@ const struct mv88e6xxx_avb_ops mv88e6390_avb_ops = {
.tai_read = mv88e6390_g2_avb_tai_read,
.tai_write = mv88e6390_g2_avb_tai_write,
.port_qav_write = mv88e6390_g2_avb_port_qav_write,
+ .qav_read = mv88e6390_g2_avb_qav_read,
+ .qav_write = mv88e6390_g2_avb_qav_write,
+ .port_avb_read = mv88e6390_g2_avb_port_avb_read,
+ .port_avb_write = mv88e6390_g2_avb_port_avb_write,
+ .avb_write = mv88e6390_g2_avb_avb_write,
};
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 758b8d41f4853..e2defc817fed3 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -1651,6 +1651,24 @@ int mv88e6390_port_tag_remap(struct mv88e6xxx_chip *chip, int port)
return 0;
}
+int mv88e6390_port_ieee_pri_map(struct mv88e6xxx_chip *chip, int port, const u8 *map)
+{
+ u8 fpri, qpri;
+ int err;
+
+ for (fpri = 0; fpri < IEEE_8021Q_MAX_PRIORITIES; fpri++) {
+ qpri = map ? (map[fpri] & 0x7) : fpri;
+
+ err = mv88e6xxx_port_ieeepmt_write(chip, port,
+ MV88E6390_PORT_IEEE_PRIO_MAP_TABLE_INGRESS_PCP,
+ fpri, (fpri | qpri << 4));
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
/* Offset 0x0E: Policy Control Register */
static int
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index a380f356eb83d..5d46fbea16f05 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -535,6 +535,8 @@ int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port,
u16 mode);
int mv88e6095_port_tag_remap(struct mv88e6xxx_chip *chip, int port);
int mv88e6390_port_tag_remap(struct mv88e6xxx_chip *chip, int port);
+int mv88e6390_port_ieee_pri_map(struct mv88e6xxx_chip *chip, int port,
+ const u8 *map);
int mv88e6xxx_port_set_egress_mode(struct mv88e6xxx_chip *chip, int port,
enum mv88e6xxx_egress_mode mode);
int mv88e6085_port_set_frame_mode(struct mv88e6xxx_chip *chip, int port,
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v2 6/6] net: dsa: mv88e6xxx: honour MDB_FLAGS_STREAM_RESERVED for AVB streams
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
` (4 preceding siblings ...)
2026-06-02 0:43 ` [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support Luke Howard
@ 2026-06-02 0:43 ` Luke Howard
5 siblings, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 0:43 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell, Luke Howard
Map the MDB_FLAGS_STREAM_RESERVED flag to MC_STATIC_AVB_NRL when adding
or removing MDB entries to the ATU. The presence or absence of this flag
must match any existing ATU entries for the destination address, as the
ATU entry type applies to all destination ports.
A port in enhanced mode, requested by BR_FILTER_STREAM_RESERVED, will only
admit frames with priorities associated with AVB traffic classes for these
destinations.
Note: MC_STATIC_AVB_NRL entries persist independently of the MQPRIO mode.
If port ingress rate limiting (PIRL) support is ever added, this should
be revisited.
Signed-off-by: Luke Howard <lukeh@padl.com>
---
drivers/net/dsa/mv88e6xxx/chip.c | 101 ++++++++++++++++++++++++++++++---------
1 file changed, 78 insertions(+), 23 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index db79302c2b84d..ef3cb1cca134d 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2269,6 +2269,35 @@ static bool mv88e6xxx_port_db_find(struct mv88e6xxx_chip *chip,
return entry.state && ether_addr_equal(entry.mac, addr);
}
+static int mv88e6xxx_port_db_loadpurge_entry(struct mv88e6xxx_chip *chip,
+ int port, u16 fid,
+ const unsigned char *addr,
+ struct mv88e6xxx_atu_entry *entry,
+ u8 state)
+{
+ /* Initialize a fresh ATU entry if it isn't found */
+ if (!entry->state || !ether_addr_equal(entry->mac, addr)) {
+ memset(entry, 0, sizeof(*entry));
+ ether_addr_copy(entry->mac, addr);
+ }
+
+ /* Purge the ATU entry only if no port is using it anymore */
+ if (!state) {
+ entry->portvec &= ~BIT(port);
+ if (!entry->portvec)
+ entry->state = 0;
+ } else {
+ if (state == MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC)
+ entry->portvec = BIT(port);
+ else
+ entry->portvec |= BIT(port);
+
+ entry->state = state;
+ }
+
+ return mv88e6xxx_g1_atu_loadpurge(chip, fid, entry);
+}
+
static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port,
const unsigned char *addr, u16 vid,
u8 state)
@@ -2281,27 +2310,8 @@ static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port,
if (err)
return err;
- /* Initialize a fresh ATU entry if it isn't found */
- if (!entry.state || !ether_addr_equal(entry.mac, addr)) {
- memset(&entry, 0, sizeof(entry));
- ether_addr_copy(entry.mac, addr);
- }
-
- /* Purge the ATU entry only if no port is using it anymore */
- if (!state) {
- entry.portvec &= ~BIT(port);
- if (!entry.portvec)
- entry.state = 0;
- } else {
- if (state == MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC)
- entry.portvec = BIT(port);
- else
- entry.portvec |= BIT(port);
-
- entry.state = state;
- }
-
- return mv88e6xxx_g1_atu_loadpurge(chip, fid, &entry);
+ return mv88e6xxx_port_db_loadpurge_entry(chip, port, fid, addr, &entry,
+ state);
}
static int mv88e6xxx_policy_apply(struct mv88e6xxx_chip *chip, int port,
@@ -6781,16 +6791,61 @@ static int mv88e6xxx_change_tag_protocol(struct dsa_switch *ds,
return err;
}
+static bool mv88e6xxx_atu_mc_entry_changed(const struct mv88e6xxx_atu_entry *existing,
+ int port,
+ const struct switchdev_obj_port_mdb *mdb,
+ u8 state)
+{
+ if (!ether_addr_equal(existing->mac, mdb->addr))
+ return false;
+
+ if (existing->state != MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC &&
+ existing->state != MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_AVB_NRL)
+ return false;
+
+ if (existing->state == state)
+ return false;
+
+ if (!(existing->portvec & ~BIT(port)))
+ return false;
+
+ return true;
+}
+
static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb,
struct dsa_db db)
{
struct mv88e6xxx_chip *chip = ds->priv;
+ struct mv88e6xxx_atu_entry existing;
+ u8 state;
+ u16 fid;
int err;
mv88e6xxx_reg_lock(chip);
- err = mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid,
- MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC);
+
+ /* Note that AVB_NRL entries persist independently of the MQPRIO mode;
+ * as ingress rate control is not used by this driver, this is safe.
+ */
+ if ((mdb->flags & SWITCHDEV_MDB_F_STREAM_RESERVED) && chip->info->qav)
+ state = MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_AVB_NRL;
+ else
+ state = MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC;
+
+ err = mv88e6xxx_port_db_get(chip, mdb->addr, mdb->vid, &fid, &existing);
+ if (err)
+ goto out;
+
+ if (mv88e6xxx_atu_mc_entry_changed(&existing, port, mdb, state)) {
+ dev_info_ratelimited(chip->dev,
+ "p%d: cannot offload MDB %pM: stream-reserved flag conflicts with existing entry\n",
+ port, mdb->addr);
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = mv88e6xxx_port_db_loadpurge_entry(chip, port, fid, mdb->addr,
+ &existing, state);
if (err)
goto out;
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-02 0:43 ` [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control Luke Howard
@ 2026-06-02 1:28 ` Luke Howard
2026-06-03 7:35 ` Nikolay Aleksandrov
1 sibling, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 1:28 UTC (permalink / raw)
To: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell
> On 2 Jun 2026, at 10:43 am, Luke Howard <lukeh@padl.com> wrote:
>
> + if (mdst->flags & BRIDGE_MDBE_F_HOST_STREAM_RESERVED)
> + return false;
This flag should not have appeared until patch 4/6 (allow MDB_FLAGS_STREAM_RESERVED on host groups).
Will be fixed in the next revision.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 0:43 ` [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support Luke Howard
@ 2026-06-02 12:00 ` Cedric Jehasse
2026-06-02 21:12 ` Luke Howard
[not found] ` <808529B1-E40A-4E54-A654-86F1B6D1FA66@padl.com>
0 siblings, 2 replies; 30+ messages in thread
From: Cedric Jehasse @ 2026-06-02 12:00 UTC (permalink / raw)
To: Luke Howard
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
On Tue, Jun 02, 2026 at 10:43:50AM +1000, Luke Howard wrote:
> +static int mv88e6xxx_setup_tc_mqprio(struct dsa_switch *ds, int port,
> + struct tc_mqprio_qopt_offload *mqprio)
> +{
> + struct netlink_ext_ack *extack = mqprio->extack;
> + u8 ieee_pri_map[IEEE_8021Q_MAX_PRIORITIES];
> + struct mv88e6xxx_chip *chip = ds->priv;
> + struct mv88e6xxx_tc_policy *pol;
> + enum mv88e6xxx_tc_mode tc_mode;
> + struct net_device *user;
> + bool can_update_pol;
> + bool per_port_pol;
> + int num_tc, err;
> +
> + if (!dsa_is_user_port(ds, port))
> + return -EINVAL;
> +
> + num_tc = mv88e6xxx_validate_tc_mqprio(chip, mqprio, &tc_mode, ieee_pri_map);
> + if (num_tc < 0)
> + return num_tc;
> +
> + user = dsa_to_port(ds, port)->user;
> +
> + per_port_pol = (tc_mode == MV88E6XXX_TC_MODE_QPRI &&
> + chip->info->ops->port_ieee_pri_map);
> +
> + mv88e6xxx_reg_lock(chip);
> +
> + pol = &chip->tc_policy;
> +
> + if (num_tc && pol->tc_mode && pol->tc_mode != tc_mode) {
> + NL_SET_ERR_MSG_MOD(extack, "all switch ports must use the same MQPRIO mode");
> + err = -EOPNOTSUPP;
> + goto err_unlock;
> + }
> +
> + can_update_pol = per_port_pol ||
> + !pol->tc_port_mask || pol->tc_port_mask == BIT(port);
> + if (!can_update_pol && num_tc &&
> + !mv88e6xxx_tc_mode_map_equal(chip, tc_mode, ieee_pri_map)) {
> + NL_SET_ERR_MSG_MOD(extack, "only a single priority mapping supported per switch");
> + err = -EOPNOTSUPP;
> + goto err_unlock;
> + }
> +
> + err = mv88e6xxx_mqprio_netdev_set_tc(user, &mqprio->qopt, num_tc);
> + if (err)
> + goto err_reset_tc;
> +
> + if (can_update_pol) {
> + const u8 *map = num_tc ? ieee_pri_map : NULL;
> +
> + if (per_port_pol)
> + err = mv88e6xxx_set_port_ieee_pri_map(chip, port, map);
> + else
> + err = mv88e6xxx_set_ieee_pri_map(chip, map);
In case of per port priority mapping, i don't know if this is working as
expected, as the IEEE priority mapping is done at ingress.
Eg. i think if MQPRIO channel mode is used to configure a pcp to queue mapping
on port 1 and a different mapping on port 2. Traffic received on port 1 that
gets forwarded to port 2 and egresses port 2 will end up in the queue
configured by the mapping on port 1. As mqprio is an egress qdisc, i don't
think that's expected.
I have a patch that hasn't been submitted to the mailing list yet which
implements support for the dcb app pcp-prio command. This is also done by
configuring the IEEE priority mapping table.
Cedric
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 12:00 ` Cedric Jehasse
@ 2026-06-02 21:12 ` Luke Howard
2026-06-02 23:48 ` Luke Howard
2026-06-03 2:09 ` Luke Howard
[not found] ` <808529B1-E40A-4E54-A654-86F1B6D1FA66@padl.com>
1 sibling, 2 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-02 21:12 UTC (permalink / raw)
To: Cedric Jehasse
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> In case of per port priority mapping, i don't know if this is working as
> expected, as the IEEE priority mapping is done at ingress.
> Eg. i think if MQPRIO channel mode is used to configure a pcp to queue mapping
> on port 1 and a different mapping on port 2. Traffic received on port 1 that
> gets forwarded to port 2 and egresses port 2 will end up in the queue
> configured by the mapping on port 1. As mqprio is an egress qdisc, i don't
> think that's expected.
Good point, it’s interesting there is egress mapping on the 6390 but only for FPri to DSCP. The Frame Priority Table set ordinal one would expect for egress QPri mapping is “reserved for future use”. (Be nice if it were an undocumented feature.)
I will remove the per-port mapping for the 6390 so all ports share the same mapping as they do on the 6352.
> I have a patch that hasn't been submitted to the mailing list yet which
> implements support for the dcb app pcp-prio command. This is also done by
> configuring the IEEE priority mapping table.
Definitely happy to take a look if and when you submit, but I do plan to continue with this patch series (originally proposed as an RFC in September 2025).
Cheers,
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 21:12 ` Luke Howard
@ 2026-06-02 23:48 ` Luke Howard
2026-06-02 23:55 ` Andrew Lunn
2026-06-03 2:09 ` Luke Howard
1 sibling, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-02 23:48 UTC (permalink / raw)
To: Cedric Jehasse
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> On 3 Jun 2026, at 7:12 am, Luke Howard <lukeh@padl.com> wrote:
>
>> In case of per port priority mapping, i don't know if this is working as
>> expected, as the IEEE priority mapping is done at ingress.
>> Eg. i think if MQPRIO channel mode is used to configure a pcp to queue mapping
>> on port 1 and a different mapping on port 2. Traffic received on port 1 that
>> gets forwarded to port 2 and egresses port 2 will end up in the queue
>> configured by the mapping on port 1. As mqprio is an egress qdisc, i don't
>> think that's expected.
>
> Good point, it’s interesting there is egress mapping on the 6390 but only for FPri to DSCP. The Frame Priority Table set ordinal one would expect for egress QPri mapping is “reserved for future use”. (Be nice if it were an undocumented feature.)
>
> I will remove the per-port mapping for the 6390 so all ports share the same mapping as they do on the 6352.
There is actually a slight impedance mismatch here which I hadn’t previously considered.
MQPRIO is a per-port Qdisc, but the FPri/QPri mappings on the switches are global (even, as you point out, on the 6390 as far as egress queues are concerned). Whilst we do validate that any MQPRIO-configured port has the same mapping, we can’t do anything about ports on which MQPRIO has not been configured (otherwise we would never be able to configure the first port). These ports implicitly inherit the per-switch mapping.
I think this is acceptable because, if one has not configured MQPRIO, one should have no expectation about which egress queue traffic ends up in. But there’s an alternative, more invasive, solution where the MQPRIO configuration is attached to the bridge itself, and is copied to (but not writable on) the user ports. This would require dsa_switch_ops changes.
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 23:48 ` Luke Howard
@ 2026-06-02 23:55 ` Andrew Lunn
2026-06-03 0:15 ` Luke Howard
0 siblings, 1 reply; 30+ messages in thread
From: Andrew Lunn @ 2026-06-02 23:55 UTC (permalink / raw)
To: Luke Howard
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikolay Aleksandrov, Ido Schimmel, Andrew Lunn, David Ahern,
Shuah Khan, Vladimir Oltean, netdev, linux-kernel, bridge,
linux-kselftest, Max Hunter, Kieran Tyrrell
> But there’s an alternative, more invasive, solution where the MQPRIO
> configuration is attached to the bridge itself.
What about ports which are not attached to a bridge? They are just
standalone, have an IP address of their own, etc.
Andrew
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 23:55 ` Andrew Lunn
@ 2026-06-03 0:15 ` Luke Howard
2026-06-03 1:40 ` Luke Howard
0 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-03 0:15 UTC (permalink / raw)
To: Andrew Lunn
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikolay Aleksandrov, Ido Schimmel, Andrew Lunn, David Ahern,
Shuah Khan, Vladimir Oltean, netdev, linux-kernel, bridge,
linux-kselftest, Max Hunter, Kieran Tyrrell
> On 3 Jun 2026, at 9:55 am, Andrew Lunn <andrew@lunn.ch> wrote:
>
>> But there’s an alternative, more invasive, solution where the MQPRIO
>> configuration is attached to the bridge itself.
>
> What about ports which are not attached to a bridge? They are just
> standalone, have an IP address of their own, etc.
That is a good point. So, barring a switch feature I’ve not yet found, I think the only options are either to drop MQPRIO offload, or to accept that ports with no MQPRIO mapping inherit the per-switch mapping. Arguably that’s the case today anyway (each chip has its own default frame priority to queue mapping), so the user should have no expectation of queue assignment on a port that hasn’t been configured.
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-03 0:15 ` Luke Howard
@ 2026-06-03 1:40 ` Luke Howard
2026-06-03 2:41 ` Andrew Lunn
0 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-03 1:40 UTC (permalink / raw)
To: Andrew Lunn
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikolay Aleksandrov, Ido Schimmel, Andrew Lunn, David Ahern,
Shuah Khan, Vladimir Oltean, netdev, linux-kernel, bridge,
linux-kselftest, Max Hunter, Kieran Tyrrell
> On 3 Jun 2026, at 10:15 am, Luke Howard <lukeh@padl.com> wrote:
>
>> On 3 Jun 2026, at 9:55 am, Andrew Lunn <andrew@lunn.ch> wrote:
>>
>>> But there’s an alternative, more invasive, solution where the MQPRIO
>>> configuration is attached to the bridge itself.
>>
>> What about ports which are not attached to a bridge? They are just
>> standalone, have an IP address of their own, etc.
>
> That is a good point. So, barring a switch feature I’ve not yet found, I think the only options are either to drop MQPRIO offload, or to accept that ports with no MQPRIO mapping inherit the per-switch mapping. Arguably that’s the case today anyway (each chip has its own default frame priority to queue mapping), so the user should have no expectation of queue assignment on a port that hasn’t been configured.
We could add (e.g.) bridge_setup_tc to dsa_switch_ops, which (in the mv88e6xxx implementation) could validate the bridge contained all user ports. But it would not be possible to block a port from leaving as port_bridge_leave cannot return an error, and tearing MQPRIO config down silently would be a different sort of bad.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-02 21:12 ` Luke Howard
2026-06-02 23:48 ` Luke Howard
@ 2026-06-03 2:09 ` Luke Howard
2026-06-03 3:30 ` Luke Howard
1 sibling, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-03 2:09 UTC (permalink / raw)
To: Cedric Jehasse
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> On 3 Jun 2026, at 7:12 am, Luke Howard <lukeh@padl.com> wrote:
>
>> In case of per port priority mapping, i don't know if this is working as
>> expected, as the IEEE priority mapping is done at ingress.
>> Eg. i think if MQPRIO channel mode is used to configure a pcp to queue mapping
>> on port 1 and a different mapping on port 2. Traffic received on port 1 that
>> gets forwarded to port 2 and egresses port 2 will end up in the queue
>> configured by the mapping on port 1. As mqprio is an egress qdisc, i don't
>> think that's expected.
>
> Good point, it’s interesting there is egress mapping on the 6390 but only for FPri to DSCP. The Frame Priority Table set ordinal one would expect for egress QPri mapping is “reserved for future use”. (Be nice if it were an undocumented feature.)
I misread the data sheet. Egress PCP to FPRI mapping is possible per-port on the 6390 family. This doesn’t resolve the other issue (MQPRIO-less ports on the 6352 family). But it is good news and I will revise accordingly in the next patch revision.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-03 1:40 ` Luke Howard
@ 2026-06-03 2:41 ` Andrew Lunn
2026-06-03 3:29 ` Luke Howard
0 siblings, 1 reply; 30+ messages in thread
From: Andrew Lunn @ 2026-06-03 2:41 UTC (permalink / raw)
To: Luke Howard
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikolay Aleksandrov, Ido Schimmel, Andrew Lunn, David Ahern,
Shuah Khan, Vladimir Oltean, netdev, linux-kernel, bridge,
linux-kselftest, Max Hunter, Kieran Tyrrell
> We could add (e.g.) bridge_setup_tc to dsa_switch_ops, which (in the
> mv88e6xxx implementation) could validate the bridge contained all
> user ports. But it would not be possible to block a port from
> leaving as port_bridge_leave cannot return an error, and tearing
> MQPRIO config down silently would be a different sort of bad.
There can be multiple bridges, even as far as one bridge per port.
For switch wide properties, you basically have to allow the first user
to configure it, refcount additional users get added and removed, and
only allow the last user to change it.
The alternative is return -EOPNOTSUPP, and let the kernel do it in
software, if a user wants something different to the global
setting. The hardware is only there to accelerate what Linux can
already do in software. That is the model we use.
Andrew
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-03 2:41 ` Andrew Lunn
@ 2026-06-03 3:29 ` Luke Howard
2026-06-04 6:26 ` Luke Howard
0 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-03 3:29 UTC (permalink / raw)
To: Andrew Lunn
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikolay Aleksandrov, Ido Schimmel, Andrew Lunn, David Ahern,
Shuah Khan, Vladimir Oltean, netdev, linux-kernel, bridge,
linux-kselftest, Max Hunter, Kieran Tyrrell
> On 3 Jun 2026, at 12:41 pm, Andrew Lunn <andrew@lunn.ch> wrote:
>
>> We could add (e.g.) bridge_setup_tc to dsa_switch_ops, which (in the
>> mv88e6xxx implementation) could validate the bridge contained all
>> user ports. But it would not be possible to block a port from
>> leaving as port_bridge_leave cannot return an error, and tearing
>> MQPRIO config down silently would be a different sort of bad.
>
> There can be multiple bridges, even as far as one bridge per port.
Right, per the above mv88e6xxx would return -EOPNOTSUPP if the bridge did not contain all DSA user ports. But without a means to block the removal of ports from bridges it would not be possible to maintain this invariant.
> For switch wide properties, you basically have to allow the first user
> to configure it, refcount additional users get added and removed, and
> only allow the last user to change it.
Indeed, that’s what this series does. But it comes with the discussed caveat that ports without MQPRIO (whether or not they are part of a bridge) implicitly inherit the global FPri to QPri mapping. I think that is acceptable given that users should have no expectaitons about the default mapping, but others may disagree.
> The alternative is return -EOPNOTSUPP, and let the kernel do it in
> software, if a user wants something different to the global
> setting. The hardware is only there to accelerate what Linux can
> already do in software. That is the model we use.
Yes, this is how it is implemented.
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-03 2:09 ` Luke Howard
@ 2026-06-03 3:30 ` Luke Howard
0 siblings, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-03 3:30 UTC (permalink / raw)
To: Cedric Jehasse
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
>> Good point, it’s interesting there is egress mapping on the 6390 but only for FPri to DSCP. The Frame Priority Table set ordinal one would expect for egress QPri mapping is “reserved for future use”. (Be nice if it were an undocumented feature.)
>
> I misread the data sheet. Egress PCP to FPRI mapping is possible per-port on the 6390 family. This doesn’t resolve the other issue (MQPRIO-less ports on the 6352 family). But it is good news and I will revise accordingly in the next patch revision.
I did not misread the data sheet. The 6390 family does not support egress FPri to QPri mapping. It supports egress FPri to _PCP_ mapping (i.e. the internal and wire representations of the priority can differ). Clearly needed a third coffee.
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-02 0:43 ` [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control Luke Howard
2026-06-02 1:28 ` Luke Howard
@ 2026-06-03 7:35 ` Nikolay Aleksandrov
2026-06-04 5:39 ` Luke Howard
1 sibling, 1 reply; 30+ messages in thread
From: Nikolay Aleksandrov @ 2026-06-03 7:35 UTC (permalink / raw)
To: Luke Howard, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell
On 02/06/2026 03:43, Luke Howard wrote:
> Add the BR_FILTER_STREAM_RESERVED bridge port flag, gated by
> CONFIG_BRIDGE_8021Q_SRP, which may be used to enforce 802.1Qat admission
> control on ports that have it set.
>
> A frame received by a port with the flag set, and whose 802.1p priority
> maps (via an MQPRIO/TAPRIO Qdisc on the bridge) to a non-zero traffic
> class, is admitted only if it belongs to a reserved stream. Reserved
> streams are multicast frames whose MDB entry has FLAGS_STREAM_RESERVED
> set. Unicast and broadcast frames sharing this priority are dropped.
>
> Non-admitted frames are dropped after source address learning.
>
> Multicast snooping must be enabled on the bridge for admission control
> to function correctly: with snooping disabled, no MDB entries exist and
> all SR-class multicast frames on SR-filtered ports would be dropped.
>
> There is no support for reserved unicast streams at present; although
> permitted by 802.1Qat (SRP) they are rarely used in practice.
>
> The choice to not allow configurable mapping of traffic classes was done
> in the interest of simplicity and keeping the number of additional
> instructions in the forwarding path to a minimum. It is anticipated this
> will suffice for the common AVB/TSN use case.
>
> Assisted-by: Claude:claude-opus-4-8
> Signed-off-by: Luke Howard <lukeh@padl.com>
> ---
> include/linux/if_bridge.h | 1 +
> include/uapi/linux/if_link.h | 9 +
> net/bridge/Kconfig | 22 +
> net/bridge/br_input.c | 59 ++-
> net/bridge/br_netlink.c | 8 +-
> net/bridge/br_private.h | 3 +-
> net/bridge/br_switchdev.c | 3 +-
> net/core/rtnetlink.c | 2 +-
> tools/testing/selftests/net/forwarding/Makefile | 1 +
> .../net/forwarding/bridge_mdb_stream_reserved.sh | 536 +++++++++++++++++++++
> tools/testing/selftests/net/forwarding/config | 2 +
> 11 files changed, 641 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index ec9ffea1e46ed..a5e91bace0464 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -62,6 +62,7 @@ struct br_ip_list {
> #define BR_PORT_MAB BIT(22)
> #define BR_NEIGH_VLAN_SUPPRESS BIT(23)
> #define BR_NEIGH_FORWARD_GRAT BIT(24)
> +#define BR_FILTER_STREAM_RESERVED BIT(25)
>
> #define BR_DEFAULT_AGEING_TIME (300 * HZ)
>
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index 46413392b402c..80f5a2b4162c8 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -1106,6 +1106,14 @@ enum {
> * backup port that has VLAN tunnel mapping enabled (via the
> * *IFLA_BRPORT_VLAN_TUNNEL* option). Setting a value of 0 (default) has
> * the effect of not attaching any ID.
> + *
> + * @IFLA_BRPORT_FILTER_STREAM_RESERVED
> + * Controls whether the port enforces 802.1Qat stream reservation
> + * admission control. When enabled, a frame received on the port whose
> + * 802.1p priority maps (via an MQPRIO/TAPRIO Qdisc on the bridge) to a
> + * non-zero traffic class is dropped at ingress unless it belongs to a
> + * reserved stream, i.e. it is multicast and its destination address has a
> + * stream-reserved MDB entry. The flag is off by default.
> */
> enum {
> IFLA_BRPORT_UNSPEC,
> @@ -1154,6 +1162,7 @@ enum {
> IFLA_BRPORT_NEIGH_VLAN_SUPPRESS,
> IFLA_BRPORT_BACKUP_NHID,
> IFLA_BRPORT_NEIGH_FORWARD_GRAT,
> + IFLA_BRPORT_FILTER_STREAM_RESERVED,
> __IFLA_BRPORT_MAX
> };
> #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
> diff --git a/net/bridge/Kconfig b/net/bridge/Kconfig
> index 318715c8fc9bc..7e46b791922a6 100644
> --- a/net/bridge/Kconfig
> +++ b/net/bridge/Kconfig
> @@ -47,6 +47,28 @@ config BRIDGE_IGMP_SNOOPING
>
> If unsure, say Y.
>
> +config BRIDGE_8021Q_SRP
> + bool "802.1Qat Stream Reservation Protocol (SRP) admission control"
> + depends on BRIDGE
> + default n
While it's ok to have a kconfig, from experience almost all distros will
enable this, so you have to better optimize the software fast-path and
make this feature almost a no-op when not enabled, but more below..
This is also a very small feature and I'm not yet convinced it has to be
in the bridge at all, you can drop the kconfig.
> + help
> + If you say Y here, then the Ethernet bridge will enforce 802.1Qat
> + stream reservation admission control in software on ingress ports
> + that have the BR_FILTER_STREAM_RESERVED flag set: a frame whose
> + 802.1p priority maps (via an MQPRIO/TAPRIO Qdisc on the bridge) to
> + a non-zero traffic class is dropped unless it is multicast and its
> + destination address MDB_FLAGS_STREAM_RESERVED set on its MDB entry.
> +
> + This option only controls software enforcement. The
> + BR_FILTER_STREAM_RESERVED port flag and MDB_FLAGS_STREAM_RESERVED
> + MDB flag are always accepted from user space and propagated via
> + switchdev so that hardware-offloading switches can enforce
> + admission control even when this option is disabled.
> +
> + Say N to exclude software enforcement and reduce the binary size.
> +
> + If unsure, say N.
> +
> config BRIDGE_VLAN_FILTERING
> bool "VLAN filtering"
> depends on BRIDGE
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index 5787066b1f4cb..2e8aa19a9b542 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -11,6 +11,7 @@
> #include <linux/kernel.h>
> #include <linux/netdevice.h>
> #include <linux/etherdevice.h>
> +#include <linux/if_vlan.h>
> #include <linux/netfilter_bridge.h>
> #ifdef CONFIG_NETFILTER_FAMILY_BRIDGE
> #include <net/netfilter/nf_queue.h>
> @@ -72,6 +73,57 @@ static int br_pass_frame_up(struct sk_buff *skb, bool promisc)
> br_netif_receive_skb);
> }
>
> +#ifdef CONFIG_BRIDGE_8021Q_SRP
> +/* Return false if the bridge has an MQPRIO/TAPRIO Qdisc that maps the
> + * frame's VLAN PCP to a non-zero traffic class.
> + */
> +static bool br_skb_is_sr_class(const struct net_bridge *br,
> + const struct sk_buff *skb)
> +{
> + if (!skb_vlan_tag_present(skb) || !netdev_get_num_tc(br->dev))
> + return false;
> +
> + return netdev_get_prio_tc_map(br->dev, skb_vlan_tag_get_prio(skb)) != 0;
> +}
> +
> +/* 802.1Qat admission control: a frame whose priority maps to a non-zero
> + * TC and which ingresses a port with BR_FILTER_STREAM_RESERVED is admitted
> + * only if it belongs to a reserved stream. Only multicast can be a reserved
> + * stream: either via an MDB port-group member with MDB_PG_FLAGS_STREAM_RESERVED,
> + * or via a host-group entry marked BRIDGE_MDBE_F_HOST_STREAM_RESERVED.
> + */
So looking at how this is implemented, why not put most of it in TC?
It is testing for skb class, for tc qdisc, I don't see a reason for it
to be in the bridge at all. You can filter the mcast groups and simulate
the "reserved" flag, it has to be set manually anyway. Adding new
tests in the bridge software fast-path just for this is a waste.
> +static bool br_sr_admission_denied(const struct net_bridge_port *p,
> + const struct sk_buff *skb,
> + const struct net_bridge_mdb_entry *mdst)
> +{
avoid double negatives
> + const struct net_bridge_port_group *pg;
> +
> + if (!(p->flags & BR_FILTER_STREAM_RESERVED) ||
> + !br_skb_is_sr_class(p->br, skb))
> + return false;
> +
> + if (!mdst)
> + return true;
> +
> + if (mdst->flags & BRIDGE_MDBE_F_HOST_STREAM_RESERVED)
> + return false;
> +
> + for (pg = rcu_dereference(mdst->ports); pg;
> + pg = rcu_dereference(pg->next))
> + if (pg->flags & MDB_PG_FLAGS_STREAM_RESERVED)
> + return false;
> +
> + return true;
> +}
> +#else
> +static inline bool br_sr_admission_denied(const struct net_bridge_port *p,
> + const struct sk_buff *skb,
> + const struct net_bridge_mdb_entry *mdst)
> +{
> + return false;
> +}
such definitions live in br_private.h, not in .c files, also no inlines in .c
files
> +#endif
> +
> /* note: already called with rcu_read_lock */
> int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
> {
> @@ -183,9 +235,14 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> br_do_suppress_nd(skb, br, vid, p, msg);
> }
>
> + mdst = pkt_type == BR_PKT_MULTICAST ?
> + br_mdb_entry_skb_get(brmctx, skb, vid) : NULL;
> +
> + if (br_sr_admission_denied(p, skb, mdst))
> + goto drop;
> +
so I don't see the point of this code, why not just add the new check below?
it's ugly like this
> switch (pkt_type) {
> case BR_PKT_MULTICAST:
> - mdst = br_mdb_entry_skb_get(brmctx, skb, vid);
> if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
> br_multicast_querier_exists(brmctx, eth_hdr(skb), mdst)) {
> if ((mdst && (mdst->flags & BRIDGE_MDBE_F_HOST_JOINED)) ||
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index a104b25c871d2..99e2a19255773 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -191,6 +191,7 @@ static inline size_t br_port_info_size(void)
> + nla_total_size(1) /* IFLA_BRPORT_MAB */
> + nla_total_size(1) /* IFLA_BRPORT_NEIGH_VLAN_SUPPRESS */
> + nla_total_size(1) /* IFLA_BRPORT_NEIGH_FORWARD_GRAT */
> + + nla_total_size(1) /* IFLA_BRPORT_FILTER_STREAM_RESERVED */
> + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_ROOT_ID */
> + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */
> + nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */
> @@ -285,7 +286,9 @@ static int br_port_fill_attrs(struct sk_buff *skb,
> nla_put_u8(skb, IFLA_BRPORT_NEIGH_VLAN_SUPPRESS,
> !!(p->flags & BR_NEIGH_VLAN_SUPPRESS)) ||
> nla_put_u8(skb, IFLA_BRPORT_NEIGH_FORWARD_GRAT,
> - !!(p->flags & BR_NEIGH_FORWARD_GRAT)))
> + !!(p->flags & BR_NEIGH_FORWARD_GRAT)) ||
> + nla_put_u8(skb, IFLA_BRPORT_FILTER_STREAM_RESERVED,
> + !!(p->flags & BR_FILTER_STREAM_RESERVED)))
> return -EMSGSIZE;
>
> timerval = br_timer_value(&p->message_age_timer);
> @@ -906,6 +909,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = {
> [IFLA_BRPORT_NEIGH_VLAN_SUPPRESS] = NLA_POLICY_MAX(NLA_U8, 1),
> [IFLA_BRPORT_BACKUP_NHID] = { .type = NLA_U32 },
> [IFLA_BRPORT_NEIGH_FORWARD_GRAT] = NLA_POLICY_MAX(NLA_U8, 1),
> + [IFLA_BRPORT_FILTER_STREAM_RESERVED] = NLA_POLICY_MAX(NLA_U8, 1),
> };
>
> /* Change the state of the port and notify spanning tree */
> @@ -976,6 +980,8 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
> BR_NEIGH_VLAN_SUPPRESS);
> br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_FORWARD_GRAT,
> BR_NEIGH_FORWARD_GRAT);
> + br_set_port_flag(p, tb, IFLA_BRPORT_FILTER_STREAM_RESERVED,
> + BR_FILTER_STREAM_RESERVED);
>
> if ((p->flags & BR_PORT_MAB) &&
> (!(p->flags & BR_PORT_LOCKED) || !(p->flags & BR_LEARNING))) {
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 1e0eefaf50dd1..4ae050ae4826e 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -373,7 +373,8 @@ struct net_bridge_port_group {
> struct rcu_head rcu;
> };
>
> -#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
> +#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
> +#define BRIDGE_MDBE_F_HOST_STREAM_RESERVED BIT(1)
>
> struct net_bridge_mdb_entry {
> struct rhash_head rhnode;
> diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
> index 39535f1a6b8ce..7b531d483817c 100644
> --- a/net/bridge/br_switchdev.c
> +++ b/net/bridge/br_switchdev.c
> @@ -76,7 +76,8 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
> /* Flags that can be offloaded to hardware */
> #define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | BR_PORT_MAB | \
> BR_MCAST_FLOOD | BR_BCAST_FLOOD | BR_PORT_LOCKED | \
> - BR_HAIRPIN_MODE | BR_ISOLATED | BR_MULTICAST_TO_UNICAST)
> + BR_HAIRPIN_MODE | BR_ISOLATED | BR_MULTICAST_TO_UNICAST | \
> + BR_FILTER_STREAM_RESERVED)
>
> int br_switchdev_set_port_flag(struct net_bridge_port *p,
> unsigned long flags,
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 652dd008955a9..8ad7f1d0357b2 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -63,7 +63,7 @@
> #include "dev.h"
>
> #define RTNL_MAX_TYPE 50
> -#define RTNL_SLAVE_MAX_TYPE 45
> +#define RTNL_SLAVE_MAX_TYPE 46
>
> struct rtnl_link {
> rtnl_doit_func doit;
> diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile
> index bbaf4d937dd8b..3899551db05b9 100644
> --- a/tools/testing/selftests/net/forwarding/Makefile
> +++ b/tools/testing/selftests/net/forwarding/Makefile
> @@ -10,6 +10,7 @@ TEST_PROGS := \
> bridge_mdb_host.sh \
> bridge_mdb_max.sh \
> bridge_mdb_port_down.sh \
> + bridge_mdb_stream_reserved.sh \
> bridge_mld.sh \
> bridge_port_isolation.sh \
> bridge_sticky_fdb.sh \
> diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh b/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
> new file mode 100755
> index 0000000000000..a21dc2ec3e95c
> --- /dev/null
> +++ b/tools/testing/selftests/net/forwarding/bridge_mdb_stream_reserved.sh
this should be a separate patch
> @@ -0,0 +1,536 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +# Test 802.1Qat stream reservation admission control. A bridge port with the
> +# BR_FILTER_STREAM_RESERVED flag set (bridge link set ... filter_stream_reserved
> +# on) polices multicast it receives: a frame whose 802.1p priority maps (via an
> +# mqprio/TC configuration on the bridge netdev) to a non-zero traffic class is
> +# admitted only if its destination is a reserved stream, i.e. has an MDB entry
> +# with the stream_reserved flag (the allow-list, typically maintained by an SRP
> +# daemon). Other SR-class multicast is dropped at ingress, so it reaches neither
> +# the host nor any port. TC 0 traffic, and traffic on ports without the flag,
> +# is unaffected.
> +#
> +# +------------------------+
> +# | H1 (vrf) - talker |
> +# | + $h1 |
> +# +----|-------------------+
> +# | PCP-tagged mcast
> +# +-----------------------------------|------------------------------------+
> +# | SW $swp1 (filter_stream_reserved) BR0 (802.1q, mqprio) |
> +# | + |
> +# | + $swp2 (listener) + $swp3 (listener) |
> +# +------------------|-------------------------|---------------------------+
> +# | |
> +# +--------------|---------+ +-----------|------------+
> +# | H2 (vrf) - listener | | H3 (vrf) - listener |
> +# | + $h2 | | + $h3 |
> +# +------------------------+ +------------------------+
> +
> +ALL_TESTS="
> + cfg_test
> + fwd_sr_member_test
> + fwd_foreign_blocked_test
> + fwd_unicast_blocked_test
> + fwd_flag_gates_test
> + fwd_tc_toggle_test
> + fwd_flag_toggle_test
> + fwd_sr_ipv6_test
> +"
> +
> +NUM_NETIFS=6
> +source lib.sh
> +source tc_common.sh
> +
> +# GRP is the stream-reserved group; GRP2 is a plain group that both swp2 and
> +# swp3 join, used to show that a foreign SR-class group is dropped at ingress.
> +GRP=239.1.1.1
> +GRP_DMAC=01:00:5e:01:01:01
> +GRP2=239.1.1.2
> +GRP2_DMAC=01:00:5e:01:01:02
> +GRP3=239.1.1.3
> +GRP3_DMAC=01:00:5e:01:01:03
> +# IPv6 (MLD) groups: GRP6 is stream-reserved, GRP6B is a plain group.
> +GRP6=ff0e::1
> +GRP6_DMAC=33:33:00:00:00:01
> +GRP6B=ff0e::2
> +GRP6B_DMAC=33:33:00:00:00:02
> +# Source for the (S, G) configuration check.
> +SRC=192.0.2.10
> +# PCP 3 is SR class A; the mqprio map below sends it to TC 1.
> +SR_PCP=3
> +BE_PCP=0
> +VID=10
> +
> +h1_create()
> +{
> + simple_if_init $h1
> + vlan_create $h1 $VID v$h1 192.0.2.1/28
> +}
> +
> +h1_destroy()
> +{
> + vlan_destroy $h1 $VID
> + simple_if_fini $h1
> +}
> +
> +h2_create()
> +{
> + simple_if_init $h2
> + vlan_create $h2 $VID v$h2 192.0.2.2/28
> +}
> +
> +h2_destroy()
> +{
> + vlan_destroy $h2 $VID
> + simple_if_fini $h2
> +}
> +
> +h3_create()
> +{
> + simple_if_init $h3
> + vlan_create $h3 $VID v$h3 192.0.2.3/28
> +}
> +
> +h3_destroy()
> +{
> + vlan_destroy $h3 $VID
> + simple_if_fini $h3
> +}
> +
> +switch_create()
> +{
> + # The bridge must have multiple TX queues so that an mqprio qdisc (which
> + # populates the netdev prio->tc map the SR filter consults) can be
> + # attached, and a multicast querier so that the bridge forwards
> + # selectively.
> + ip link add name br0 numtxqueues 8 numrxqueues 8 type bridge \
> + vlan_filtering 1 vlan_default_pvid 0 \
> + mcast_snooping 1 mcast_igmp_version 3 mcast_mld_version 2 \
> + mcast_querier 1
> + bridge vlan add vid $VID dev br0 self
> + ip link set dev br0 up
> +
> + # A link-local address lets the bridge act as the IPv6 (MLD) querier,
> + # mirroring the IGMP querier.
> + ip address add fe80::1/64 dev br0 nodad
> +
> + local swp
> + for swp in $swp1 $swp2 $swp3; do
> + ip link set dev $swp master br0
> + ip link set dev $swp up
> + bridge vlan add vid $VID dev $swp
> + done
> +
> + # PCP $SR_PCP -> TC 1, everything else -> TC 0 (software mode).
> + tc qdisc add dev br0 root handle 100: mqprio num_tc 2 \
> + map 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 \
> + queues 1@0 1@1 hw 0
> +
> + tc qdisc add dev $h2 clsact
> + tc qdisc add dev $h3 clsact
> +
> + # Wait for the bridge's own querier to become active.
> + sleep 10
> +}
> +
> +switch_destroy()
> +{
> + tc qdisc del dev $h3 clsact
> + tc qdisc del dev $h2 clsact
> + tc qdisc del dev br0 root handle 100: mqprio 2>/dev/null
> +
> + local swp
> + for swp in $swp3 $swp2 $swp1; do
> + bridge vlan del vid $VID dev $swp
> + ip link set dev $swp down
> + ip link set dev $swp nomaster
> + done
> +
> + ip link set dev br0 down
> + bridge vlan del vid $VID dev br0 self
> + ip link del dev br0
> +}
> +
> +setup_prepare()
> +{
> + h1=${NETIFS[p1]}
> + swp1=${NETIFS[p2]}
> + swp2=${NETIFS[p3]}
> + h2=${NETIFS[p4]}
> + swp3=${NETIFS[p5]}
> + h3=${NETIFS[p6]}
> +
> + vrf_prepare
> + forwarding_enable
> +
> + h1_create
> + h2_create
> + h3_create
> + switch_create
> +}
> +
> +cleanup()
> +{
> + pre_cleanup
> +
> + switch_destroy
> + h3_destroy
> + h2_destroy
> + h1_destroy
> +
> + forwarding_restore
> + vrf_cleanup
> +}
> +
> +# Arm or disarm SR-class admission control on a bridge port
> +# (BR_FILTER_STREAM_RESERVED).
> +sr_filter()
> +{
> + local dev=$1 onoff=$2
> +
> + bridge link set dev $dev filter_stream_reserved $onoff
> +}
> +
> +# Probe whether the running kernel and iproute2 understand the MDB flag and the
> +# port flag. If not, the whole suite is skipped, so it is safe to invoke on an
> +# unpatched system.
> +stream_reserved_supported()
> +{
> + bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
> + stream_reserved 2>/dev/null
> + if [[ $? -ne 0 ]]; then
> + return 1
> + fi
> + bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
> +
> + sr_filter $swp1 on 2>/dev/null || return 1
> + sr_filter $swp1 off
> + return 0
> +}
> +
> +cfg_test()
> +{
> + RET=0
> +
> + # stream_reserved entries must be permanent.
> + bridge mdb add dev br0 port $swp2 grp $GRP vid $VID \
> + stream_reserved 2>/dev/null
> + check_fail $? "non-permanent stream_reserved port entry accepted"
> +
> + # The flag must be rejected on host groups.
> + bridge mdb add dev br0 port br0 grp $GRP permanent vid $VID \
> + stream_reserved 2>/dev/null
> + check_fail $? "stream_reserved accepted on a host group"
> +
> + # Add a port group with the flag and confirm it is reflected in dump.
> + bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
> + stream_reserved
> + check_err $? "Failed to add stream_reserved entry"
> + bridge -d mdb show dev br0 | grep -q "$GRP.*stream_reserved"
> + check_err $? "stream_reserved flag not shown in dump"
> +
> + # The other state flags must not be disturbed: a permanent entry stays
> + # permanent and carries no group timer when stream_reserved is set.
> + bridge -d mdb get dev br0 grp $GRP vid $VID | grep -q "permanent"
> + check_err $? "stream_reserved entry not kept \"permanent\""
> + bridge -d -s mdb get dev br0 grp $GRP vid $VID | grep -q " 0.00"
> + check_err $? "\"permanent\" stream_reserved entry has a pending group timer"
> +
> + # The flag is also accepted, and reported, on a source-specific (S, G).
> + bridge mdb add dev br0 port $swp2 grp $GRP3 src $SRC permanent vid $VID \
> + stream_reserved
> + check_err $? "stream_reserved rejected on an (S, G) entry"
> + bridge -d mdb show dev br0 | grep "$SRC" | grep -q stream_reserved
> + check_err $? "stream_reserved flag not shown on (S, G) entry"
> + bridge mdb del dev br0 port $swp2 grp $GRP3 src $SRC vid $VID
> +
> + # Replacing without the flag must clear it.
> + bridge mdb replace dev br0 port $swp2 grp $GRP permanent vid $VID
> + bridge -d mdb show dev br0 | grep -q "$GRP.*stream_reserved"
> + check_fail $? "stream_reserved flag not cleared on replace"
> +
> + bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
> +
> + # The port flag round-trips through netlink and is shown in the dump.
> + sr_filter $swp1 on
> + check_err $? "Failed to set filter_stream_reserved on a port"
> + bridge -d link show dev $swp1 | grep -q "filter_stream_reserved on"
> + check_err $? "filter_stream_reserved not shown in link dump"
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved configuration"
> +}
> +
> +rx_filter_install()
> +{
> + local dev=$1 pref=$2 grp=$3 ethtype=${4:-ipv4}
> +
> + tc filter add dev $dev ingress protocol 802.1q pref $pref handle $pref \
> + flower vlan_ethtype $ethtype vlan_id $VID dst_ip $grp action drop
> +}
> +
> +rx_filter_uninstall()
> +{
> + local dev=$1 pref=$2
> +
> + tc filter del dev $dev ingress protocol 802.1q pref $pref handle $pref \
> + flower
> +}
> +
> +send_mc()
> +{
> + local grp=$1 dmac=$2 pcp=$3
> +
> + $MZ $h1 -a own -b $dmac -c 1 -p 64 \
> + -A 192.0.2.1 -B $grp -t udp -Q $pcp:$VID -q
> +}
> +
> +send_mc6()
> +{
> + local grp=$1 dmac=$2 pcp=$3
> +
> + $MZ -6 $h1 -a own -b $dmac -c 1 -p 64 \
> + -A 2001:db8:1::1 -B $grp -t udp -Q $pcp:$VID -q
> +}
> +
> +# An arbitrary unicast DA: the bridge floods it as unknown unicast, so it
> +# reaches h2 unless dropped at ingress.
> +UC_DMAC=00:de:ad:be:ef:02
> +
> +send_uc()
> +{
> + local dip=$1 pcp=$2
> +
> + $MZ $h1 -a own -b $UC_DMAC -c 1 -p 64 \
> + -A 192.0.2.1 -B $dip -t udp -Q $pcp:$VID -q
> +}
> +
> +# An SR-class frame for a reserved stream is admitted on a filtering port and
> +# delivered to the stream's member.
> +fwd_sr_member_test()
> +{
> + RET=0
> +
> + sr_filter $swp1 on
> + bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
> + stream_reserved
> + rx_filter_install $h2 1 $GRP
> +
> + send_mc $GRP $GRP_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 1 1
> + check_err $? "reserved-stream SR-class frame not admitted to its member"
> +
> + rx_filter_uninstall $h2 1
> + bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved member delivery"
> +}
> +
> +# swp1 filters SR-class ingress. A foreign (non-reserved) group GRP2 at SR class
> +# is dropped at ingress, reaching neither listener, while a best-effort (TC 0)
> +# frame is admitted and delivered to both.
> +fwd_foreign_blocked_test()
> +{
> + RET=0
> +
> + sr_filter $swp1 on
> + bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
> + bridge mdb add dev br0 port $swp3 grp $GRP2 permanent vid $VID
> +
> + rx_filter_install $h2 2 $GRP2
> + rx_filter_install $h3 2 $GRP2
> +
> + # SR-class: dropped at ingress, reaches neither listener.
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 0
> + check_err $? "foreign SR-class frame leaked to a listener"
> + tc_check_packets "dev $h3 ingress" 2 0
> + check_err $? "foreign SR-class frame leaked to a listener"
> +
> + # Best-effort (TC 0): unaffected, delivered to both.
> + send_mc $GRP2 $GRP2_DMAC $BE_PCP
> + tc_check_packets "dev $h2 ingress" 2 1
> + check_err $? "best-effort frame not delivered"
> + tc_check_packets "dev $h3 ingress" 2 1
> + check_err $? "best-effort frame not delivered"
> +
> + rx_filter_uninstall $h3 2
> + rx_filter_uninstall $h2 2
> +
> + bridge mdb del dev br0 port $swp3 grp $GRP2 permanent vid $VID
> + bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved blocks foreign SR-class group at ingress"
> +}
> +
> +# Unicast cannot belong to a reserved stream, so an SR-class unicast frame is
> +# dropped at a filtering ingress port (otherwise it would consume the AVB
> +# queue's reserved bandwidth). A best-effort unicast frame is unaffected.
> +fwd_unicast_blocked_test()
> +{
> + RET=0
> +
> + sr_filter $swp1 on
> + rx_filter_install $h2 5 192.0.2.2
> +
> + send_uc 192.0.2.2 $SR_PCP
> + tc_check_packets "dev $h2 ingress" 5 0
> + check_err $? "SR-class unicast leaked through a filtering ingress port"
> +
> + send_uc 192.0.2.2 $BE_PCP
> + tc_check_packets "dev $h2 ingress" 5 1
> + check_err $? "best-effort unicast not delivered"
> +
> + rx_filter_uninstall $h2 5
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved blocks SR-class unicast at ingress"
> +}
> +
> +# Filtering is gated by the ingress port flag, not by the presence of a
> +# reserved stream: with a reserved stream registered but the flag clear, a
> +# foreign SR-class group is forwarded; setting the flag then blocks it.
> +fwd_flag_gates_test()
> +{
> + RET=0
> +
> + bridge mdb add dev br0 port $swp2 grp $GRP permanent vid $VID \
> + stream_reserved
> + bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
> +
> + rx_filter_install $h2 2 $GRP2
> +
> + # Flag clear (default): the reserved stream does not engage the gate.
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 1
> + check_err $? "SR-class frame blocked with filter flag clear"
> +
> + # Flag set on the ingress port: the foreign group is now dropped.
> + sr_filter $swp1 on
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 1
> + check_err $? "foreign SR-class frame leaked after filter flag set"
> +
> + rx_filter_uninstall $h2 2
> + bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
> + bridge mdb del dev br0 port $swp2 grp $GRP permanent vid $VID
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved gated by port flag, not membership"
> +}
> +
> +# The gate only engages while the prio->tc map has a non-zero class. With the
> +# mqprio qdisc removed, the foreign group is admitted again.
> +fwd_tc_toggle_test()
> +{
> + RET=0
> +
> + sr_filter $swp1 on
> + bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
> +
> + rx_filter_install $h2 2 $GRP2
> +
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 0
> + check_err $? "foreign SR-class frame leaked while TC enabled"
> +
> + # Drop the TC configuration; the prio->tc map is gone, gate is inert.
> + tc qdisc del dev br0 root handle 100: mqprio
> +
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 1
> + check_err $? "frame not delivered after TC configuration removed"
> +
> + tc qdisc add dev br0 root handle 100: mqprio num_tc 2 \
> + map 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 \
> + queues 1@0 1@1 hw 0
> +
> + rx_filter_uninstall $h2 2
> +
> + bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved gate follows TC configuration"
> +}
> +
> +# Clearing the port flag stops the port filtering, so the previously blocked
> +# group is admitted again.
> +fwd_flag_toggle_test()
> +{
> + RET=0
> +
> + sr_filter $swp1 on
> + bridge mdb add dev br0 port $swp2 grp $GRP2 permanent vid $VID
> +
> + rx_filter_install $h2 2 $GRP2
> +
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 0
> + check_err $? "foreign SR-class frame leaked while ingress filtering"
> +
> + # Disarm the filter on swp1.
> + sr_filter $swp1 off
> +
> + send_mc $GRP2 $GRP2_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 1
> + check_err $? "frame not delivered after filter flag cleared"
> +
> + rx_filter_uninstall $h2 2
> +
> + bridge mdb del dev br0 port $swp2 grp $GRP2 permanent vid $VID
> +
> + log_test "MDB stream_reserved filtering disabled on port flag clear"
> +}
> +
> +# The semantics are protocol-independent: a foreign IPv6/MLD group at SR class
> +# is dropped at ingress, while best-effort is delivered.
> +fwd_sr_ipv6_test()
> +{
> + RET=0
> +
> + sr_filter $swp1 on
> + bridge mdb add dev br0 port $swp2 grp $GRP6B permanent vid $VID
> + bridge mdb add dev br0 port $swp3 grp $GRP6B permanent vid $VID
> +
> + rx_filter_install $h2 2 $GRP6B ipv6
> + rx_filter_install $h3 2 $GRP6B ipv6
> +
> + send_mc6 $GRP6B $GRP6B_DMAC $SR_PCP
> + tc_check_packets "dev $h2 ingress" 2 0
> + check_err $? "foreign SR-class IPv6 frame leaked to a listener"
> + tc_check_packets "dev $h3 ingress" 2 0
> + check_err $? "foreign SR-class IPv6 frame leaked to a listener"
> +
> + send_mc6 $GRP6B $GRP6B_DMAC $BE_PCP
> + tc_check_packets "dev $h2 ingress" 2 1
> + check_err $? "best-effort IPv6 frame not delivered"
> + tc_check_packets "dev $h3 ingress" 2 1
> + check_err $? "best-effort IPv6 frame not delivered"
> +
> + rx_filter_uninstall $h3 2
> + rx_filter_uninstall $h2 2
> +
> + bridge mdb del dev br0 port $swp3 grp $GRP6B permanent vid $VID
> + bridge mdb del dev br0 port $swp2 grp $GRP6B permanent vid $VID
> + sr_filter $swp1 off
> +
> + log_test "MDB stream_reserved blocks foreign SR-class IPv6 group at ingress"
> +}
> +
> +trap cleanup EXIT
> +
> +setup_prepare
> +setup_wait
> +
> +if ! stream_reserved_supported; then
> + log_test_skip "MDB stream_reserved" \
> + "kernel or iproute2 lacks MDB_FLAGS_STREAM_RESERVED support"
> + exit $EXIT_STATUS
> +fi
> +
> +tests_run
> +
> +exit $EXIT_STATUS
> diff --git a/tools/testing/selftests/net/forwarding/config b/tools/testing/selftests/net/forwarding/config
> index 75a6c3d3c1da3..d1fe9ec41340e 100644
> --- a/tools/testing/selftests/net/forwarding/config
> +++ b/tools/testing/selftests/net/forwarding/config
> @@ -1,5 +1,6 @@
> CONFIG_BPF_SYSCALL=y
> CONFIG_BRIDGE=m
> +CONFIG_BRIDGE_8021Q_SRP=y
> CONFIG_BRIDGE_IGMP_SNOOPING=y
> CONFIG_BRIDGE_VLAN_FILTERING=y
> CONFIG_CGROUP_BPF=y
> @@ -40,6 +41,7 @@ CONFIG_NET_L3_MASTER_DEV=y
> CONFIG_NET_NS=y
> CONFIG_NET_SCH_ETS=m
> CONFIG_NET_SCH_INGRESS=m
> +CONFIG_NET_SCH_MQPRIO=m
> CONFIG_NET_SCH_PRIO=m
> CONFIG_NET_SCH_RED=m
> CONFIG_NET_SCH_TBF=m
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 2/6] net: bridge: convert mdb_entry host_joined to a flags field
2026-06-02 0:43 ` [PATCH net-next v2 2/6] net: bridge: convert mdb_entry host_joined to a flags field Luke Howard
@ 2026-06-03 7:38 ` Nikolay Aleksandrov
0 siblings, 0 replies; 30+ messages in thread
From: Nikolay Aleksandrov @ 2026-06-03 7:38 UTC (permalink / raw)
To: Luke Howard, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean
Cc: netdev, linux-kernel, bridge, linux-kselftest, Max Hunter,
Kieran Tyrrell
On 02/06/2026 03:43, Luke Howard wrote:
> Replace the bool host_joined in struct net_bridge_mdb_entry with a u8
> flags field and a BRIDGE_MDBE_F_HOST_JOINED bit.
>
> Signed-off-by: Luke Howard <lukeh@padl.com>
> ---
> net/bridge/br_input.c | 2 +-
> net/bridge/br_mdb.c | 14 ++++++++------
> net/bridge/br_multicast.c | 26 ++++++++++++++------------
> net/bridge/br_private.h | 4 +++-
> net/bridge/br_switchdev.c | 2 +-
> 5 files changed, 27 insertions(+), 21 deletions(-)
>
it's best these new flags to be unsigned long and use test/set_bit for
manipulating them, otherwise kcsan won't be happy
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index 470615675bdc0..5787066b1f4cb 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -188,7 +188,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> mdst = br_mdb_entry_skb_get(brmctx, skb, vid);
> if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
> br_multicast_querier_exists(brmctx, eth_hdr(skb), mdst)) {
> - if ((mdst && mdst->host_joined) ||
> + if ((mdst && (mdst->flags & BRIDGE_MDBE_F_HOST_JOINED)) ||
> br_multicast_is_router(brmctx, skb) ||
> br->dev->flags & IFF_ALLMULTI) {
> local_rcv = true;
> diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
> index 3ddfbd536edb4..b95ca72ec6347 100644
> --- a/net/bridge/br_mdb.c
> +++ b/net/bridge/br_mdb.c
> @@ -344,7 +344,7 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
> break;
> }
>
> - if (!s_pidx && mp->host_joined) {
> + if (!s_pidx && (mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
> err = __mdb_fill_info(skb, mp, NULL);
> if (err) {
> nla_nest_cancel(skb, nest2);
> @@ -1053,7 +1053,8 @@ static int br_mdb_add_group(const struct br_mdb_config *cfg,
>
> /* host join */
> if (!port) {
> - if (mp->host_joined && !(cfg->nlflags & NLM_F_REPLACE)) {
> + if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
> + !(cfg->nlflags & NLM_F_REPLACE)) {
> NL_SET_ERR_MSG_MOD(extack, "Group is already joined by host");
> return -EEXIST;
> }
> @@ -1381,7 +1382,8 @@ static int __br_mdb_del(const struct br_mdb_config *cfg)
> goto unlock;
>
> /* host leave */
> - if (entry->ifindex == mp->br->dev->ifindex && mp->host_joined) {
> + if (entry->ifindex == mp->br->dev->ifindex &&
> + (mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
> br_multicast_host_leave(mp, false);
> err = 0;
> br_mdb_notify(br->dev, mp, NULL, RTM_DELMDB);
> @@ -1619,7 +1621,7 @@ br_mdb_get_reply_alloc(const struct net_bridge_mdb_entry *mp)
> /* MDBA_MDB_ENTRY */
> nla_total_size(0);
>
> - if (mp->host_joined)
> + if (mp->flags & BRIDGE_MDBE_F_HOST_JOINED)
> nlmsg_size += rtnl_mdb_nlmsg_pg_size(NULL);
>
> for (pg = mlock_dereference(mp->ports, mp->br); pg;
> @@ -1658,7 +1660,7 @@ static int br_mdb_get_reply_fill(struct sk_buff *skb,
> goto cancel;
> }
>
> - if (mp->host_joined) {
> + if (mp->flags & BRIDGE_MDBE_F_HOST_JOINED) {
> err = __mdb_fill_info(skb, mp, NULL);
> if (err)
> goto cancel;
> @@ -1702,7 +1704,7 @@ int br_mdb_get(struct net_device *dev, struct nlattr *tb[], u32 portid, u32 seq,
> spin_lock_bh(&br->multicast_lock);
>
> mp = br_mdb_ip_get(br, &group);
> - if (!mp || (!mp->ports && !mp->host_joined)) {
> + if (!mp || (!mp->ports && !(mp->flags & BRIDGE_MDBE_F_HOST_JOINED))) {
> NL_SET_ERR_MSG_MOD(extack, "MDB entry not found");
> err = -ENOENT;
> goto unlock;
> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> index 5d6fdfb43c046..4107bf7bd271f 100644
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -391,13 +391,13 @@ static void br_multicast_sg_host_state(struct net_bridge_mdb_entry *star_mp,
>
> if (WARN_ON(!br_multicast_is_star_g(&star_mp->addr)))
> return;
> - if (!star_mp->host_joined)
> + if (!(star_mp->flags & BRIDGE_MDBE_F_HOST_JOINED))
> return;
>
> sg_mp = br_mdb_ip_get(star_mp->br, &sg->key.addr);
> if (!sg_mp)
> return;
> - sg_mp->host_joined = true;
> + sg_mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
> }
>
> /* set the host_joined state of all of *,G's S,G entries */
> @@ -425,7 +425,8 @@ static void br_multicast_star_g_host_state(struct net_bridge_mdb_entry *star_mp)
> sg_mp = br_mdb_ip_get(br, &sg_ip);
> if (!sg_mp)
> continue;
> - sg_mp->host_joined = star_mp->host_joined;
> + sg_mp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
> + sg_mp->flags |= star_mp->flags & BRIDGE_MDBE_F_HOST_JOINED;
> }
> }
> }
> @@ -453,7 +454,7 @@ static void br_multicast_sg_del_exclude_ports(struct net_bridge_mdb_entry *sgmp)
> * we treat it as EXCLUDE {}, so for an S,G it's considered a
> * STAR_EXCLUDE entry and we can safely leave it
> */
> - sgmp->host_joined = false;
> + sgmp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
>
> for (pp = &sgmp->ports;
> (p = mlock_dereference(*pp, sgmp->br)) != NULL;) {
> @@ -824,7 +825,8 @@ void br_multicast_del_pg(struct net_bridge_mdb_entry *mp,
> hlist_add_head(&pg->mcast_gc.gc_node, &br->mcast_gc_list);
> queue_work(system_long_wq, &br->mcast_gc_work);
>
> - if (!mp->ports && !mp->host_joined && netif_running(br->dev))
> + if (!mp->ports && !(mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
> + netif_running(br->dev))
> mod_timer(&mp->timer, jiffies);
> }
>
> @@ -1470,8 +1472,8 @@ void br_multicast_del_port_group(struct net_bridge_port_group *p)
> void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
> struct net_bridge_mdb_entry *mp, bool notify)
> {
> - if (!mp->host_joined) {
> - mp->host_joined = true;
> + if (!(mp->flags & BRIDGE_MDBE_F_HOST_JOINED)) {
> + mp->flags |= BRIDGE_MDBE_F_HOST_JOINED;
> if (br_multicast_is_star_g(&mp->addr))
> br_multicast_star_g_host_state(mp);
> if (notify)
> @@ -1486,10 +1488,10 @@ void br_multicast_host_join(const struct net_bridge_mcast *brmctx,
>
> void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify)
> {
> - if (!mp->host_joined)
> + if (!(mp->flags & BRIDGE_MDBE_F_HOST_JOINED))
> return;
>
> - mp->host_joined = false;
> + mp->flags &= ~BRIDGE_MDBE_F_HOST_JOINED;
> if (br_multicast_is_star_g(&mp->addr))
> br_multicast_star_g_host_state(mp);
> if (notify)
> @@ -3537,7 +3539,7 @@ static void br_ip4_multicast_query(struct net_bridge_mcast *brmctx,
>
> max_delay *= brmctx->multicast_last_member_count;
>
> - if (mp->host_joined &&
> + if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
> (timer_pending(&mp->timer) ?
> time_after(mp->timer.expires, now + max_delay) :
> timer_delete_sync_try(&mp->timer) >= 0))
> @@ -3626,7 +3628,7 @@ static int br_ip6_multicast_query(struct net_bridge_mcast *brmctx,
> goto out;
>
> max_delay *= brmctx->multicast_last_member_count;
> - if (mp->host_joined &&
> + if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
> (timer_pending(&mp->timer) ?
> time_after(mp->timer.expires, now + max_delay) :
> timer_delete_sync_try(&mp->timer) >= 0))
> @@ -3722,7 +3724,7 @@ br_multicast_leave_group(struct net_bridge_mcast *brmctx,
> brmctx->multicast_last_member_interval;
>
> if (!pmctx) {
> - if (mp->host_joined &&
> + if ((mp->flags & BRIDGE_MDBE_F_HOST_JOINED) &&
> (timer_pending(&mp->timer) ?
> time_after(mp->timer.expires, time) :
> timer_delete_sync_try(&mp->timer) >= 0)) {
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 6a2dabd6f4bfb..1e0eefaf50dd1 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -373,12 +373,14 @@ struct net_bridge_port_group {
> struct rcu_head rcu;
> };
>
> +#define BRIDGE_MDBE_F_HOST_JOINED BIT(0)
> +
> struct net_bridge_mdb_entry {
> struct rhash_head rhnode;
> struct net_bridge *br;
> struct net_bridge_port_group __rcu *ports;
> struct br_ip addr;
> - bool host_joined;
> + u8 flags;
>
> struct timer_list timer;
> struct hlist_node mdb_node;
> diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
> index c46d8e49ce990..39535f1a6b8ce 100644
> --- a/net/bridge/br_switchdev.c
> +++ b/net/bridge/br_switchdev.c
> @@ -741,7 +741,7 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
> struct net_bridge_port_group __rcu * const *pp;
> const struct net_bridge_port_group *p;
>
> - if (mp->host_joined) {
> + if (mp->flags & BRIDGE_MDBE_F_HOST_JOINED) {
> err = br_switchdev_mdb_queue_one(&mdb_list, dev, action,
> SWITCHDEV_OBJ_ID_HOST_MDB,
> mp, NULL, br_dev);
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-03 7:35 ` Nikolay Aleksandrov
@ 2026-06-04 5:39 ` Luke Howard
2026-06-05 12:53 ` Cedric Jehasse
0 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-04 5:39 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel,
Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell, Cedric Jehasse
On 3 Jun 2026, at 5:35 pm, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> So looking at how this is implemented, why not put most of it in TC?
> It is testing for skb class, for tc qdisc, I don't see a reason for it
> to be in the bridge at all. You can filter the mcast groups and simulate
> the "reserved" flag, it has to be set manually anyway. Adding new
> tests in the bridge software fast-path just for this is a waste.
This makes sense. Maybe it’s better to use flower+TCAM to implement this, and have our SRP daemon manage flower entries in addition to MDB entries. The mv88e6xxx driver’s TCAM support is recent (hi Cedric!) and does not yet appear to support the 6352 or matching on PCP/DA, but one presumes this is possible to add.
Alternatively, responding to your comments, the bridge could mark frames with a “stream reserved” DA with a new tc_skb_ext bit, which would be surfaced to flower similarly to l2_miss. The rest of the policy would live in the flower rule. MDB_FLAGS_STREAM_RESERVED would remain but IFLA_BRPORT_FILTER_STREAM_RESERVED could go (we would dynamically enable the equivalent in mv88e6xxx by refcounting MDB_FLAGS_STREAM_RESERVED entries).
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
2026-06-03 3:29 ` Luke Howard
@ 2026-06-04 6:26 ` Luke Howard
0 siblings, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-04 6:26 UTC (permalink / raw)
To: Andrew Lunn
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikolay Aleksandrov, Ido Schimmel, Andrew Lunn, David Ahern,
Shuah Khan, Vladimir Oltean, netdev, linux-kernel, bridge,
linux-kselftest, Max Hunter, Kieran Tyrrell
> Indeed, that’s what this series does. But it comes with the discussed caveat that ports without MQPRIO (whether or not they are part of a bridge) implicitly inherit the global FPri to QPri mapping. I think that is acceptable given that users should have no expectaitons about the default mapping, but others may disagree.
It appears out one can set the egress QPri with a TCAM entry. So we may be able to correctly implement MQPRIO with an entry that matches on the PCP and sets the QPri and DPV bits. We’d need to be careful that the flower API couldn’t remove one of these entries.
Luke
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support
[not found] ` <808529B1-E40A-4E54-A654-86F1B6D1FA66@padl.com>
@ 2026-06-04 8:36 ` Cedric Jehasse
0 siblings, 0 replies; 30+ messages in thread
From: Cedric Jehasse @ 2026-06-04 8:36 UTC (permalink / raw)
To: Luke Howard
Cc: Jiri Pirko, Ivan Vecera, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Nikolay Aleksandrov,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
O Thu, Jun 04, 2026 at 07:36:07AM +1000, Luke Howard wrote:
> I realise I misunderstood you: this is about exposing the 6390’s egress FPri to PCP mapping via DSA/DCB rewrite? That’s really cool, and orthogonal to MQPRIO; apologies for my confusion.
No, it's about configuring the ingress pcp to QPri mappings.
The patch has been submitted to the mailing list:
https://lore.kernel.org/netdev/20260604-net-next-mv88e6xxx-pcp-prio-v1-0-f9d10fe6cdc8@luminex.be/
Cedric
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-04 5:39 ` Luke Howard
@ 2026-06-05 12:53 ` Cedric Jehasse
2026-06-05 14:44 ` Andrew Lunn
2026-06-05 22:36 ` Luke Howard
0 siblings, 2 replies; 30+ messages in thread
From: Cedric Jehasse @ 2026-06-05 12:53 UTC (permalink / raw)
To: Luke Howard, Nikolay Aleksandrov
Cc: Nikolay Aleksandrov, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
On Thu, Jun 04, 2026 at 03:39:25PM +1000, Luke Howard wrote:
> On 3 Jun 2026, at 5:35 pm, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> > So looking at how this is implemented, why not put most of it in TC?
> > It is testing for skb class, for tc qdisc, I don't see a reason for it
> > to be in the bridge at all. You can filter the mcast groups and simulate
> > the "reserved" flag, it has to be set manually anyway. Adding new
> > tests in the bridge software fast-path just for this is a waste.
>
> This makes sense. Maybe it’s better to use flower+TCAM to implement this, and have our SRP daemon manage flower entries in addition to MDB entries. The mv88e6xxx driver’s TCAM support is recent (hi Cedric!) and does not yet appear to support the 6352 or matching on PCP/DA, but one presumes this is possible to add.
>
> Alternatively, responding to your comments, the bridge could mark frames with a “stream reserved” DA with a new tc_skb_ext bit, which would be surfaced to flower similarly to l2_miss. The rest of the policy would live in the flower rule. MDB_FLAGS_STREAM_RESERVED would remain but IFLA_BRPORT_FILTER_STREAM_RESERVED could go (we would dynamically enable the equivalent in mv88e6xxx by refcounting MDB_FLAGS_STREAM_RESERVED entries).
>
I don't know if this can be done in TC.
For the usecase of SRP the Marvell has bits in it's mac entries to specify is
an AVB entry. This looks vendor specific, and other vendors don't have it.
But the concept comes from the 802.1Q spec. The spec describes different types
of FDB entries. One of these is Dynamic Reservation Entries, which are created
from the Stream Reservation Protocol.
The issue here is the Marvell switch has an implementation where we need to
know if an entry is a Dynamic Reservation Entry. But the linux bridge has a
simplified version of the FDB described in 802.1Q.
By adding a way to distinguish between the type of FDB entries, it's possible
to program a Marvell which has this distinction between Dynamic Reservation
Entries (or AVB entries) and other entries.
Cedric
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-05 12:53 ` Cedric Jehasse
@ 2026-06-05 14:44 ` Andrew Lunn
2026-06-06 8:02 ` Luke Howard
2026-06-05 22:36 ` Luke Howard
1 sibling, 1 reply; 30+ messages in thread
From: Andrew Lunn @ 2026-06-05 14:44 UTC (permalink / raw)
To: Cedric Jehasse
Cc: Luke Howard, Nikolay Aleksandrov, Jiri Pirko, Ivan Vecera,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> The issue here is the Marvell switch has an implementation where we need to
> know if an entry is a Dynamic Reservation Entry. But the linux bridge has a
> simplified version of the FDB described in 802.1Q.
> By adding a way to distinguish between the type of FDB entries, it's possible
> to program a Marvell which has this distinction between Dynamic Reservation
> Entries (or AVB entries) and other entries.
So it sounds like you first need to work on the software
implementation and expand the simplified version with the features you
need. You can then add support to accelerate this by offloading it to
the hardware.
Andrew
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-05 12:53 ` Cedric Jehasse
2026-06-05 14:44 ` Andrew Lunn
@ 2026-06-05 22:36 ` Luke Howard
1 sibling, 0 replies; 30+ messages in thread
From: Luke Howard @ 2026-06-05 22:36 UTC (permalink / raw)
To: Cedric Jehasse
Cc: Nikolay Aleksandrov, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan, Andrew Lunn,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> But the concept comes from the 802.1Q spec. The spec describes different types
> of FDB entries. One of these is Dynamic Reservation Entries, which are created
> from the Stream Reservation Protocol.
> The issue here is the Marvell switch has an implementation where we need to
> know if an entry is a Dynamic Reservation Entry. But the linux bridge has a
> simplified version of the FDB described in 802.1Q.
> By adding a way to distinguish between the type of FDB entries, it's possible
> to program a Marvell which has this distinction between Dynamic Reservation
> Entries (or AVB entries) and other entries.
If we only care about the multicast case, and we can depend on snooping being enabled and flooding disabled, then I think Dynamic Reservation Entries collapse to permanent MDB entries. Unicast is trickier.
The issue of reclassifying (or dropping) frames that share a SRP class PCP is separate. I think it could be done with a tc-flower entry that reclassifies all ingress traffic with SRP PCPs (e.g. see 802.1Q Table 6-5) and then a per-stream egress entry that sets the queue and priority.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-05 14:44 ` Andrew Lunn
@ 2026-06-06 8:02 ` Luke Howard
2026-06-06 8:21 ` Nikolay Aleksandrov
0 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-06 8:02 UTC (permalink / raw)
To: Andrew Lunn
Cc: Cedric Jehasse, Nikolay Aleksandrov, Jiri Pirko, Ivan Vecera,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> So it sounds like you first need to work on the software
> implementation and expand the simplified version with the features you
> need. You can then add support to accelerate this by offloading it to
> the hardware.
Agreed.
The definition of Dynamic Reservation Entries in 802.1Q (clause 8.8.7) might support the addition of a new MDB (or even FDB) entry state to the kernel:
- add MDB_DYNAMIC_RESERVATION (a state, not a flag);
- the software bridge only _classifies_ packets against MDB_DYNAMIC_RESERVATION entries, and only when MDB is authoritative. Classification sets dynamic_reservation_hit on tc_skb_ext;
- dynamic_reservation_hit is visible to the flow dissector so can be used for policy enforcement.
Advantages:
- the new MDB state maps well to 802.1Q;
- minimal changes to bridge;
- actual policy (reclassify, drop, etc) is left to the user;
- doesn’t require a new tc-flower entry for each stream DA;
- entry state maps 1:1 to mv88e6xxx AVB_NRL ATU EntryState.
Disadvantages:
- no unicast support, although potentially can be extended (there are some subtleties);
- mv88e6xxx support would either require rich enough tc-flower support in TCAM, intercepting TCA_FLOWER_DYNAMIC_RESERVATION_HIT and mapping to native AVB admission control, or a per-port devlink parameter (not so nice).
This is implemented and working with the software bridge, I still haven’t quite figured out the right mapping for mv88e6xxx.
Luke
PS. I previously incorrectly asserted that 802.1Q required dropping frames with AVB/SRP PCPs but without valid dynamic reservation entries. 802.1Q discussions priority mapping in clause 6.9.4 for traffic from SRP boundary ports (those not participating in SRP). In practice I think this should all be policy, e.g. you might want to reprioritise valid DSCP traffic from within a SRP domain, or drop instead of reprioritise, etc.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-06 8:02 ` Luke Howard
@ 2026-06-06 8:21 ` Nikolay Aleksandrov
2026-06-06 21:49 ` Luke Howard
0 siblings, 1 reply; 30+ messages in thread
From: Nikolay Aleksandrov @ 2026-06-06 8:21 UTC (permalink / raw)
To: Luke Howard, Andrew Lunn
Cc: Cedric Jehasse, Jiri Pirko, Ivan Vecera, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
On 06/06/2026 11:02, Luke Howard wrote:
>
>> So it sounds like you first need to work on the software
>> implementation and expand the simplified version with the features you
>> need. You can then add support to accelerate this by offloading it to
>> the hardware.
>
> Agreed.
>
> The definition of Dynamic Reservation Entries in 802.1Q (clause 8.8.7) might support the addition of a new MDB (or even FDB) entry state to the kernel:
>
> - add MDB_DYNAMIC_RESERVATION (a state, not a flag);
> - the software bridge only _classifies_ packets against MDB_DYNAMIC_RESERVATION entries, and only when MDB is authoritative. Classification sets dynamic_reservation_hit on tc_skb_ext;
> - dynamic_reservation_hit is visible to the flow dissector so can be used for policy enforcement.
>
See, saying the bridge has to classify doesn't sound right. Why not do the
classification where such operations are usually done, e.g. tc?
You have to manually designate these entries anyway.
> Advantages:
>
> - the new MDB state maps well to 802.1Q;
> - minimal changes to bridge;
> - actual policy (reclassify, drop, etc) is left to the user;
> - doesn’t require a new tc-flower entry for each stream DA;
> - entry state maps 1:1 to mv88e6xxx AVB_NRL ATU EntryState.
>
> Disadvantages:
>
> - no unicast support, although potentially can be extended (there are some subtleties);
> - mv88e6xxx support would either require rich enough tc-flower support in TCAM, intercepting TCA_FLOWER_DYNAMIC_RESERVATION_HIT and mapping to native AVB admission control, or a per-port devlink parameter (not so nice).
>
> This is implemented and working with the software bridge, I still haven’t quite figured out the right mapping for mv88e6xxx.
>
> Luke
>
> PS. I previously incorrectly asserted that 802.1Q required dropping frames with AVB/SRP PCPs but without valid dynamic reservation entries. 802.1Q discussions priority mapping in clause 6.9.4 for traffic from SRP boundary ports (those not participating in SRP). In practice I think this should all be policy, e.g. you might want to reprioritise valid DSCP traffic from within a SRP domain, or drop instead of reprioritise, etc.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-06 8:21 ` Nikolay Aleksandrov
@ 2026-06-06 21:49 ` Luke Howard
2026-06-06 22:14 ` Nikolay Aleksandrov
0 siblings, 1 reply; 30+ messages in thread
From: Luke Howard @ 2026-06-06 21:49 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: Andrew Lunn, Cedric Jehasse, Jiri Pirko, Ivan Vecera,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
> On 6 Jun 2026, at 6:21 pm, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>
> On 06/06/2026 11:02, Luke Howard wrote:
>> The definition of Dynamic Reservation Entries in 802.1Q (clause 8.8.7) might support the addition of a new MDB (or even FDB) entry state to the kernel:
>> - add MDB_DYNAMIC_RESERVATION (a state, not a flag);
>> - the software bridge only _classifies_ packets against MDB_DYNAMIC_RESERVATION entries, and only when MDB is authoritative. Classification sets dynamic_reservation_hit on tc_skb_ext;
>> - dynamic_reservation_hit is visible to the flow dissector so can be used for policy enforcement.
>
> See, saying the bridge has to classify doesn't sound right. Why not do the
> classification where such operations are usually done, e.g. tc?
> You have to manually designate these entries anyway.
s/classify/mark, i.e. marking a forwarding bit for tc to match, a la l2_miss.
tc can’t see into the MDB to tell if a DA has a dynamic reservation entry so, without an explicit DRE bit, the SRP daemon would need to maintain a flower permit filter per DRE. Not needing this allows the user to set a single policy filter prior to starting SRP, e.g.:
tc filter add dev lan0 egress protocol 802.1Q pref 1 handle 1 flower vlan_prio 3 dynamic_reservation_hit 0 action drop
It also maps cleanly to chips that support 802.1Qav with priority regeneration or filtering, but which can’t support tc-flower.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control
2026-06-06 21:49 ` Luke Howard
@ 2026-06-06 22:14 ` Nikolay Aleksandrov
0 siblings, 0 replies; 30+ messages in thread
From: Nikolay Aleksandrov @ 2026-06-06 22:14 UTC (permalink / raw)
To: Luke Howard
Cc: Andrew Lunn, Cedric Jehasse, Jiri Pirko, Ivan Vecera,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Ido Schimmel, Andrew Lunn, David Ahern, Shuah Khan,
Vladimir Oltean, netdev, linux-kernel, bridge, linux-kselftest,
Max Hunter, Kieran Tyrrell
On Sun, Jun 07, 2026 at 07:49:26AM +1000, Luke Howard wrote:
>
>
> > On 6 Jun 2026, at 6:21 pm, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> >
> > On 06/06/2026 11:02, Luke Howard wrote:
> >> The definition of Dynamic Reservation Entries in 802.1Q (clause 8.8.7) might support the addition of a new MDB (or even FDB) entry state to the kernel:
> >> - add MDB_DYNAMIC_RESERVATION (a state, not a flag);
> >> - the software bridge only _classifies_ packets against MDB_DYNAMIC_RESERVATION entries, and only when MDB is authoritative. Classification sets dynamic_reservation_hit on tc_skb_ext;
> >> - dynamic_reservation_hit is visible to the flow dissector so can be used for policy enforcement.
> >
> > See, saying the bridge has to classify doesn't sound right. Why not do the
> > classification where such operations are usually done, e.g. tc?
> > You have to manually designate these entries anyway.
>
> s/classify/mark, i.e. marking a forwarding bit for tc to match, a la l2_miss.
>
> tc can’t see into the MDB to tell if a DA has a dynamic reservation entry so, without an explicit DRE bit, the SRP daemon would need to maintain a flower permit filter per DRE. Not needing this allows the user to set a single policy filter prior to starting SRP, e.g.:
>
> tc filter add dev lan0 egress protocol 802.1Q pref 1 handle 1 flower vlan_prio 3 dynamic_reservation_hit 0 action drop
>
> It also maps cleanly to chips that support 802.1Qav with priority regeneration or filtering, but which can’t support tc-flower.
Yeah, that was an expected answer and I've seen such claims multiple times.
Just because it is convenient to add it in the bridge, does not make it the
right software model. There are layers that do filtering, marking and manipulation
this must be done at such layer. If you have to create a new table with the entries
filled there then that is what your user-space software must do, or come up with a
better alternative. There're also bridge netfilter chains that can do packet
filtering and manipulation, that might be an option.
For the MDB to have a dynamic reservation entry means someone must've added it,
these are not dynamically learned, so you can just as well build the table
in a more appropriate place which can tag or filter the packet.
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2026-06-06 22:14 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 0:43 [PATCH net-next v2 0/6] net: dsa: mv8ee6xxx: MQPRIO and 802.1Qat support Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 1/6] net: bridge: mdb: add MDB_FLAGS_STREAM_RESERVED flag Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 2/6] net: bridge: convert mdb_entry host_joined to a flags field Luke Howard
2026-06-03 7:38 ` Nikolay Aleksandrov
2026-06-02 0:43 ` [PATCH net-next v2 3/6] net: bridge: add 802.1Qat stream reservation admission control Luke Howard
2026-06-02 1:28 ` Luke Howard
2026-06-03 7:35 ` Nikolay Aleksandrov
2026-06-04 5:39 ` Luke Howard
2026-06-05 12:53 ` Cedric Jehasse
2026-06-05 14:44 ` Andrew Lunn
2026-06-06 8:02 ` Luke Howard
2026-06-06 8:21 ` Nikolay Aleksandrov
2026-06-06 21:49 ` Luke Howard
2026-06-06 22:14 ` Nikolay Aleksandrov
2026-06-05 22:36 ` Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 4/6] net: bridge: allow MDB_FLAGS_STREAM_RESERVED on host groups Luke Howard
2026-06-02 0:43 ` [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: MQPRIO support Luke Howard
2026-06-02 12:00 ` Cedric Jehasse
2026-06-02 21:12 ` Luke Howard
2026-06-02 23:48 ` Luke Howard
2026-06-02 23:55 ` Andrew Lunn
2026-06-03 0:15 ` Luke Howard
2026-06-03 1:40 ` Luke Howard
2026-06-03 2:41 ` Andrew Lunn
2026-06-03 3:29 ` Luke Howard
2026-06-04 6:26 ` Luke Howard
2026-06-03 2:09 ` Luke Howard
2026-06-03 3:30 ` Luke Howard
[not found] ` <808529B1-E40A-4E54-A654-86F1B6D1FA66@padl.com>
2026-06-04 8:36 ` Cedric Jehasse
2026-06-02 0:43 ` [PATCH net-next v2 6/6] net: dsa: mv88e6xxx: honour MDB_FLAGS_STREAM_RESERVED for AVB streams Luke Howard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox