* [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features
@ 2026-03-10 7:45 Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team Hangbin Liu
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Hangbin Liu @ 2026-03-10 7:45 UTC (permalink / raw)
To: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Jiri Pirko, Nikolay Aleksandrov,
Ido Schimmel, Simon Horman, Sridhar Samudrala
Cc: netdev, linux-kernel, bridge, Hangbin Liu
Currently, master devices (bonding, bridge, team) manually call
netdev_compute_master_upper_features() scattered throughout their port
add/remove operations. This approach requires each driver to remember
to update features at the right times and leads to code duplication.
The series moves netdev_compute_master_upper_features() to callback
ndo_set_features so that the offload compute could automatically
invoked during feature updates when upper/lower device relationships
change. This centralizes the feature computation flow and removes the
burden from individual drivers.
---
Hangbin Liu (3):
net: use ndo_set_features to set offload features for bonding/bridge/team
failover: use ndo_set_features for failover offload compute
net: no need to disable LRO specifically
drivers/net/bonding/bond_main.c | 14 +++++----
drivers/net/net_failover.c | 67 +++++------------------------------------
drivers/net/team/team_core.c | 15 ++++-----
include/net/net_failover.h | 7 -----
net/8021q/vlan.c | 2 --
net/bridge/br_device.c | 7 +++++
net/bridge/br_if.c | 6 ----
net/core/dev.c | 8 +++--
net/hsr/hsr_slave.c | 1 -
9 files changed, 37 insertions(+), 90 deletions(-)
---
base-commit: 52ede1bce557c66309f41ac29dd190be23ca9129
change-id: 20260310-offload_compute-4c0bafa2e022
Best regards,
--
Hangbin Liu <liuhangbin@gmail.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team
2026-03-10 7:45 [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features Hangbin Liu
@ 2026-03-10 7:45 ` Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 2/3] failover: use ndo_set_features for failover offload compute Hangbin Liu
` (3 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2026-03-10 7:45 UTC (permalink / raw)
To: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Jiri Pirko, Nikolay Aleksandrov,
Ido Schimmel, Simon Horman, Sridhar Samudrala
Cc: netdev, linux-kernel, bridge, Hangbin Liu
Convert bonding, bridge, and team drivers to use ndo_set_features callback
instead of manually calling netdev_compute_master_upper_features() during
port add/remove operations.
This change centralizes the feature computation flow:
Before:
- netdev_master_upper_dev_link()
- netdev_compute_master_upper_features() <- called manually
- compute offload features
- netdev_change_features()
- __netdev_update_features()
- update other features
After:
- netdev_master_upper_dev_link()
- __netdev_upper_dev_link()
- netdev_change_features()
- __netdev_update_features()
- ndo_set_features()
- netdev_compute_master_upper_features()
This ensures offload features are computed automatically when
upper/lower device links change, removing the need for manual
feature computation calls in the driver code.
The netdev_change_features() call in team_uninit() is also removed
since it calls team_port_del() for each port, which now triggers
feature updates automatically.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
drivers/net/bonding/bond_main.c | 11 ++++++++---
drivers/net/team/team_core.c | 12 ++++++++----
net/bridge/br_device.c | 7 +++++++
net/bridge/br_if.c | 4 ----
net/core/dev.c | 8 ++++++--
5 files changed, 29 insertions(+), 13 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 0ccf928eecde..4d32b06079ee 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1509,6 +1509,12 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
return features;
}
+static int bond_set_features(struct net_device *dev, netdev_features_t features)
+{
+ netdev_compute_master_upper_features(dev, true);
+ return 0;
+}
+
static void bond_setup_by_slave(struct net_device *bond_dev,
struct net_device *slave_dev)
{
@@ -2228,7 +2234,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
}
bond->slave_cnt++;
- netdev_compute_master_upper_features(bond->dev, true);
bond_set_carrier(bond);
/* Needs to be called before bond_select_active_slave(), which will
@@ -2483,7 +2488,6 @@ static int __bond_release_one(struct net_device *bond_dev,
call_netdevice_notifiers(NETDEV_RELEASE, bond->dev);
}
- netdev_compute_master_upper_features(bond->dev, true);
if (!(bond_dev->features & NETIF_F_VLAN_CHALLENGED) &&
(old_features & NETIF_F_VLAN_CHALLENGED))
slave_info(bond_dev, slave_dev, "last VLAN challenged slave left bond - VLAN blocking is removed\n");
@@ -3975,7 +3979,7 @@ static int bond_slave_netdev_event(unsigned long event,
case NETDEV_FEAT_CHANGE:
if (!bond->notifier_ctx) {
bond->notifier_ctx = true;
- netdev_compute_master_upper_features(bond->dev, true);
+ netdev_change_features(bond->dev);
bond->notifier_ctx = false;
}
break;
@@ -5897,6 +5901,7 @@ static const struct net_device_ops bond_netdev_ops = {
.ndo_add_slave = bond_enslave,
.ndo_del_slave = bond_release,
.ndo_fix_features = bond_fix_features,
+ .ndo_set_features = bond_set_features,
.ndo_features_check = passthru_features_check,
.ndo_get_xmit_slave = bond_xmit_get_slave,
.ndo_sk_get_lower_dev = bond_sk_get_lower_dev,
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index b7282f5c9632..4906ea3717b1 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1247,7 +1247,6 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
port->index = -1;
list_add_tail_rcu(&port->list, &team->port_list);
team_port_enable(team, port);
- netdev_compute_master_upper_features(team->dev, true);
__team_port_change_port_added(port, !!netif_oper_up(port_dev));
__team_options_change_check(team);
@@ -1337,7 +1336,6 @@ static int team_port_del(struct team *team, struct net_device *port_dev, bool un
}
kfree_rcu(port, rcu);
netdev_info(dev, "Port device %s removed\n", portname);
- netdev_compute_master_upper_features(team->dev, true);
return 0;
}
@@ -1645,7 +1643,6 @@ static void team_uninit(struct net_device *dev)
team_mcast_rejoin_fini(team);
team_notify_peers_fini(team);
team_queue_override_fini(team);
- netdev_change_features(dev);
}
static void team_destructor(struct net_device *dev)
@@ -1972,6 +1969,12 @@ static netdev_features_t team_fix_features(struct net_device *dev,
return features;
}
+static int team_set_features(struct net_device *dev, netdev_features_t features)
+{
+ netdev_compute_master_upper_features(dev, true);
+ return 0;
+}
+
static int team_change_carrier(struct net_device *dev, bool new_carrier)
{
struct team *team = netdev_priv(dev);
@@ -2007,6 +2010,7 @@ static const struct net_device_ops team_netdev_ops = {
.ndo_add_slave = team_add_slave,
.ndo_del_slave = team_del_slave,
.ndo_fix_features = team_fix_features,
+ .ndo_set_features = team_set_features,
.ndo_change_carrier = team_change_carrier,
.ndo_features_check = passthru_features_check,
};
@@ -2944,7 +2948,7 @@ static int team_device_event(struct notifier_block *unused,
case NETDEV_FEAT_CHANGE:
if (!port->team->notifier_ctx) {
port->team->notifier_ctx = true;
- netdev_compute_master_upper_features(port->team->dev, true);
+ netdev_change_features(port->team->dev);
port->team->notifier_ctx = false;
}
break;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index f7502e62dd35..f82082050e36 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -289,6 +289,12 @@ static netdev_features_t br_fix_features(struct net_device *dev,
return br_features_recompute(br, features);
}
+static int br_set_features(struct net_device *dev, netdev_features_t features)
+{
+ netdev_compute_master_upper_features(dev, false);
+ return 0;
+}
+
#ifdef CONFIG_NET_POLL_CONTROLLER
static void br_poll_controller(struct net_device *br_dev)
{
@@ -456,6 +462,7 @@ static const struct net_device_ops br_netdev_ops = {
.ndo_add_slave = br_add_slave,
.ndo_del_slave = br_del_slave,
.ndo_fix_features = br_fix_features,
+ .ndo_set_features = br_set_features,
.ndo_fdb_add = br_fdb_add,
.ndo_fdb_del = br_fdb_delete,
.ndo_fdb_del_bulk = br_fdb_delete_bulk,
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index d39571e13744..030248bc94c5 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -680,8 +680,6 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
br_mtu_auto_adjust(br);
- netdev_compute_master_upper_features(br->dev, false);
-
kobject_uevent(&p->kobj, KOBJ_ADD);
return 0;
@@ -734,8 +732,6 @@ int br_del_if(struct net_bridge *br, struct net_device *dev)
if (changed_addr)
call_netdevice_notifiers(NETDEV_CHANGEADDR, br->dev);
- netdev_compute_master_upper_features(br->dev, false);
-
return 0;
}
diff --git a/net/core/dev.c b/net/core/dev.c
index 6fc9350f0be8..ce1d9c73e6ce 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8919,6 +8919,9 @@ static int __netdev_upper_dev_link(struct net_device *dev,
__netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level,
priv);
+ /* re-compute all features after adding link */
+ netdev_change_features(upper_dev);
+
return 0;
rollback:
@@ -9011,6 +9014,9 @@ static void __netdev_upper_dev_unlink(struct net_device *dev,
__netdev_update_lower_level(upper_dev, priv);
__netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level,
priv);
+
+ /* re-compute all features after removing link */
+ netdev_change_features(upper_dev);
}
/**
@@ -12873,8 +12879,6 @@ void netdev_compute_master_upper_features(struct net_device *dev, bool update_he
netif_set_tso_max_segs(dev, tso_max_segs);
netif_set_tso_max_size(dev, tso_max_size);
-
- netdev_change_features(dev);
}
EXPORT_SYMBOL(netdev_compute_master_upper_features);
--
Git-155)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 2/3] failover: use ndo_set_features for failover offload compute
2026-03-10 7:45 [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team Hangbin Liu
@ 2026-03-10 7:45 ` Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 3/3] net: no need to disable LRO specifically Hangbin Liu
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2026-03-10 7:45 UTC (permalink / raw)
To: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Jiri Pirko, Nikolay Aleksandrov,
Ido Schimmel, Simon Horman, Sridhar Samudrala
Cc: netdev, linux-kernel, bridge, Hangbin Liu
Convert net_failover to use the ndo_set_features callback and calls
netdev_compute_master_upper_features() to compute new offload flags
during slave registration/unregistration.
This simplifies the failover code significantly by removing the custom
feature computation logic and relying on the centralized feature update
mechanism. The callback is automatically invoked during feature updates
when upper/lower device links change.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
drivers/net/net_failover.c | 67 ++++++----------------------------------------
include/net/net_failover.h | 7 -----
2 files changed, 8 insertions(+), 66 deletions(-)
diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c
index d0361aaf25ef..10041a271bf5 100644
--- a/drivers/net/net_failover.c
+++ b/drivers/net/net_failover.c
@@ -300,6 +300,12 @@ static int net_failover_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
return 0;
}
+static int net_failover_set_features(struct net_device *dev, netdev_features_t features)
+{
+ netdev_compute_master_upper_features(dev, true);
+ return 0;
+}
+
static const struct net_device_ops failover_dev_ops = {
.ndo_open = net_failover_open,
.ndo_stop = net_failover_close,
@@ -312,6 +318,7 @@ static const struct net_device_ops failover_dev_ops = {
.ndo_vlan_rx_kill_vid = net_failover_vlan_rx_kill_vid,
.ndo_validate_addr = eth_validate_addr,
.ndo_features_check = passthru_features_check,
+ .ndo_set_features = net_failover_set_features,
};
#define FAILOVER_NAME "net_failover"
@@ -373,61 +380,6 @@ static rx_handler_result_t net_failover_handle_frame(struct sk_buff **pskb)
return RX_HANDLER_ANOTHER;
}
-static void net_failover_compute_features(struct net_device *dev)
-{
- netdev_features_t vlan_features = FAILOVER_VLAN_FEATURES &
- NETIF_F_ALL_FOR_ALL;
- netdev_features_t enc_features = FAILOVER_ENC_FEATURES;
- unsigned short max_hard_header_len = ETH_HLEN;
- unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |
- IFF_XMIT_DST_RELEASE_PERM;
- struct net_failover_info *nfo_info = netdev_priv(dev);
- struct net_device *primary_dev, *standby_dev;
-
- primary_dev = rcu_dereference(nfo_info->primary_dev);
- if (primary_dev) {
- vlan_features =
- netdev_increment_features(vlan_features,
- primary_dev->vlan_features,
- FAILOVER_VLAN_FEATURES);
- enc_features =
- netdev_increment_features(enc_features,
- primary_dev->hw_enc_features,
- FAILOVER_ENC_FEATURES);
-
- dst_release_flag &= primary_dev->priv_flags;
- if (primary_dev->hard_header_len > max_hard_header_len)
- max_hard_header_len = primary_dev->hard_header_len;
- }
-
- standby_dev = rcu_dereference(nfo_info->standby_dev);
- if (standby_dev) {
- vlan_features =
- netdev_increment_features(vlan_features,
- standby_dev->vlan_features,
- FAILOVER_VLAN_FEATURES);
- enc_features =
- netdev_increment_features(enc_features,
- standby_dev->hw_enc_features,
- FAILOVER_ENC_FEATURES);
-
- dst_release_flag &= standby_dev->priv_flags;
- if (standby_dev->hard_header_len > max_hard_header_len)
- max_hard_header_len = standby_dev->hard_header_len;
- }
-
- dev->vlan_features = vlan_features;
- dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL;
- dev->hard_header_len = max_hard_header_len;
-
- dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
- if (dst_release_flag == (IFF_XMIT_DST_RELEASE |
- IFF_XMIT_DST_RELEASE_PERM))
- dev->priv_flags |= IFF_XMIT_DST_RELEASE;
-
- netdev_change_features(dev);
-}
-
static void net_failover_lower_state_changed(struct net_device *slave_dev,
struct net_device *primary_dev,
struct net_device *standby_dev)
@@ -550,7 +502,6 @@ static int net_failover_slave_register(struct net_device *slave_dev,
}
net_failover_lower_state_changed(slave_dev, primary_dev, standby_dev);
- net_failover_compute_features(failover_dev);
call_netdevice_notifiers(NETDEV_JOIN, slave_dev);
@@ -621,8 +572,6 @@ static int net_failover_slave_unregister(struct net_device *slave_dev,
dev_put(slave_dev);
- net_failover_compute_features(failover_dev);
-
netdev_info(failover_dev, "failover %s slave:%s unregistered\n",
slave_is_standby ? "standby" : "primary", slave_dev->name);
@@ -736,7 +685,7 @@ struct failover *net_failover_create(struct net_device *standby_dev)
/* Don't allow failover devices to change network namespaces. */
failover_dev->netns_immutable = true;
- failover_dev->hw_features = FAILOVER_VLAN_FEATURES |
+ failover_dev->hw_features = MASTER_UPPER_DEV_VLAN_FEATURES |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER;
diff --git a/include/net/net_failover.h b/include/net/net_failover.h
index b12a1c469d1c..f0f038d113a0 100644
--- a/include/net/net_failover.h
+++ b/include/net/net_failover.h
@@ -30,11 +30,4 @@ struct net_failover_info {
struct failover *net_failover_create(struct net_device *standby_dev);
void net_failover_destroy(struct failover *failover);
-#define FAILOVER_VLAN_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
- NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
- NETIF_F_HIGHDMA | NETIF_F_LRO)
-
-#define FAILOVER_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
- NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
-
#endif /* _NET_FAILOVER_H */
--
Git-155)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 3/3] net: no need to disable LRO specifically
2026-03-10 7:45 [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 2/3] failover: use ndo_set_features for failover offload compute Hangbin Liu
@ 2026-03-10 7:45 ` Hangbin Liu
2026-03-10 17:02 ` [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features syzbot ci
2026-03-12 9:46 ` [PATCH net-next 0/3] " Paolo Abeni
4 siblings, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2026-03-10 7:45 UTC (permalink / raw)
To: Jay Vosburgh, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Jiri Pirko, Nikolay Aleksandrov,
Ido Schimmel, Simon Horman, Sridhar Samudrala
Cc: netdev, linux-kernel, bridge, Hangbin Liu
In commit "net: use ndo_set_features to set offload features for
bonding/bridge/team" we called netdev_change_features() in
__netdev_upper_dev_link(), which will disable LRO automatically on lower
device if upper device doesn't have LRO enabled. So we don't need to
dev_disable_lro() again after netdev_upper_dev_link().
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
drivers/net/bonding/bond_main.c | 3 ---
drivers/net/team/team_core.c | 3 ---
net/8021q/vlan.c | 2 --
net/bridge/br_if.c | 2 --
net/hsr/hsr_slave.c | 1 -
5 files changed, 11 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 4d32b06079ee..5b117acb1138 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2177,9 +2177,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
}
#endif
- if (!(bond_dev->features & NETIF_F_LRO))
- dev_disable_lro(slave_dev);
-
res = netdev_rx_handler_register(slave_dev, bond_handle_frame,
new_slave);
if (res) {
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index 4906ea3717b1..73d4f68a7ccd 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1191,9 +1191,6 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
goto err_enable_netpoll;
}
- if (!(dev->features & NETIF_F_LRO))
- dev_disable_lro(port_dev);
-
err = netdev_rx_handler_register(port_dev, team_handle_frame,
port);
if (err) {
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 2b74ed56eb16..fda3a80e9340 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -193,8 +193,6 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack)
vlan_group_set_device(grp, vlan->vlan_proto, vlan_id, dev);
grp->nr_vlan_devs++;
- netdev_update_features(dev);
-
return 0;
out_unregister_netdev:
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 030248bc94c5..0aa653a1e651 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -620,8 +620,6 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
if (err)
goto err5;
- dev_disable_lro(dev);
-
list_add_rcu(&p->list, &br->port_list);
nbp_update_port_count(br);
diff --git a/net/hsr/hsr_slave.c b/net/hsr/hsr_slave.c
index 44f83c8c56a7..4b6ab185392b 100644
--- a/net/hsr/hsr_slave.c
+++ b/net/hsr/hsr_slave.c
@@ -169,7 +169,6 @@ static int hsr_portdev_setup(struct hsr_priv *hsr, struct net_device *dev,
res = netdev_rx_handler_register(dev, hsr_handle_frame, port);
if (res)
goto fail_rx_handler;
- dev_disable_lro(dev);
return 0;
--
Git-155)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-10 7:45 [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features Hangbin Liu
` (2 preceding siblings ...)
2026-03-10 7:45 ` [PATCH net-next 3/3] net: no need to disable LRO specifically Hangbin Liu
@ 2026-03-10 17:02 ` syzbot ci
2026-03-10 19:17 ` Sabrina Dubroca
2026-03-12 9:46 ` [PATCH net-next 0/3] " Paolo Abeni
4 siblings, 1 reply; 15+ messages in thread
From: syzbot ci @ 2026-03-10 17:02 UTC (permalink / raw)
To: andrew, bridge, davem, edumazet, horms, idosch, jiri, jv, kuba,
linux-kernel, liuhangbin, netdev, pabeni, razor,
sridhar.samudrala
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v1] net: move netdev_compute_master_upper_features to ndo_set_features
https://lore.kernel.org/all/20260310-offload_compute-v1-0-3df79c09ea65@gmail.com
* [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team
* [PATCH net-next 2/3] failover: use ndo_set_features for failover offload compute
* [PATCH net-next 3/3] net: no need to disable LRO specifically
and found the following issue:
WARNING in rtmsg_ifinfo_build_skb
Full report is available here:
https://ci.syzbot.org/series/d5001c4a-a51e-49c1-9106-624836e43ec2
***
WARNING in rtmsg_ifinfo_build_skb
tree: net-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base: 52ede1bce557c66309f41ac29dd190be23ca9129
arch: amd64
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config: https://ci.syzbot.org/builds/281e7e02-f6af-4b4a-845e-d1d5842a9301/config
batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
hsr_slave_0: entered promiscuous mode
hsr_slave_1: entered promiscuous mode
------------[ cut here ]------------
err == -EMSGSIZE
WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
Modules linked in:
CPU: 0 UID: 0 PID: 6496 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:rtmsg_ifinfo_build_skb+0x218/0x260
Code: f6 ba 01 00 00 00 89 e9 e8 25 15 3a 00 4c 89 f0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 7f 3a 2e 02 cc e8 49 3b 42 f8 90 <0f> 0b 90 eb 90 89 d9 80 e1 07 fe c1 38 c1 0f 8c 95 fe ff ff 48 89
RSP: 0018:ffffc9000637e9a0 EFLAGS: 00010293
RAX: ffffffff89835e27 RBX: 0000000000000000 RCX: ffff8881b80a57c0
RDX: 0000000000000000 RSI: 00000000ffffffa6 RDI: 00000000ffffffa6
RBP: 00000000ffffffa6 R08: 0000000000000004 R09: 0000000000000004
R10: fffff52000c6fcdc R11: 0000000000000000 R12: 1ffff110235ddc21
R13: 0000000000000000 R14: ffff8881133dc780 R15: ffff88811aeee000
FS: 0000555557c4a500(0000) GS:ffff88818de65000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055555e1838c8 CR3: 0000000168b80000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnetlink_event+0x1b7/0x270
notifier_call_chain+0x1be/0x400
netdev_change_features+0x95/0xe0
__netdev_upper_dev_link+0xb20/0xc80
netdev_upper_dev_link+0xb0/0x100
macsec_newlink+0xb11/0x1200
rtnl_newlink_create+0x329/0xb70
rtnl_newlink+0x1666/0x1be0
rtnetlink_rcv_msg+0x7d5/0xbe0
netlink_rcv_skb+0x232/0x4b0
netlink_unicast+0x80f/0x9b0
netlink_sendmsg+0x813/0xb40
__sys_sendto+0x672/0x710
__x64_sys_sendto+0xde/0x100
do_syscall_64+0x14d/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ffadf757917
Code: 48 89 fa 4c 89 df e8 a8 56 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007ffd0eb79f80 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000555557c4a500 RCX: 00007ffadf757917
RDX: 0000000000000054 RSI: 00007ffae0544670 RDI: 0000000000000003
RBP: 0000000000000001 R08: 00007ffd0eb79fe4 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
R13: 0000000000000000 R14: 00007ffae0544670 R15: 0000000000000000
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-10 17:02 ` [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features syzbot ci
@ 2026-03-10 19:17 ` Sabrina Dubroca
2026-03-11 0:47 ` Hangbin Liu
0 siblings, 1 reply; 15+ messages in thread
From: Sabrina Dubroca @ 2026-03-10 19:17 UTC (permalink / raw)
To: syzbot ci
Cc: andrew, bridge, davem, edumazet, horms, idosch, jiri, jv, kuba,
linux-kernel, liuhangbin, netdev, pabeni, razor,
sridhar.samudrala, syzbot, syzkaller-bugs
2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> hsr_slave_0: entered promiscuous mode
> hsr_slave_1: entered promiscuous mode
> ------------[ cut here ]------------
> err == -EMSGSIZE
> WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
I'm not sure this one is caused by this series, but either way,
reviewing if_nlmsg_size/rtnl_fill_ifinfo for mismatches is really
unpleasant :/
Things I see in rtnl_fill_ifinfo but don't find in if_nlmsg_size:
- IFLA_PARENT_DEV_NAME
- IFLA_PARENT_DEV_BUS_NAME
(both from 00e77ed8e64d ("rtnetlink: add
IFLA_PARENT_[DEV|DEV_BUS]_NAME"), which doesn't include a change to
if_nlmsg_size)
- rtnl_link_slave_info_fill also outputs IFLA_INFO_SLAVE_KIND + the
IFLA_INFO_SLAVE_DATA nest, but rtnl_link_get_slave_info_data_size
only counts the nest, and its caller (rtnl_link_get_size) doesn't
have anything more about the slave info. This may be what syzbot is
tripping on here.
But there's a
+ nla_total_size(4) /* IFLA_WEIGHT */
that doesn't get filled anywhere.
> Modules linked in:
> CPU: 0 UID: 0 PID: 6496 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:rtmsg_ifinfo_build_skb+0x218/0x260
> Code: f6 ba 01 00 00 00 89 e9 e8 25 15 3a 00 4c 89 f0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 7f 3a 2e 02 cc e8 49 3b 42 f8 90 <0f> 0b 90 eb 90 89 d9 80 e1 07 fe c1 38 c1 0f 8c 95 fe ff ff 48 89
> RSP: 0018:ffffc9000637e9a0 EFLAGS: 00010293
> RAX: ffffffff89835e27 RBX: 0000000000000000 RCX: ffff8881b80a57c0
> RDX: 0000000000000000 RSI: 00000000ffffffa6 RDI: 00000000ffffffa6
> RBP: 00000000ffffffa6 R08: 0000000000000004 R09: 0000000000000004
> R10: fffff52000c6fcdc R11: 0000000000000000 R12: 1ffff110235ddc21
> R13: 0000000000000000 R14: ffff8881133dc780 R15: ffff88811aeee000
> FS: 0000555557c4a500(0000) GS:ffff88818de65000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055555e1838c8 CR3: 0000000168b80000 CR4: 00000000000006f0
> Call Trace:
> <TASK>
> rtnetlink_event+0x1b7/0x270
> notifier_call_chain+0x1be/0x400
> netdev_change_features+0x95/0xe0
> __netdev_upper_dev_link+0xb20/0xc80
> netdev_upper_dev_link+0xb0/0x100
> macsec_newlink+0xb11/0x1200
> rtnl_newlink_create+0x329/0xb70
> rtnl_newlink+0x1666/0x1be0
--
Sabrina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-10 19:17 ` Sabrina Dubroca
@ 2026-03-11 0:47 ` Hangbin Liu
2026-03-11 21:18 ` Sabrina Dubroca
0 siblings, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2026-03-11 0:47 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: syzbot ci, andrew, bridge, davem, edumazet, horms, idosch, jiri,
jv, kuba, linux-kernel, netdev, pabeni, razor, sridhar.samudrala,
syzbot, syzkaller-bugs
On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote:
> 2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> > batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> > hsr_slave_0: entered promiscuous mode
> > hsr_slave_1: entered promiscuous mode
> > ------------[ cut here ]------------
> > err == -EMSGSIZE
> > WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
>
> I'm not sure this one is caused by this series, but either way,
rtnetlink_event+0x1b7/0x270
notifier_call_chain+0x1be/0x400
netdev_change_features+0x95/0xe0
__netdev_upper_dev_link+0xb20/0xc80
netdev_upper_dev_link+0xb0/0x100
This patch calls netdev_change_features() after __netdev_upper_dev_link(),
Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
to fill the new link info. Maybe the event is a bit early and macsec has
data not ready?
Thanks
Hangbin
> reviewing if_nlmsg_size/rtnl_fill_ifinfo for mismatches is really
> unpleasant :/
>
> Things I see in rtnl_fill_ifinfo but don't find in if_nlmsg_size:
> - IFLA_PARENT_DEV_NAME
> - IFLA_PARENT_DEV_BUS_NAME
> (both from 00e77ed8e64d ("rtnetlink: add
> IFLA_PARENT_[DEV|DEV_BUS]_NAME"), which doesn't include a change to
> if_nlmsg_size)
> - rtnl_link_slave_info_fill also outputs IFLA_INFO_SLAVE_KIND + the
> IFLA_INFO_SLAVE_DATA nest, but rtnl_link_get_slave_info_data_size
> only counts the nest, and its caller (rtnl_link_get_size) doesn't
> have anything more about the slave info. This may be what syzbot is
> tripping on here.
>
>
> But there's a
>
> + nla_total_size(4) /* IFLA_WEIGHT */
>
> that doesn't get filled anywhere.
>
>
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 6496 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > RIP: 0010:rtmsg_ifinfo_build_skb+0x218/0x260
> > Code: f6 ba 01 00 00 00 89 e9 e8 25 15 3a 00 4c 89 f0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 7f 3a 2e 02 cc e8 49 3b 42 f8 90 <0f> 0b 90 eb 90 89 d9 80 e1 07 fe c1 38 c1 0f 8c 95 fe ff ff 48 89
> > RSP: 0018:ffffc9000637e9a0 EFLAGS: 00010293
> > RAX: ffffffff89835e27 RBX: 0000000000000000 RCX: ffff8881b80a57c0
> > RDX: 0000000000000000 RSI: 00000000ffffffa6 RDI: 00000000ffffffa6
> > RBP: 00000000ffffffa6 R08: 0000000000000004 R09: 0000000000000004
> > R10: fffff52000c6fcdc R11: 0000000000000000 R12: 1ffff110235ddc21
> > R13: 0000000000000000 R14: ffff8881133dc780 R15: ffff88811aeee000
> > FS: 0000555557c4a500(0000) GS:ffff88818de65000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000055555e1838c8 CR3: 0000000168b80000 CR4: 00000000000006f0
> > Call Trace:
> > <TASK>
> > rtnetlink_event+0x1b7/0x270
> > notifier_call_chain+0x1be/0x400
> > netdev_change_features+0x95/0xe0
> > __netdev_upper_dev_link+0xb20/0xc80
> > netdev_upper_dev_link+0xb0/0x100
> > macsec_newlink+0xb11/0x1200
> > rtnl_newlink_create+0x329/0xb70
> > rtnl_newlink+0x1666/0x1be0
>
> --
> Sabrina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-11 0:47 ` Hangbin Liu
@ 2026-03-11 21:18 ` Sabrina Dubroca
2026-03-12 9:40 ` Paolo Abeni
0 siblings, 1 reply; 15+ messages in thread
From: Sabrina Dubroca @ 2026-03-11 21:18 UTC (permalink / raw)
To: Hangbin Liu
Cc: syzbot ci, andrew, bridge, davem, edumazet, horms, idosch, jiri,
jv, kuba, linux-kernel, netdev, pabeni, razor, sridhar.samudrala,
syzbot, syzkaller-bugs
2026-03-11, 00:47:41 +0000, Hangbin Liu wrote:
> On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote:
> > 2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> > > batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> > > hsr_slave_0: entered promiscuous mode
> > > hsr_slave_1: entered promiscuous mode
> > > ------------[ cut here ]------------
> > > err == -EMSGSIZE
> > > WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
> >
> > I'm not sure this one is caused by this series, but either way,
>
>
> rtnetlink_event+0x1b7/0x270
> notifier_call_chain+0x1be/0x400
> netdev_change_features+0x95/0xe0
> __netdev_upper_dev_link+0xb20/0xc80
> netdev_upper_dev_link+0xb0/0x100
>
>
> This patch calls netdev_change_features() after __netdev_upper_dev_link(),
> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> to fill the new link info. Maybe the event is a bit early and macsec has
> data not ready?
But this would still mean that there's a mismatch between
if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
revealing it.
I'll send fixes for the stuff I mentioned, no idea if that's what
syzbot saw since we don't have a repro.
--
Sabrina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-11 21:18 ` Sabrina Dubroca
@ 2026-03-12 9:40 ` Paolo Abeni
2026-03-12 11:13 ` Sabrina Dubroca
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2026-03-12 9:40 UTC (permalink / raw)
To: Sabrina Dubroca, Hangbin Liu
Cc: syzbot ci, andrew, bridge, davem, edumazet, horms, idosch, jiri,
jv, kuba, linux-kernel, netdev, razor, sridhar.samudrala, syzbot,
syzkaller-bugs
On 3/11/26 10:18 PM, Sabrina Dubroca wrote:
> 2026-03-11, 00:47:41 +0000, Hangbin Liu wrote:
>> On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote:
>>> 2026-03-10, 10:02:09 -0700, syzbot ci wrote:
>>>> batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
>>>> hsr_slave_0: entered promiscuous mode
>>>> hsr_slave_1: entered promiscuous mode
>>>> ------------[ cut here ]------------
>>>> err == -EMSGSIZE
>>>> WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
>>>
>>> I'm not sure this one is caused by this series, but either way,
>>
>>
>> rtnetlink_event+0x1b7/0x270
>> notifier_call_chain+0x1be/0x400
>> netdev_change_features+0x95/0xe0
>> __netdev_upper_dev_link+0xb20/0xc80
>> netdev_upper_dev_link+0xb0/0x100
>>
>>
>> This patch calls netdev_change_features() after __netdev_upper_dev_link(),
>> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
>> to fill the new link info. Maybe the event is a bit early and macsec has
>> data not ready?
>
> But this would still mean that there's a mismatch between
> if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> revealing it.
>
> I'll send fixes for the stuff I mentioned, no idea if that's what
> syzbot saw since we don't have a repro.
It looks like even the nipa CI is reproducing the issue, i.e.:
https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
more failures here:
https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
the fail in mascsec-offload looks quite deterministic, could you please
have a look?
/P
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-10 7:45 [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features Hangbin Liu
` (3 preceding siblings ...)
2026-03-10 17:02 ` [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features syzbot ci
@ 2026-03-12 9:46 ` Paolo Abeni
4 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2026-03-12 9:46 UTC (permalink / raw)
To: Hangbin Liu, Jay Vosburgh, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Jiri Pirko, Nikolay Aleksandrov,
Ido Schimmel, Simon Horman, Sridhar Samudrala
Cc: netdev, linux-kernel, bridge
On 3/10/26 8:45 AM, Hangbin Liu wrote:
> Currently, master devices (bonding, bridge, team) manually call
> netdev_compute_master_upper_features() scattered throughout their port
> add/remove operations. This approach requires each driver to remember
> to update features at the right times and leads to code duplication.
>
> The series moves netdev_compute_master_upper_features() to callback
> ndo_set_features so that the offload compute could automatically
> invoked during feature updates when upper/lower device relationships
> change. This centralizes the feature computation flow and removes the
> burden from individual drivers.
>
> ---
> Hangbin Liu (3):
> net: use ndo_set_features to set offload features for bonding/bridge/team
> failover: use ndo_set_features for failover offload compute
> net: no need to disable LRO specifically
>
> drivers/net/bonding/bond_main.c | 14 +++++----
> drivers/net/net_failover.c | 67 +++++------------------------------------
> drivers/net/team/team_core.c | 15 ++++-----
> include/net/net_failover.h | 7 -----
> net/8021q/vlan.c | 2 --
> net/bridge/br_device.c | 7 +++++
> net/bridge/br_if.c | 6 ----
> net/core/dev.c | 8 +++--
> net/hsr/hsr_slave.c | 1 -
> 9 files changed, 37 insertions(+), 90 deletions(-)
I'm dropping this series from PW due to the self-tests failures. I guess
it could be restored later if it turns out the real problem is elsewhere.
/P
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-12 9:40 ` Paolo Abeni
@ 2026-03-12 11:13 ` Sabrina Dubroca
2026-03-12 14:34 ` Hangbin Liu
0 siblings, 1 reply; 15+ messages in thread
From: Sabrina Dubroca @ 2026-03-12 11:13 UTC (permalink / raw)
To: Paolo Abeni
Cc: Hangbin Liu, syzbot ci, andrew, bridge, davem, edumazet, horms,
idosch, jiri, jv, kuba, linux-kernel, netdev, razor,
sridhar.samudrala, syzbot, syzkaller-bugs
2026-03-12, 10:40:43 +0100, Paolo Abeni wrote:
> On 3/11/26 10:18 PM, Sabrina Dubroca wrote:
> > 2026-03-11, 00:47:41 +0000, Hangbin Liu wrote:
> >> On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote:
> >>> 2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> >>>> batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> >>>> hsr_slave_0: entered promiscuous mode
> >>>> hsr_slave_1: entered promiscuous mode
> >>>> ------------[ cut here ]------------
> >>>> err == -EMSGSIZE
> >>>> WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
> >>>
> >>> I'm not sure this one is caused by this series, but either way,
> >>
> >>
> >> rtnetlink_event+0x1b7/0x270
> >> notifier_call_chain+0x1be/0x400
> >> netdev_change_features+0x95/0xe0
> >> __netdev_upper_dev_link+0xb20/0xc80
> >> netdev_upper_dev_link+0xb0/0x100
> >>
> >>
> >> This patch calls netdev_change_features() after __netdev_upper_dev_link(),
> >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> >> to fill the new link info. Maybe the event is a bit early and macsec has
> >> data not ready?
> >
> > But this would still mean that there's a mismatch between
> > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> > revealing it.
> >
> > I'll send fixes for the stuff I mentioned, no idea if that's what
> > syzbot saw since we don't have a repro.
>
> It looks like even the nipa CI is reproducing the issue, i.e.:
>
> https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
>
> more failures here:
>
> https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
>
> the fail in mascsec-offload looks quite deterministic, could you please
> have a look?
Ah crap, sorry Hangbin, you were right. macsec_fill_info() returns
-EMSGSIZE when the key length is unexpected, and at this point we
haven't set it to its proper value yet.
Bandaid solution would be:
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..0f7ef7bfbdde 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4337,7 +4337,7 @@ static int macsec_fill_info(struct sk_buff *skb,
csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 : MACSEC_CIPHER_ID_GCM_AES_256;
break;
default:
- goto nla_put_failure;
+ return 0;
}
if (nla_put_sci(skb, IFLA_MACSEC_SCI, secy->sci,
Proper fix (so that the notification we're sending during
upper_dev_link has full linkinfo) would be to move
netdev_upper_dev_link() to after macsec_changelink_common() and fix up
the error handling. I don't see anything in macsec_add_dev or
macsec_changelink_common that needs the device to be linked. But
anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
on invalid data, so the "bandaid" should be included as well.
Should this be part of this series (either just the "bandaid" or the
"proper fix"+bandaid), since we never saw a problem before?
--
Sabrina
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-12 11:13 ` Sabrina Dubroca
@ 2026-03-12 14:34 ` Hangbin Liu
2026-03-12 15:58 ` Sabrina Dubroca
0 siblings, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2026-03-12 14:34 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Paolo Abeni, syzbot ci, andrew, bridge, davem, edumazet, horms,
idosch, jiri, jv, kuba, linux-kernel, netdev, razor,
sridhar.samudrala, syzbot, syzkaller-bugs
On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> > >> This patch calls netdev_change_features() after __netdev_upper_dev_link(),
> > >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> > >> to fill the new link info. Maybe the event is a bit early and macsec has
> > >> data not ready?
> > >
> > > But this would still mean that there's a mismatch between
> > > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> > > revealing it.
> > >
> > > I'll send fixes for the stuff I mentioned, no idea if that's what
> > > syzbot saw since we don't have a repro.
> >
> > It looks like even the nipa CI is reproducing the issue, i.e.:
> >
> > https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
> >
> > more failures here:
> >
> > https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
> >
> > the fail in mascsec-offload looks quite deterministic, could you please
> > have a look?
>
> Ah crap, sorry Hangbin, you were right. macsec_fill_info() returns
> -EMSGSIZE when the key length is unexpected, and at this point we
> haven't set it to its proper value yet.
>
> Bandaid solution would be:
>
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index f6cad0746a02..0f7ef7bfbdde 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -4337,7 +4337,7 @@ static int macsec_fill_info(struct sk_buff *skb,
> csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 : MACSEC_CIPHER_ID_GCM_AES_256;
> break;
> default:
> - goto nla_put_failure;
> + return 0;
> }
>
> if (nla_put_sci(skb, IFLA_MACSEC_SCI, secy->sci,
>
>
> Proper fix (so that the notification we're sending during
> upper_dev_link has full linkinfo) would be to move
> netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> the error handling. I don't see anything in macsec_add_dev or
> macsec_changelink_common that needs the device to be linked. But
If we move the netdev_upper_dev_link() after macsec_changelink_common(),
we will not goto nla_put_failure via default, right?
> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> on invalid data, so the "bandaid" should be included as well.
>
> Should this be part of this series (either just the "bandaid" or the
> "proper fix"+bandaid), since we never saw a problem before?
Since macsec need the "bandaid" fix either way. How about you post the
"bandaid" fix to net. And I include the "proper fix" in this series for
net-next?
BTW, here is my draft patch, would you like to review it first?
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..1f8da4c291c6 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4161,10 +4161,6 @@ static int macsec_newlink(struct net_device *dev,
lockdep_set_class(&dev->addr_list_lock,
&macsec_netdev_addr_lock_key);
- err = netdev_upper_dev_link(real_dev, dev, extack);
- if (err < 0)
- goto unregister;
-
/* need to be already registered so that ->init has run and
* the MAC addr is set
*/
@@ -4177,12 +4173,12 @@ static int macsec_newlink(struct net_device *dev,
if (rx_handler && sci_exists(real_dev, sci)) {
err = -EBUSY;
- goto unlink;
+ goto unregister;
}
err = macsec_add_dev(dev, sci, icv_len);
if (err)
- goto unlink;
+ goto unregister;
if (data) {
err = macsec_changelink_common(dev, data);
@@ -4190,6 +4186,10 @@ static int macsec_newlink(struct net_device *dev,
goto del_dev;
}
+ err = netdev_upper_dev_link(real_dev, dev, extack);
+ if (err < 0)
+ goto unlink;
+
/* If h/w offloading is available, propagate to the device */
if (macsec_is_offloaded(macsec)) {
const struct macsec_ops *ops;
@@ -4200,7 +4200,7 @@ static int macsec_newlink(struct net_device *dev,
ctx.secy = &macsec->secy;
err = macsec_offload(ops->mdo_add_secy, &ctx);
if (err)
- goto del_dev;
+ goto unlink;
macsec->insert_tx_tag =
macsec_needs_tx_tag(macsec, ops);
@@ -4209,7 +4209,7 @@ static int macsec_newlink(struct net_device *dev,
err = register_macsec_dev(real_dev, dev);
if (err < 0)
- goto del_dev;
+ goto unlink;
netdev_update_features(dev);
netif_stacked_transfer_operstate(real_dev, dev);
@@ -4219,10 +4219,10 @@ static int macsec_newlink(struct net_device *dev,
return 0;
-del_dev:
- macsec_del_dev(macsec);
unlink:
netdev_upper_dev_unlink(real_dev, dev);
+del_dev:
+ macsec_del_dev(macsec);
unregister:
unregister_netdevice(dev);
return err;
Thanks
Hangbin
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-12 14:34 ` Hangbin Liu
@ 2026-03-12 15:58 ` Sabrina Dubroca
2026-03-12 16:47 ` Paolo Abeni
0 siblings, 1 reply; 15+ messages in thread
From: Sabrina Dubroca @ 2026-03-12 15:58 UTC (permalink / raw)
To: Hangbin Liu
Cc: Paolo Abeni, syzbot ci, andrew, bridge, davem, edumazet, horms,
idosch, jiri, jv, kuba, linux-kernel, netdev, razor,
sridhar.samudrala, syzbot, syzkaller-bugs
2026-03-12, 14:34:44 +0000, Hangbin Liu wrote:
> On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> > Proper fix (so that the notification we're sending during
> > upper_dev_link has full linkinfo) would be to move
> > netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> > the error handling. I don't see anything in macsec_add_dev or
> > macsec_changelink_common that needs the device to be linked. But
>
> If we move the netdev_upper_dev_link() after macsec_changelink_common(),
> we will not goto nla_put_failure via default, right?
Yes.
> > anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> > on invalid data, so the "bandaid" should be included as well.
> >
> > Should this be part of this series (either just the "bandaid" or the
> > "proper fix"+bandaid), since we never saw a problem before?
>
> Since macsec need the "bandaid" fix either way. How about you post the
> "bandaid" fix to net. And I include the "proper fix" in this series for
> net-next?
But I don't think it's needed in net. Am I missing a codepath (before
your series) where macsec_fill_info could be called for the new device
before macsec_newlink returns? If not, it doesn't really qualify as a
fix, that's why I was asking Paolo.
> BTW, here is my draft patch, would you like to review it first?
>
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index f6cad0746a02..1f8da4c291c6 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -4161,10 +4161,6 @@ static int macsec_newlink(struct net_device *dev,
> lockdep_set_class(&dev->addr_list_lock,
> &macsec_netdev_addr_lock_key);
>
> - err = netdev_upper_dev_link(real_dev, dev, extack);
> - if (err < 0)
> - goto unregister;
> -
> /* need to be already registered so that ->init has run and
> * the MAC addr is set
> */
> @@ -4177,12 +4173,12 @@ static int macsec_newlink(struct net_device *dev,
>
> if (rx_handler && sci_exists(real_dev, sci)) {
> err = -EBUSY;
> - goto unlink;
> + goto unregister;
> }
>
> err = macsec_add_dev(dev, sci, icv_len);
> if (err)
> - goto unlink;
> + goto unregister;
>
> if (data) {
> err = macsec_changelink_common(dev, data);
> @@ -4190,6 +4186,10 @@ static int macsec_newlink(struct net_device *dev,
> goto del_dev;
> }
>
> + err = netdev_upper_dev_link(real_dev, dev, extack);
> + if (err < 0)
> + goto unlink;
This one shouldn't be "goto unlink" since we haven't linked yet. I'm
not sure netdev_upper_dev_unlink can handle "not linked yet" devices.
Otherwise this looks ok.
--
Sabrina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-12 15:58 ` Sabrina Dubroca
@ 2026-03-12 16:47 ` Paolo Abeni
2026-03-12 17:07 ` Sabrina Dubroca
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2026-03-12 16:47 UTC (permalink / raw)
To: Sabrina Dubroca, Hangbin Liu
Cc: syzbot ci, andrew, bridge, davem, edumazet, horms, idosch, jiri,
jv, kuba, linux-kernel, netdev, razor, sridhar.samudrala, syzbot,
syzkaller-bugs
On 3/12/26 4:58 PM, Sabrina Dubroca wrote:
> 2026-03-12, 14:34:44 +0000, Hangbin Liu wrote:
>> On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
>>> Proper fix (so that the notification we're sending during
>>> upper_dev_link has full linkinfo) would be to move
>>> netdev_upper_dev_link() to after macsec_changelink_common() and fix up
>>> the error handling. I don't see anything in macsec_add_dev or
>>> macsec_changelink_common that needs the device to be linked. But
>>
>> If we move the netdev_upper_dev_link() after macsec_changelink_common(),
>> we will not goto nla_put_failure via default, right?
>
> Yes.
>
>>> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
>>> on invalid data, so the "bandaid" should be included as well.
>>>
>>> Should this be part of this series (either just the "bandaid" or the
>>> "proper fix"+bandaid), since we never saw a problem before?
>>
>> Since macsec need the "bandaid" fix either way. How about you post the
>> "bandaid" fix to net. And I include the "proper fix" in this series for
>> net-next?
>
> But I don't think it's needed in net. Am I missing a codepath (before
> your series) where macsec_fill_info could be called for the new device
> before macsec_newlink returns? If not, it doesn't really qualify as a
> fix, that's why I was asking Paolo.
FWIW, I don't see a codepath calling into rtmsg_ifinfo_build_skb()
before device initialization, so I would be fine targeting net-next with
the EMSGSIZE-related change.
Side note, it looks like that the WARN() in the rnnetlink code here
helped identifying a real problem and correctly returning 0 when the
key_len is not yet initialized will silence it forever, what about
preserving a warning for this kind of race? something alike:
---
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..82974d4fa3f6 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4337,7 +4337,8 @@ static int macsec_fill_info(struct sk_buff *skb,
csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 :
MACSEC_CIPHER_ID_GCM_AES_256;
break;
default:
- goto nla_put_failure;
+ WARN_ON_ONCE(1);
+ return 0;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features
2026-03-12 16:47 ` Paolo Abeni
@ 2026-03-12 17:07 ` Sabrina Dubroca
0 siblings, 0 replies; 15+ messages in thread
From: Sabrina Dubroca @ 2026-03-12 17:07 UTC (permalink / raw)
To: Paolo Abeni
Cc: Hangbin Liu, syzbot ci, andrew, bridge, davem, edumazet, horms,
idosch, jiri, jv, kuba, linux-kernel, netdev, razor,
sridhar.samudrala, syzbot, syzkaller-bugs
2026-03-12, 17:47:45 +0100, Paolo Abeni wrote:
>
>
> On 3/12/26 4:58 PM, Sabrina Dubroca wrote:
> > 2026-03-12, 14:34:44 +0000, Hangbin Liu wrote:
> >> On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> >>> Proper fix (so that the notification we're sending during
> >>> upper_dev_link has full linkinfo) would be to move
> >>> netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> >>> the error handling. I don't see anything in macsec_add_dev or
> >>> macsec_changelink_common that needs the device to be linked. But
> >>
> >> If we move the netdev_upper_dev_link() after macsec_changelink_common(),
> >> we will not goto nla_put_failure via default, right?
> >
> > Yes.
> >
> >>> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> >>> on invalid data, so the "bandaid" should be included as well.
> >>>
> >>> Should this be part of this series (either just the "bandaid" or the
> >>> "proper fix"+bandaid), since we never saw a problem before?
> >>
> >> Since macsec need the "bandaid" fix either way. How about you post the
> >> "bandaid" fix to net. And I include the "proper fix" in this series for
> >> net-next?
> >
> > But I don't think it's needed in net. Am I missing a codepath (before
> > your series) where macsec_fill_info could be called for the new device
> > before macsec_newlink returns? If not, it doesn't really qualify as a
> > fix, that's why I was asking Paolo.
>
> FWIW, I don't see a codepath calling into rtmsg_ifinfo_build_skb()
> before device initialization, so I would be fine targeting net-next with
> the EMSGSIZE-related change.
>
> Side note, it looks like that the WARN() in the rnnetlink code here
> helped identifying a real problem and correctly returning 0 when the
> key_len is not yet initialized will silence it forever, what about
> preserving a warning for this kind of race? something alike:
Right, I was thinking about putting a DEBUG_NET_WARN_ON_ONCE there.
--
Sabrina
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-03-12 17:07 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 7:45 [PATCH net-next 0/3] net: move netdev_compute_master_upper_features to ndo_set_features Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 2/3] failover: use ndo_set_features for failover offload compute Hangbin Liu
2026-03-10 7:45 ` [PATCH net-next 3/3] net: no need to disable LRO specifically Hangbin Liu
2026-03-10 17:02 ` [syzbot ci] Re: net: move netdev_compute_master_upper_features to ndo_set_features syzbot ci
2026-03-10 19:17 ` Sabrina Dubroca
2026-03-11 0:47 ` Hangbin Liu
2026-03-11 21:18 ` Sabrina Dubroca
2026-03-12 9:40 ` Paolo Abeni
2026-03-12 11:13 ` Sabrina Dubroca
2026-03-12 14:34 ` Hangbin Liu
2026-03-12 15:58 ` Sabrina Dubroca
2026-03-12 16:47 ` Paolo Abeni
2026-03-12 17:07 ` Sabrina Dubroca
2026-03-12 9:46 ` [PATCH net-next 0/3] " Paolo Abeni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox