* [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER
@ 2025-03-25 21:30 Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 1/9] net: switch to netif_disable_lro in inetdev_init Stanislav Fomichev
` (8 more replies)
0 siblings, 9 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni
Solving the issue reported by Cosmin in [0] requires consistent
lock during NETDEV_UP/REGISTER/UNREGISTER notifiers. This series
addresses that (along with some other fixes in net/ipv4/devinet.c
and net/ipv6/addrconf.c) and appends the patches from Jakub
that were conditional on locked NETDEV_UNREGISTER.
0: https://lore.kernel.org/netdev/700fa36b94cbd57cfea2622029b087643c80cbc9.camel@nvidia.com/
Jakub Kicinski (3):
net: designate XSK pool pointers in queues as "ops protected"
netdev: add "ops compat locking" helpers
netdev: don't hold rtnl_lock over nl queue info get when possible
Stanislav Fomichev (6):
net: switch to netif_disable_lro in inetdev_init
net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
net: use netif_disable_lro in ipv6_add_dev
net: dummy: request ops lock
net: release instance lock during NETDEV_UNREGISTER for bond/team
docs: net: document netdev notifier expectations
Documentation/networking/netdevices.rst | 18 ++++
drivers/net/bonding/bond_main.c | 2 +
drivers/net/dummy.c | 1 +
drivers/net/team/team_core.c | 2 +
include/linux/netdevice.h | 2 +
include/net/netdev_lock.h | 16 ++++
include/net/netdev_rx_queue.h | 6 +-
net/core/dev.c | 111 ++++++++++++++++++++----
net/core/dev.h | 16 +++-
net/core/netdev-genl.c | 18 ++--
net/ipv4/devinet.c | 2 +-
net/ipv6/addrconf.c | 17 +++-
net/xdp/xsk_buff_pool.c | 7 +-
13 files changed, 178 insertions(+), 40 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH net-next 1/9] net: switch to netif_disable_lro in inetdev_init
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER Stanislav Fomichev
` (7 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Cosmin Ratiu
Cosmin reports the following deadlock:
dump_stack_lvl+0x62/0x90
print_deadlock_bug+0x274/0x3b0
__lock_acquire+0x1229/0x2470
lock_acquire+0xb7/0x2b0
__mutex_lock+0xa6/0xd20
dev_disable_lro+0x20/0x80
inetdev_init+0x12f/0x1f0
inetdev_event+0x48b/0x870
notifier_call_chain+0x38/0xf0
netif_change_net_namespace+0x72e/0x9f0
do_setlink.isra.0+0xd5/0x1220
rtnl_newlink+0x7ea/0xb50
rtnetlink_rcv_msg+0x459/0x5e0
netlink_rcv_skb+0x54/0x100
netlink_unicast+0x193/0x270
netlink_sendmsg+0x204/0x450
Switch to netif_disable_lro which assumes the caller holds the instance
lock. inetdev_init is called for blackhole device (which sw device and
doesn't grab instance lock) and from REGISTER/UNREGISTER notifiers.
We already hold the instance lock for REGISTER notifier during
netns change and we'll soon hold the lock during other paths.
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
net/core/dev.c | 1 +
net/ipv4/devinet.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index b597cc27a115..8df428fc6924 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1771,6 +1771,7 @@ void netif_disable_lro(struct net_device *dev)
netdev_unlock_ops(lower_dev);
}
}
+EXPORT_SYMBOL(netif_disable_lro);
/**
* dev_disable_gro_hw - disable HW Generic Receive Offload on a device
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 754f60fb6e25..77e5705ac799 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -281,7 +281,7 @@ static struct in_device *inetdev_init(struct net_device *dev)
if (!in_dev->arp_parms)
goto out_kfree;
if (IPV4_DEVCONF(in_dev->cnf, FORWARDING))
- dev_disable_lro(dev);
+ netif_disable_lro(dev);
/* Reference in_dev->dev */
netdev_hold(dev, &in_dev->dev_tracker, GFP_KERNEL);
/* Account for reference dev->ip_ptr (below) */
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 1/9] net: switch to netif_disable_lro in inetdev_init Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-26 15:03 ` Cosmin Ratiu
2025-03-25 21:30 ` [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev Stanislav Fomichev
` (6 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Cosmin Ratiu
Callers of inetdev_init can come from several places with inconsistent
expectation about netdev instance lock. Grab instance lock during
REGISTER (plus UP) and UNREGISTER netdev notifiers.
Take extra care in the path that re-registers the notifiers during
net namespace move (all the dance with extra 'lock' argument).
WARNING: CPU: 10 PID: 1479 at ./include/net/netdev_lock.h:54
__netdev_update_features+0x65f/0xca0
__warn+0x81/0x180
__netdev_update_features+0x65f/0xca0
report_bug+0x156/0x180
handle_bug+0x4f/0x90
exc_invalid_op+0x13/0x60
asm_exc_invalid_op+0x16/0x20
__netdev_update_features+0x65f/0xca0
netif_disable_lro+0x30/0x1d0
inetdev_init+0x12f/0x1f0
inetdev_event+0x48b/0x870
notifier_call_chain+0x38/0xf0
register_netdevice+0x741/0x8b0
register_netdev+0x1f/0x40
mlx5e_probe+0x4e3/0x8e0 [mlx5_core]
auxiliary_bus_probe+0x3f/0x90
really_probe+0xc3/0x3a0
__driver_probe_device+0x80/0x150
driver_probe_device+0x1f/0x90
__device_attach_driver+0x7d/0x100
bus_for_each_drv+0x80/0xd0
__device_attach+0xb4/0x1c0
bus_probe_device+0x91/0xa0
device_add+0x657/0x870
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
net/core/dev.c | 61 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 42 insertions(+), 19 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 8df428fc6924..bbcf302b53a8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1829,6 +1829,8 @@ static int call_netdevice_register_notifiers(struct notifier_block *nb,
{
int err;
+ netdev_ops_assert_locked(dev);
+
err = call_netdevice_notifier(nb, NETDEV_REGISTER, dev);
err = notifier_to_errno(err);
if (err)
@@ -1842,24 +1844,34 @@ static int call_netdevice_register_notifiers(struct notifier_block *nb,
}
static void call_netdevice_unregister_notifiers(struct notifier_block *nb,
- struct net_device *dev)
+ struct net_device *dev,
+ bool lock)
{
if (dev->flags & IFF_UP) {
call_netdevice_notifier(nb, NETDEV_GOING_DOWN,
dev);
call_netdevice_notifier(nb, NETDEV_DOWN, dev);
}
+ if (lock)
+ netdev_lock_ops(dev);
call_netdevice_notifier(nb, NETDEV_UNREGISTER, dev);
+ if (lock)
+ netdev_unlock_ops(dev);
}
static int call_netdevice_register_net_notifiers(struct notifier_block *nb,
- struct net *net)
+ struct net *net,
+ bool lock)
{
struct net_device *dev;
int err;
for_each_netdev(net, dev) {
+ if (lock)
+ netdev_lock_ops(dev);
err = call_netdevice_register_notifiers(nb, dev);
+ if (lock)
+ netdev_unlock_ops(dev);
if (err)
goto rollback;
}
@@ -1867,17 +1879,18 @@ static int call_netdevice_register_net_notifiers(struct notifier_block *nb,
rollback:
for_each_netdev_continue_reverse(net, dev)
- call_netdevice_unregister_notifiers(nb, dev);
+ call_netdevice_unregister_notifiers(nb, dev, lock);
return err;
}
static void call_netdevice_unregister_net_notifiers(struct notifier_block *nb,
- struct net *net)
+ struct net *net,
+ bool lock)
{
struct net_device *dev;
for_each_netdev(net, dev)
- call_netdevice_unregister_notifiers(nb, dev);
+ call_netdevice_unregister_notifiers(nb, dev, lock);
}
static int dev_boot_phase = 1;
@@ -1914,7 +1927,7 @@ int register_netdevice_notifier(struct notifier_block *nb)
goto unlock;
for_each_net(net) {
__rtnl_net_lock(net);
- err = call_netdevice_register_net_notifiers(nb, net);
+ err = call_netdevice_register_net_notifiers(nb, net, true);
__rtnl_net_unlock(net);
if (err)
goto rollback;
@@ -1928,7 +1941,7 @@ int register_netdevice_notifier(struct notifier_block *nb)
rollback:
for_each_net_continue_reverse(net) {
__rtnl_net_lock(net);
- call_netdevice_unregister_net_notifiers(nb, net);
+ call_netdevice_unregister_net_notifiers(nb, net, true);
__rtnl_net_unlock(net);
}
@@ -1965,7 +1978,7 @@ int unregister_netdevice_notifier(struct notifier_block *nb)
for_each_net(net) {
__rtnl_net_lock(net);
- call_netdevice_unregister_net_notifiers(nb, net);
+ call_netdevice_unregister_net_notifiers(nb, net, true);
__rtnl_net_unlock(net);
}
@@ -1978,7 +1991,8 @@ EXPORT_SYMBOL(unregister_netdevice_notifier);
static int __register_netdevice_notifier_net(struct net *net,
struct notifier_block *nb,
- bool ignore_call_fail)
+ bool ignore_call_fail,
+ bool lock)
{
int err;
@@ -1988,7 +2002,7 @@ static int __register_netdevice_notifier_net(struct net *net,
if (dev_boot_phase)
return 0;
- err = call_netdevice_register_net_notifiers(nb, net);
+ err = call_netdevice_register_net_notifiers(nb, net, lock);
if (err && !ignore_call_fail)
goto chain_unregister;
@@ -2000,7 +2014,8 @@ static int __register_netdevice_notifier_net(struct net *net,
}
static int __unregister_netdevice_notifier_net(struct net *net,
- struct notifier_block *nb)
+ struct notifier_block *nb,
+ bool lock)
{
int err;
@@ -2008,7 +2023,7 @@ static int __unregister_netdevice_notifier_net(struct net *net,
if (err)
return err;
- call_netdevice_unregister_net_notifiers(nb, net);
+ call_netdevice_unregister_net_notifiers(nb, net, lock);
return 0;
}
@@ -2032,7 +2047,7 @@ int register_netdevice_notifier_net(struct net *net, struct notifier_block *nb)
int err;
rtnl_net_lock(net);
- err = __register_netdevice_notifier_net(net, nb, false);
+ err = __register_netdevice_notifier_net(net, nb, false, true);
rtnl_net_unlock(net);
return err;
@@ -2061,7 +2076,7 @@ int unregister_netdevice_notifier_net(struct net *net,
int err;
rtnl_net_lock(net);
- err = __unregister_netdevice_notifier_net(net, nb);
+ err = __unregister_netdevice_notifier_net(net, nb, true);
rtnl_net_unlock(net);
return err;
@@ -2072,8 +2087,8 @@ static void __move_netdevice_notifier_net(struct net *src_net,
struct net *dst_net,
struct notifier_block *nb)
{
- __unregister_netdevice_notifier_net(src_net, nb);
- __register_netdevice_notifier_net(dst_net, nb, true);
+ __unregister_netdevice_notifier_net(src_net, nb, false);
+ __register_netdevice_notifier_net(dst_net, nb, true, false);
}
static void rtnl_net_dev_lock(struct net_device *dev)
@@ -2119,7 +2134,7 @@ int register_netdevice_notifier_dev_net(struct net_device *dev,
int err;
rtnl_net_dev_lock(dev);
- err = __register_netdevice_notifier_net(dev_net(dev), nb, false);
+ err = __register_netdevice_notifier_net(dev_net(dev), nb, false, true);
if (!err) {
nn->nb = nb;
list_add(&nn->list, &dev->net_notifier_list);
@@ -2138,7 +2153,7 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev,
rtnl_net_dev_lock(dev);
list_del(&nn->list);
- err = __unregister_netdevice_notifier_net(dev_net(dev), nb);
+ err = __unregister_netdevice_notifier_net(dev_net(dev), nb, true);
rtnl_net_dev_unlock(dev);
return err;
@@ -11047,7 +11062,9 @@ int register_netdevice(struct net_device *dev)
memcpy(dev->perm_addr, dev->dev_addr, dev->addr_len);
/* Notify protocols, that a new device appeared. */
+ netdev_lock_ops(dev);
ret = call_netdevice_notifiers(NETDEV_REGISTER, dev);
+ netdev_unlock_ops(dev);
ret = notifier_to_errno(ret);
if (ret) {
/* Expect explicit free_netdev() on failure */
@@ -11180,8 +11197,11 @@ static struct net_device *netdev_wait_allrefs_any(struct list_head *list)
rtnl_lock();
/* Rebroadcast unregister notification */
- list_for_each_entry(dev, list, todo_list)
+ list_for_each_entry(dev, list, todo_list) {
+ netdev_lock_ops(dev);
call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
+ netdev_unlock_ops(dev);
+ }
__rtnl_unlock();
rcu_barrier();
@@ -11972,7 +11992,9 @@ void unregister_netdevice_many_notify(struct list_head *head,
/* Notify protocols, that we are about to destroy
* this device. They should clean all the things.
*/
+ netdev_lock_ops(dev);
call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
+ netdev_unlock_ops(dev);
if (!dev->rtnl_link_ops ||
dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
@@ -12069,6 +12091,7 @@ int netif_change_net_namespace(struct net_device *dev, struct net *net,
int err, new_nsid;
ASSERT_RTNL();
+ netdev_ops_assert_locked(dev);
/* Don't allow namespace local devices to be moved. */
err = -EINVAL;
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 1/9] net: switch to netif_disable_lro in inetdev_init Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-26 7:33 ` kernel test robot
2025-03-25 21:30 ` [PATCH net-next 4/9] net: dummy: request ops lock Stanislav Fomichev
` (5 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Cosmin Ratiu
ipv6_add_dev might call dev_disable_lro which unconditionally grabs
instance lock, so it will deadlock during NETDEV_REGISTER. Switch
to netif_disable_lro.
Make sure all callers hold the instance lock as well.
Cc: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
include/linux/netdevice.h | 1 +
net/core/dev.h | 1 -
net/ipv6/addrconf.c | 17 +++++++++++++----
3 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index fa79145518d1..b2b4e31806d5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3386,6 +3386,7 @@ struct net_device *dev_get_by_index(struct net *net, int ifindex);
struct net_device *__dev_get_by_index(struct net *net, int ifindex);
struct net_device *netdev_get_by_index(struct net *net, int ifindex,
netdevice_tracker *tracker, gfp_t gfp);
+struct net_device *netdev_get_by_index_lock(struct net *net, int ifindex);
struct net_device *netdev_get_by_name(struct net *net, const char *name,
netdevice_tracker *tracker, gfp_t gfp);
struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex);
diff --git a/net/core/dev.h b/net/core/dev.h
index 7ee203395d8e..8d35860f2e89 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -28,7 +28,6 @@ struct napi_struct *
netdev_napi_by_id_lock(struct net *net, unsigned int napi_id);
struct net_device *dev_get_by_napi_id(unsigned int napi_id);
-struct net_device *netdev_get_by_index_lock(struct net *net, int ifindex);
struct net_device *__netdev_put_lock(struct net_device *dev);
struct net_device *
netdev_xa_find_lock(struct net *net, struct net_device *dev,
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ac8cc1076536..665184d2adce 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -80,6 +80,7 @@
#include <net/netlink.h>
#include <net/pkt_sched.h>
#include <net/l3mdev.h>
+#include <net/netdev_lock.h>
#include <linux/if_tunnel.h>
#include <linux/rtnetlink.h>
#include <linux/netconf.h>
@@ -377,6 +378,7 @@ static struct inet6_dev *ipv6_add_dev(struct net_device *dev)
int err = -ENOMEM;
ASSERT_RTNL();
+ netdev_ops_assert_locked(dev);
if (dev->mtu < IPV6_MIN_MTU && dev != blackhole_netdev)
return ERR_PTR(-EINVAL);
@@ -402,7 +404,7 @@ static struct inet6_dev *ipv6_add_dev(struct net_device *dev)
return ERR_PTR(err);
}
if (ndev->cnf.forwarding)
- dev_disable_lro(dev);
+ netif_disable_lro(dev);
/* We refer to the device */
netdev_hold(dev, &ndev->dev_tracker, GFP_KERNEL);
@@ -3151,11 +3153,12 @@ int addrconf_add_ifaddr(struct net *net, void __user *arg)
cfg.plen = ireq.ifr6_prefixlen;
rtnl_net_lock(net);
- dev = __dev_get_by_index(net, ireq.ifr6_ifindex);
+ dev = netdev_get_by_index_lock(net, ireq.ifr6_ifindex);
if (dev)
err = inet6_addr_add(net, dev, &cfg, 0, 0, NULL);
else
err = -ENODEV;
+ netdev_unlock(dev);
rtnl_net_unlock(net);
return err;
}
@@ -5022,11 +5025,11 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh,
rtnl_net_lock(net);
- dev = __dev_get_by_index(net, ifm->ifa_index);
+ dev = netdev_get_by_index_lock(net, ifm->ifa_index);
if (!dev) {
NL_SET_ERR_MSG_MOD(extack, "Unable to find the interface");
err = -ENODEV;
- goto unlock;
+ goto unlock_rtnl;
}
idev = ipv6_find_idev(dev);
@@ -5065,6 +5068,8 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh,
in6_ifa_put(ifa);
unlock:
+ netdev_unlock(dev);
+unlock_rtnl:
rtnl_net_unlock(net);
return err;
@@ -6503,7 +6508,9 @@ static int addrconf_sysctl_addr_gen_mode(const struct ctl_table *ctl, int write,
if (idev->cnf.addr_gen_mode != new_val) {
WRITE_ONCE(idev->cnf.addr_gen_mode, new_val);
+ netdev_lock_ops(idev->dev);
addrconf_init_auto_addrs(idev->dev);
+ netdev_unlock_ops(idev->dev);
}
} else if (&net->ipv6.devconf_all->addr_gen_mode == ctl->data) {
struct net_device *dev;
@@ -6515,7 +6522,9 @@ static int addrconf_sysctl_addr_gen_mode(const struct ctl_table *ctl, int write,
idev->cnf.addr_gen_mode != new_val) {
WRITE_ONCE(idev->cnf.addr_gen_mode,
new_val);
+ netdev_lock_ops(idev->dev);
addrconf_init_auto_addrs(idev->dev);
+ netdev_unlock_ops(idev->dev);
}
}
}
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 4/9] net: dummy: request ops lock
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
` (2 preceding siblings ...)
2025-03-25 21:30 ` [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 5/9] net: release instance lock during NETDEV_UNREGISTER for bond/team Stanislav Fomichev
` (4 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni
Even though dummy device doesn't really need an instance lock,
a lot of selftests use dummy so it's useful to have extra
expose to the instance lock on NIPA. Request the instance/ops
locking.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
drivers/net/dummy.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index a4938c6a5ebb..d6bdad4baadd 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -105,6 +105,7 @@ static void dummy_setup(struct net_device *dev)
dev->netdev_ops = &dummy_netdev_ops;
dev->ethtool_ops = &dummy_ethtool_ops;
dev->needs_free_netdev = true;
+ dev->request_ops_lock = true;
/* Fill in device structure with ethernet-generic values. */
dev->flags |= IFF_NOARP;
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 5/9] net: release instance lock during NETDEV_UNREGISTER for bond/team
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
` (3 preceding siblings ...)
2025-03-25 21:30 ` [PATCH net-next 4/9] net: dummy: request ops lock Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 6/9] docs: net: document netdev notifier expectations Stanislav Fomichev
` (3 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Taehee Yoo
Running NETDEV_UNREGISTER under instance lock might be problematic for
teaming/bonding [0] because they take their own lock and the ordering
is reverse in the notifiers path. Release the instance lock in the notifiers
and let the existing code paths take the lock in the correct
order.
0: https://lore.kernel.org/netdev/CAMArcTW+5Lk0EWCaHOsUhf+p31S8yAZyQvi3C8zeRF3TxnC9Fg@mail.gmail.com/
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
drivers/net/bonding/bond_main.c | 2 ++
drivers/net/team/team_core.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e7f576d52311..7a1b160c5f23 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4022,10 +4022,12 @@ static int bond_slave_netdev_event(unsigned long event,
switch (event) {
case NETDEV_UNREGISTER:
+ netdev_unlock_ops(slave_dev);
if (bond_dev->type != ARPHRD_ETHER)
bond_release_and_destroy(bond_dev, slave_dev);
else
__bond_release_one(bond_dev, slave_dev, false, true);
+ netdev_lock_ops(slave_dev);
break;
case NETDEV_UP:
case NETDEV_CHANGE:
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index d8fc0c79745d..4a1815f50015 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -2997,7 +2997,9 @@ static int team_device_event(struct notifier_block *unused,
!!netif_oper_up(port->dev));
break;
case NETDEV_UNREGISTER:
+ netdev_unlock_ops(dev);
team_del_slave(port->team->dev, dev);
+ netdev_lock_ops(dev);
break;
case NETDEV_FEAT_CHANGE:
if (!port->team->notifier_ctx) {
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 6/9] docs: net: document netdev notifier expectations
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
` (4 preceding siblings ...)
2025-03-25 21:30 ` [PATCH net-next 5/9] net: release instance lock during NETDEV_UNREGISTER for bond/team Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 7/9] net: designate XSK pool pointers in queues as "ops protected" Stanislav Fomichev
` (2 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni
We don't have a consistent state yet, but document where we think
we are and where we wanna be.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
Documentation/networking/netdevices.rst | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index ebb868f50ac2..89337cfec36e 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -343,6 +343,24 @@ there are two sets of interfaces: ``dev_xxx`` and ``netif_xxx`` (e.g.,
acquiring the instance lock themselves, while the ``netif_xxx`` functions
assume that the driver has already acquired the instance lock.
+Notifiers and netdev instance lock
+==================================
+
+For device drivers that implement shaping or queue management APIs,
+some of the notifiers (``enum netdev_cmd``) are running under the netdev
+instance lock.
+
+Currently only the following notifiers are running under the instance lock:
+* ``NETDEV_REGISTER``
+* ``NETDEV_UP``
+* ``NETDEV_UNREGISTER``
+
+There are no clear expectations for the remaining notifiers. Notifiers not on
+the list may run with or without the instance lock, potentially even invoking
+the same notifier type with and without the lock from different code paths.
+The goal is to eventually ensure that all (or most, with a few documented
+exceptions) notifiers run under the instance lock.
+
NETDEV_INTERNAL symbol namespace
================================
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 7/9] net: designate XSK pool pointers in queues as "ops protected"
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
` (5 preceding siblings ...)
2025-03-25 21:30 ` [PATCH net-next 6/9] docs: net: document netdev notifier expectations Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 8/9] netdev: add "ops compat locking" helpers Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 9/9] netdev: don't hold rtnl_lock over nl queue info get when possible Stanislav Fomichev
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni
From: Jakub Kicinski <kuba@kernel.org>
Read accesses go via xsk_get_pool_from_qid(), the call coming
from the core and gve look safe (other "ops locked" drivers
don't support XSK).
Write accesses go via xsk_reg_pool_at_qid() and xsk_clear_pool_at_qid().
Former is already under the ops lock, latter needs to be locked when
coming from the workqueue via xp_clear_dev().
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
include/linux/netdevice.h | 1 +
include/net/netdev_rx_queue.h | 6 +++---
net/xdp/xsk_buff_pool.c | 7 ++++++-
3 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b2b4e31806d5..b3d1c1922ec0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -688,6 +688,7 @@ struct netdev_queue {
/* Subordinate device that the queue has been assigned to */
struct net_device *sb_dev;
#ifdef CONFIG_XDP_SOCKETS
+ /* "ops protected", see comment about net_device::lock */
struct xsk_buff_pool *pool;
#endif
diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h
index b2238b551dce..8cdcd138b33f 100644
--- a/include/net/netdev_rx_queue.h
+++ b/include/net/netdev_rx_queue.h
@@ -20,12 +20,12 @@ struct netdev_rx_queue {
struct net_device *dev;
netdevice_tracker dev_tracker;
+ /* All fields below are "ops protected",
+ * see comment about net_device::lock
+ */
#ifdef CONFIG_XDP_SOCKETS
struct xsk_buff_pool *pool;
#endif
- /* NAPI instance for the queue
- * "ops protected", see comment about net_device::lock
- */
struct napi_struct *napi;
struct pp_memory_provider_params mp_params;
} ____cacheline_aligned_in_smp;
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 25a76c5ce0f1..c7e50fd86c6a 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -279,9 +279,14 @@ static void xp_release_deferred(struct work_struct *work)
{
struct xsk_buff_pool *pool = container_of(work, struct xsk_buff_pool,
work);
+ struct net_device *netdev = pool->netdev;
rtnl_lock();
- xp_clear_dev(pool);
+ if (netdev) {
+ netdev_lock_ops(netdev);
+ xp_clear_dev(pool);
+ netdev_unlock_ops(netdev);
+ }
rtnl_unlock();
if (pool->fq) {
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 8/9] netdev: add "ops compat locking" helpers
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
` (6 preceding siblings ...)
2025-03-25 21:30 ` [PATCH net-next 7/9] net: designate XSK pool pointers in queues as "ops protected" Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 9/9] netdev: don't hold rtnl_lock over nl queue info get when possible Stanislav Fomichev
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni
From: Jakub Kicinski <kuba@kernel.org>
Add helpers to "lock a netdev in a backward-compatible way",
which for ops-locked netdevs will mean take the instance lock.
For drivers which haven't opted into the ops locking we'll take
rtnl_lock.
The scoped foreach is dropping and re-taking the lock for each
device, even if prev and next are both under rtnl_lock.
I hope that's fine since we expect that netdev nl to be mostly
supported by modern drivers, and modern drivers should also
opt into the instance locking.
Note that these helpers are mostly needed for queue related state,
because drivers modify queue config in their ops in a non-atomic
way. Or differently put, queue changes don't have a clear-cut API
like NAPI configuration. Any state that can should just use the
instance lock directly, not the "compat" hacks.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
include/net/netdev_lock.h | 16 +++++++++++++
net/core/dev.c | 49 +++++++++++++++++++++++++++++++++++++++
net/core/dev.h | 15 ++++++++++++
3 files changed, 80 insertions(+)
diff --git a/include/net/netdev_lock.h b/include/net/netdev_lock.h
index 1c0c9a94cc22..76cbf5a449b6 100644
--- a/include/net/netdev_lock.h
+++ b/include/net/netdev_lock.h
@@ -64,6 +64,22 @@ netdev_ops_assert_locked_or_invisible(const struct net_device *dev)
netdev_ops_assert_locked(dev);
}
+static inline void netdev_lock_ops_compat(struct net_device *dev)
+{
+ if (netdev_need_ops_lock(dev))
+ netdev_lock(dev);
+ else
+ rtnl_lock();
+}
+
+static inline void netdev_unlock_ops_compat(struct net_device *dev)
+{
+ if (netdev_need_ops_lock(dev))
+ netdev_unlock(dev);
+ else
+ rtnl_unlock();
+}
+
static inline int netdev_lock_cmp_fn(const struct lockdep_map *a,
const struct lockdep_map *b)
{
diff --git a/net/core/dev.c b/net/core/dev.c
index bbcf302b53a8..3589cd4471c8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1051,6 +1051,18 @@ struct net_device *__netdev_put_lock(struct net_device *dev)
return dev;
}
+static struct net_device *__netdev_put_lock_ops_compat(struct net_device *dev)
+{
+ netdev_lock_ops_compat(dev);
+ if (dev->reg_state > NETREG_REGISTERED) {
+ netdev_unlock_ops_compat(dev);
+ dev_put(dev);
+ return NULL;
+ }
+ dev_put(dev);
+ return dev;
+}
+
/**
* netdev_get_by_index_lock() - find a device by its ifindex
* @net: the applicable net namespace
@@ -1073,6 +1085,18 @@ struct net_device *netdev_get_by_index_lock(struct net *net, int ifindex)
return __netdev_put_lock(dev);
}
+struct net_device *
+netdev_get_by_index_lock_ops_compat(struct net *net, int ifindex)
+{
+ struct net_device *dev;
+
+ dev = dev_get_by_index(net, ifindex);
+ if (!dev)
+ return NULL;
+
+ return __netdev_put_lock_ops_compat(dev);
+}
+
struct net_device *
netdev_xa_find_lock(struct net *net, struct net_device *dev,
unsigned long *index)
@@ -1098,6 +1122,31 @@ netdev_xa_find_lock(struct net *net, struct net_device *dev,
} while (true);
}
+struct net_device *
+netdev_xa_find_lock_ops_compat(struct net *net, struct net_device *dev,
+ unsigned long *index)
+{
+ if (dev)
+ netdev_unlock_ops_compat(dev);
+
+ do {
+ rcu_read_lock();
+ dev = xa_find(&net->dev_by_index, index, ULONG_MAX, XA_PRESENT);
+ if (!dev) {
+ rcu_read_unlock();
+ return NULL;
+ }
+ dev_hold(dev);
+ rcu_read_unlock();
+
+ dev = __netdev_put_lock_ops_compat(dev);
+ if (dev)
+ return dev;
+
+ (*index)++;
+ } while (true);
+}
+
static DEFINE_SEQLOCK(netdev_rename_lock);
void netdev_copy_name(struct net_device *dev, char *name)
diff --git a/net/core/dev.h b/net/core/dev.h
index 8d35860f2e89..e7446b25bcde 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -40,6 +40,21 @@ DEFINE_FREE(netdev_unlock, struct net_device *, if (_T) netdev_unlock(_T));
(var_name = netdev_xa_find_lock(net, var_name, &ifindex)); \
ifindex++)
+struct net_device *
+netdev_get_by_index_lock_ops_compat(struct net *net, int ifindex);
+struct net_device *
+netdev_xa_find_lock_ops_compat(struct net *net, struct net_device *dev,
+ unsigned long *index);
+
+DEFINE_FREE(netdev_unlock_ops_compat, struct net_device *,
+ if (_T) netdev_unlock_ops_compat(_T));
+
+#define for_each_netdev_lock_ops_compat_scoped(net, var_name, ifindex) \
+ for (struct net_device *var_name __free(netdev_unlock_ops_compat) = NULL; \
+ (var_name = netdev_xa_find_lock_ops_compat(net, var_name, \
+ &ifindex)); \
+ ifindex++)
+
#ifdef CONFIG_PROC_FS
int __init dev_proc_init(void);
#else
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net-next 9/9] netdev: don't hold rtnl_lock over nl queue info get when possible
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
` (7 preceding siblings ...)
2025-03-25 21:30 ` [PATCH net-next 8/9] netdev: add "ops compat locking" helpers Stanislav Fomichev
@ 2025-03-25 21:30 ` Stanislav Fomichev
8 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-25 21:30 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni
From: Jakub Kicinski <kuba@kernel.org>
Netdev queue dump accesses: NAPI, memory providers, XSk pointers.
All three are "ops protected" now, switch to the op compat locking.
rtnl lock does not have to be taken for "ops locked" devices.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
net/core/netdev-genl.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index fd1cfa9707dc..39f52a311f07 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -481,18 +481,15 @@ int netdev_nl_queue_get_doit(struct sk_buff *skb, struct genl_info *info)
if (!rsp)
return -ENOMEM;
- rtnl_lock();
-
- netdev = netdev_get_by_index_lock(genl_info_net(info), ifindex);
+ netdev = netdev_get_by_index_lock_ops_compat(genl_info_net(info),
+ ifindex);
if (netdev) {
err = netdev_nl_queue_fill(rsp, netdev, q_id, q_type, info);
- netdev_unlock(netdev);
+ netdev_unlock_ops_compat(netdev);
} else {
err = -ENODEV;
}
- rtnl_unlock();
-
if (err)
goto err_free_msg;
@@ -541,17 +538,17 @@ int netdev_nl_queue_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
if (info->attrs[NETDEV_A_QUEUE_IFINDEX])
ifindex = nla_get_u32(info->attrs[NETDEV_A_QUEUE_IFINDEX]);
- rtnl_lock();
if (ifindex) {
- netdev = netdev_get_by_index_lock(net, ifindex);
+ netdev = netdev_get_by_index_lock_ops_compat(net, ifindex);
if (netdev) {
err = netdev_nl_queue_dump_one(netdev, skb, info, ctx);
- netdev_unlock(netdev);
+ netdev_unlock_ops_compat(netdev);
} else {
err = -ENODEV;
}
} else {
- for_each_netdev_lock_scoped(net, netdev, ctx->ifindex) {
+ for_each_netdev_lock_ops_compat_scoped(net, netdev,
+ ctx->ifindex) {
err = netdev_nl_queue_dump_one(netdev, skb, info, ctx);
if (err < 0)
break;
@@ -559,7 +556,6 @@ int netdev_nl_queue_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
ctx->txq_idx = 0;
}
}
- rtnl_unlock();
return err;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev
2025-03-25 21:30 ` [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev Stanislav Fomichev
@ 2025-03-26 7:33 ` kernel test robot
0 siblings, 0 replies; 21+ messages in thread
From: kernel test robot @ 2025-03-26 7:33 UTC (permalink / raw)
To: Stanislav Fomichev, netdev
Cc: oe-kbuild-all, davem, edumazet, kuba, pabeni, Cosmin Ratiu
Hi Stanislav,
kernel test robot noticed the following build errors:
[auto build test ERROR on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/net-switch-to-netif_disable_lro-in-inetdev_init/20250326-053539
base: net-next/main
patch link: https://lore.kernel.org/r/20250325213056.332902-4-sdf%40fomichev.me
patch subject: [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev
config: csky-randconfig-002-20250326 (https://download.01.org/0day-ci/archive/20250326/202503261535.GURSr7ti-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250326/202503261535.GURSr7ti-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202503261535.GURSr7ti-lkp@intel.com/
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "netdev_get_by_index_lock" [net/ipv6/ipv6.ko] undefined!
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-25 21:30 ` [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER Stanislav Fomichev
@ 2025-03-26 15:03 ` Cosmin Ratiu
2025-03-26 15:23 ` Stanislav Fomichev
2025-03-26 15:24 ` Cosmin Ratiu
0 siblings, 2 replies; 21+ messages in thread
From: Cosmin Ratiu @ 2025-03-26 15:03 UTC (permalink / raw)
To: netdev@vger.kernel.org, sdf@fomichev.me
Cc: edumazet@google.com, davem@davemloft.net, kuba@kernel.org,
pabeni@redhat.com
On Tue, 2025-03-25 at 14:30 -0700, Stanislav Fomichev wrote:
> @@ -2072,8 +2087,8 @@ static void
> __move_netdevice_notifier_net(struct net *src_net,
> struct net *dst_net,
> struct notifier_block *nb)
> {
> - __unregister_netdevice_notifier_net(src_net, nb);
> - __register_netdevice_notifier_net(dst_net, nb, true);
> + __unregister_netdevice_notifier_net(src_net, nb, false);
> + __register_netdevice_notifier_net(dst_net, nb, true, false);
> }
I tested with your (and the rest of Jakub's) patches.
The problem with this approach is that when a netdev's net is changed,
its lock will be acquired, but the notifiers for ALL netdevs in the old
and the new namespace will be called, which will result in correct
behavior for that device and lockdep_assert_held failure for all
others.
Cosmin.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 15:03 ` Cosmin Ratiu
@ 2025-03-26 15:23 ` Stanislav Fomichev
2025-03-26 15:37 ` Cosmin Ratiu
2025-03-26 15:24 ` Cosmin Ratiu
1 sibling, 1 reply; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-26 15:23 UTC (permalink / raw)
To: Cosmin Ratiu
Cc: netdev@vger.kernel.org, sdf@fomichev.me, edumazet@google.com,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com
On 03/26, Cosmin Ratiu wrote:
> On Tue, 2025-03-25 at 14:30 -0700, Stanislav Fomichev wrote:
> > @@ -2072,8 +2087,8 @@ static void
> > __move_netdevice_notifier_net(struct net *src_net,
> > struct net *dst_net,
> > struct notifier_block *nb)
> > {
> > - __unregister_netdevice_notifier_net(src_net, nb);
> > - __register_netdevice_notifier_net(dst_net, nb, true);
> > + __unregister_netdevice_notifier_net(src_net, nb, false);
> > + __register_netdevice_notifier_net(dst_net, nb, true, false);
> > }
>
> I tested with your (and the rest of Jakub's) patches.
> The problem with this approach is that when a netdev's net is changed,
> its lock will be acquired, but the notifiers for ALL netdevs in the old
> and the new namespace will be called, which will result in correct
> behavior for that device and lockdep_assert_held failure for all
> others.
Perfect, thanks for testing! At least we don't deadlock anymore, that's
progress :-) So looks like we need to do something like the following
below, maybe you can give it a run on your side? Since we don't
have any locking hierarchy (yet), we should be able to lock all
other netdevs besides the one that's already locked at netlink level.
diff --git a/net/core/dev.c b/net/core/dev.c
index afee19245401..125af0fc25d3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1895,32 +1895,32 @@ static int call_netdevice_register_notifiers(struct notifier_block *nb,
static void call_netdevice_unregister_notifiers(struct notifier_block *nb,
struct net_device *dev,
- bool lock)
+ struct net_device *locked)
{
if (dev->flags & IFF_UP) {
call_netdevice_notifier(nb, NETDEV_GOING_DOWN,
dev);
call_netdevice_notifier(nb, NETDEV_DOWN, dev);
}
- if (lock)
+ if (dev != locked)
netdev_lock_ops(dev);
call_netdevice_notifier(nb, NETDEV_UNREGISTER, dev);
- if (lock)
+ if (dev != locked)
netdev_unlock_ops(dev);
}
static int call_netdevice_register_net_notifiers(struct notifier_block *nb,
struct net *net,
- bool lock)
+ struct net_device *locked)
{
struct net_device *dev;
int err;
for_each_netdev(net, dev) {
- if (lock)
+ if (locked != dev)
netdev_lock_ops(dev);
err = call_netdevice_register_notifiers(nb, dev);
- if (lock)
+ if (locked != dev)
netdev_unlock_ops(dev);
if (err)
goto rollback;
@@ -1929,18 +1929,18 @@ static int call_netdevice_register_net_notifiers(struct notifier_block *nb,
rollback:
for_each_netdev_continue_reverse(net, dev)
- call_netdevice_unregister_notifiers(nb, dev, lock);
+ call_netdevice_unregister_notifiers(nb, dev, locked);
return err;
}
static void call_netdevice_unregister_net_notifiers(struct notifier_block *nb,
struct net *net,
- bool lock)
+ struct net_device *locked)
{
struct net_device *dev;
for_each_netdev(net, dev)
- call_netdevice_unregister_notifiers(nb, dev, lock);
+ call_netdevice_unregister_notifiers(nb, dev, locked);
}
static int dev_boot_phase = 1;
@@ -1977,7 +1977,7 @@ int register_netdevice_notifier(struct notifier_block *nb)
goto unlock;
for_each_net(net) {
__rtnl_net_lock(net);
- err = call_netdevice_register_net_notifiers(nb, net, true);
+ err = call_netdevice_register_net_notifiers(nb, net, NULL);
__rtnl_net_unlock(net);
if (err)
goto rollback;
@@ -1991,7 +1991,7 @@ int register_netdevice_notifier(struct notifier_block *nb)
rollback:
for_each_net_continue_reverse(net) {
__rtnl_net_lock(net);
- call_netdevice_unregister_net_notifiers(nb, net, true);
+ call_netdevice_unregister_net_notifiers(nb, net, NULL);
__rtnl_net_unlock(net);
}
@@ -2028,7 +2028,7 @@ int unregister_netdevice_notifier(struct notifier_block *nb)
for_each_net(net) {
__rtnl_net_lock(net);
- call_netdevice_unregister_net_notifiers(nb, net, true);
+ call_netdevice_unregister_net_notifiers(nb, net, NULL);
__rtnl_net_unlock(net);
}
@@ -2042,7 +2042,7 @@ EXPORT_SYMBOL(unregister_netdevice_notifier);
static int __register_netdevice_notifier_net(struct net *net,
struct notifier_block *nb,
bool ignore_call_fail,
- bool lock)
+ struct net_device *locked)
{
int err;
@@ -2052,7 +2052,7 @@ static int __register_netdevice_notifier_net(struct net *net,
if (dev_boot_phase)
return 0;
- err = call_netdevice_register_net_notifiers(nb, net, lock);
+ err = call_netdevice_register_net_notifiers(nb, net, locked);
if (err && !ignore_call_fail)
goto chain_unregister;
@@ -2065,7 +2065,7 @@ static int __register_netdevice_notifier_net(struct net *net,
static int __unregister_netdevice_notifier_net(struct net *net,
struct notifier_block *nb,
- bool lock)
+ struct net_device *locked)
{
int err;
@@ -2073,7 +2073,7 @@ static int __unregister_netdevice_notifier_net(struct net *net,
if (err)
return err;
- call_netdevice_unregister_net_notifiers(nb, net, lock);
+ call_netdevice_unregister_net_notifiers(nb, net, locked);
return 0;
}
@@ -2097,7 +2097,7 @@ int register_netdevice_notifier_net(struct net *net, struct notifier_block *nb)
int err;
rtnl_net_lock(net);
- err = __register_netdevice_notifier_net(net, nb, false, true);
+ err = __register_netdevice_notifier_net(net, nb, false, NULL);
rtnl_net_unlock(net);
return err;
@@ -2126,7 +2126,7 @@ int unregister_netdevice_notifier_net(struct net *net,
int err;
rtnl_net_lock(net);
- err = __unregister_netdevice_notifier_net(net, nb, true);
+ err = __unregister_netdevice_notifier_net(net, nb, NULL);
rtnl_net_unlock(net);
return err;
@@ -2135,10 +2135,11 @@ EXPORT_SYMBOL(unregister_netdevice_notifier_net);
static void __move_netdevice_notifier_net(struct net *src_net,
struct net *dst_net,
- struct notifier_block *nb)
+ struct notifier_block *nb,
+ struct net_device *locked)
{
- __unregister_netdevice_notifier_net(src_net, nb, false);
- __register_netdevice_notifier_net(dst_net, nb, true, false);
+ __unregister_netdevice_notifier_net(src_net, nb, locked);
+ __register_netdevice_notifier_net(dst_net, nb, true, locked);
}
static void rtnl_net_dev_lock(struct net_device *dev)
@@ -2184,7 +2185,7 @@ int register_netdevice_notifier_dev_net(struct net_device *dev,
int err;
rtnl_net_dev_lock(dev);
- err = __register_netdevice_notifier_net(dev_net(dev), nb, false, true);
+ err = __register_netdevice_notifier_net(dev_net(dev), nb, false, NULL);
if (!err) {
nn->nb = nb;
list_add(&nn->list, &dev->net_notifier_list);
@@ -2203,7 +2204,7 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev,
rtnl_net_dev_lock(dev);
list_del(&nn->list);
- err = __unregister_netdevice_notifier_net(dev_net(dev), nb, true);
+ err = __unregister_netdevice_notifier_net(dev_net(dev), nb, NULL);
rtnl_net_dev_unlock(dev);
return err;
@@ -2216,7 +2217,7 @@ static void move_netdevice_notifiers_dev_net(struct net_device *dev,
struct netdev_net_notifier *nn;
list_for_each_entry(nn, &dev->net_notifier_list, list)
- __move_netdevice_notifier_net(dev_net(dev), net, nn->nb);
+ __move_netdevice_notifier_net(dev_net(dev), net, nn->nb, dev);
}
/**
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 15:03 ` Cosmin Ratiu
2025-03-26 15:23 ` Stanislav Fomichev
@ 2025-03-26 15:24 ` Cosmin Ratiu
2025-03-26 17:43 ` Stanislav Fomichev
1 sibling, 1 reply; 21+ messages in thread
From: Cosmin Ratiu @ 2025-03-26 15:24 UTC (permalink / raw)
To: netdev@vger.kernel.org, sdf@fomichev.me
Cc: edumazet@google.com, davem@davemloft.net, kuba@kernel.org,
pabeni@redhat.com
On Wed, 2025-03-26 at 15:03 +0000, Cosmin Ratiu wrote:
> On Tue, 2025-03-25 at 14:30 -0700, Stanislav Fomichev wrote:
> > @@ -2072,8 +2087,8 @@ static void
> > __move_netdevice_notifier_net(struct net *src_net,
> > struct net *dst_net,
> > struct notifier_block
> > *nb)
> > {
> > - __unregister_netdevice_notifier_net(src_net, nb);
> > - __register_netdevice_notifier_net(dst_net, nb, true);
> > + __unregister_netdevice_notifier_net(src_net, nb, false);
> > + __register_netdevice_notifier_net(dst_net, nb, true,
> > false);
> > }
>
> I tested with your (and the rest of Jakub's) patches.
> The problem with this approach is that when a netdev's net is
> changed,
> its lock will be acquired, but the notifiers for ALL netdevs in the
> old
> and the new namespace will be called, which will result in correct
> behavior for that device and lockdep_assert_held failure for all
> others.
But a thing I've learned many years ago about locking is that locks
should protect data, not code. Shouldn't we avoid locking deep call
hierarchies (like notifiers) with the instance lock and instead focus
on 1) what fields need to be protected by the lock and 2) reduce
critical section length for those fields.
That plus reference counting usually does the trick and should avoid
these ugly deadlocks.
Cosmin.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 15:23 ` Stanislav Fomichev
@ 2025-03-26 15:37 ` Cosmin Ratiu
2025-03-26 17:49 ` Stanislav Fomichev
0 siblings, 1 reply; 21+ messages in thread
From: Cosmin Ratiu @ 2025-03-26 15:37 UTC (permalink / raw)
To: stfomichev@gmail.com
Cc: pabeni@redhat.com, edumazet@google.com, netdev@vger.kernel.org,
davem@davemloft.net, sdf@fomichev.me, kuba@kernel.org
On Wed, 2025-03-26 at 08:23 -0700, Stanislav Fomichev wrote:
> @@ -2028,7 +2028,7 @@ int unregister_netdevice_notifier(struct
> notifier_block *nb)
>
> for_each_net(net) {
> __rtnl_net_lock(net);
> - call_netdevice_unregister_net_notifiers(nb, net,
> true);
> + call_netdevice_unregister_net_notifiers(nb, net,
> NULL);
> __rtnl_net_unlock(net);
> }
I tested. The deadlock is back now, because dev != NULL and if the lock
is held (like in the below stack), the mutex_lock will be attempted
again:
WARNING: possible recursive locking detected
6.14.0-rc6+ #148 Tainted: G W
--------------------------------------------
ip/1766 is trying to acquire lock:
ffff888110e18c80 (&dev->lock){+.+.}-{4:4}, at:
call_netdevice_unregister_notifiers+0x7d/0x140
but task is already holding lock:
ffff888130ae0c80 (&dev->lock){+.+.}-{4:4}, at:
do_setlink.isra.0+0x5b/0x1220
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&dev->lock);
lock(&dev->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by ip/1766:
#0: ffffffff82df18c8 (rtnl_mutex){+.+.}-{4:4}, at:
rtnl_newlink+0x35d/0xb50
#1: ffff888130ae0c80 (&dev->lock){+.+.}-{4:4}, at:
do_setlink.isra.0+0x5b/0x1220
stack backtrace:
print_deadlock_bug+0x274/0x3b0
__lock_acquire+0x1229/0x2470
lock_acquire+0xb7/0x2b0
__mutex_lock+0xa6/0xd20
call_netdevice_unregister_notifiers+0x7d/0x140
netif_change_net_namespace+0x4ba/0xa90
do_setlink.isra.0+0xd5/0x1220
Cosmin.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 15:24 ` Cosmin Ratiu
@ 2025-03-26 17:43 ` Stanislav Fomichev
0 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-26 17:43 UTC (permalink / raw)
To: Cosmin Ratiu
Cc: netdev@vger.kernel.org, sdf@fomichev.me, edumazet@google.com,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com
On 03/26, Cosmin Ratiu wrote:
> On Wed, 2025-03-26 at 15:03 +0000, Cosmin Ratiu wrote:
> > On Tue, 2025-03-25 at 14:30 -0700, Stanislav Fomichev wrote:
> > > @@ -2072,8 +2087,8 @@ static void
> > > __move_netdevice_notifier_net(struct net *src_net,
> > > struct net *dst_net,
> > > struct notifier_block
> > > *nb)
> > > {
> > > - __unregister_netdevice_notifier_net(src_net, nb);
> > > - __register_netdevice_notifier_net(dst_net, nb, true);
> > > + __unregister_netdevice_notifier_net(src_net, nb, false);
> > > + __register_netdevice_notifier_net(dst_net, nb, true,
> > > false);
> > > }
> >
> > I tested with your (and the rest of Jakub's) patches.
> > The problem with this approach is that when a netdev's net is
> > changed,
> > its lock will be acquired, but the notifiers for ALL netdevs in the
> > old
> > and the new namespace will be called, which will result in correct
> > behavior for that device and lockdep_assert_held failure for all
> > others.
>
> But a thing I've learned many years ago about locking is that locks
> should protect data, not code. Shouldn't we avoid locking deep call
> hierarchies (like notifiers) with the instance lock and instead focus
> on 1) what fields need to be protected by the lock and 2) reduce
> critical section length for those fields.
>
> That plus reference counting usually does the trick and should avoid
> these ugly deadlocks.
We want the operations to look atomic from the userspace if possible.
So the whole device is either moved or not, some other thread should
not be able to change, say, mtu mid-way.
And we do try to clarify what's specifically protected in terms of data:
https://web.git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/include/linux/netdevice.h#n2494
But the notifiers are super tricky. There are years of natural growth
with the assumption of a single rtnl lock :-(
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 15:37 ` Cosmin Ratiu
@ 2025-03-26 17:49 ` Stanislav Fomichev
2025-03-26 20:37 ` Stanislav Fomichev
2025-03-26 20:57 ` Cosmin Ratiu
0 siblings, 2 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-26 17:49 UTC (permalink / raw)
To: Cosmin Ratiu
Cc: pabeni@redhat.com, edumazet@google.com, netdev@vger.kernel.org,
davem@davemloft.net, sdf@fomichev.me, kuba@kernel.org
On 03/26, Cosmin Ratiu wrote:
> On Wed, 2025-03-26 at 08:23 -0700, Stanislav Fomichev wrote:
> > @@ -2028,7 +2028,7 @@ int unregister_netdevice_notifier(struct
> > notifier_block *nb)
> >
> > for_each_net(net) {
> > __rtnl_net_lock(net);
> > - call_netdevice_unregister_net_notifiers(nb, net,
> > true);
> > + call_netdevice_unregister_net_notifiers(nb, net,
> > NULL);
> > __rtnl_net_unlock(net);
> > }
>
> I tested. The deadlock is back now, because dev != NULL and if the lock
> is held (like in the below stack), the mutex_lock will be attempted
> again:
I think I'm missing something. In this case I'm not sure why the original
"fix" worked.
You, presumably, use mlx5? And you just move this single device into
a new netns? Or there is a couple of other mlx5 devices still hanging in
the root netns?
I'll try to take a look more at register_netdevice_notifier_net under
mlx5..
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 17:49 ` Stanislav Fomichev
@ 2025-03-26 20:37 ` Stanislav Fomichev
2025-03-26 20:57 ` Cosmin Ratiu
1 sibling, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-26 20:37 UTC (permalink / raw)
To: Cosmin Ratiu
Cc: pabeni@redhat.com, edumazet@google.com, netdev@vger.kernel.org,
davem@davemloft.net, sdf@fomichev.me, kuba@kernel.org
On 03/26, Stanislav Fomichev wrote:
> On 03/26, Cosmin Ratiu wrote:
> > On Wed, 2025-03-26 at 08:23 -0700, Stanislav Fomichev wrote:
> > > @@ -2028,7 +2028,7 @@ int unregister_netdevice_notifier(struct
> > > notifier_block *nb)
> > >
> > > for_each_net(net) {
> > > __rtnl_net_lock(net);
> > > - call_netdevice_unregister_net_notifiers(nb, net,
> > > true);
> > > + call_netdevice_unregister_net_notifiers(nb, net,
> > > NULL);
> > > __rtnl_net_unlock(net);
> > > }
> >
> > I tested. The deadlock is back now, because dev != NULL and if the lock
> > is held (like in the below stack), the mutex_lock will be attempted
> > again:
>
> I think I'm missing something. In this case I'm not sure why the original
> "fix" worked.
>
> You, presumably, use mlx5? And you just move this single device into
> a new netns? Or there is a couple of other mlx5 devices still hanging in
> the root netns?
>
> I'll try to take a look more at register_netdevice_notifier_net under
> mlx5..
I have a feeling that it's a spurious warning, the lock addresses
are different:
ip/1766 is trying to acquire lock:
ffff888110e18c80 (&dev->lock){+.+.}-{4:4}, at:
call_netdevice_unregister_notifiers+0x7d/0x140
but task is already holding lock:
ffff888130ae0c80 (&dev->lock){+.+.}-{4:4}, at:
do_setlink.isra.0+0x5b/0x1220
Can you try to apply the following on top of previous patch? At least
to confirm whether it matches my understanding.. We might also stick
with that unless we find a better option.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 3506024c2453..e3d8d6c9bf03 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -40,6 +40,7 @@
#include <linux/if_bridge.h>
#include <linux/filter.h>
#include <net/netdev_queues.h>
+#include <net/netdev_lock.h>
#include <net/page_pool/types.h>
#include <net/pkt_sched.h>
#include <net/xdp_sock_drv.h>
@@ -5454,6 +5455,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
netdev->netdev_ops = &mlx5e_netdev_ops;
netdev->xdp_metadata_ops = &mlx5e_xdp_metadata_ops;
netdev->xsk_tx_metadata_ops = &mlx5e_xsk_tx_metadata_ops;
+ netdev_lockdep_set_classes(netdev);
mlx5e_dcbnl_build_netdev(netdev);
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 17:49 ` Stanislav Fomichev
2025-03-26 20:37 ` Stanislav Fomichev
@ 2025-03-26 20:57 ` Cosmin Ratiu
2025-03-26 21:18 ` Cosmin Ratiu
1 sibling, 1 reply; 21+ messages in thread
From: Cosmin Ratiu @ 2025-03-26 20:57 UTC (permalink / raw)
To: stfomichev@gmail.com
Cc: kuba@kernel.org, edumazet@google.com, netdev@vger.kernel.org,
davem@davemloft.net, pabeni@redhat.com, sdf@fomichev.me
On Wed, 2025-03-26 at 10:49 -0700, Stanislav Fomichev wrote:
> On 03/26, Cosmin Ratiu wrote:
> > On Wed, 2025-03-26 at 08:23 -0700, Stanislav Fomichev wrote:
> > > @@ -2028,7 +2028,7 @@ int unregister_netdevice_notifier(struct
> > > notifier_block *nb)
> > >
> > > for_each_net(net) {
> > > __rtnl_net_lock(net);
> > > - call_netdevice_unregister_net_notifiers(nb, net,
> > > true);
> > > + call_netdevice_unregister_net_notifiers(nb, net,
> > > NULL);
> > > __rtnl_net_unlock(net);
> > > }
> >
> > I tested. The deadlock is back now, because dev != NULL and if the
> > lock
> > is held (like in the below stack), the mutex_lock will be attempted
> > again:
>
> I think I'm missing something. In this case I'm not sure why the
> original
> "fix" worked.
I was misinterpreting the unregister_netdevice_notifier, it's not
reached from netif_change_net_namespace. Sorry for the confusion.
It seems this is not a deadlock by reentrance, just lockdep seeing a
*potential* deadlock because the two locks have the same lock class.
And theoretically, you can deadlock if two concurrent
netif_change_net_namespace on two different devices attempt to lock
things in the wrong order. Ah, the joys of granular locking...
Am I missing some locking annotation patch? A quick search in net-next
turned out nothing.
> You, presumably, use mlx5? And you just move this single device into
> a new netns? Or there is a couple of other mlx5 devices still hanging
> in
> the root netns?
>
> I'll try to take a look more at register_netdevice_notifier_net under
> mlx5..
I see there are two mlx5-exclusive functions,
register_netdevice_notifier_net and
register_netdevice_notifier_dev_net, which I'm not yet too clear how
are used, but those don't come into play here.
For more context: I'm adding support for the new locking behavior in
mlx5, and running multiple tests which add VFs & representors and play
with netdevs & net namespaces in general.
Cosmin.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 20:57 ` Cosmin Ratiu
@ 2025-03-26 21:18 ` Cosmin Ratiu
2025-03-26 22:02 ` Stanislav Fomichev
0 siblings, 1 reply; 21+ messages in thread
From: Cosmin Ratiu @ 2025-03-26 21:18 UTC (permalink / raw)
To: stfomichev@gmail.com
Cc: kuba@kernel.org, edumazet@google.com, netdev@vger.kernel.org,
davem@davemloft.net, pabeni@redhat.com, sdf@fomichev.me
On Wed, 2025-03-26 at 21:57 +0100, Cosmin Ratiu wrote:
>
> Am I missing some locking annotation patch? A quick search in net-
> next
> turned out nothing.
Soon after sending the previous email, I found
netdev_lockdep_set_classes and saw that it disables deadlock checking
for the instance lock. With it in place, it works.
I also saw your other email immediately after...
With that in place, things seem to work fine without further warnings
for a few quick tests.
However, it seems that this approach is dangerous, there is the
possibility of an actual deadlock with two concurrent
netif_change_net_namespace when the RTNL is removed from that path.
Cosmin.
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
2025-03-26 21:18 ` Cosmin Ratiu
@ 2025-03-26 22:02 ` Stanislav Fomichev
0 siblings, 0 replies; 21+ messages in thread
From: Stanislav Fomichev @ 2025-03-26 22:02 UTC (permalink / raw)
To: Cosmin Ratiu
Cc: kuba@kernel.org, edumazet@google.com, netdev@vger.kernel.org,
davem@davemloft.net, pabeni@redhat.com, sdf@fomichev.me
On 03/26, Cosmin Ratiu wrote:
> On Wed, 2025-03-26 at 21:57 +0100, Cosmin Ratiu wrote:
> >
> > Am I missing some locking annotation patch? A quick search in net-
> > next
> > turned out nothing.
>
> Soon after sending the previous email, I found
> netdev_lockdep_set_classes and saw that it disables deadlock checking
> for the instance lock. With it in place, it works.
> I also saw your other email immediately after...
>
> With that in place, things seem to work fine without further warnings
> for a few quick tests.
>
> However, it seems that this approach is dangerous, there is the
> possibility of an actual deadlock with two concurrent
> netif_change_net_namespace when the RTNL is removed from that path.
Yeah, netdev_lockdep_set_classes is not pretty, but it should do until
we solve the locking for the layering devices. Which is another can
of worms I don't want to open in the current release. We want to be
in a somewhat consistent state before jumping to the rest (dropping
rtnl for ethtool, properly fixing notifiers, upper/lower locking).
For the netlink path, we are very unlikely to remove rtnl, so let's
deal with the ordering when and if we get there.
Thanks again for testing btw, lmk if you hit any other issues, I want
to unblock your (Saeed's) queue management changes.. I'll try to post
a v2 later today or tomorrow.
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-03-26 22:02 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 1/9] net: switch to netif_disable_lro in inetdev_init Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER Stanislav Fomichev
2025-03-26 15:03 ` Cosmin Ratiu
2025-03-26 15:23 ` Stanislav Fomichev
2025-03-26 15:37 ` Cosmin Ratiu
2025-03-26 17:49 ` Stanislav Fomichev
2025-03-26 20:37 ` Stanislav Fomichev
2025-03-26 20:57 ` Cosmin Ratiu
2025-03-26 21:18 ` Cosmin Ratiu
2025-03-26 22:02 ` Stanislav Fomichev
2025-03-26 15:24 ` Cosmin Ratiu
2025-03-26 17:43 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev Stanislav Fomichev
2025-03-26 7:33 ` kernel test robot
2025-03-25 21:30 ` [PATCH net-next 4/9] net: dummy: request ops lock Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 5/9] net: release instance lock during NETDEV_UNREGISTER for bond/team Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 6/9] docs: net: document netdev notifier expectations Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 7/9] net: designate XSK pool pointers in queues as "ops protected" Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 8/9] netdev: add "ops compat locking" helpers Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 9/9] netdev: don't hold rtnl_lock over nl queue info get when possible Stanislav Fomichev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).