Netdev List
 help / color / mirror / Atom feed
* [PATCH v1 net-next 00/14] net: Support per-netns device unregistration
@ 2026-07-01 21:41 Kuniyuki Iwashima
  2026-07-01 21:41 ` [PATCH v1 net-next 01/14] rtnetlink: Lock sock_net(skb->sk) in rtnl_newlink() Kuniyuki Iwashima
                   ` (14 more replies)
  0 siblings, 15 replies; 17+ messages in thread
From: Kuniyuki Iwashima @ 2026-07-01 21:41 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

The biggest blocker to per-netns RTNL is netdev unregistration.

It starts within a single netns, but it can eventually involve
multiple namespaces.

There are three types of such cross-netns devices:

  1. Paired devices (e.g., netkit, veth, vxcan)
     -> Unregistering one device also deletes its peer, which
        may reside in another netns.

  2. Tunnel devices (e.g., bareudp, geneve, etc)
     -> Destroying a netns removes devices in another netns if
        their backend sockets reside in the dying netns

  3. Stacked devices (e.g., ipvlan, macvlan, etc)
     -> Removing the lower device also removes multiple upper
        devices, each of which may reside in different namespaces.

While the first two device types require at most two rtnl_net_lock()s,
the stacked type has no upper limit.  This makes it impossible to
freeze all necessary namespaces in advance.

This series introduces per-netns work, initially suggested at
NetConf 2024, to delegate the unregistration of such cross-netns
devices.

  https://netdev.bots.linux.dev/netconf/2024/kuniyu.pdf#page=62

The first half of the series wraps NETDEV_UNREGISTER (in core) with
per-netns RTNL, adds a helper for per-netns device unregistration,
and forces per-netns device unregistration in the core code when
CONFIG_DEBUG_NET_SMALL_RTNL=y.

The latter half picks out one from each type (veth, bareudp, ipvlan)
and converts them to support per-netns device unregistration,
although the operations are **still serialised under RTNL** for now.

Please note that this series focuses only on the device unregistration
paths.  For example, there are ASSERT_RTNL() left in other paths, and
Sashiko may point it out, but they are out of scope.

This is just the first step, and we need more incremental changes to
completely remove RTNL anyway.

Now, we can see that unregistering a lower device (veth0 below)
removes upper devices (ipvl2, ipvl3) in different namespaces using
per-netns work with a different PID.  The lower device (veth0) is
freed only after all upper ipvlan devices have called netdev_put()
in ipvlan_uninit().

  # ip netns add ns1
  # ip netns add ns2
  # ip netns add ns3
  # ip -n ns1 link add veth0 type veth peer veth1
  # ip -n ns2 link add ipvl2 link veth0 link-netns ns1 type ipvlan mode l2
  # ip -n ns3 link add ipvl3 link veth0 link-netns ns1 type ipvlan mode l2
  # ip -n ns1 link del veth0

  # bpftrace -e '#include <linux/netdevice.h>
  kprobe:ipvlan_uninit,
  kprobe:veth_dellink,
  kprobe:free_netdev {
      $dev = (struct net_device *)arg0;
      printf("PID: %d | DEV: %s%s\n", pid, $dev->name, kstack());
  }'

  PID: 2010 | DEV: veth0
          veth_dellink+5
          rtnl_dellink+1213
          rtnetlink_rcv_msg+1791
  ...
  PID: 440 | DEV: ipvl2
          ipvlan_uninit+5
          unregister_netdevice_many_notify+7129
          unregister_netdevice_many_net+1050
          rtnl_net_work_func+136
  ...
  PID: 440 | DEV: ipvl2
          free_netdev+5
          netdev_run_todo+4798
          process_scheduled_works+2538
  ...
  PID: 440 | DEV: ipvl3
          ipvlan_uninit+5
          unregister_netdevice_many_notify+7129
          unregister_netdevice_many_net+1050
          rtnl_net_work_func+136
          process_scheduled_works+2538
  ...
  PID: 2010 | DEV: veth0
          free_netdev+5
          netdev_run_todo+4798
          rtnl_dellink+1507
          rtnetlink_rcv_msg+1791
  ...
  PID: 440 | DEV: ipvl3
          free_netdev+5
          netdev_run_todo+4798
          process_scheduled_works+2538
  ...


Kuniyuki Iwashima (14):
  rtnetlink: Lock sock_net(skb->sk) in rtnl_newlink().
  rtnetlink: Call unregister_netdevice_many() only once in
    rtnl_link_unregister().
  rtnetlink: Add per-netns rtnl_work.
  net: Wrap default_device_exit_net() with __rtnl_net_lock().
  net: Hold __rtnl_net_lock() in netdev_wait_allrefs_any().
  net: Add per-netns netdev unregistration infra.
  net: Call unregister_netdevice_many() per netns.
  veth: Support per-netns device unregistration.
  bareudp: Protect bareudp_list with mutex.
  bareudp: Support per-netns netdev unregistration.
  ipvlan: Convert ipvl_port.count to refcount_t.
  ipvlan: Synchronise ipvlan_init() and ipvlan_uninit() for the same
    lower dev.
  ipvlan: Protect ipvl_port.ipvlans with mutex.
  ipvlan: Support per-netns netdev unregistration.

 drivers/net/bareudp.c            |  43 ++++++++-
 drivers/net/ipvlan/ipvlan.h      |  18 +++-
 drivers/net/ipvlan/ipvlan_main.c | 153 +++++++++++++++++++++++++------
 drivers/net/ipvlan/ipvtap.c      |  16 ++--
 drivers/net/veth.c               |  34 ++++---
 include/linux/netdevice.h        |  22 +++++
 include/linux/rtnetlink.h        |   8 ++
 include/net/net_namespace.h      |   3 +
 net/core/dev.c                   | 129 +++++++++++++++++++++++++-
 net/core/net_namespace.c         |   4 +
 net/core/rtnetlink.c             |  57 ++++++++++--
 11 files changed, 418 insertions(+), 69 deletions(-)

-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-07-02 21:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 21:41 [PATCH v1 net-next 00/14] net: Support per-netns device unregistration Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 01/14] rtnetlink: Lock sock_net(skb->sk) in rtnl_newlink() Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 02/14] rtnetlink: Call unregister_netdevice_many() only once in rtnl_link_unregister() Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 03/14] rtnetlink: Add per-netns rtnl_work Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 04/14] net: Wrap default_device_exit_net() with __rtnl_net_lock() Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 05/14] net: Hold __rtnl_net_lock() in netdev_wait_allrefs_any() Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 06/14] net: Add per-netns netdev unregistration infra Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 07/14] net: Call unregister_netdevice_many() per netns Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 08/14] veth: Support per-netns device unregistration Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 09/14] bareudp: Protect bareudp_list with mutex Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 10/14] bareudp: Support per-netns netdev unregistration Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 11/14] ipvlan: Convert ipvl_port.count to refcount_t Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 12/14] ipvlan: Synchronise ipvlan_init() and ipvlan_uninit() for the same lower dev Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 13/14] ipvlan: Protect ipvl_port.ipvlans with mutex Kuniyuki Iwashima
2026-07-01 21:41 ` [PATCH v1 net-next 14/14] ipvlan: Support per-netns netdev unregistration Kuniyuki Iwashima
2026-07-02  7:45 ` [syzbot ci] Re: net: Support per-netns device unregistration syzbot ci
2026-07-02 21:59   ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox