From: Kuniyuki Iwashima <kuniyu@google.com>
To: "David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Simon Horman <horms@kernel.org>,
Kuniyuki Iwashima <kuniyu@google.com>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
netdev@vger.kernel.org
Subject: [PATCH v2 net-next 03/14] rtnetlink: Add per-netns rtnl_work.
Date: Fri, 3 Jul 2026 00:09:14 +0000 [thread overview]
Message-ID: <20260703001009.1572444-4-kuniyu@google.com> (raw)
In-Reply-To: <20260703001009.1572444-1-kuniyu@google.com>
The biggest blocker to per-netns RTNL is netdev unregistration.
It starts within a single netns (e.g., during a device lookup or
netns dismantle), but it can eventually involve multiple namespaces,
such as when upper ipvlan devices reside in different netns.
This prevents us from acquiring multiple rtnl_net_lock()s beforehand.
When we encounter such a cross-netns device, we must delegate the
unregistration to the work of the netns where the device actually
resides.
Let's add per-netns rtnl_work to support the deferred netdev
unregistration.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
include/linux/rtnetlink.h | 8 ++++++++
include/net/net_namespace.h | 1 +
net/core/net_namespace.c | 1 +
net/core/rtnetlink.c | 26 ++++++++++++++++++++++++++
4 files changed, 36 insertions(+)
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index ea39dd23a197..95729339e7a5 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -115,6 +115,10 @@ bool rtnl_net_is_locked(struct net *net);
bool lockdep_rtnl_net_is_held(struct net *net);
+void rtnl_net_queue_work(struct net *net);
+void rtnl_net_flush_workqueue(void);
+void rtnl_net_work_func(struct work_struct *work);
+
#define rcu_dereference_rtnl_net(net, p) \
rcu_dereference_check(p, lockdep_rtnl_net_is_held(net))
#define rtnl_net_dereference(net, p) \
@@ -150,6 +154,10 @@ static inline void ASSERT_RTNL_NET(struct net *net)
ASSERT_RTNL();
}
+static inline void rtnl_net_flush_workqueue(void)
+{
+}
+
#define rcu_dereference_rtnl_net(net, p) \
rcu_dereference_rtnl(p)
#define rtnl_net_dereference(net, p) \
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 80de5e98a66d..a989019af5f7 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -197,6 +197,7 @@ struct net {
#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
/* Move to a better place when the config guard is removed. */
struct mutex rtnl_mutex;
+ struct work_struct rtnl_work;
#endif
#if IS_ENABLED(CONFIG_VSOCKETS)
struct netns_vsock vsock;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index d9dafe24f57e..d1aeff9de580 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -422,6 +422,7 @@ static __net_init int preinit_net(struct net *net, struct user_namespace *user_n
#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
mutex_init(&net->rtnl_mutex);
lock_set_cmp_fn(&net->rtnl_mutex, rtnl_net_lock_cmp_fn, NULL);
+ INIT_WORK(&net->rtnl_work, rtnl_net_work_func);
#endif
INIT_LIST_HEAD(&net->ptype_all);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 7207da002fb5..7959519e7375 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -273,6 +273,26 @@ bool lockdep_rtnl_net_is_held(struct net *net)
return lockdep_rtnl_is_held() && lockdep_is_held(&net->rtnl_mutex);
}
EXPORT_SYMBOL(lockdep_rtnl_net_is_held);
+
+static struct workqueue_struct *rtnl_net_wq;
+
+void rtnl_net_queue_work(struct net *net)
+{
+ queue_work(rtnl_net_wq, &net->rtnl_work);
+}
+
+void rtnl_net_flush_workqueue(void)
+{
+ flush_workqueue(rtnl_net_wq);
+}
+
+void rtnl_net_work_func(struct work_struct *work)
+{
+ struct net *net = container_of(work, struct net, rtnl_work);
+
+ rtnl_net_lock(net);
+ rtnl_net_unlock(net);
+}
#else
static int rtnl_net_cmp_locks(const struct net *net_a, const struct net *net_b)
{
@@ -7226,4 +7246,10 @@ void __init rtnetlink_init(void)
register_netdevice_notifier(&rtnetlink_dev_notifier);
rtnl_register_many(rtnetlink_rtnl_msg_handlers);
+
+#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
+ rtnl_net_wq = create_workqueue("rtnl_net");
+ if (!rtnl_net_wq)
+ panic("Could not create rtnl_net workq");
+#endif
}
--
2.55.0.rc0.799.gd6f94ed593-goog
next prev parent reply other threads:[~2026-07-03 0:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-03 0:09 [PATCH v2 net-next 00/14] net: Support per-netns device unregistration Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 01/14] rtnetlink: Lock sock_net(skb->sk) in rtnl_newlink() Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 02/14] rtnetlink: Call unregister_netdevice_many() only once in rtnl_link_unregister() Kuniyuki Iwashima
2026-07-03 0:09 ` Kuniyuki Iwashima [this message]
2026-07-03 0:09 ` [PATCH v2 net-next 04/14] net: Wrap default_device_exit_net() with __rtnl_net_lock() Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 05/14] net: Hold __rtnl_net_lock() in netdev_wait_allrefs_any() Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 06/14] net: Add per-netns netdev unregistration infra Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 07/14] net: Call unregister_netdevice_many() per netns Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 08/14] veth: Support per-netns device unregistration Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 09/14] bareudp: Protect bareudp_list with mutex Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 10/14] bareudp: Support per-netns netdev unregistration Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 11/14] ipvlan: Convert ipvl_port.count to refcount_t Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 12/14] ipvlan: Synchronise ipvlan_init() and ipvlan_uninit() for the same lower dev Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 13/14] ipvlan: Protect ipvl_port.ipvlans with mutex Kuniyuki Iwashima
2026-07-03 0:09 ` [PATCH v2 net-next 14/14] ipvlan: Support per-netns netdev unregistration Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260703001009.1572444-4-kuniyu@google.com \
--to=kuniyu@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox