Netdev List
 help / color / mirror / Atom feed
* [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo()
@ 2026-05-22 17:29 Eric Dumazet
  2026-05-22 17:29 ` [PATCH v4 net-next 1/5] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Eric Dumazet @ 2026-05-22 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
	Eric Dumazet

Many shell scripts invoke iproute2 commands specifying a device by
its name.

This series improves their performance avoiding RTNL acquisition
for their (repeated) name->index conversion.

v3: insert patch 2/3 in the series (Jakub reported a KASAN splat)
v4: Addressed Sashiko's feedback.
    added 2 patches for rtnl_dump_ifinfo().

Eric Dumazet (5):
  rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
  net: defer netdev_name_node_alt_flush() call to netdev_run_todo()
  rtnetlink: do not acquire RTNL in rtnl_getlink() with
    RTEXT_FILTER_NAME_ONLY
  rtnetlink: do not assume RTNL is held in link_master_filtered()
  rtnetlink: add RTEXT_FILTER_NAME_ONLY support to rtnl_dump_ifinfo()

 net/core/dev.c       |   4 +-
 net/core/rtnetlink.c | 131 ++++++++++++++++++++++++++++++-------------
 2 files changed, 95 insertions(+), 40 deletions(-)

-- 
2.54.0.746.g67dd491aae-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 net-next 1/5] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
  2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
@ 2026-05-22 17:29 ` Eric Dumazet
  2026-05-22 17:29 ` [PATCH v4 net-next 2/5] net: defer netdev_name_node_alt_flush() call to netdev_run_todo() Eric Dumazet
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2026-05-22 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
	Eric Dumazet

Avoid corrupting a netlink message and confuse user space in the
very unlikely case rtnl_fill_prop_list was able to produce a very big
nested element.

This is extremely unlikely, because rtnl_prop_list_size()
provisions nla_total_size(ALTIFNAMSIZ) per altname.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/rtnetlink.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 3d40ebe035b37ae0f38fb81f918eb76742371ef1..3dfa28927c7f92f906a0d89b7a1812b975d13854 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1971,12 +1971,14 @@ static int rtnl_fill_prop_list(struct sk_buff *skb,
 	if (ret <= 0)
 		goto nest_cancel;
 
-	nla_nest_end(skb, prop_list);
+	if (nla_nest_end_safe(skb, prop_list) < 0)
+		goto nest_cancel;
+
 	return 0;
 
 nest_cancel:
 	nla_nest_cancel(skb, prop_list);
-	return ret;
+	return -EMSGSIZE;
 }
 
 static int rtnl_fill_proto_down(struct sk_buff *skb,
-- 
2.54.0.746.g67dd491aae-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 net-next 2/5] net: defer netdev_name_node_alt_flush() call to netdev_run_todo()
  2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
  2026-05-22 17:29 ` [PATCH v4 net-next 1/5] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
@ 2026-05-22 17:29 ` Eric Dumazet
  2026-05-22 17:30 ` [PATCH v4 net-next 3/5] rtnetlink: do not acquire RTNL in rtnl_getlink() with RTEXT_FILTER_NAME_ONLY Eric Dumazet
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2026-05-22 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
	Eric Dumazet

In the following patch, we want to call rtnl_fill_prop_list() without
RTNL being held, but after a device reference was taken.

We need to free altnames in netdev_run_todo() instead of
unregister_netdevice_many_notify().

Freeing will only happen once all device references
have been released.

Note that dev->name_node serves as the anchor for altnames,
thus must be also freed in netdev_run_todo().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/core/dev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 26ac8eb9b259d489159c7ab5a2b206d425110b3b..2d795f3f569be00361809823fd3e59fb1871919c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11738,6 +11738,8 @@ void netdev_run_todo(void)
 		WARN_ON(rcu_access_pointer(dev->ip_ptr));
 		WARN_ON(rcu_access_pointer(dev->ip6_ptr));
 
+		netdev_name_node_alt_flush(dev);
+		netdev_name_node_free(dev->name_node);
 		netdev_do_free_pcpu_stats(dev);
 		if (dev->priv_destructor)
 			dev->priv_destructor(dev);
@@ -12451,8 +12453,6 @@ void unregister_netdevice_many_notify(struct list_head *head,
 		dev_uc_flush(dev);
 		dev_mc_flush(dev);
 
-		netdev_name_node_alt_flush(dev);
-		netdev_name_node_free(dev->name_node);
 
 		netdev_rss_contexts_free(dev);
 
-- 
2.54.0.746.g67dd491aae-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 net-next 3/5] rtnetlink: do not acquire RTNL in rtnl_getlink() with RTEXT_FILTER_NAME_ONLY
  2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
  2026-05-22 17:29 ` [PATCH v4 net-next 1/5] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
  2026-05-22 17:29 ` [PATCH v4 net-next 2/5] net: defer netdev_name_node_alt_flush() call to netdev_run_todo() Eric Dumazet
@ 2026-05-22 17:30 ` Eric Dumazet
  2026-05-22 17:30 ` [PATCH v4 net-next 4/5] rtnetlink: do not assume RTNL is held in link_master_filtered() Eric Dumazet
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2026-05-22 17:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
	Eric Dumazet

When RTEXT_FILTER_NAME_ONLY is requested, rtnl_fill_ifinfo()
is dumping device attributes which do not need RTNL protection.

Many shell scripts invoke iproute2 commands specifying a device by
its name. After this patch, they will no longer add RTNL pressure.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/rtnetlink.c | 94 +++++++++++++++++++++++++++++++-------------
 1 file changed, 67 insertions(+), 27 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 3dfa28927c7f92f906a0d89b7a1812b975d13854..c342b22528e4478a61f22e204a3934ba1a48cb3c 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2068,7 +2068,6 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
 	struct nlmsghdr *nlh;
 	struct Qdisc *qdisc;
 
-	ASSERT_RTNL();
 	nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags);
 	if (nlh == NULL)
 		return -EMSGSIZE;
@@ -2091,6 +2090,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
 	if (ext_filter_mask & RTEXT_FILTER_NAME_ONLY)
 		goto end;
 
+	ASSERT_RTNL();
 	if (tgt_netnsid >= 0 &&
 	    nla_put_s32(skb, IFLA_TARGET_NETNSID, tgt_netnsid))
 		goto nla_put_failure;
@@ -3468,6 +3468,21 @@ static struct net_device *rtnl_dev_get(struct net *net,
 	return __dev_get_by_name(net, ifname);
 }
 
+static struct net_device *rtnl_dev_get_rcu(struct net *net,
+					   struct nlattr *tb[])
+{
+	char ifname[ALTIFNAMSIZ];
+
+	if (tb[IFLA_IFNAME])
+		nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
+	else if (tb[IFLA_ALT_IFNAME])
+		nla_strscpy(ifname, tb[IFLA_ALT_IFNAME], ALTIFNAMSIZ);
+	else
+		return NULL;
+
+	return dev_get_by_name_rcu(net, ifname);
+}
+
 static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			struct netlink_ext_ack *extack)
 {
@@ -4187,14 +4202,16 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
+	struct nlattr *tb[IFLA_MAX + 1];
+	netdevice_tracker dev_tracker;
+	struct net_device *dev = NULL;
 	struct net *tgt_net = net;
+	u32 ext_filter_mask = 0;
 	struct ifinfomsg *ifm;
-	struct nlattr *tb[IFLA_MAX+1];
-	struct net_device *dev = NULL;
 	struct sk_buff *nskb;
 	int netnsid = -1;
+	bool need_rtnl;
 	int err;
-	u32 ext_filter_mask = 0;
 
 	err = rtnl_valid_getlink_req(skb, nlh, tb, extack);
 	if (err < 0)
@@ -4214,43 +4231,65 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (tb[IFLA_EXT_MASK])
 		ext_filter_mask = nla_get_u32(tb[IFLA_EXT_MASK]);
 
-	err = -EINVAL;
 	ifm = nlmsg_data(nlh);
-	if (ifm->ifi_index > 0)
-		dev = __dev_get_by_index(tgt_net, ifm->ifi_index);
-	else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
-		dev = rtnl_dev_get(tgt_net, tb);
-	else
+	rcu_read_lock();
+	if (ifm->ifi_index > 0) {
+		dev = dev_get_by_index_rcu(tgt_net, ifm->ifi_index);
+	} else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) {
+		dev = rtnl_dev_get_rcu(tgt_net, tb);
+	} else {
+		rcu_read_unlock();
+		err = -EINVAL;
 		goto out;
+	}
+	netdev_hold(dev, &dev_tracker, GFP_ATOMIC);
+	rcu_read_unlock();
 
 	err = -ENODEV;
 	if (dev == NULL)
 		goto out;
 
+	need_rtnl = !(ext_filter_mask & RTEXT_FILTER_NAME_ONLY);
+
+retry:
+	if (need_rtnl) {
+		rtnl_lock();
+		/* Synchronize the carrier state so we don't report a state
+		 * that we're not actually going to honour immediately; if
+		 * the driver just did a carrier off->on transition, we can
+		 * only TX if link watch work has run, but without this we'd
+		 * already report carrier on, even if it doesn't work yet.
+		 */
+		linkwatch_sync_dev(dev);
+	}
+
 	err = -ENOBUFS;
 	nskb = nlmsg_new_large(if_nlmsg_size(dev, ext_filter_mask));
-	if (nskb == NULL)
-		goto out;
+	if (nskb)
+		err = rtnl_fill_ifinfo(nskb, dev, net,
+				       RTM_NEWLINK, NETLINK_CB(skb).portid,
+				       nlh->nlmsg_seq, 0, 0, ext_filter_mask,
+				       0, NULL, 0, netnsid, GFP_KERNEL);
 
-	/* Synchronize the carrier state so we don't report a state
-	 * that we're not actually going to honour immediately; if
-	 * the driver just did a carrier off->on transition, we can
-	 * only TX if link watch work has run, but without this we'd
-	 * already report carrier on, even if it doesn't work yet.
-	 */
-	linkwatch_sync_dev(dev);
+	if (need_rtnl)
+		rtnl_unlock();
 
-	err = rtnl_fill_ifinfo(nskb, dev, net,
-			       RTM_NEWLINK, NETLINK_CB(skb).portid,
-			       nlh->nlmsg_seq, 0, 0, ext_filter_mask,
-			       0, NULL, 0, netnsid, GFP_KERNEL);
 	if (err < 0) {
-		/* -EMSGSIZE implies BUG in if_nlmsg_size */
-		WARN_ON(err == -EMSGSIZE);
 		kfree_skb(nskb);
-	} else
+		if (err == -EMSGSIZE) {
+			if (!need_rtnl) {
+				/* Some altnames were added, retry with RTNL. */
+				need_rtnl = true;
+				goto retry;
+			}
+			/* -EMSGSIZE implies BUG in if_nlmsg_size */
+			WARN_ON_ONCE(1);
+		}
+	} else {
 		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
+	}
 out:
+	netdev_put(dev, &dev_tracker);
 	if (netnsid >= 0)
 		put_net(tgt_net);
 
@@ -7117,7 +7156,8 @@ static const struct rtnl_msg_handler rtnetlink_rtnl_msg_handlers[] __initconst =
 	{.msgtype = RTM_DELLINK, .doit = rtnl_dellink,
 	 .flags = RTNL_FLAG_DOIT_PERNET_WIP},
 	{.msgtype = RTM_GETLINK, .doit = rtnl_getlink,
-	 .dumpit = rtnl_dump_ifinfo, .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE},
+	 .dumpit = rtnl_dump_ifinfo,
+	 .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE | RTNL_FLAG_DOIT_UNLOCKED},
 	{.msgtype = RTM_SETLINK, .doit = rtnl_setlink,
 	 .flags = RTNL_FLAG_DOIT_PERNET_WIP},
 	{.msgtype = RTM_GETADDR, .dumpit = rtnl_dump_all},
-- 
2.54.0.746.g67dd491aae-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 net-next 4/5] rtnetlink: do not assume RTNL is held in link_master_filtered()
  2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
                   ` (2 preceding siblings ...)
  2026-05-22 17:30 ` [PATCH v4 net-next 3/5] rtnetlink: do not acquire RTNL in rtnl_getlink() with RTEXT_FILTER_NAME_ONLY Eric Dumazet
@ 2026-05-22 17:30 ` Eric Dumazet
  2026-05-22 17:30 ` [PATCH v4 net-next 5/5] rtnetlink: add RTEXT_FILTER_NAME_ONLY support to rtnl_dump_ifinfo() Eric Dumazet
  2026-05-22 21:29 ` [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Jakub Kicinski
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2026-05-22 17:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
	Eric Dumazet

RTNL might be no longer held by the caller in the following patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/rtnetlink.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c342b22528e4478a61f22e204a3934ba1a48cb3c..bad036ef7614ffae52a65c447344ac1314f5521b 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2371,22 +2371,24 @@ static struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla,
 static bool link_master_filtered(struct net_device *dev, int master_idx)
 {
 	struct net_device *master;
+	bool res = false;
 
 	if (!master_idx)
 		return false;
 
-	master = netdev_master_upper_dev_get(dev);
+	rcu_read_lock();
+	master = netdev_master_upper_dev_get_rcu(dev);
 
 	/* 0 is already used to denote IFLA_MASTER wasn't passed, therefore need
 	 * another invalid value for ifindex to denote "no master".
 	 */
 	if (master_idx == -1)
-		return !!master;
-
-	if (!master || master->ifindex != master_idx)
-		return true;
+		res = !!master;
+	else if (!master || master->ifindex != master_idx)
+		res = true;
+	rcu_read_unlock();
 
-	return false;
+	return res;
 }
 
 static bool link_kind_filtered(const struct net_device *dev,
-- 
2.54.0.746.g67dd491aae-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 net-next 5/5] rtnetlink: add RTEXT_FILTER_NAME_ONLY support to rtnl_dump_ifinfo()
  2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
                   ` (3 preceding siblings ...)
  2026-05-22 17:30 ` [PATCH v4 net-next 4/5] rtnetlink: do not assume RTNL is held in link_master_filtered() Eric Dumazet
@ 2026-05-22 17:30 ` Eric Dumazet
  2026-05-22 21:29 ` [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Jakub Kicinski
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2026-05-22 17:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
	Eric Dumazet

When user requests RTEXT_FILTER_NAME_ONLY flag, we limit the dump
parts to:

 - struct nlmsghdr
 - IFLA_IFNAME
 - IFLA_PROP_LIST (alternate names)

- This saves space in the dump, pushing more devices per system call.
- This can be done without acquiring RTNL.

I still have a long term goal to avoid RTNL in rtnl_dump_ifinfo()
regardless of RTEXT_FILTER_NAME_ONLY being used.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/rtnetlink.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index bad036ef7614ffae52a65c447344ac1314f5521b..9045285ba2f8be8d7ff32e4f90ee546651f1a05f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2499,6 +2499,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 	int ops_srcu_index;
 	int master_idx = 0;
 	int netnsid = -1;
+	bool need_rtnl;
 	int err, i;
 
 	err = rtnl_valid_dump_ifinfo_req(nlh, cb->strict_check, tb, extack);
@@ -2548,6 +2549,12 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 
 walk_entries:
 	err = 0;
+	need_rtnl = !(ext_filter_mask & RTEXT_FILTER_NAME_ONLY);
+	if (need_rtnl)
+		rtnl_lock();
+	else
+		rcu_read_lock();
+
 	for_each_netdev_dump(tgt_net, dev, ctx->ifindex) {
 		if (link_dump_filtered(dev, master_idx, kind_ops))
 			continue;
@@ -2559,11 +2566,13 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 		if (err < 0)
 			break;
 	}
-
-
-	cb->seq = tgt_net->dev_base_seq;
+	cb->seq = READ_ONCE(tgt_net->dev_base_seq);
 	nl_dump_check_consistent(cb, nlmsg_hdr(skb));
 
+	if (need_rtnl)
+		rtnl_unlock();
+	else
+		rcu_read_unlock();
 out:
 
 	if (kind_ops)
@@ -7159,7 +7168,9 @@ static const struct rtnl_msg_handler rtnetlink_rtnl_msg_handlers[] __initconst =
 	 .flags = RTNL_FLAG_DOIT_PERNET_WIP},
 	{.msgtype = RTM_GETLINK, .doit = rtnl_getlink,
 	 .dumpit = rtnl_dump_ifinfo,
-	 .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE | RTNL_FLAG_DOIT_UNLOCKED},
+	 .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE |
+		  RTNL_FLAG_DOIT_UNLOCKED |
+		  RTNL_FLAG_DUMP_UNLOCKED},
 	{.msgtype = RTM_SETLINK, .doit = rtnl_setlink,
 	 .flags = RTNL_FLAG_DOIT_PERNET_WIP},
 	{.msgtype = RTM_GETADDR, .dumpit = rtnl_dump_all},
-- 
2.54.0.746.g67dd491aae-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo()
  2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
                   ` (4 preceding siblings ...)
  2026-05-22 17:30 ` [PATCH v4 net-next 5/5] rtnetlink: add RTEXT_FILTER_NAME_ONLY support to rtnl_dump_ifinfo() Eric Dumazet
@ 2026-05-22 21:29 ` Jakub Kicinski
  5 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2026-05-22 21:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
	netdev, eric.dumazet

On Fri, 22 May 2026 17:29:57 +0000 Eric Dumazet wrote:
> Many shell scripts invoke iproute2 commands specifying a device by
> its name.
> 
> This series improves their performance avoiding RTNL acquisition
> for their (repeated) name->index conversion.
> 
> v3: insert patch 2/3 in the series (Jakub reported a KASAN splat)
> v4: Addressed Sashiko's feedback.
>     added 2 patches for rtnl_dump_ifinfo().

The CI looks fried, various errors:

# 0.02 [+0.02] RTNETLINK answers: File exists
# 0.02 [+0.00] Failed to create netif

# CMD: ip -d -j link show dev eth8
#   EXIT: 1
#   STDOUT: []
#   STDERR: RTNETLINK answers: Message too long
#           Cannot send link get request: Message too long

At boot we hit:

[    0.661578] ------------[ cut here ]------------
[    0.661608] WARNING: net/core/rtnetlink.c:4296 at rtnl_getlink+0x457/0x5e0, CPU#3: ip/71
[    0.661656] Modules linked in:
[    0.661681] CPU: 3 UID: 0 PID: 71 Comm: ip Tainted: G        W           7.1.0-rc4-virtme #1 PREEMPT(lazy) 
[    0.661735] Tainted: [W]=WARN
[    0.661756] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.661795] RIP: 0010:rtnl_getlink+0x457/0x5e0
[    0.661831] Code: ff ff 89 c3 48 83 c4 38 e8 76 44 fe ff 85 db 48 8b 7c 24 08 0f 84 2c 01 00 00 48 89 fe ba 02 00 00 00 31 ff e8 4a b7 fb ff 90 <0f> 0b 90 b8 a6 ff ff ff 49 8b 94 24 40 05 00 00 65 ff 0a 80 7c 24
[    0.661928] RSP: 0018:ff87a370c027b7a8 EFLAGS: 00010296
[    0.661956] RAX: 0000000000000010 RBX: 00000000ffffffa6 RCX: ff368f1541d3ef00
[    0.661998] RDX: ff368f157edadda0 RSI: 0000000000000011 RDI: ff368f15411ffa00
[    0.662041] RBP: ff368f1541e60b00 R08: ff368f1541845810 R09: ffffffff9ec3bbe6
[    0.662076] R10: ffd7c3a340079800 R11: ff368f15411ffa00 R12: ff368f1541e31000
[    0.662126] R13: ffffffff9fe4f700 R14: 0000000000000009 R15: ffffffff9fe4f700
[    0.662176] FS:  00007fa9d67e0600(0000) GS:ff368f15df00c000(0000) knlGS:0000000000000000
[    0.662219] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.662253] CR2: 00007fff3c059c78 CR3: 000000000190f005 CR4: 0000000000771ef0
[    0.662294] PKRU: 55555554
[    0.662307] Call Trace:
[    0.662319]  <TASK>
[    0.662338]  ? rtnl_fill_ifinfo.isra.0+0x1670/0x1670
[    0.662366]  rtnetlink_rcv_msg+0x39f/0x460
[    0.662390]  ? rtnl_calcit.isra.0+0x160/0x160
[    0.662420]  netlink_rcv_skb+0xca/0x140
[    0.662445]  netlink_unicast+0x26b/0x3a0
[    0.662467]  netlink_sendmsg+0x1e2/0x430
[    0.662489]  ____sys_sendmsg+0x14c/0x2b0
[    0.662511]  ___sys_sendmsg+0xe1/0x120
[    0.662539]  __sys_sendmsg+0xad/0x100
[    0.662560]  do_syscall_64+0x104/0xfc0
[    0.662585]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[    0.662609] RIP: 0033:0x7fa9d6a1808e
[    0.662631] Code: 4d 89 d8 e8 94 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 03 ff ff ff 0f 1f 00 f3 0f 1e fa
[    0.662733] RSP: 002b:00007fff3c059b50 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[    0.662775] RAX: ffffffffffffffda RBX: 00007fff3c05ce1d RCX: 00007fa9d6a1808e
[    0.662823] RDX: 0000000000000000 RSI: 00007fff3c059c00 RDI: 0000000000000004
[    0.662864] RBP: 00007fff3c059b60 R08: 0000000000000000 R09: 0000000000000000
[    0.662907] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[    0.662947] R13: 000000006a10c531 R14: 00007fff3c059ca0 R15: 00007fff3c05ce1d
[    0.662982]  </TASK>
[    0.662998] ---[ end trace 0000000000000000 ]---
[    0.663079] ------------[ cut here ]------------
[    0.663110] WARNING: net/core/rtnetlink.c:4523 at rtmsg_ifinfo_build_skb+0xc8/0x110, CPU#3: ip/71
[    0.663162] Modules linked in:
[    0.663186] CPU: 3 UID: 0 PID: 71 Comm: ip Tainted: G        W           7.1.0-rc4-virtme #1 PREEMPT(lazy) 
[    0.663235] Tainted: [W]=WARN
[    0.663256] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.663289] RIP: 0010:rtmsg_ifinfo_build_skb+0xc8/0x110
[    0.663319] Code: 80 00 00 00 e8 59 d9 ff ff 48 83 c4 38 85 c0 75 18 48 83 c4 08 4c 89 f8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 44 8b 48 08 eb b4 90 <0f> 0b 90 ba 02 00 00 00 4c 89 fe 31 ff e8 36 ab fb ff b9 a6 ff ff
[    0.663414] RSP: 0018:ff87a370c027b720 EFLAGS: 00010286
[    0.663439] RAX: 00000000ffffffa6 RBX: 0000000000000001 RCX: 0000000000000000
[    0.663485] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff368f1541e60700
[    0.663527] RBP: 0000000000000000 R08: ff368f1541845810 R09: ff368f154298f02c
[    0.663569] R10: ff368f1541e31120 R11: fefefefefefefeff R12: 0000000000000000
[    0.663612] R13: 0000000000000010 R14: ff368f1541e31000 R15: ff368f1541e60700
[    0.663655] FS:  00007fa9d67e0600(0000) GS:ff368f15df00c000(0000) knlGS:0000000000000000
[    0.663699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.663730] CR2: 000055ed56288988 CR3: 000000000190f005 CR4: 0000000000771ef0
[    0.663774] PKRU: 55555554
[    0.663787] Call Trace:
[    0.663803]  <TASK>
[    0.663821]  rtmsg_ifinfo+0x3c/0xa0
[    0.663845]  __dev_notify_flags+0xb1/0xf0
[    0.663867]  ? rtnl_getlink+0x456/0x5e0
[    0.663887]  netif_change_flags+0x54/0x70
[    0.663913]  do_setlink.isra.0+0x3a2/0x1500
[    0.663939]  ? __nla_validate_parse+0x76/0xf20
[    0.663970]  rtnl_newlink+0x9d3/0xd90
[    0.663993]  ? do_setlink.isra.0+0x1500/0x1500
[    0.664015]  rtnetlink_rcv_msg+0x39f/0x460
[    0.664035]  ? get_page_from_freelist+0x157a/0x16a0
[    0.664068]  ? rtnl_calcit.isra.0+0x160/0x160
[    0.664091]  netlink_rcv_skb+0xca/0x140
[    0.664118]  netlink_unicast+0x26b/0x3a0
[    0.664140]  netlink_sendmsg+0x1e2/0x430
[    0.664162]  ____sys_sendmsg+0x14c/0x2b0
[    0.664182]  ___sys_sendmsg+0xe1/0x120
[    0.664204]  __sys_sendmsg+0xad/0x100
[    0.664226]  do_syscall_64+0x104/0xfc0
[    0.664248]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[    0.664275] RIP: 0033:0x7fa9d6a1808e
[    0.664295] Code: 4d 89 d8 e8 94 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 03 ff ff ff 0f 1f 00 f3 0f 1e fa
[    0.664397] RSP: 002b:00007fff3c05b1b0 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[    0.664437] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa9d6a1808e
[    0.664478] RDX: 0000000000000000 RSI: 00007fff3c05b260 RDI: 0000000000000003
[    0.664520] RBP: 00007fff3c05b1c0 R08: 0000000000000000 R09: 0000000000000000
[    0.664559] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[    0.664604] R13: 000000006a10c531 R14: 000055ed1cfb1040 R15: 0000000000000000
[    0.664645]  </TASK>
[    0.664658] ---[ end trace 0000000000000000 ]---

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-22 21:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22 17:29 [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Eric Dumazet
2026-05-22 17:29 ` [PATCH v4 net-next 1/5] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
2026-05-22 17:29 ` [PATCH v4 net-next 2/5] net: defer netdev_name_node_alt_flush() call to netdev_run_todo() Eric Dumazet
2026-05-22 17:30 ` [PATCH v4 net-next 3/5] rtnetlink: do not acquire RTNL in rtnl_getlink() with RTEXT_FILTER_NAME_ONLY Eric Dumazet
2026-05-22 17:30 ` [PATCH v4 net-next 4/5] rtnetlink: do not assume RTNL is held in link_master_filtered() Eric Dumazet
2026-05-22 17:30 ` [PATCH v4 net-next 5/5] rtnetlink: add RTEXT_FILTER_NAME_ONLY support to rtnl_dump_ifinfo() Eric Dumazet
2026-05-22 21:29 ` [PATCH v4 net-next 0/5] rtnetlink: RTNL avoidance in rtnl_getlink() and rtnl_dump_ifinfo() Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox