Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink()
@ 2026-05-18 10:01 Eric Dumazet
  2026-05-18 10:01 ` [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Eric Dumazet @ 2026-05-18 10:01 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Simon Horman, netdev, eric.dumazet,
	Eric Dumazet

Many shell scripts invoke iproute2 commands specifying a device by
its name.

This series improves their performance avoiding RTNL acquisition
for their (repeated) name->index conversion.

Eric Dumazet (2):
  rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
  rtnetlink: do not acquire RTNL for RTM_GETLINK with
    RTEXT_FILTER_NAME_ONLY

 net/core/rtnetlink.c | 74 ++++++++++++++++++++++++++++++++------------
 1 file changed, 54 insertions(+), 20 deletions(-)

-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
  2026-05-18 10:01 [PATCH net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
@ 2026-05-18 10:01 ` Eric Dumazet
  2026-05-18 10:01 ` [PATCH net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY Eric Dumazet
  2026-05-18 16:38 ` [syzbot ci] Re: rtnetlink: RTNL avoidance in rtnl_getlink() syzbot ci
  2 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2026-05-18 10:01 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Simon Horman, netdev, eric.dumazet,
	Eric Dumazet

Avoid corrupting a netlink message and confuse user space in the
unlikely case rtnl_fill_prop_list was able to produce a very big
nested element.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/rtnetlink.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 6a5e9ace55a0880d7b1e4303d12dc0a8b8b7c5ed..ae0254f19178735b2805a8189e81a960a49b2858 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1971,7 +1971,9 @@ static int rtnl_fill_prop_list(struct sk_buff *skb,
 	if (ret <= 0)
 		goto nest_cancel;
 
-	nla_nest_end(skb, prop_list);
+	if (nla_nest_end_safe(skb, prop_list) < 0)
+		goto nest_cancel;
+
 	return 0;
 
 nest_cancel:
-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY
  2026-05-18 10:01 [PATCH net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
  2026-05-18 10:01 ` [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
@ 2026-05-18 10:01 ` Eric Dumazet
  2026-05-18 16:38 ` [syzbot ci] Re: rtnetlink: RTNL avoidance in rtnl_getlink() syzbot ci
  2 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2026-05-18 10:01 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Simon Horman, netdev, eric.dumazet,
	Eric Dumazet

When RTEXT_FILTER_NAME_ONLY is requested, rtnl_fill_ifinfo()
is dumping device attributes which do not need RTNL protection.

Many shell scripts invoke iproute2 commands specifying a device by
its name. After this patch, they will no longer add RTNL pressure.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/rtnetlink.c | 70 ++++++++++++++++++++++++++++++++------------
 1 file changed, 51 insertions(+), 19 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index ae0254f19178735b2805a8189e81a960a49b2858..8fe6260a0feaeea7ba35ddca3c1942f1420aecae 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3468,6 +3468,21 @@ static struct net_device *rtnl_dev_get(struct net *net,
 	return __dev_get_by_name(net, ifname);
 }
 
+static struct net_device *rtnl_dev_get_rcu(struct net *net,
+					   struct nlattr *tb[])
+{
+	char ifname[ALTIFNAMSIZ];
+
+	if (tb[IFLA_IFNAME])
+		nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
+	else if (tb[IFLA_ALT_IFNAME])
+		nla_strscpy(ifname, tb[IFLA_ALT_IFNAME], ALTIFNAMSIZ);
+	else
+		return NULL;
+
+	return dev_get_by_name_rcu(net, ifname);
+}
+
 static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			struct netlink_ext_ack *extack)
 {
@@ -4187,14 +4202,15 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
+	struct nlattr *tb[IFLA_MAX + 1];
+	netdevice_tracker dev_tracker;
+	struct net_device *dev = NULL;
 	struct net *tgt_net = net;
+	u32 ext_filter_mask = 0;
 	struct ifinfomsg *ifm;
-	struct nlattr *tb[IFLA_MAX+1];
-	struct net_device *dev = NULL;
 	struct sk_buff *nskb;
 	int netnsid = -1;
 	int err;
-	u32 ext_filter_mask = 0;
 
 	err = rtnl_valid_getlink_req(skb, nlh, tb, extack);
 	if (err < 0)
@@ -4214,14 +4230,19 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (tb[IFLA_EXT_MASK])
 		ext_filter_mask = nla_get_u32(tb[IFLA_EXT_MASK]);
 
-	err = -EINVAL;
 	ifm = nlmsg_data(nlh);
-	if (ifm->ifi_index > 0)
-		dev = __dev_get_by_index(tgt_net, ifm->ifi_index);
-	else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
-		dev = rtnl_dev_get(tgt_net, tb);
-	else
+	rcu_read_lock();
+	if (ifm->ifi_index > 0) {
+		dev = dev_get_by_index_rcu(tgt_net, ifm->ifi_index);
+	} else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) {
+		dev = rtnl_dev_get_rcu(tgt_net, tb);
+	} else {
+		rcu_read_unlock();
+		err = -EINVAL;
 		goto out;
+	}
+	netdev_hold(dev, &dev_tracker, GFP_ATOMIC);
+	rcu_read_unlock();
 
 	err = -ENODEV;
 	if (dev == NULL)
@@ -4232,25 +4253,35 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (nskb == NULL)
 		goto out;
 
-	/* Synchronize the carrier state so we don't report a state
-	 * that we're not actually going to honour immediately; if
-	 * the driver just did a carrier off->on transition, we can
-	 * only TX if link watch work has run, but without this we'd
-	 * already report carrier on, even if it doesn't work yet.
-	 */
-	linkwatch_sync_dev(dev);
+	if (!(ext_filter_mask & RTEXT_FILTER_NAME_ONLY)) {
+		rtnl_lock();
+		/* Synchronize the carrier state so we don't report a state
+		 * that we're not actually going to honour immediately; if
+		 * the driver just did a carrier off->on transition, we can
+		 * only TX if link watch work has run, but without this we'd
+		 * already report carrier on, even if it doesn't work yet.
+		 */
+		linkwatch_sync_dev(dev);
+	}
 
 	err = rtnl_fill_ifinfo(nskb, dev, net,
 			       RTM_NEWLINK, NETLINK_CB(skb).portid,
 			       nlh->nlmsg_seq, 0, 0, ext_filter_mask,
 			       0, NULL, 0, netnsid, GFP_KERNEL);
+
+	if (!(ext_filter_mask & RTEXT_FILTER_NAME_ONLY))
+		rtnl_unlock();
+
 	if (err < 0) {
 		/* -EMSGSIZE implies BUG in if_nlmsg_size */
-		WARN_ON(err == -EMSGSIZE);
+		WARN_ON_ONCE(err == -EMSGSIZE &&
+			     !(ext_filter_mask & RTEXT_FILTER_NAME_ONLY));
 		kfree_skb(nskb);
-	} else
+	} else {
 		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
+	}
 out:
+	netdev_put(dev, &dev_tracker);
 	if (netnsid >= 0)
 		put_net(tgt_net);
 
@@ -7116,7 +7147,8 @@ static const struct rtnl_msg_handler rtnetlink_rtnl_msg_handlers[] __initconst =
 	{.msgtype = RTM_DELLINK, .doit = rtnl_dellink,
 	 .flags = RTNL_FLAG_DOIT_PERNET_WIP},
 	{.msgtype = RTM_GETLINK, .doit = rtnl_getlink,
-	 .dumpit = rtnl_dump_ifinfo, .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE},
+	 .dumpit = rtnl_dump_ifinfo,
+	 .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE | RTNL_FLAG_DOIT_UNLOCKED},
 	{.msgtype = RTM_SETLINK, .doit = rtnl_setlink,
 	 .flags = RTNL_FLAG_DOIT_PERNET_WIP},
 	{.msgtype = RTM_GETADDR, .dumpit = rtnl_dump_all},
-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [syzbot ci] Re: rtnetlink: RTNL avoidance in rtnl_getlink()
  2026-05-18 10:01 [PATCH net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
  2026-05-18 10:01 ` [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
  2026-05-18 10:01 ` [PATCH net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY Eric Dumazet
@ 2026-05-18 16:38 ` syzbot ci
  2026-05-18 16:53   ` Eric Dumazet
  2 siblings, 1 reply; 5+ messages in thread
From: syzbot ci @ 2026-05-18 16:38 UTC (permalink / raw)
  To: davem, edumazet, eric.dumazet, horms, kuba, kuniyu, netdev,
	pabeni
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v1] rtnetlink: RTNL avoidance in rtnl_getlink()
https://lore.kernel.org/all/20260518100134.809042-1-edumazet@google.com
* [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
* [PATCH net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY

and found the following issue:
WARNING in rtnl_fill_ifinfo

Full report is available here:
https://ci.syzbot.org/series/c0efec5e-90ce-4ba2-8383-16d7bc172521

***

WARNING in rtnl_fill_ifinfo

tree:      net-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base:      627ac78f2741e2ebd2225e2e953b6964a8a9182f
arch:      amd64
compiler:  Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config:    https://ci.syzbot.org/builds/9507e45e-17a8-457b-a47d-5229f306e1f1/config
syz repro: https://ci.syzbot.org/findings/e56b032e-6444-4682-b577-34e22399b7cc/syz_repro

------------[ cut here ]------------
RTNL: assertion failed at net/core/rtnetlink.c (2071)
WARNING: net/core/rtnetlink.c:2071 at rtnl_fill_ifinfo+0xc73/0x2770 net/core/rtnetlink.c:2071, CPU#1: syz.1.18/5818
Modules linked in:
CPU: 1 UID: 0 PID: 5818 Comm: syz.1.18 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:rtnl_fill_ifinfo+0xc7f/0x2770 net/core/rtnetlink.c:2071
Code: 5a 41 f8 45 85 f6 74 2c e8 be 55 41 f8 e9 91 fc ff ff e8 b4 55 41 f8 48 8d 3d 9d af bb 06 48 c7 c6 40 1c de 8c ba 17 08 00 00 <67> 48 0f b9 3a e9 5d f4 ff ff 48 8d 8c 24 00 02 00 00 48 8b 44 24
RSP: 0018:ffffc90003bcee40 EFLAGS: 00010293
RAX: ffffffff89846eac RBX: 1ffff92000779dd4 RCX: ffff88816b500000
RDX: 0000000000000817 RSI: ffffffff8cde1c40 RDI: ffffffff90401e50
RBP: ffffc90003bcf118 R08: ffffffff8fdd1227 R09: 1ffffffff1fba244
R10: dffffc0000000000 R11: fffffbfff1fba245 R12: 0000000000000010
R13: 0000000000000003 R14: 1ffff92000779dd4 R15: ffff88816b898000
FS:  00007f293800b6c0(0000) GS:ffff8882a927a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2937072780 CR3: 000000016a7d4000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 rtnl_getlink+0xce4/0x10d0 net/core/rtnetlink.c:4267
 rtnetlink_rcv_msg+0x7d5/0xbe0 net/core/rtnetlink.c:7042
 netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2551
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x75c/0x8e0 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1895
 sock_sendmsg_nosec net/socket.c:787 [inline]
 __sock_sendmsg net/socket.c:802 [inline]
 ____sys_sendmsg+0x972/0x9f0 net/socket.c:2698
 ___sys_sendmsg+0x2a5/0x360 net/socket.c:2752
 __sys_sendmsg net/socket.c:2784 [inline]
 __do_sys_sendmsg net/socket.c:2789 [inline]
 __se_sys_sendmsg net/socket.c:2787 [inline]
 __x64_sys_sendmsg+0x1bd/0x2a0 net/socket.c:2787
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f293719ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f293800b028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f2937415fa0 RCX: 00007f293719ce59
RDX: 0000000000000004 RSI: 0000200000000ac0 RDI: 0000000000000003
RBP: 00007f2937232d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f2937416038 R14: 00007f2937415fa0 R15: 00007ffffb498538
 </TASK>
----------------
Code disassembly (best guess):
   0:	5a                   	pop    %rdx
   1:	41 f8                	rex.B clc
   3:	45 85 f6             	test   %r14d,%r14d
   6:	74 2c                	je     0x34
   8:	e8 be 55 41 f8       	call   0xf84155cb
   d:	e9 91 fc ff ff       	jmp    0xfffffca3
  12:	e8 b4 55 41 f8       	call   0xf84155cb
  17:	48 8d 3d 9d af bb 06 	lea    0x6bbaf9d(%rip),%rdi        # 0x6bbafbb
  1e:	48 c7 c6 40 1c de 8c 	mov    $0xffffffff8cde1c40,%rsi
  25:	ba 17 08 00 00       	mov    $0x817,%edx
* 2a:	67 48 0f b9 3a       	ud1    (%edx),%rdi <-- trapping instruction
  2f:	e9 5d f4 ff ff       	jmp    0xfffff491
  34:	48 8d 8c 24 00 02 00 	lea    0x200(%rsp),%rcx
  3b:	00
  3c:	48                   	rex.W
  3d:	8b                   	.byte 0x8b
  3e:	44                   	rex.R
  3f:	24                   	.byte 0x24


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot ci] Re: rtnetlink: RTNL avoidance in rtnl_getlink()
  2026-05-18 16:38 ` [syzbot ci] Re: rtnetlink: RTNL avoidance in rtnl_getlink() syzbot ci
@ 2026-05-18 16:53   ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2026-05-18 16:53 UTC (permalink / raw)
  To: syzbot ci
  Cc: davem, eric.dumazet, horms, kuba, kuniyu, netdev, pabeni, syzbot,
	syzkaller-bugs

On Mon, May 18, 2026 at 9:38 AM syzbot ci
<syzbot+ci0e37b1eaa858ac77@syzkaller.appspotmail.com> wrote:
>
> syzbot ci has tested the following series
>
> [v1] rtnetlink: RTNL avoidance in rtnl_getlink()
> https://lore.kernel.org/all/20260518100134.809042-1-edumazet@google.com
> * [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
> * [PATCH net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY
>
> and found the following issue:
> WARNING in rtnl_fill_ifinfo
>
> Full report is available here:
> https://ci.syzbot.org/series/c0efec5e-90ce-4ba2-8383-16d7bc172521
>
> ***
>

Hmmm.  I thought I moved the ASSERT_RTNL(), but apparently this part was lost.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-18 16:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18 10:01 [PATCH net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
2026-05-18 10:01 ` [PATCH net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
2026-05-18 10:01 ` [PATCH net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY Eric Dumazet
2026-05-18 16:38 ` [syzbot ci] Re: rtnetlink: RTNL avoidance in rtnl_getlink() syzbot ci
2026-05-18 16:53   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox