* [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink()
@ 2026-05-19 11:43 Eric Dumazet
2026-05-19 11:43 ` [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Eric Dumazet @ 2026-05-19 11:43 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
Eric Dumazet
Many shell scripts invoke iproute2 commands specifying a device by
its name.
This series improves their performance avoiding RTNL acquisition
for their (repeated) name->index conversion.
Eric Dumazet (2):
rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
rtnetlink: do not acquire RTNL for RTM_GETLINK with
RTEXT_FILTER_NAME_ONLY
net/core/rtnetlink.c | 76 ++++++++++++++++++++++++++++++++------------
1 file changed, 55 insertions(+), 21 deletions(-)
--
2.54.0.563.g4f69b47b94-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
2026-05-19 11:43 [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
@ 2026-05-19 11:43 ` Eric Dumazet
2026-05-19 16:39 ` Jakub Kicinski
2026-05-19 11:43 ` [PATCH v2 net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY Eric Dumazet
2026-05-19 16:37 ` [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Jakub Kicinski
2 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2026-05-19 11:43 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
Eric Dumazet
Avoid corrupting a netlink message and confuse user space in the
unlikely case rtnl_fill_prop_list was able to produce a very big
nested element.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 6a5e9ace55a0880d7b1e4303d12dc0a8b8b7c5ed..ae0254f19178735b2805a8189e81a960a49b2858 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1971,7 +1971,9 @@ static int rtnl_fill_prop_list(struct sk_buff *skb,
if (ret <= 0)
goto nest_cancel;
- nla_nest_end(skb, prop_list);
+ if (nla_nest_end_safe(skb, prop_list) < 0)
+ goto nest_cancel;
+
return 0;
nest_cancel:
--
2.54.0.563.g4f69b47b94-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY
2026-05-19 11:43 [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
2026-05-19 11:43 ` [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
@ 2026-05-19 11:43 ` Eric Dumazet
2026-05-19 16:37 ` [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Jakub Kicinski
2 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2026-05-19 11:43 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, netdev, eric.dumazet,
Eric Dumazet
When RTEXT_FILTER_NAME_ONLY is requested, rtnl_fill_ifinfo()
is dumping device attributes which do not need RTNL protection.
Many shell scripts invoke iproute2 commands specifying a device by
its name. After this patch, they will no longer add RTNL pressure.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
v2: move the ASSERT_RTNL() in rtnl_fill_ifinfo()
net/core/rtnetlink.c | 72 ++++++++++++++++++++++++++++++++------------
1 file changed, 52 insertions(+), 20 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index ae0254f19178735b2805a8189e81a960a49b2858..68cd2238ee170f44841caf47c86ef48303a3d15e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2068,7 +2068,6 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
struct nlmsghdr *nlh;
struct Qdisc *qdisc;
- ASSERT_RTNL();
nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags);
if (nlh == NULL)
return -EMSGSIZE;
@@ -2091,6 +2090,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
if (ext_filter_mask & RTEXT_FILTER_NAME_ONLY)
goto end;
+ ASSERT_RTNL();
if (tgt_netnsid >= 0 &&
nla_put_s32(skb, IFLA_TARGET_NETNSID, tgt_netnsid))
goto nla_put_failure;
@@ -3468,6 +3468,21 @@ static struct net_device *rtnl_dev_get(struct net *net,
return __dev_get_by_name(net, ifname);
}
+static struct net_device *rtnl_dev_get_rcu(struct net *net,
+ struct nlattr *tb[])
+{
+ char ifname[ALTIFNAMSIZ];
+
+ if (tb[IFLA_IFNAME])
+ nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
+ else if (tb[IFLA_ALT_IFNAME])
+ nla_strscpy(ifname, tb[IFLA_ALT_IFNAME], ALTIFNAMSIZ);
+ else
+ return NULL;
+
+ return dev_get_by_name_rcu(net, ifname);
+}
+
static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
@@ -4187,14 +4202,15 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct net *net = sock_net(skb->sk);
+ struct nlattr *tb[IFLA_MAX + 1];
+ netdevice_tracker dev_tracker;
+ struct net_device *dev = NULL;
struct net *tgt_net = net;
+ u32 ext_filter_mask = 0;
struct ifinfomsg *ifm;
- struct nlattr *tb[IFLA_MAX+1];
- struct net_device *dev = NULL;
struct sk_buff *nskb;
int netnsid = -1;
int err;
- u32 ext_filter_mask = 0;
err = rtnl_valid_getlink_req(skb, nlh, tb, extack);
if (err < 0)
@@ -4214,14 +4230,19 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (tb[IFLA_EXT_MASK])
ext_filter_mask = nla_get_u32(tb[IFLA_EXT_MASK]);
- err = -EINVAL;
ifm = nlmsg_data(nlh);
- if (ifm->ifi_index > 0)
- dev = __dev_get_by_index(tgt_net, ifm->ifi_index);
- else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
- dev = rtnl_dev_get(tgt_net, tb);
- else
+ rcu_read_lock();
+ if (ifm->ifi_index > 0) {
+ dev = dev_get_by_index_rcu(tgt_net, ifm->ifi_index);
+ } else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) {
+ dev = rtnl_dev_get_rcu(tgt_net, tb);
+ } else {
+ rcu_read_unlock();
+ err = -EINVAL;
goto out;
+ }
+ netdev_hold(dev, &dev_tracker, GFP_ATOMIC);
+ rcu_read_unlock();
err = -ENODEV;
if (dev == NULL)
@@ -4232,25 +4253,35 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (nskb == NULL)
goto out;
- /* Synchronize the carrier state so we don't report a state
- * that we're not actually going to honour immediately; if
- * the driver just did a carrier off->on transition, we can
- * only TX if link watch work has run, but without this we'd
- * already report carrier on, even if it doesn't work yet.
- */
- linkwatch_sync_dev(dev);
+ if (!(ext_filter_mask & RTEXT_FILTER_NAME_ONLY)) {
+ rtnl_lock();
+ /* Synchronize the carrier state so we don't report a state
+ * that we're not actually going to honour immediately; if
+ * the driver just did a carrier off->on transition, we can
+ * only TX if link watch work has run, but without this we'd
+ * already report carrier on, even if it doesn't work yet.
+ */
+ linkwatch_sync_dev(dev);
+ }
err = rtnl_fill_ifinfo(nskb, dev, net,
RTM_NEWLINK, NETLINK_CB(skb).portid,
nlh->nlmsg_seq, 0, 0, ext_filter_mask,
0, NULL, 0, netnsid, GFP_KERNEL);
+
+ if (!(ext_filter_mask & RTEXT_FILTER_NAME_ONLY))
+ rtnl_unlock();
+
if (err < 0) {
/* -EMSGSIZE implies BUG in if_nlmsg_size */
- WARN_ON(err == -EMSGSIZE);
+ WARN_ON_ONCE(err == -EMSGSIZE &&
+ !(ext_filter_mask & RTEXT_FILTER_NAME_ONLY));
kfree_skb(nskb);
- } else
+ } else {
err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
+ }
out:
+ netdev_put(dev, &dev_tracker);
if (netnsid >= 0)
put_net(tgt_net);
@@ -7116,7 +7147,8 @@ static const struct rtnl_msg_handler rtnetlink_rtnl_msg_handlers[] __initconst =
{.msgtype = RTM_DELLINK, .doit = rtnl_dellink,
.flags = RTNL_FLAG_DOIT_PERNET_WIP},
{.msgtype = RTM_GETLINK, .doit = rtnl_getlink,
- .dumpit = rtnl_dump_ifinfo, .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE},
+ .dumpit = rtnl_dump_ifinfo,
+ .flags = RTNL_FLAG_DUMP_SPLIT_NLM_DONE | RTNL_FLAG_DOIT_UNLOCKED},
{.msgtype = RTM_SETLINK, .doit = rtnl_setlink,
.flags = RTNL_FLAG_DOIT_PERNET_WIP},
{.msgtype = RTM_GETADDR, .dumpit = rtnl_dump_all},
--
2.54.0.563.g4f69b47b94-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink()
2026-05-19 11:43 [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
2026-05-19 11:43 ` [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
2026-05-19 11:43 ` [PATCH v2 net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY Eric Dumazet
@ 2026-05-19 16:37 ` Jakub Kicinski
2026-05-19 17:17 ` Eric Dumazet
2 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2026-05-19 16:37 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
netdev, eric.dumazet
On Tue, 19 May 2026 11:43:53 +0000 Eric Dumazet wrote:
> Many shell scripts invoke iproute2 commands specifying a device by
> its name.
>
> This series improves their performance avoiding RTNL acquisition
> for their (repeated) name->index conversion.
Hm.
[ 1414.868166][T10284] BUG: KASAN: slab-use-after-free in rtnl_fill_prop_list+0x5c0/0x620
[ 1414.868291][T10284] Read of size 8 at addr ff11000001d2c150 by task (udev-worker)/10284
[ 1414.868404][T10284]
[ 1414.868445][T10284] CPU: 2 UID: 0 PID: 10284 Comm: (udev-worker) Not tainted 7.1.0-rc3-virtme #1 PREEMPT(full)
[ 1414.868448][T10284] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1414.868450][T10284] Call Trace:
[ 1414.868452][T10284] <TASK>
[ 1414.868453][T10284] dump_stack_lvl+0x6f/0xa0
[ 1414.868459][T10284] print_address_description.constprop.0+0x56/0x2d0
[ 1414.868464][T10284] print_report+0xfc/0x1fa
[ 1414.868466][T10284] ? __virt_addr_valid+0x102/0x440
[ 1414.868470][T10284] ? __virt_addr_valid+0x1da/0x440
[ 1414.868472][T10284] kasan_report+0x108/0x130
[ 1414.868475][T10284] ? rtnl_fill_prop_list+0x5c0/0x620
[ 1414.868477][T10284] ? rtnl_fill_prop_list+0x5c0/0x620
[ 1414.868479][T10284] rtnl_fill_prop_list+0x5c0/0x620
[ 1414.868480][T10284] ? __asan_memcpy+0x3c/0x60
[ 1414.868482][T10284] rtnl_fill_ifinfo.isra.0+0x3d6/0x2c90
[ 1414.868484][T10284] ? rcu_read_lock_any_held+0x3c/0x90
[ 1414.868487][T10284] ? validate_chain+0x38b/0xc20
[ 1414.868490][T10284] ? rtnl_fill_vf+0x460/0x460
[ 1414.868491][T10284] ? lockdep_hardirqs_on_prepare.part.0+0x9a/0x160
[ 1414.868493][T10284] ? lockdep_hardirqs_on+0x8c/0x130
[ 1414.868496][T10284] ? __lock_acquire+0x508/0xc10
[ 1414.868498][T10284] ? lock_acquire.part.0+0xbc/0x260
[ 1414.868499][T10284] ? find_held_lock+0x2b/0x80
[ 1414.868502][T10284] ? __lock_release.isra.0+0x6b/0x1a0
[ 1414.868504][T10284] ? mark_held_locks+0x40/0x70
[ 1414.868505][T10284] ? lockdep_hardirqs_on_prepare.part.0+0x9a/0x160
[ 1414.868507][T10284] ? lockdep_hardirqs_on+0x8c/0x130
[ 1414.868508][T10284] ? _raw_spin_unlock_irqrestore+0x53/0x80
[ 1414.868510][T10284] rtnl_getlink+0xa48/0xe50
[ 1414.868513][T10284] ? find_held_lock+0x2b/0x80
[ 1414.868515][T10284] ? rtnl_dump_ifinfo+0xfb0/0xfb0
[ 1414.868516][T10284] ? mark_usage+0x61/0x170
[ 1414.868517][T10284] ? __lock_release.isra.0+0x6b/0x1a0
[ 1414.868518][T10284] ? __lock_acquire+0x508/0xc10
[ 1414.868525][T10284] ? lock_acquire.part.0+0xbc/0x260
[ 1414.868526][T10284] ? find_held_lock+0x2b/0x80
[ 1414.868529][T10284] ? mark_usage+0x61/0x170
[ 1414.868530][T10284] ? __lock_release.isra.0+0x6b/0x1a0
[ 1414.868531][T10284] ? __lock_acquire+0x508/0xc10
[ 1414.868532][T10284] ? bpf_address_lookup+0x232/0x290
[ 1414.868536][T10284] ? lock_acquire.part.0+0xbc/0x260
[ 1414.868537][T10284] ? find_held_lock+0x2b/0x80
[ 1414.868539][T10284] ? rtnl_dump_ifinfo+0xfb0/0xfb0
[ 1414.868540][T10284] ? __lock_release.isra.0+0x6b/0x1a0
[ 1414.868542][T10284] ? rtnl_dump_ifinfo+0xfb0/0xfb0
[ 1414.868543][T10284] rtnetlink_rcv_msg+0x6fd/0xbd0
[ 1414.868545][T10284] ? validate_chain+0x38b/0xc20
[ 1414.868546][T10284] ? rtnl_link_fill+0x920/0x920
[ 1414.868547][T10284] ? __lock_acquire+0x508/0xc10
[ 1414.868549][T10284] ? lock_acquire.part.0+0xbc/0x260
[ 1414.868551][T10284] ? find_held_lock+0x2b/0x80
[ 1414.868553][T10284] netlink_rcv_skb+0x14e/0x3a0
[ 1414.868556][T10284] ? rtnl_link_fill+0x920/0x920
[ 1414.868558][T10284] ? netlink_ack+0xce0/0xce0
[ 1414.868560][T10284] ? netlink_deliver_tap+0xc5/0x330
[ 1414.868562][T10284] ? netlink_deliver_tap+0x13c/0x330
[ 1414.868564][T10284] netlink_unicast+0x47c/0x740
[ 1414.868566][T10284] ? netlink_attachskb+0x800/0x800
[ 1414.868568][T10284] ? __lock_acquire+0x508/0xc10
[ 1414.868570][T10284] netlink_sendmsg+0x735/0xc60
[ 1414.868572][T10284] ? netlink_unicast+0x740/0x740
[ 1414.868574][T10284] ? __might_fault+0x97/0x140
[ 1414.868577][T10284] ? __might_fault+0x97/0x140
[ 1414.868579][T10284] __sys_sendto+0x2c9/0x400
[ 1414.868582][T10284] ? __ia32_sys_getpeername+0xd0/0xd0
[ 1414.868586][T10284] ? fput_close_sync+0xde/0x1b0
[ 1414.868589][T10284] ? alloc_file_clone+0xe0/0xe0
[ 1414.868591][T10284] __x64_sys_sendto+0xe4/0x1f0
[ 1414.868593][T10284] ? trace_irq_enable.constprop.0+0x9b/0x180
[ 1414.868596][T10284] ? lockdep_hardirqs_on+0x8c/0x130
[ 1414.868597][T10284] ? do_syscall_64+0x82/0xfc0
[ 1414.868599][T10284] do_syscall_64+0x117/0xfc0
[ 1414.868600][T10284] ? trace_hardirqs_off+0xd/0x30
[ 1414.868602][T10284] ? exc_page_fault+0xee/0x100
[ 1414.868604][T10284] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1414.868606][T10284] RIP: 0033:0x7fcd4191e08e
[ 1414.868609][T10284] Code: 4d 89 d8 e8 94 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 03 ff ff ff 0f 1f 00 f3 0f 1e fa
[ 1414.868611][T10284] RSP: 002b:00007ffc4c8b41c0 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 1414.868614][T10284] RAX: ffffffffffffffda RBX: 0000557d5eaaee20 RCX: 00007fcd4191e08e
[ 1414.868615][T10284] RDX: 0000000000000020 RSI: 0000557d5eaa98e0 RDI: 0000000000000012
[ 1414.868616][T10284] RBP: 00007ffc4c8b41d0 R08: 00007ffc4c8b4220 R09: 0000000000000080
[ 1414.868617][T10284] R10: 0000000000000000 R11: 0000000000000202 R12: 0000557d5ec08060
[ 1414.868618][T10284] R13: 00007ffc4c8b4304 R14: 0000000000000000 R15: 00007ffc4c8b43a0
[ 1414.868621][T10284] </TASK>
[ 1414.868621][T10284]
[ 1414.876011][T10284] Allocated by task 10304:
[ 1414.876091][T10284] kasan_save_stack+0x2f/0x50
[ 1414.876205][T10284] kasan_save_track+0x14/0x30
[ 1414.876286][T10284] __kasan_kmalloc+0x7b/0x90
[ 1414.876399][T10284] register_netdevice+0x48b/0x1bc0
[ 1414.876477][T10284] geneve_configure+0x6c3/0xcf0 [geneve]
[ 1414.876591][T10284] geneve_newlink+0x189/0x220 [geneve]
[ 1414.876669][T10284] rtnl_newlink_create+0x2da/0x8c0
[ 1414.876747][T10284] __rtnl_newlink+0x22b/0xa50
[ 1414.876858][T10284] rtnl_newlink+0x8d1/0xef0
[ 1414.876973][T10284] rtnetlink_rcv_msg+0x6fd/0xbd0
[ 1414.877049][T10284] netlink_rcv_skb+0x14e/0x3a0
[ 1414.877128][T10284] netlink_unicast+0x47c/0x740
[ 1414.877204][T10284] netlink_sendmsg+0x735/0xc60
[ 1414.877280][T10284] __sys_sendto+0x2c9/0x400
[ 1414.877392][T10284] __x64_sys_sendto+0xe4/0x1f0
[ 1414.877470][T10284] do_syscall_64+0x117/0xfc0
[ 1414.877547][T10284] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1414.877680][T10284]
[ 1414.877719][T10284] Freed by task 10304:
[ 1414.877818][T10284] kasan_save_stack+0x2f/0x50
[ 1414.877894][T10284] kasan_save_track+0x14/0x30
[ 1414.877968][T10284] kasan_save_free_info+0x3b/0x60
[ 1414.878084][T10284] __kasan_slab_free+0x43/0x70
[ 1414.878161][T10284] kfree+0x123/0x5a0
[ 1414.878218][T10284] unregister_netdevice_many_notify+0xf0d/0x1f20
[ 1414.878313][T10284] rtnl_dellink+0x4a0/0xae0
[ 1414.878425][T10284] rtnetlink_rcv_msg+0x6fd/0xbd0
[ 1414.878499][T10284] netlink_rcv_skb+0x14e/0x3a0
[ 1414.878577][T10284] netlink_unicast+0x47c/0x740
[ 1414.878657][T10284] netlink_sendmsg+0x735/0xc60
[ 1414.878778][T10284] __sys_sendto+0x2c9/0x400
[ 1414.878853][T10284] __x64_sys_sendto+0xe4/0x1f0
[ 1414.878928][T10284] do_syscall_64+0x117/0xfc0
[ 1414.879004][T10284] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1414.879098][T10284]
[ 1414.879148][T10284] The buggy address belongs to the object at ff11000001d2c140
[ 1414.879148][T10284] which belongs to the cache kmalloc-64 of size 64
[ 1414.879339][T10284] The buggy address is located 16 bytes inside of
[ 1414.879339][T10284] freed 64-byte region [ff11000001d2c140, ff11000001d2c180)
[ 1414.879521][T10284]
[ 1414.879559][T10284] The buggy address belongs to the physical page:
[ 1414.879689][T10284] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1d2c
[ 1414.879910][T10284] flags: 0x80000000000000(node=0|zone=1)
[ 1414.879992][T10284] page_type: f5(slab)
[ 1414.880053][T10284] raw: 0080000000000000 ff1100000103cac0 ffd400000023ac90 ffd4000000074c90
[ 1414.880232][T10284] raw: 0000000000000000 0000000000100010 00000000f5000000 0000000000000000
[ 1414.880365][T10284] page dumped because: kasan: bad access detected
[ 1414.880495][T10284]
[ 1414.880534][T10284] Memory state around the buggy address:
[ 1414.880609][T10284] ff11000001d2c000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[ 1414.880726][T10284] ff11000001d2c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1414.880837][T10284] >ff11000001d2c100: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[ 1414.880948][T10284] ^
[ 1414.881042][T10284] ff11000001d2c180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1414.881190][T10284] ff11000001d2c200: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
decoded: https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/653382/vm-crash-thr0-0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
2026-05-19 11:43 ` [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
@ 2026-05-19 16:39 ` Jakub Kicinski
2026-05-19 16:53 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2026-05-19 16:39 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
netdev, eric.dumazet
On Tue, 19 May 2026 11:43:54 +0000 Eric Dumazet wrote:
> Avoid corrupting a netlink message and confuse user space in the
> unlikely case rtnl_fill_prop_list was able to produce a very big
> nested element.
Should we not prevent it from happening in the first place?
IIUC otherwise if user adds a lot of altnames ip link will no longer
work?
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 6a5e9ace55a0880d7b1e4303d12dc0a8b8b7c5ed..ae0254f19178735b2805a8189e81a960a49b2858 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1971,7 +1971,9 @@ static int rtnl_fill_prop_list(struct sk_buff *skb,
> if (ret <= 0)
> goto nest_cancel;
>
> - nla_nest_end(skb, prop_list);
> + if (nla_nest_end_safe(skb, prop_list) < 0)
> + goto nest_cancel;
> +
> return 0;
>
> nest_cancel:
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
2026-05-19 16:39 ` Jakub Kicinski
@ 2026-05-19 16:53 ` Eric Dumazet
2026-05-19 22:17 ` Jakub Kicinski
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2026-05-19 16:53 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S . Miller, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
netdev, eric.dumazet
On Tue, May 19, 2026 at 9:39 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 19 May 2026 11:43:54 +0000 Eric Dumazet wrote:
> > Avoid corrupting a netlink message and confuse user space in the
> > unlikely case rtnl_fill_prop_list was able to produce a very big
> > nested element.
>
> Should we not prevent it from happening in the first place?
> IIUC otherwise if user adds a lot of altnames ip link will no longer
> work?
We cannot prevent this unless we add a mutual exclusion.
If a reader iterates an RCU list, other threads can delete items
(before the reader's cursor) and add new ones to the end of the list.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink()
2026-05-19 16:37 ` [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Jakub Kicinski
@ 2026-05-19 17:17 ` Eric Dumazet
0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2026-05-19 17:17 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S . Miller, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
netdev, eric.dumazet
On Tue, May 19, 2026 at 9:37 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 19 May 2026 11:43:53 +0000 Eric Dumazet wrote:
> > Many shell scripts invoke iproute2 commands specifying a device by
> > its name.
> >
> > This series improves their performance avoiding RTNL acquisition
> > for their (repeated) name->index conversion.
>
> Hm.
>
We probably miss this.
diff --git a/net/core/dev.c b/net/core/dev.c
index 26ac8eb9b259d489159c7ab5a2b206d425110b3b..92f17c270da988ca46f4cfbb4ca67ebecd4e7e8e
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -398,7 +398,7 @@ static void netdev_name_node_alt_flush(struct
net_device *dev)
struct netdev_name_node *name_node, *tmp;
list_for_each_entry_safe(name_node, tmp, &dev->name_node->list, list) {
- list_del(&name_node->list);
+ list_del_rcu(&name_node->list);
netdev_name_node_alt_free(&name_node->rcu);
}
}
> [ 1414.868166][T10284] BUG: KASAN: slab-use-after-free in rtnl_fill_prop_list+0x5c0/0x620
> [ 1414.868291][T10284] Read of size 8 at addr ff11000001d2c150 by task (udev-worker)/10284
> [ 1414.868404][T10284]
> [ 1414.868445][T10284] CPU: 2 UID: 0 PID: 10284 Comm: (udev-worker) Not tainted 7.1.0-rc3-virtme #1 PREEMPT(full)
> [ 1414.868448][T10284] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 1414.868450][T10284] Call Trace:
> [ 1414.868452][T10284] <TASK>
> [ 1414.868453][T10284] dump_stack_lvl+0x6f/0xa0
> [ 1414.868459][T10284] print_address_description.constprop.0+0x56/0x2d0
> [ 1414.868464][T10284] print_report+0xfc/0x1fa
> [ 1414.868466][T10284] ? __virt_addr_valid+0x102/0x440
> [ 1414.868470][T10284] ? __virt_addr_valid+0x1da/0x440
> [ 1414.868472][T10284] kasan_report+0x108/0x130
> [ 1414.868475][T10284] ? rtnl_fill_prop_list+0x5c0/0x620
> [ 1414.868477][T10284] ? rtnl_fill_prop_list+0x5c0/0x620
> [ 1414.868479][T10284] rtnl_fill_prop_list+0x5c0/0x620
> [ 1414.868480][T10284] ? __asan_memcpy+0x3c/0x60
> [ 1414.868482][T10284] rtnl_fill_ifinfo.isra.0+0x3d6/0x2c90
> [ 1414.868484][T10284] ? rcu_read_lock_any_held+0x3c/0x90
> [ 1414.868487][T10284] ? validate_chain+0x38b/0xc20
> [ 1414.868490][T10284] ? rtnl_fill_vf+0x460/0x460
> [ 1414.868491][T10284] ? lockdep_hardirqs_on_prepare.part.0+0x9a/0x160
> [ 1414.868493][T10284] ? lockdep_hardirqs_on+0x8c/0x130
> [ 1414.868496][T10284] ? __lock_acquire+0x508/0xc10
> [ 1414.868498][T10284] ? lock_acquire.part.0+0xbc/0x260
> [ 1414.868499][T10284] ? find_held_lock+0x2b/0x80
> [ 1414.868502][T10284] ? __lock_release.isra.0+0x6b/0x1a0
> [ 1414.868504][T10284] ? mark_held_locks+0x40/0x70
> [ 1414.868505][T10284] ? lockdep_hardirqs_on_prepare.part.0+0x9a/0x160
> [ 1414.868507][T10284] ? lockdep_hardirqs_on+0x8c/0x130
> [ 1414.868508][T10284] ? _raw_spin_unlock_irqrestore+0x53/0x80
> [ 1414.868510][T10284] rtnl_getlink+0xa48/0xe50
> [ 1414.868513][T10284] ? find_held_lock+0x2b/0x80
> [ 1414.868515][T10284] ? rtnl_dump_ifinfo+0xfb0/0xfb0
> [ 1414.868516][T10284] ? mark_usage+0x61/0x170
> [ 1414.868517][T10284] ? __lock_release.isra.0+0x6b/0x1a0
> [ 1414.868518][T10284] ? __lock_acquire+0x508/0xc10
> [ 1414.868525][T10284] ? lock_acquire.part.0+0xbc/0x260
> [ 1414.868526][T10284] ? find_held_lock+0x2b/0x80
> [ 1414.868529][T10284] ? mark_usage+0x61/0x170
> [ 1414.868530][T10284] ? __lock_release.isra.0+0x6b/0x1a0
> [ 1414.868531][T10284] ? __lock_acquire+0x508/0xc10
> [ 1414.868532][T10284] ? bpf_address_lookup+0x232/0x290
> [ 1414.868536][T10284] ? lock_acquire.part.0+0xbc/0x260
> [ 1414.868537][T10284] ? find_held_lock+0x2b/0x80
> [ 1414.868539][T10284] ? rtnl_dump_ifinfo+0xfb0/0xfb0
> [ 1414.868540][T10284] ? __lock_release.isra.0+0x6b/0x1a0
> [ 1414.868542][T10284] ? rtnl_dump_ifinfo+0xfb0/0xfb0
> [ 1414.868543][T10284] rtnetlink_rcv_msg+0x6fd/0xbd0
> [ 1414.868545][T10284] ? validate_chain+0x38b/0xc20
> [ 1414.868546][T10284] ? rtnl_link_fill+0x920/0x920
> [ 1414.868547][T10284] ? __lock_acquire+0x508/0xc10
> [ 1414.868549][T10284] ? lock_acquire.part.0+0xbc/0x260
> [ 1414.868551][T10284] ? find_held_lock+0x2b/0x80
> [ 1414.868553][T10284] netlink_rcv_skb+0x14e/0x3a0
> [ 1414.868556][T10284] ? rtnl_link_fill+0x920/0x920
> [ 1414.868558][T10284] ? netlink_ack+0xce0/0xce0
> [ 1414.868560][T10284] ? netlink_deliver_tap+0xc5/0x330
> [ 1414.868562][T10284] ? netlink_deliver_tap+0x13c/0x330
> [ 1414.868564][T10284] netlink_unicast+0x47c/0x740
> [ 1414.868566][T10284] ? netlink_attachskb+0x800/0x800
> [ 1414.868568][T10284] ? __lock_acquire+0x508/0xc10
> [ 1414.868570][T10284] netlink_sendmsg+0x735/0xc60
> [ 1414.868572][T10284] ? netlink_unicast+0x740/0x740
> [ 1414.868574][T10284] ? __might_fault+0x97/0x140
> [ 1414.868577][T10284] ? __might_fault+0x97/0x140
> [ 1414.868579][T10284] __sys_sendto+0x2c9/0x400
> [ 1414.868582][T10284] ? __ia32_sys_getpeername+0xd0/0xd0
> [ 1414.868586][T10284] ? fput_close_sync+0xde/0x1b0
> [ 1414.868589][T10284] ? alloc_file_clone+0xe0/0xe0
> [ 1414.868591][T10284] __x64_sys_sendto+0xe4/0x1f0
> [ 1414.868593][T10284] ? trace_irq_enable.constprop.0+0x9b/0x180
> [ 1414.868596][T10284] ? lockdep_hardirqs_on+0x8c/0x130
> [ 1414.868597][T10284] ? do_syscall_64+0x82/0xfc0
> [ 1414.868599][T10284] do_syscall_64+0x117/0xfc0
> [ 1414.868600][T10284] ? trace_hardirqs_off+0xd/0x30
> [ 1414.868602][T10284] ? exc_page_fault+0xee/0x100
> [ 1414.868604][T10284] entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 1414.868606][T10284] RIP: 0033:0x7fcd4191e08e
> [ 1414.868609][T10284] Code: 4d 89 d8 e8 94 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 03 ff ff ff 0f 1f 00 f3 0f 1e fa
> [ 1414.868611][T10284] RSP: 002b:00007ffc4c8b41c0 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> [ 1414.868614][T10284] RAX: ffffffffffffffda RBX: 0000557d5eaaee20 RCX: 00007fcd4191e08e
> [ 1414.868615][T10284] RDX: 0000000000000020 RSI: 0000557d5eaa98e0 RDI: 0000000000000012
> [ 1414.868616][T10284] RBP: 00007ffc4c8b41d0 R08: 00007ffc4c8b4220 R09: 0000000000000080
> [ 1414.868617][T10284] R10: 0000000000000000 R11: 0000000000000202 R12: 0000557d5ec08060
> [ 1414.868618][T10284] R13: 00007ffc4c8b4304 R14: 0000000000000000 R15: 00007ffc4c8b43a0
> [ 1414.868621][T10284] </TASK>
> [ 1414.868621][T10284]
> [ 1414.876011][T10284] Allocated by task 10304:
> [ 1414.876091][T10284] kasan_save_stack+0x2f/0x50
> [ 1414.876205][T10284] kasan_save_track+0x14/0x30
> [ 1414.876286][T10284] __kasan_kmalloc+0x7b/0x90
> [ 1414.876399][T10284] register_netdevice+0x48b/0x1bc0
> [ 1414.876477][T10284] geneve_configure+0x6c3/0xcf0 [geneve]
> [ 1414.876591][T10284] geneve_newlink+0x189/0x220 [geneve]
> [ 1414.876669][T10284] rtnl_newlink_create+0x2da/0x8c0
> [ 1414.876747][T10284] __rtnl_newlink+0x22b/0xa50
> [ 1414.876858][T10284] rtnl_newlink+0x8d1/0xef0
> [ 1414.876973][T10284] rtnetlink_rcv_msg+0x6fd/0xbd0
> [ 1414.877049][T10284] netlink_rcv_skb+0x14e/0x3a0
> [ 1414.877128][T10284] netlink_unicast+0x47c/0x740
> [ 1414.877204][T10284] netlink_sendmsg+0x735/0xc60
> [ 1414.877280][T10284] __sys_sendto+0x2c9/0x400
> [ 1414.877392][T10284] __x64_sys_sendto+0xe4/0x1f0
> [ 1414.877470][T10284] do_syscall_64+0x117/0xfc0
> [ 1414.877547][T10284] entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 1414.877680][T10284]
> [ 1414.877719][T10284] Freed by task 10304:
> [ 1414.877818][T10284] kasan_save_stack+0x2f/0x50
> [ 1414.877894][T10284] kasan_save_track+0x14/0x30
> [ 1414.877968][T10284] kasan_save_free_info+0x3b/0x60
> [ 1414.878084][T10284] __kasan_slab_free+0x43/0x70
> [ 1414.878161][T10284] kfree+0x123/0x5a0
> [ 1414.878218][T10284] unregister_netdevice_many_notify+0xf0d/0x1f20
> [ 1414.878313][T10284] rtnl_dellink+0x4a0/0xae0
> [ 1414.878425][T10284] rtnetlink_rcv_msg+0x6fd/0xbd0
> [ 1414.878499][T10284] netlink_rcv_skb+0x14e/0x3a0
> [ 1414.878577][T10284] netlink_unicast+0x47c/0x740
> [ 1414.878657][T10284] netlink_sendmsg+0x735/0xc60
> [ 1414.878778][T10284] __sys_sendto+0x2c9/0x400
> [ 1414.878853][T10284] __x64_sys_sendto+0xe4/0x1f0
> [ 1414.878928][T10284] do_syscall_64+0x117/0xfc0
> [ 1414.879004][T10284] entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 1414.879098][T10284]
> [ 1414.879148][T10284] The buggy address belongs to the object at ff11000001d2c140
> [ 1414.879148][T10284] which belongs to the cache kmalloc-64 of size 64
> [ 1414.879339][T10284] The buggy address is located 16 bytes inside of
> [ 1414.879339][T10284] freed 64-byte region [ff11000001d2c140, ff11000001d2c180)
> [ 1414.879521][T10284]
> [ 1414.879559][T10284] The buggy address belongs to the physical page:
> [ 1414.879689][T10284] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1d2c
> [ 1414.879910][T10284] flags: 0x80000000000000(node=0|zone=1)
> [ 1414.879992][T10284] page_type: f5(slab)
> [ 1414.880053][T10284] raw: 0080000000000000 ff1100000103cac0 ffd400000023ac90 ffd4000000074c90
> [ 1414.880232][T10284] raw: 0000000000000000 0000000000100010 00000000f5000000 0000000000000000
> [ 1414.880365][T10284] page dumped because: kasan: bad access detected
> [ 1414.880495][T10284]
> [ 1414.880534][T10284] Memory state around the buggy address:
> [ 1414.880609][T10284] ff11000001d2c000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
> [ 1414.880726][T10284] ff11000001d2c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 1414.880837][T10284] >ff11000001d2c100: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
> [ 1414.880948][T10284] ^
> [ 1414.881042][T10284] ff11000001d2c180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 1414.881190][T10284] ff11000001d2c200: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
>
> decoded: https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/653382/vm-crash-thr0-0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list()
2026-05-19 16:53 ` Eric Dumazet
@ 2026-05-19 22:17 ` Jakub Kicinski
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2026-05-19 22:17 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Simon Horman, Kuniyuki Iwashima,
netdev, eric.dumazet
On Tue, 19 May 2026 09:53:08 -0700 Eric Dumazet wrote:
> On Tue, May 19, 2026 at 9:39 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 19 May 2026 11:43:54 +0000 Eric Dumazet wrote:
> > > Avoid corrupting a netlink message and confuse user space in the
> > > unlikely case rtnl_fill_prop_list was able to produce a very big
> > > nested element.
> >
> > Should we not prevent it from happening in the first place?
> > IIUC otherwise if user adds a lot of altnames ip link will no longer
> > work?
>
> We cannot prevent this unless we add a mutual exclusion.
Today its under rtnl_lock, AFAICT, so nop, if we want to lift
rtnl_lock from RTM_NEWLINKPROP we can probably slide the props
under dev->lock or add a new lock?
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-19 22:18 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 11:43 [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Eric Dumazet
2026-05-19 11:43 ` [PATCH v2 net-next 1/2] rtnetlink: use nla_nest_end_safe() in rtnl_fill_prop_list() Eric Dumazet
2026-05-19 16:39 ` Jakub Kicinski
2026-05-19 16:53 ` Eric Dumazet
2026-05-19 22:17 ` Jakub Kicinski
2026-05-19 11:43 ` [PATCH v2 net-next 2/2] rtnetlink: do not acquire RTNL for RTM_GETLINK with RTEXT_FILTER_NAME_ONLY Eric Dumazet
2026-05-19 16:37 ` [PATCH v2 net-next 0/2] rtnetlink: RTNL avoidance in rtnl_getlink() Jakub Kicinski
2026-05-19 17:17 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox