* [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node
@ 2025-09-25 16:35 syzbot
2025-10-18 20:37 ` syzbot
0 siblings, 1 reply; 27+ messages in thread
From: syzbot @ 2025-09-25 16:35 UTC (permalink / raw)
To: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 846bd2225ec3 Add linux-next specific files for 20250919
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=130c08e2580000
kernel config: https://syzkaller.appspot.com/x/.config?x=135377594f35b576
dashboard link: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13f3fe42580000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/c53d48022f8a/disk-846bd222.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/483534e784c8/vmlinux-846bd222.xz
kernel image: https://storage.googleapis.com/syzbot-assets/721b36eec9b3/bzImage-846bd222.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
==================================================================
BUG: KASAN: slab-use-after-free in nr_add_node+0x251b/0x2580 net/netrom/nr_route.c:248
Read of size 2 at addr ffff888078ae2632 by task syz.0.4719/15936
CPU: 0 UID: 0 PID: 15936 Comm: syz.0.4719 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
nr_add_node+0x251b/0x2580 net/netrom/nr_route.c:248
nr_rt_ioctl+0x9f1/0xb00 net/netrom/nr_route.c:651
sock_do_ioctl+0xdc/0x300 net/socket.c:1241
sock_ioctl+0x576/0x790 net/socket.c:1362
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f29de38ec29
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f29df216038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f29de5d5fa0 RCX: 00007f29de38ec29
RDX: 0000200000000000 RSI: 000000000000890b RDI: 0000000000000004
RBP: 00007f29de411e41 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f29de5d6038 R14: 00007f29de5d5fa0 R15: 00007ffedaf2ceb8
</TASK>
Allocated by task 15939:
kasan_save_stack mm/kasan/common.c:56 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
poison_kmalloc_redzone mm/kasan/common.c:400 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:417
kasan_kmalloc include/linux/kasan.h:262 [inline]
__kmalloc_cache_noprof+0x3d5/0x6f0 mm/slub.c:5723
kmalloc_noprof include/linux/slab.h:957 [inline]
nr_add_node+0x804/0x2580 net/netrom/nr_route.c:146
nr_rt_ioctl+0x9f1/0xb00 net/netrom/nr_route.c:651
sock_do_ioctl+0xdc/0x300 net/socket.c:1241
sock_ioctl+0x576/0x790 net/socket.c:1362
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 15936:
kasan_save_stack mm/kasan/common.c:56 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
__kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:587
kasan_save_free_info mm/kasan/kasan.h:406 [inline]
poison_slab_object mm/kasan/common.c:252 [inline]
__kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284
kasan_slab_free include/linux/kasan.h:234 [inline]
slab_free_hook mm/slub.c:2507 [inline]
slab_free mm/slub.c:6557 [inline]
kfree+0x19a/0x6d0 mm/slub.c:6765
nr_add_node+0x1cbe/0x2580 net/netrom/nr_route.c:246
nr_rt_ioctl+0x9f1/0xb00 net/netrom/nr_route.c:651
sock_do_ioctl+0xdc/0x300 net/socket.c:1241
sock_ioctl+0x576/0x790 net/socket.c:1362
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff888078ae2600
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 50 bytes inside of
freed 64-byte region [ffff888078ae2600, ffff888078ae2640)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x78ae2
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 00fff00000000000 ffff88801b0418c0 ffffea0000c51900 dead000000000004
raw: 0000000000000000 0000000000200020 00000000f5000000 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5233, tgid 5233 (udevd), ts 92365645188, free_ts 92323224662
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x240/0x2a0 mm/page_alloc.c:1850
prep_new_page mm/page_alloc.c:1858 [inline]
get_page_from_freelist+0x21e4/0x22c0 mm/page_alloc.c:3869
__alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5159
alloc_pages_mpol+0x232/0x4a0 mm/mempolicy.c:2416
alloc_slab_page mm/slub.c:3023 [inline]
allocate_slab+0x96/0x3a0 mm/slub.c:3196
new_slab mm/slub.c:3250 [inline]
___slab_alloc+0xe94/0x1920 mm/slub.c:4626
__slab_alloc+0x65/0x100 mm/slub.c:4745
__slab_alloc_node mm/slub.c:4821 [inline]
slab_alloc_node mm/slub.c:5232 [inline]
__do_kmalloc_node mm/slub.c:5601 [inline]
__kmalloc_noprof+0x471/0x7f0 mm/slub.c:5614
kmalloc_noprof include/linux/slab.h:961 [inline]
kzalloc_noprof include/linux/slab.h:1094 [inline]
tomoyo_encode2 security/tomoyo/realpath.c:45 [inline]
tomoyo_encode+0x28b/0x550 security/tomoyo/realpath.c:80
tomoyo_realpath_from_path+0x58d/0x5d0 security/tomoyo/realpath.c:283
tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
tomoyo_path_perm+0x213/0x4b0 security/tomoyo/file.c:822
security_inode_getattr+0x12f/0x330 security/security.c:2416
vfs_getattr fs/stat.c:259 [inline]
vfs_fstat fs/stat.c:281 [inline]
__do_sys_newfstat fs/stat.c:555 [inline]
__se_sys_newfstat fs/stat.c:550 [inline]
__x64_sys_newfstat+0xfc/0x200 fs/stat.c:550
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 5528 tgid 5528 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1394 [inline]
__free_frozen_pages+0xbc4/0xd30 mm/page_alloc.c:2906
rcu_do_batch kernel/rcu/tree.c:2605 [inline]
rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861
handle_softirqs+0x286/0x870 kernel/softirq.c:622
__do_softirq kernel/softirq.c:656 [inline]
invoke_softirq kernel/softirq.c:496 [inline]
__irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:723
irq_exit_rcu+0x9/0x30 kernel/softirq.c:739
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:697
Memory state around the buggy address:
ffff888078ae2500: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff888078ae2580: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888078ae2600: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^
ffff888078ae2680: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff888078ae2700: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node
2025-09-25 16:35 [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node syzbot
@ 2025-10-18 20:37 ` syzbot
2025-10-19 5:02 ` Brahmajit Das
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: syzbot @ 2025-10-18 20:37 UTC (permalink / raw)
To: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzkaller-bugs
syzbot has found a reproducer for the following issue on:
HEAD commit: f406055cb18c Merge tag 'arm64-fixes' of git://git.kernel.o..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11cf767c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
dashboard link: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=155f1de2580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12b6b52f980000
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-f406055c.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a4db2a99bfb1/vmlinux-f406055c.xz
kernel image: https://storage.googleapis.com/syzbot-assets/91d1ca420bac/bzImage-f406055c.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
==================================================================
BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Read of size 4 at addr ffff888054b8cc30 by task syz.3.7839/22393
CPU: 3 UID: 0 PID: 22393 Comm: syz.3.7839 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcd/0x630 mm/kasan/report.c:482
kasan_report+0xe0/0x110 mm/kasan/report.c:595
nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
nr_rt_ioctl+0x11b7/0x29b0 net/netrom/nr_route.c:651
nr_ioctl+0x19a/0x2d0 net/netrom/af_netrom.c:1254
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7f4fb8efc9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7f509c5038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f7f4fde5fa0 RCX: 00007f7f4fb8efc9
RDX: 0000200000000280 RSI: 000000000000890b RDI: 0000000000000004
RBP: 00007f7f4fc11f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f7f4fde6038 R14: 00007f7f4fde5fa0 R15: 00007ffd1bd56f18
</TASK>
Allocated by task 22380:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
kasan_save_track+0x14/0x30 mm/kasan/common.c:77
poison_kmalloc_redzone mm/kasan/common.c:400 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:417
kmalloc_noprof include/linux/slab.h:957 [inline]
nr_add_node+0xe4e/0x2c00 net/netrom/nr_route.c:146
nr_rt_ioctl+0x11b7/0x29b0 net/netrom/nr_route.c:651
nr_ioctl+0x19a/0x2d0 net/netrom/af_netrom.c:1254
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 22393:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
kasan_save_track+0x14/0x30 mm/kasan/common.c:77
__kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:587
kasan_save_free_info mm/kasan/kasan.h:406 [inline]
poison_slab_object mm/kasan/common.c:252 [inline]
__kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284
kasan_slab_free include/linux/kasan.h:234 [inline]
slab_free_hook mm/slub.c:2523 [inline]
slab_free mm/slub.c:6611 [inline]
kfree+0x2b8/0x6d0 mm/slub.c:6818
nr_neigh_put include/net/netrom.h:143 [inline]
nr_neigh_put include/net/netrom.h:137 [inline]
nr_add_node+0x23b9/0x2c00 net/netrom/nr_route.c:246
nr_rt_ioctl+0x11b7/0x29b0 net/netrom/nr_route.c:651
nr_ioctl+0x19a/0x2d0 net/netrom/af_netrom.c:1254
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff888054b8cc00
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 48 bytes inside of
freed 64-byte region [ffff888054b8cc00, ffff888054b8cc40)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x54b8c
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 00fff00000000000 ffff88801b4428c0 dead000000000100 dead000000000122
raw: 0000000000000000 0000000000200020 00000000f5000000 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 6074, tgid 6074 (syz-executor), ts 126099390276, free_ts 126098882568
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1c0/0x230 mm/page_alloc.c:1850
prep_new_page mm/page_alloc.c:1858 [inline]
get_page_from_freelist+0x10a3/0x3a30 mm/page_alloc.c:3884
__alloc_frozen_pages_noprof+0x25f/0x2470 mm/page_alloc.c:5183
alloc_pages_mpol+0x1fb/0x550 mm/mempolicy.c:2416
alloc_slab_page mm/slub.c:3039 [inline]
allocate_slab mm/slub.c:3212 [inline]
new_slab+0x24a/0x360 mm/slub.c:3266
___slab_alloc+0xdc4/0x1ae0 mm/slub.c:4636
__slab_alloc.constprop.0+0x63/0x110 mm/slub.c:4755
__slab_alloc_node mm/slub.c:4831 [inline]
slab_alloc_node mm/slub.c:5253 [inline]
__kmalloc_cache_noprof+0x477/0x780 mm/slub.c:5743
kmalloc_noprof include/linux/slab.h:957 [inline]
__team_option_inst_add+0xbf/0x330 drivers/net/team/team_core.c:160
__team_option_inst_add_option drivers/net/team/team_core.c:182 [inline]
__team_options_register+0x355/0x720 drivers/net/team/team_core.c:276
team_options_register drivers/net/team/team_core.c:341 [inline]
team_init+0x5c8/0xce0 drivers/net/team/team_core.c:1658
register_netdevice+0x653/0x2270 net/core/dev.c:11221
team_newlink+0xb4/0x190 drivers/net/team/team_core.c:2213
rtnl_newlink_create net/core/rtnetlink.c:3833 [inline]
__rtnl_newlink net/core/rtnetlink.c:3950 [inline]
rtnl_newlink+0xc45/0x2000 net/core/rtnetlink.c:4065
rtnetlink_rcv_msg+0x95e/0xe90 net/core/rtnetlink.c:6954
netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552
page last free pid 6074 tgid 6074 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1394 [inline]
__free_frozen_pages+0x7df/0x1160 mm/page_alloc.c:2906
selinux_genfs_get_sid security/selinux/hooks.c:1357 [inline]
inode_doinit_with_dentry+0xacb/0x12e0 security/selinux/hooks.c:1555
selinux_d_instantiate+0x26/0x30 security/selinux/hooks.c:6523
security_d_instantiate+0x142/0x1a0 security/security.c:4148
d_instantiate+0x5c/0x90 fs/dcache.c:1961
__debugfs_create_file+0x286/0x6b0 fs/debugfs/inode.c:459
debugfs_create_file_full+0x41/0x60 fs/debugfs/inode.c:469
ref_tracker_dir_debugfs+0x19d/0x290 lib/ref_tracker.c:441
ref_tracker_dir_init include/linux/ref_tracker.h:70 [inline]
alloc_netdev_mqs+0x314/0x1550 net/core/dev.c:11907
rtnl_create_link+0xc08/0xf90 net/core/rtnetlink.c:3641
rtnl_newlink_create net/core/rtnetlink.c:3823 [inline]
__rtnl_newlink net/core/rtnetlink.c:3950 [inline]
rtnl_newlink+0xb69/0x2000 net/core/rtnetlink.c:4065
rtnetlink_rcv_msg+0x95e/0xe90 net/core/rtnetlink.c:6954
netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552
netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1346
netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1896
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
__sys_sendto+0x4a3/0x520 net/socket.c:2244
Memory state around the buggy address:
ffff888054b8cb00: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
ffff888054b8cb80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888054b8cc00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^
ffff888054b8cc80: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
ffff888054b8cd00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
==================================================================
---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node
2025-10-18 20:37 ` syzbot
@ 2025-10-19 5:02 ` Brahmajit Das
2025-10-19 5:21 ` syzbot
2025-10-19 5:10 ` Brahmajit Das
2025-10-20 8:13 ` [PATCH] netrom: Prevent race conditions between multiple add route Lizhi Xu
2 siblings, 1 reply; 27+ messages in thread
From: Brahmajit Das @ 2025-10-19 5:02 UTC (permalink / raw)
To: syzbot
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzkaller-bugs
On 18.10.2025 13:37, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: f406055cb18c Merge tag 'arm64-fixes' of git://git.kernel.o..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11cf767c580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
> dashboard link: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
> compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=155f1de2580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12b6b52f980000
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-f406055c.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/a4db2a99bfb1/vmlinux-f406055c.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/91d1ca420bac/bzImage-f406055c.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> Read of size 4 at addr ffff888054b8cc30 by task syz.3.7839/22393
>
#syz test
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..5fa7d9febbbb 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -245,7 +245,9 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
nr_node->routes[2].neighbour->count--;
nr_neigh_put(nr_node->routes[2].neighbour);
- if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
+ if (nr_node->routes[2].neighbour &&
+ nr_node->routes[2].neighbour->count == 0 &&
+ !nr_node->routes[2].neighbour->locked)
nr_remove_neigh(nr_node->routes[2].neighbour);
nr_node->routes[2].quality = quality;
--
Regards,
listout
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node
2025-10-18 20:37 ` syzbot
2025-10-19 5:02 ` Brahmajit Das
@ 2025-10-19 5:10 ` Brahmajit Das
2025-10-19 5:10 ` syzbot
2025-10-20 8:13 ` [PATCH] netrom: Prevent race conditions between multiple add route Lizhi Xu
2 siblings, 1 reply; 27+ messages in thread
From: Brahmajit Das @ 2025-10-19 5:10 UTC (permalink / raw)
To: syzbot
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzkaller-bugs
On 18.10.2025 13:37, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: f406055cb18c Merge tag 'arm64-fixes' of git://git.kernel.o..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11cf767c580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
> dashboard link: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
> compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=155f1de2580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12b6b52f980000
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-f406055c.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/a4db2a99bfb1/vmlinux-f406055c.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/91d1ca420bac/bzImage-f406055c.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> Read of size 4 at addr ffff888054b8cc30 by task syz.3.7839/22393
#syz test linux-next
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..5fa7d9febbbb 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -245,7 +245,9 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
nr_node->routes[2].neighbour->count--;
nr_neigh_put(nr_node->routes[2].neighbour);
- if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
+ if (nr_node->routes[2].neighbour &&
+ nr_node->routes[2].neighbour->count == 0 &&
+ !nr_node->routes[2].neighbour->locked)
nr_remove_neigh(nr_node->routes[2].neighbour);
nr_node->routes[2].quality = quality;
--
Regards,
listout
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node
2025-10-19 5:10 ` Brahmajit Das
@ 2025-10-19 5:10 ` syzbot
0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2025-10-19 5:10 UTC (permalink / raw)
To: listout
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, listout,
netdev, pabeni, syzkaller-bugs
> On 18.10.2025 13:37, syzbot wrote:
>> syzbot has found a reproducer for the following issue on:
>>
>> HEAD commit: f406055cb18c Merge tag 'arm64-fixes' of git://git.kernel.o..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=11cf767c580000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
>> dashboard link: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
>> compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=155f1de2580000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12b6b52f980000
>>
>> Downloadable assets:
>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-f406055c.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/a4db2a99bfb1/vmlinux-f406055c.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/91d1ca420bac/bzImage-f406055c.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
>>
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
>> Read of size 4 at addr ffff888054b8cc30 by task syz.3.7839/22393
>
> #syz test linux-next
want either no args or 2 args (repo, branch), got 1
>
> diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
> index b94cb2ffbaf8..5fa7d9febbbb 100644
> --- a/net/netrom/nr_route.c
> +++ b/net/netrom/nr_route.c
> @@ -245,7 +245,9 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
> nr_node->routes[2].neighbour->count--;
> nr_neigh_put(nr_node->routes[2].neighbour);
>
> - if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
> + if (nr_node->routes[2].neighbour &&
> + nr_node->routes[2].neighbour->count == 0 &&
> + !nr_node->routes[2].neighbour->locked)
> nr_remove_neigh(nr_node->routes[2].neighbour);
>
> nr_node->routes[2].quality = quality;
>
> --
> Regards,
> listout
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node
2025-10-19 5:02 ` Brahmajit Das
@ 2025-10-19 5:21 ` syzbot
0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2025-10-19 5:21 UTC (permalink / raw)
To: davem, edumazet, horms, kuba, linux-hams, linux-kernel, listout,
netdev, pabeni, syzkaller-bugs
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KASAN: slab-use-after-free Read in nr_add_node
==================================================================
BUG: KASAN: slab-use-after-free in nr_add_node+0x25e5/0x2c10 net/netrom/nr_route.c:249
Read of size 4 at addr ffff88805466ddb0 by task syz.1.4225/15237
CPU: 2 UID: 0 PID: 15237 Comm: syz.1.4225 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcd/0x630 mm/kasan/report.c:482
kasan_report+0xe0/0x110 mm/kasan/report.c:595
nr_add_node+0x25e5/0x2c10 net/netrom/nr_route.c:249
nr_rt_ioctl+0x11b7/0x29b0 net/netrom/nr_route.c:653
nr_ioctl+0x19a/0x2d0 net/netrom/af_netrom.c:1254
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4229d8efc9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f422ac41038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f4229fe5fa0 RCX: 00007f4229d8efc9
RDX: 0000200000000280 RSI: 000000000000890b RDI: 0000000000000004
RBP: 00007f4229e11f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f4229fe6038 R14: 00007f4229fe5fa0 R15: 00007ffe9d4efb98
</TASK>
Allocated by task 15224:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
kasan_save_track+0x14/0x30 mm/kasan/common.c:77
poison_kmalloc_redzone mm/kasan/common.c:400 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:417
kmalloc_noprof include/linux/slab.h:957 [inline]
nr_add_node+0xe4e/0x2c10 net/netrom/nr_route.c:146
nr_rt_ioctl+0x11b7/0x29b0 net/netrom/nr_route.c:653
nr_ioctl+0x19a/0x2d0 net/netrom/af_netrom.c:1254
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 15237:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
kasan_save_track+0x14/0x30 mm/kasan/common.c:77
__kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:587
kasan_save_free_info mm/kasan/kasan.h:406 [inline]
poison_slab_object mm/kasan/common.c:252 [inline]
__kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284
kasan_slab_free include/linux/kasan.h:234 [inline]
slab_free_hook mm/slub.c:2530 [inline]
slab_free mm/slub.c:6619 [inline]
kfree+0x2b8/0x6d0 mm/slub.c:6826
nr_neigh_put include/net/netrom.h:143 [inline]
nr_neigh_put include/net/netrom.h:137 [inline]
nr_add_node+0x23c3/0x2c10 net/netrom/nr_route.c:246
nr_rt_ioctl+0x11b7/0x29b0 net/netrom/nr_route.c:653
nr_ioctl+0x19a/0x2d0 net/netrom/af_netrom.c:1254
sock_do_ioctl+0x118/0x280 net/socket.c:1254
sock_ioctl+0x227/0x6b0 net/socket.c:1375
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88805466dd80
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 48 bytes inside of
freed 64-byte region [ffff88805466dd80, ffff88805466ddc0)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5466d
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 00fff00000000000 ffff88801b4428c0 ffffea0000867b00 dead000000000004
raw: 0000000000000000 0000000000200020 00000000f5000000 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5345, tgid 5345 (udevd), ts 58557594470, free_ts 0
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1c0/0x230 mm/page_alloc.c:1850
prep_new_page mm/page_alloc.c:1858 [inline]
get_page_from_freelist+0x10a3/0x3a30 mm/page_alloc.c:3884
__alloc_frozen_pages_noprof+0x25f/0x2470 mm/page_alloc.c:5183
alloc_pages_mpol+0x1fb/0x550 mm/mempolicy.c:2416
alloc_slab_page mm/slub.c:3046 [inline]
allocate_slab mm/slub.c:3219 [inline]
new_slab+0x24a/0x360 mm/slub.c:3273
___slab_alloc+0xdc4/0x1ae0 mm/slub.c:4643
__slab_alloc.constprop.0+0x63/0x110 mm/slub.c:4762
__slab_alloc_node mm/slub.c:4838 [inline]
slab_alloc_node mm/slub.c:5260 [inline]
__do_kmalloc_node mm/slub.c:5633 [inline]
__kmalloc_noprof+0x501/0x880 mm/slub.c:5646
kmalloc_noprof include/linux/slab.h:961 [inline]
kzalloc_noprof include/linux/slab.h:1094 [inline]
tomoyo_encode2+0x100/0x3e0 security/tomoyo/realpath.c:45
tomoyo_encode+0x29/0x50 security/tomoyo/realpath.c:80
tomoyo_realpath_from_path+0x18f/0x6e0 security/tomoyo/realpath.c:283
tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
tomoyo_check_open_permission+0x2ab/0x3c0 security/tomoyo/file.c:771
tomoyo_file_open+0x6b/0x90 security/tomoyo/tomoyo.c:334
security_file_open+0x84/0x1e0 security/security.c:3183
do_dentry_open+0x596/0x1530 fs/open.c:942
vfs_open+0x82/0x3f0 fs/open.c:1097
page_owner free stack trace missing
Memory state around the buggy address:
ffff88805466dc80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
ffff88805466dd00: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
>ffff88805466dd80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^
ffff88805466de00: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
ffff88805466de80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
==================================================================
Tested on:
commit: 1c64efcb Merge tag 'rust-rustfmt' of git://git.kernel...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12a8bde2580000
kernel config: https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
dashboard link: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=13e20de2580000
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] netrom: Prevent race conditions between multiple add route
2025-10-18 20:37 ` syzbot
2025-10-19 5:02 ` Brahmajit Das
2025-10-19 5:10 ` Brahmajit Das
@ 2025-10-20 8:13 ` Lizhi Xu
2025-10-20 10:10 ` Dan Carpenter
2 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-20 8:13 UTC (permalink / raw)
To: syzbot+2860e75836a08b172755
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzkaller-bugs
The root cause of the problem is that multiple different tasks initiate
NETROM_NODE commands to add new routes, there is no lock between them to
protect the same nr_neigh.
Task0 may add the nr_neigh.refcount value of 1 on Task1 to routes[2].
When Task3 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
release the neighbour because its refcount value is 1.
In this case, the following situation causes a UAF:
Task0 Task1
===== =====
nr_add_node()
nr_neigh_get_dev() nr_add_node()
nr_node->routes[2].neighbour->count--
nr_neigh_put(nr_node->routes[2].neighbour);
nr_remove_neigh(nr_node->routes[2].neighbour)
nr_node->routes[2].neighbour = nr_neigh
nr_neigh_hold(nr_neigh);
The solution to the problem is to use a lock to synchronize each add a route
to node.
syzbot reported:
BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
Call Trace:
<TASK>
nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
net/netrom/nr_route.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..ae1e5ee1f52f 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -102,7 +102,9 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
struct nr_neigh *nr_neigh;
int i, found;
struct net_device *odev;
+ static DEFINE_MUTEX(add_node_lock);
+ guard(mutex)(&add_node_lock);
if ((odev=nr_dev_get(nr)) != NULL) { /* Can't add routes to ourself */
dev_put(odev);
return -EINVAL;
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] netrom: Prevent race conditions between multiple add route
2025-10-20 8:13 ` [PATCH] netrom: Prevent race conditions between multiple add route Lizhi Xu
@ 2025-10-20 10:10 ` Dan Carpenter
2025-10-20 11:02 ` [PATCH V2] " Lizhi Xu
0 siblings, 1 reply; 27+ messages in thread
From: Dan Carpenter @ 2025-10-20 10:10 UTC (permalink / raw)
To: Lizhi Xu
Cc: syzbot+2860e75836a08b172755, davem, edumazet, horms, kuba,
linux-hams, linux-kernel, netdev, pabeni, syzkaller-bugs
On Mon, Oct 20, 2025 at 04:13:59PM +0800, Lizhi Xu wrote:
> The root cause of the problem is that multiple different tasks initiate
> NETROM_NODE commands to add new routes, there is no lock between them to
> protect the same nr_neigh.
> Task0 may add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> When Task3 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
s/Task3/Task1/
> release the neighbour because its refcount value is 1.
>
The refcount would be 2 and then drop to zero. Both nr_neigh_put() and
nr_remove_neigh() drop the refcount.
> In this case, the following situation causes a UAF:
>
> Task0 Task1
> ===== =====
> nr_add_node()
> nr_neigh_get_dev() nr_add_node()
> nr_node->routes[2].neighbour->count--
Does this line really matter in terms of the use after free?
> nr_neigh_put(nr_node->routes[2].neighbour);
> nr_remove_neigh(nr_node->routes[2].neighbour)
> nr_node->routes[2].neighbour = nr_neigh
> nr_neigh_hold(nr_neigh);
This chart is confusing. It says that that the nr_neigh_hold() is the use
after free. But we called nr_remove_neigh(nr_node->routes[2].neighbour)
before we assigned nr_node->routes[2].neighbour = nr_neigh...
The sysbot report says that the free happens on:
r_neigh_put(nr_node->routes[2].neighbour);
and the use after free happens on the next line:
if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
Which does suggest that somewhere the refcount is 1 when it should be
at least 2... It could be that two threads call nr_neigh_put() at
basically the same time, but that doesn't make sense either because
we're holding the nr_node_lock(nr_node)...
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 10:10 ` Dan Carpenter
@ 2025-10-20 11:02 ` Lizhi Xu
2025-10-20 12:25 ` Dan Carpenter
0 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-20 11:02 UTC (permalink / raw)
To: dan.carpenter
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, lizhi.xu,
netdev, pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
The root cause of the problem is that multiple different tasks initiate
NETROM_NODE commands to add new routes, there is no lock between them to
protect the same nr_neigh.
Task0 may add the nr_neigh.refcount value of 1 on Task1 to routes[2].
When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
release the neighbour because its refcount value is 1.
In this case, the following situation causes a UAF:
Task0 Task1 Task2
===== ===== =====
nr_add_node()
nr_neigh_get_dev() nr_add_node()
nr_node_lock()
nr_node->routes[2].neighbour->count--
nr_neigh_put(nr_node->routes[2].neighbour);
nr_remove_neigh(nr_node->routes[2].neighbour)
nr_node_unlock()
nr_node_lock()
nr_node->routes[2].neighbour = nr_neigh
nr_neigh_hold(nr_neigh); nr_add_node()
nr_neigh_put()
The solution to the problem is to use a lock to synchronize each add a route
to node.
syzbot reported:
BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
Call Trace:
<TASK>
nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
V1 -> V2: update comments for cause uaf
net/netrom/nr_route.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..ae1e5ee1f52f 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -102,7 +102,9 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
struct nr_neigh *nr_neigh;
int i, found;
struct net_device *odev;
+ static DEFINE_MUTEX(add_node_lock);
+ guard(mutex)(&add_node_lock);
if ((odev=nr_dev_get(nr)) != NULL) { /* Can't add routes to ourself */
dev_put(odev);
return -EINVAL;
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 11:02 ` [PATCH V2] " Lizhi Xu
@ 2025-10-20 12:25 ` Dan Carpenter
2025-10-20 12:33 ` Dan Carpenter
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Dan Carpenter @ 2025-10-20 12:25 UTC (permalink / raw)
To: Lizhi Xu
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Mon, Oct 20, 2025 at 07:02:44PM +0800, Lizhi Xu wrote:
> The root cause of the problem is that multiple different tasks initiate
> NETROM_NODE commands to add new routes, there is no lock between them to
> protect the same nr_neigh.
> Task0 may add the nr_neigh.refcount value of 1 on Task1 to routes[2].
>
> When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> release the neighbour because its refcount value is 1.
>
> In this case, the following situation causes a UAF:
>
> Task0 Task1 Task2
> ===== ===== =====
> nr_add_node()
> nr_neigh_get_dev() nr_add_node()
> nr_node_lock()
> nr_node->routes[2].neighbour->count--
> nr_neigh_put(nr_node->routes[2].neighbour);
> nr_remove_neigh(nr_node->routes[2].neighbour)
> nr_node_unlock()
> nr_node_lock()
> nr_node->routes[2].neighbour = nr_neigh
> nr_neigh_hold(nr_neigh); nr_add_node()
> nr_neigh_put()
>
> The solution to the problem is to use a lock to synchronize each add a route
> to node.
This chart is still not right. Let me add line numbers to your chart:
netrom/nr_route.c
214 nr_node_lock(nr_node);
215
216 if (quality != 0)
217 strscpy(nr_node->mnemonic, mnemonic);
218
219 for (found = 0, i = 0; i < nr_node->count; i++) {
220 if (nr_node->routes[i].neighbour == nr_neigh) {
221 nr_node->routes[i].quality = quality;
222 nr_node->routes[i].obs_count = obs_count;
223 found = 1;
224 break;
225 }
226 }
227
228 if (!found) {
229 /* We have space at the bottom, slot it in */
230 if (nr_node->count < 3) {
231 nr_node->routes[2] = nr_node->routes[1];
232 nr_node->routes[1] = nr_node->routes[0];
233
234 nr_node->routes[0].quality = quality;
235 nr_node->routes[0].obs_count = obs_count;
236 nr_node->routes[0].neighbour = nr_neigh;
237
238 nr_node->which++;
239 nr_node->count++;
240 nr_neigh_hold(nr_neigh);
241 nr_neigh->count++;
242 } else {
243 /* It must be better than the worst */
244 if (quality > nr_node->routes[2].quality) {
245 nr_node->routes[2].neighbour->count--;
246 nr_neigh_put(nr_node->routes[2].neighbour);
247
248 if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
249 nr_remove_neigh(nr_node->routes[2].neighbour);
250
251 nr_node->routes[2].quality = quality;
252 nr_node->routes[2].obs_count = obs_count;
253 nr_node->routes[2].neighbour = nr_neigh;
254
255 nr_neigh_hold(nr_neigh);
256 nr_neigh->count++;
257 }
258 }
259 }
Task0 Task1 Task2
===== ===== =====
[97] nr_add_node()
[113] nr_neigh_get_dev() [97] nr_add_node()
[214] nr_node_lock()
[245] nr_node->routes[2].neighbour->count--
[246] nr_neigh_put(nr_node->routes[2].neighbour);
[248] nr_remove_neigh(nr_node->routes[2].neighbour)
[283] nr_node_unlock()
[214] nr_node_lock()
[253] nr_node->routes[2].neighbour = nr_neigh
[254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
[XXX] nr_neigh_put()
^^^^^^^^^^^^^^^^^^^^
These charts are supposed to be chronological so [XXX] is wrong because the
use after free happens on line [248]. Do we really need three threads to
make this race work?
>
> syzbot reported:
> BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
^^^
> Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
I'm sure you tested your patch and that it fixes the bug, but I just
wonder if it's the best possible fix?
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 12:25 ` Dan Carpenter
@ 2025-10-20 12:33 ` Dan Carpenter
2025-10-20 12:57 ` Dan Carpenter
2025-10-20 13:34 ` Lizhi Xu
2 siblings, 0 replies; 27+ messages in thread
From: Dan Carpenter @ 2025-10-20 12:33 UTC (permalink / raw)
To: Lizhi Xu
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Mon, Oct 20, 2025 at 03:25:40PM +0300, Dan Carpenter wrote:
> On Mon, Oct 20, 2025 at 07:02:44PM +0800, Lizhi Xu wrote:
> > The root cause of the problem is that multiple different tasks initiate
> > NETROM_NODE commands to add new routes, there is no lock between them to
> > protect the same nr_neigh.
> > Task0 may add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> >
> > When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> > release the neighbour because its refcount value is 1.
> >
> > In this case, the following situation causes a UAF:
> >
> > Task0 Task1 Task2
> > ===== ===== =====
> > nr_add_node()
> > nr_neigh_get_dev() nr_add_node()
> > nr_node_lock()
> > nr_node->routes[2].neighbour->count--
> > nr_neigh_put(nr_node->routes[2].neighbour);
> > nr_remove_neigh(nr_node->routes[2].neighbour)
> > nr_node_unlock()
> > nr_node_lock()
> > nr_node->routes[2].neighbour = nr_neigh
> > nr_neigh_hold(nr_neigh); nr_add_node()
> > nr_neigh_put()
> >
> > The solution to the problem is to use a lock to synchronize each add a route
> > to node.
>
> This chart is still not right. Let me add line numbers to your chart:
>
> netrom/nr_route.c
> 214 nr_node_lock(nr_node);
> 215
> 216 if (quality != 0)
> 217 strscpy(nr_node->mnemonic, mnemonic);
> 218
> 219 for (found = 0, i = 0; i < nr_node->count; i++) {
> 220 if (nr_node->routes[i].neighbour == nr_neigh) {
> 221 nr_node->routes[i].quality = quality;
> 222 nr_node->routes[i].obs_count = obs_count;
Should we call nr_neigh->count++ if we found it? I guess I don't
really understand what nr_neigh->count is counting... It would be
really nice if someone added some comments explaining how the ref
counting worked.
> 223 found = 1;
> 224 break;
> 225 }
> 226 }
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 12:25 ` Dan Carpenter
2025-10-20 12:33 ` Dan Carpenter
@ 2025-10-20 12:57 ` Dan Carpenter
2025-10-20 13:34 ` Lizhi Xu
2 siblings, 0 replies; 27+ messages in thread
From: Dan Carpenter @ 2025-10-20 12:57 UTC (permalink / raw)
To: Lizhi Xu
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Mon, Oct 20, 2025 at 03:25:40PM +0300, Dan Carpenter wrote:
> netrom/nr_route.c
> 214 nr_node_lock(nr_node);
I guess nr_node is different for each thread?
> 215
> 216 if (quality != 0)
> 217 strscpy(nr_node->mnemonic, mnemonic);
> 218
> 219 for (found = 0, i = 0; i < nr_node->count; i++) {
> 220 if (nr_node->routes[i].neighbour == nr_neigh) {
> 221 nr_node->routes[i].quality = quality;
> 222 nr_node->routes[i].obs_count = obs_count;
> 223 found = 1;
> 224 break;
> 225 }
> 226 }
> 227
> 228 if (!found) {
> 229 /* We have space at the bottom, slot it in */
> 230 if (nr_node->count < 3) {
> 231 nr_node->routes[2] = nr_node->routes[1];
> 232 nr_node->routes[1] = nr_node->routes[0];
> 233
> 234 nr_node->routes[0].quality = quality;
> 235 nr_node->routes[0].obs_count = obs_count;
> 236 nr_node->routes[0].neighbour = nr_neigh;
> 237
> 238 nr_node->which++;
> 239 nr_node->count++;
> 240 nr_neigh_hold(nr_neigh);
> 241 nr_neigh->count++;
> 242 } else {
> 243 /* It must be better than the worst */
> 244 if (quality > nr_node->routes[2].quality) {
> 245 nr_node->routes[2].neighbour->count--;
> 246 nr_neigh_put(nr_node->routes[2].neighbour);
> 247
> 248 if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
> 249 nr_remove_neigh(nr_node->routes[2].neighbour);
Does neighbour->count means how many nr_node pointers have it in ->routes[]?
I wish this code had comments.
KTDOO: add comments explaining all the counters in netrom/nr_route.c
> 250
> 251 nr_node->routes[2].quality = quality;
> 252 nr_node->routes[2].obs_count = obs_count;
> 253 nr_node->routes[2].neighbour = nr_neigh;
> 254
> 255 nr_neigh_hold(nr_neigh);
> 256 nr_neigh->count++;
Could we just add some locking to this if statement only? I had thought
that nr_node_lock() serialized it but now I think maybe not? Or maybe we
could increment the counters before assigning it?
nr_neigh->count++;
nr_neigh_hold(nr_neigh);
nr_node->routes[2].neighbour = nr_neigh;
I'm not an expert in concerency. Does calling nr_neigh_hold() mean th
that the nr_neigh->count++ will happen before the assignment?
> 257 }
> 258 }
> 259 }
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 12:25 ` Dan Carpenter
2025-10-20 12:33 ` Dan Carpenter
2025-10-20 12:57 ` Dan Carpenter
@ 2025-10-20 13:34 ` Lizhi Xu
2025-10-20 13:49 ` Lizhi Xu
2 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-20 13:34 UTC (permalink / raw)
To: dan.carpenter
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, lizhi.xu,
netdev, pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Mon, 20 Oct 2025 15:25:40 +0300, Dan Carpenter wrote:
> Task0 Task1 Task2
> ===== ===== =====
> [97] nr_add_node()
> [113] nr_neigh_get_dev() [97] nr_add_node()
> [214] nr_node_lock()
> [245] nr_node->routes[2].neighbour->count--
> [246] nr_neigh_put(nr_node->routes[2].neighbour);
> [248] nr_remove_neigh(nr_node->routes[2].neighbour)
> [283] nr_node_unlock()
> [214] nr_node_lock()
> [253] nr_node->routes[2].neighbour = nr_neigh
> [254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
> [XXX] nr_neigh_put()
> ^^^^^^^^^^^^^^^^^^^^
>
> These charts are supposed to be chronological so [XXX] is wrong because the
> use after free happens on line [248]. Do we really need three threads to
> make this race work?
The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
executing [XXX]nr_neigh_put().
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 13:34 ` Lizhi Xu
@ 2025-10-20 13:49 ` Lizhi Xu
2025-10-20 17:59 ` Dan Carpenter
0 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-20 13:49 UTC (permalink / raw)
To: lizhi.xu
Cc: dan.carpenter, davem, edumazet, horms, kuba, linux-hams,
linux-kernel, netdev, pabeni, syzbot+2860e75836a08b172755,
syzkaller-bugs
On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > Task0 Task1 Task2
> > ===== ===== =====
> > [97] nr_add_node()
> > [113] nr_neigh_get_dev() [97] nr_add_node()
> > [214] nr_node_lock()
> > [245] nr_node->routes[2].neighbour->count--
> > [246] nr_neigh_put(nr_node->routes[2].neighbour);
> > [248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > [283] nr_node_unlock()
> > [214] nr_node_lock()
> > [253] nr_node->routes[2].neighbour = nr_neigh
> > [254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
> > [XXX] nr_neigh_put()
> > ^^^^^^^^^^^^^^^^^^^^
> >
> > These charts are supposed to be chronological so [XXX] is wrong because the
> > use after free happens on line [248]. Do we really need three threads to
> > make this race work?
> The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> executing [XXX]nr_neigh_put().
Execution Order:
1 -> Task0
[113] nr_neigh_get_dev() // After execution, the refcount value is 3
2 -> Task1
[246] nr_neigh_put(nr_node->routes[2].neighbour); // After execution, the refcount value is 2
[248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
3 -> Task0
[253] nr_node->routes[2].neighbour = nr_neigh // nr_neigh's refcount value is 1 and add it to routes[2]
4 -> Task2
[XXX] nr_neigh_put(nr_node->routes[2].neighbour) // After execution, neighhour is freed
if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked) // Uaf occurs this line when accessing neighbour->count
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 13:49 ` Lizhi Xu
@ 2025-10-20 17:59 ` Dan Carpenter
2025-10-21 2:05 ` Lizhi Xu
0 siblings, 1 reply; 27+ messages in thread
From: Dan Carpenter @ 2025-10-20 17:59 UTC (permalink / raw)
To: Lizhi Xu
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Mon, Oct 20, 2025 at 09:49:12PM +0800, Lizhi Xu wrote:
> On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > > Task0 Task1 Task2
> > > ===== ===== =====
> > > [97] nr_add_node()
> > > [113] nr_neigh_get_dev() [97] nr_add_node()
> > > [214] nr_node_lock()
> > > [245] nr_node->routes[2].neighbour->count--
> > > [246] nr_neigh_put(nr_node->routes[2].neighbour);
> > > [248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > > [283] nr_node_unlock()
> > > [214] nr_node_lock()
> > > [253] nr_node->routes[2].neighbour = nr_neigh
> > > [254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
> > > [XXX] nr_neigh_put()
> > > ^^^^^^^^^^^^^^^^^^^^
> > >
> > > These charts are supposed to be chronological so [XXX] is wrong because the
> > > use after free happens on line [248]. Do we really need three threads to
> > > make this race work?
> > The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> > then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> > executing [XXX]nr_neigh_put().
> Execution Order:
> 1 -> Task0
> [113] nr_neigh_get_dev() // After execution, the refcount value is 3
>
> 2 -> Task1
> [246] nr_neigh_put(nr_node->routes[2].neighbour); // After execution, the refcount value is 2
> [248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
>
> 3 -> Task0
> [253] nr_node->routes[2].neighbour = nr_neigh // nr_neigh's refcount value is 1 and add it to routes[2]
>
> 4 -> Task2
> [XXX] nr_neigh_put(nr_node->routes[2].neighbour) // After execution, neighhour is freed
> if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked) // Uaf occurs this line when accessing neighbour->count
Let's step back a bit and look at the bigger picture design. (Which is
completely undocumented so we're just guessing).
When we put nr_neigh into nr_node->routes[] we bump the nr_neigh_hold()
reference count and nr_neigh->count++, then when we remove it from
->routes[] we drop the reference and do nr_neigh->count--.
If it's the last reference (and we are not holding ->locked) then we
remove it from the &nr_neigh_list and drop the reference count again and
free it. So we drop the reference count twice. This is a complicated
design with three variables: nr_neigh_hold(), nr_neigh->count and
->locked. Why can it not just be one counter nr_neigh_hold(). So
instead of setting locked = true we would just take an extra reference?
The nr_neigh->count++ would be replaced with nr_neigh_hold() as well.
Because that's fundamentally the problem, right? We call
nr_neigh_get_dev() so we think we're holding a reference and we're
safe, but we don't realize that calling neighbour->count-- can
result in dropping two references.
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-20 17:59 ` Dan Carpenter
@ 2025-10-21 2:05 ` Lizhi Xu
2025-10-21 6:36 ` Dan Carpenter
0 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-21 2:05 UTC (permalink / raw)
To: dan.carpenter
Cc: lizhi.xu, davem, edumazet, horms, kuba, linux-hams, linux-kernel,
netdev, pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Mon, 20 Oct 2025 20:59:24 +0300, Dan Carpenter wrote:
> On Mon, Oct 20, 2025 at 09:49:12PM +0800, Lizhi Xu wrote:
> > On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > > > Task0 Task1 Task2
> > > > ===== ===== =====
> > > > [97] nr_add_node()
> > > > [113] nr_neigh_get_dev() [97] nr_add_node()
> > > > [214] nr_node_lock()
> > > > [245] nr_node->routes[2].neighbour->count--
> > > > [246] nr_neigh_put(nr_node->routes[2].neighbour);
> > > > [248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > > > [283] nr_node_unlock()
> > > > [214] nr_node_lock()
> > > > [253] nr_node->routes[2].neighbour = nr_neigh
> > > > [254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
> > > > [XXX] nr_neigh_put()
> > > > ^^^^^^^^^^^^^^^^^^^^
> > > >
> > > > These charts are supposed to be chronological so [XXX] is wrong because the
> > > > use after free happens on line [248]. Do we really need three threads to
> > > > make this race work?
> > > The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> > > then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> > > executing [XXX]nr_neigh_put().
> > Execution Order:
> > 1 -> Task0
> > [113] nr_neigh_get_dev() // After execution, the refcount value is 3
> >
> > 2 -> Task1
> > [246] nr_neigh_put(nr_node->routes[2].neighbour); // After execution, the refcount value is 2
> > [248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
> >
> > 3 -> Task0
> > [253] nr_node->routes[2].neighbour = nr_neigh // nr_neigh's refcount value is 1 and add it to routes[2]
> >
> > 4 -> Task2
> > [XXX] nr_neigh_put(nr_node->routes[2].neighbour) // After execution, neighhour is freed
> > if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked) // Uaf occurs this line when accessing neighbour->count
>
> Let's step back a bit and look at the bigger picture design. (Which is
> completely undocumented so we're just guessing).
>
> When we put nr_neigh into nr_node->routes[] we bump the nr_neigh_hold()
> reference count and nr_neigh->count++, then when we remove it from
> ->routes[] we drop the reference and do nr_neigh->count--.
>
> If it's the last reference (and we are not holding ->locked) then we
> remove it from the &nr_neigh_list and drop the reference count again and
> free it. So we drop the reference count twice. This is a complicated
> design with three variables: nr_neigh_hold(), nr_neigh->count and
> ->locked. Why can it not just be one counter nr_neigh_hold(). So
> instead of setting locked = true we would just take an extra reference?
> The nr_neigh->count++ would be replaced with nr_neigh_hold() as well.
locked controls whether the neighbor quality can be automatically updated;
count controls the number of different routes a neighbor is linked to;
refcount is simply used to manage the neighbor lifecycle.
>
> Because that's fundamentally the problem, right? We call
> nr_neigh_get_dev() so we think we're holding a reference and we're
> safe, but we don't realize that calling neighbour->count-- can
> result in dropping two references.
After nr_neigh_get_dev() retrieves a neighbor, there shouldn't be an
unfinished nr_add_node() call operating on the neighbor in the route.
Therefore, we need to use a lock before the nr_neigh_get_dev() operation
begins to ensure that the neighbor is added atomically to the routing table.
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-21 2:05 ` Lizhi Xu
@ 2025-10-21 6:36 ` Dan Carpenter
2025-10-21 8:34 ` Lizhi Xu
2025-10-21 8:35 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Lizhi Xu
0 siblings, 2 replies; 27+ messages in thread
From: Dan Carpenter @ 2025-10-21 6:36 UTC (permalink / raw)
To: Lizhi Xu
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Tue, Oct 21, 2025 at 10:05:33AM +0800, Lizhi Xu wrote:
> On Mon, 20 Oct 2025 20:59:24 +0300, Dan Carpenter wrote:
> > On Mon, Oct 20, 2025 at 09:49:12PM +0800, Lizhi Xu wrote:
> > > On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > > > > Task0 Task1 Task2
> > > > > ===== ===== =====
> > > > > [97] nr_add_node()
> > > > > [113] nr_neigh_get_dev() [97] nr_add_node()
> > > > > [214] nr_node_lock()
> > > > > [245] nr_node->routes[2].neighbour->count--
> > > > > [246] nr_neigh_put(nr_node->routes[2].neighbour);
> > > > > [248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > > > > [283] nr_node_unlock()
> > > > > [214] nr_node_lock()
> > > > > [253] nr_node->routes[2].neighbour = nr_neigh
> > > > > [254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
> > > > > [XXX] nr_neigh_put()
> > > > > ^^^^^^^^^^^^^^^^^^^^
> > > > >
> > > > > These charts are supposed to be chronological so [XXX] is wrong because the
> > > > > use after free happens on line [248]. Do we really need three threads to
> > > > > make this race work?
> > > > The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> > > > then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> > > > executing [XXX]nr_neigh_put().
> > > Execution Order:
> > > 1 -> Task0
> > > [113] nr_neigh_get_dev() // After execution, the refcount value is 3
> > >
> > > 2 -> Task1
> > > [246] nr_neigh_put(nr_node->routes[2].neighbour); // After execution, the refcount value is 2
> > > [248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
> > >
> > > 3 -> Task0
> > > [253] nr_node->routes[2].neighbour = nr_neigh // nr_neigh's refcount value is 1 and add it to routes[2]
> > >
> > > 4 -> Task2
> > > [XXX] nr_neigh_put(nr_node->routes[2].neighbour) // After execution, neighhour is freed
> > > if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked) // Uaf occurs this line when accessing neighbour->count
> >
> > Let's step back a bit and look at the bigger picture design. (Which is
> > completely undocumented so we're just guessing).
> >
> > When we put nr_neigh into nr_node->routes[] we bump the nr_neigh_hold()
> > reference count and nr_neigh->count++, then when we remove it from
> > ->routes[] we drop the reference and do nr_neigh->count--.
> >
> > If it's the last reference (and we are not holding ->locked) then we
> > remove it from the &nr_neigh_list and drop the reference count again and
> > free it. So we drop the reference count twice. This is a complicated
> > design with three variables: nr_neigh_hold(), nr_neigh->count and
> > ->locked. Why can it not just be one counter nr_neigh_hold(). So
> > instead of setting locked = true we would just take an extra reference?
> > The nr_neigh->count++ would be replaced with nr_neigh_hold() as well.
> locked controls whether the neighbor quality can be automatically updated;
I'm not sure your patch fixes the bug because we could still race against
nr_del_node().
I'm not saying get rid of locked completely, I'm saying get rid of code like
this:
if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
nr_remove_neigh(nr_node->routes[2].neighbour);
Right now, locked serves as a special kind of reference count, because we
don't drop the reference if it's true.
> count controls the number of different routes a neighbor is linked to;
Sure, that is interesting information for the user, so keep it around to
print in the proc file, but don't use it as a reference count.
> refcount is simply used to manage the neighbor lifecycle.
The bug is caused because our reference counting is bad.
So right now what happens is we allocate nr_neigh and we put it on the
&nr_neigh_list. Then we lock it or we add it to ->routes[] and each of
those has a different reference count. Then when we drop those references
we do:
if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
nr_remove_neigh(nr_node->routes[2].neighbour);
This removes it from the list, and hopefully this is the last reference
and it frees it.
It would be much simpler to say, we only use nr_neigh_hold()/put() for
reference counting. When we set locked we do:
nr_neigh_hold(nr_neigh);
nr_neigh->locked = true;
Incrementing the refcount means it can't be freed.
Then when we remove nr_neigh from ->routes[] we wouldn't "remove it from
the list", instead we would just drop a reference. When we dropped the
last reference, nr_neigh_put() would remove it from the list.
My proposal would be a behavior change because right now what happens is:
1: allocate nr_neigh
2: add it to ->routes[]
3: remove it from ->routes[]
(freed automatically because we drop two references)
Now it would be:
1: allocate nr_neigh
2: add it to ->routes[]
3: remove it from ->routes[]
4: needs to be freed manually with nr_del_neigh().
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
2025-10-21 6:36 ` Dan Carpenter
@ 2025-10-21 8:34 ` Lizhi Xu
2025-10-21 8:35 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Lizhi Xu
1 sibling, 0 replies; 27+ messages in thread
From: Lizhi Xu @ 2025-10-21 8:34 UTC (permalink / raw)
To: dan.carpenter
Cc: lizhi.xu, davem, edumazet, horms, kuba, linux-hams, linux-kernel,
netdev, pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
On Tue, 21 Oct 2025 09:36:47 +0300, Dan Carpenter wrote:
> On Tue, Oct 21, 2025 at 10:05:33AM +0800, Lizhi Xu wrote:
> > On Mon, 20 Oct 2025 20:59:24 +0300, Dan Carpenter wrote:
> > > On Mon, Oct 20, 2025 at 09:49:12PM +0800, Lizhi Xu wrote:
> > > > On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > > > > > Task0 Task1 Task2
> > > > > > ===== ===== =====
> > > > > > [97] nr_add_node()
> > > > > > [113] nr_neigh_get_dev() [97] nr_add_node()
> > > > > > [214] nr_node_lock()
> > > > > > [245] nr_node->routes[2].neighbour->count--
> > > > > > [246] nr_neigh_put(nr_node->routes[2].neighbour);
> > > > > > [248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > > > > > [283] nr_node_unlock()
> > > > > > [214] nr_node_lock()
> > > > > > [253] nr_node->routes[2].neighbour = nr_neigh
> > > > > > [254] nr_neigh_hold(nr_neigh); [97] nr_add_node()
> > > > > > [XXX] nr_neigh_put()
> > > > > > ^^^^^^^^^^^^^^^^^^^^
> > > > > >
> > > > > > These charts are supposed to be chronological so [XXX] is wrong because the
> > > > > > use after free happens on line [248]. Do we really need three threads to
> > > > > > make this race work?
> > > > > The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> > > > > then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> > > > > executing [XXX]nr_neigh_put().
> > > > Execution Order:
> > > > 1 -> Task0
> > > > [113] nr_neigh_get_dev() // After execution, the refcount value is 3
> > > >
> > > > 2 -> Task1
> > > > [246] nr_neigh_put(nr_node->routes[2].neighbour); // After execution, the refcount value is 2
> > > > [248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
> > > >
> > > > 3 -> Task0
> > > > [253] nr_node->routes[2].neighbour = nr_neigh // nr_neigh's refcount value is 1 and add it to routes[2]
> > > >
> > > > 4 -> Task2
> > > > [XXX] nr_neigh_put(nr_node->routes[2].neighbour) // After execution, neighhour is freed
> > > > if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked) // Uaf occurs this line when accessing neighbour->count
> > >
> > > Let's step back a bit and look at the bigger picture design. (Which is
> > > completely undocumented so we're just guessing).
> > >
> > > When we put nr_neigh into nr_node->routes[] we bump the nr_neigh_hold()
> > > reference count and nr_neigh->count++, then when we remove it from
> > > ->routes[] we drop the reference and do nr_neigh->count--.
> > >
> > > If it's the last reference (and we are not holding ->locked) then we
> > > remove it from the &nr_neigh_list and drop the reference count again and
> > > free it. So we drop the reference count twice. This is a complicated
> > > design with three variables: nr_neigh_hold(), nr_neigh->count and
> > > ->locked. Why can it not just be one counter nr_neigh_hold(). So
> > > instead of setting locked = true we would just take an extra reference?
> > > The nr_neigh->count++ would be replaced with nr_neigh_hold() as well.
> > locked controls whether the neighbor quality can be automatically updated;
>
> I'm not sure your patch fixes the bug because we could still race against
> nr_del_node().
This is fine in this issue, but for rigor, I'll add locks to all related
ioctl and route frame operations to maintain synchronization.
I will send V3 patch to improve it.
>
> I'm not saying get rid of locked completely, I'm saying get rid of code like
> this:
> if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
> nr_remove_neigh(nr_node->routes[2].neighbour);
>
> Right now, locked serves as a special kind of reference count, because we
> don't drop the reference if it's true.
I don't think this is correct.
>
> > count controls the number of different routes a neighbor is linked to;
>
> Sure, that is interesting information for the user, so keep it around to
> print in the proc file, but don't use it as a reference count.
>
> > refcount is simply used to manage the neighbor lifecycle.
>
> The bug is caused because our reference counting is bad.
>
> So right now what happens is we allocate nr_neigh and we put it on the
> &nr_neigh_list. Then we lock it or we add it to ->routes[] and each of
> those has a different reference count. Then when we drop those references
> we do:
>
> if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
> nr_remove_neigh(nr_node->routes[2].neighbour);
>
> This removes it from the list, and hopefully this is the last reference
> and it frees it.
>
> It would be much simpler to say, we only use nr_neigh_hold()/put() for
> reference counting. When we set locked we do:
>
> nr_neigh_hold(nr_neigh);
> nr_neigh->locked = true;
>
> Incrementing the refcount means it can't be freed.
No, setting locked = 1 is only done in nr_add_neigh(), and nr_neigh_hold()
is not executed, and the refcount value is 1.
>
> Then when we remove nr_neigh from ->routes[] we wouldn't "remove it from
> the list", instead we would just drop a reference. When we dropped the
> last reference, nr_neigh_put() would remove it from the list.
>
> My proposal would be a behavior change because right now what happens is:
>
> 1: allocate nr_neigh
> 2: add it to ->routes[]
> 3: remove it from ->routes[]
> (freed automatically because we drop two references)
No, No, I know where your analysis went wrong, it is here.
The problem is not when allocating neigh and adding it to routes[2],
but when nr_add_node is executed twice later, one is Task0 as I mentioned
above, and the other is Task1.
After Task1 moves the neighbor out of routes, Task0 uses nr_neigh_get_dev()
to get the neighbor that Task1 moved out, and then adds it to its routes.
This is wrong. Task0 should not use nr_neigh_get_dev() to obtain the neighbor
before other tasks move it out. This will interfere with the reference count
of the neighbor, which is the root cause of the problem.
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V3] netrom: Prevent race conditions between neighbor operations
2025-10-21 6:36 ` Dan Carpenter
2025-10-21 8:34 ` Lizhi Xu
@ 2025-10-21 8:35 ` Lizhi Xu
2025-10-23 11:44 ` Paolo Abeni
1 sibling, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-21 8:35 UTC (permalink / raw)
To: dan.carpenter
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, lizhi.xu,
netdev, pabeni, syzbot+2860e75836a08b172755, syzkaller-bugs
The root cause of the problem is that multiple different tasks initiate
SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
between them to protect the same nr_neigh.
Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
release the neighbour because its refcount value is 1.
In this case, the following situation causes a UAF on Task2:
Task0 Task1 Task2
===== ===== =====
nr_add_node()
nr_neigh_get_dev() nr_add_node()
nr_node_lock()
nr_node->routes[2].neighbour->count--
nr_neigh_put(nr_node->routes[2].neighbour);
nr_remove_neigh(nr_node->routes[2].neighbour)
nr_node_unlock()
nr_node_lock()
nr_node->routes[2].neighbour = nr_neigh
nr_neigh_hold(nr_neigh); nr_add_node()
nr_neigh_put()
if (nr_node->routes[2].neighbour->count
Description of the UAF triggering process:
First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
Then, Task 1 puts the same neighbor from its routes[2] and executes
nr_remove_neigh() because the count is 0. After these two operations,
the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
lock and writes it to its routes[2].neighbour.
Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
release the neighbor. The subsequent execution of the neighbor->count
check triggers a UAF.
The solution to the problem is to use a lock to synchronize each add a
route to node, but for rigor, I'll add locks to related ioctl and route
frame operations to maintain synchronization.
syzbot reported:
BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
Call Trace:
<TASK>
nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
V1 -> V2: update comments for cause uaf
V2 -> V3: sync neighbor operations in ioctl and route frame, update comments
net/netrom/nr_route.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..debe3e925338 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -40,6 +40,7 @@ static HLIST_HEAD(nr_node_list);
static DEFINE_SPINLOCK(nr_node_list_lock);
static HLIST_HEAD(nr_neigh_list);
static DEFINE_SPINLOCK(nr_neigh_list_lock);
+static DEFINE_MUTEX(neighbor_lock);
static struct nr_node *nr_node_get(ax25_address *callsign)
{
@@ -633,6 +634,8 @@ int nr_rt_ioctl(unsigned int cmd, void __user *arg)
ax25_digi digi;
int ret;
+ guard(mutex)(&neighbor_lock);
+
switch (cmd) {
case SIOCADDRT:
if (copy_from_user(&nr_route, arg, sizeof(struct nr_route_struct)))
@@ -765,6 +768,7 @@ int nr_route_frame(struct sk_buff *skb, ax25_cb *ax25)
nr_dest = (ax25_address *)(skb->data + 7);
if (ax25 != NULL) {
+ guard(mutex)(&neighbor_lock);
ret = nr_add_node(nr_src, "", &ax25->dest_addr, ax25->digipeat,
ax25->ax25_dev->dev, 0,
READ_ONCE(sysctl_netrom_obsolescence_count_initialiser));
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH V3] netrom: Prevent race conditions between neighbor operations
2025-10-21 8:35 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Lizhi Xu
@ 2025-10-23 11:44 ` Paolo Abeni
2025-10-23 11:54 ` Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Paolo Abeni @ 2025-10-23 11:44 UTC (permalink / raw)
To: Lizhi Xu, dan.carpenter
Cc: davem, edumazet, horms, kuba, linux-hams, linux-kernel, netdev,
syzbot+2860e75836a08b172755, syzkaller-bugs
On 10/21/25 10:35 AM, Lizhi Xu wrote:
> The root cause of the problem is that multiple different tasks initiate
> SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
> between them to protect the same nr_neigh.
>
> Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> release the neighbour because its refcount value is 1.
>
> In this case, the following situation causes a UAF on Task2:
>
> Task0 Task1 Task2
> ===== ===== =====
> nr_add_node()
> nr_neigh_get_dev() nr_add_node()
> nr_node_lock()
> nr_node->routes[2].neighbour->count--
> nr_neigh_put(nr_node->routes[2].neighbour);
> nr_remove_neigh(nr_node->routes[2].neighbour)
> nr_node_unlock()
> nr_node_lock()
> nr_node->routes[2].neighbour = nr_neigh
> nr_neigh_hold(nr_neigh); nr_add_node()
> nr_neigh_put()
> if (nr_node->routes[2].neighbour->count
> Description of the UAF triggering process:
> First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
> Then, Task 1 puts the same neighbor from its routes[2] and executes
> nr_remove_neigh() because the count is 0. After these two operations,
> the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
> lock and writes it to its routes[2].neighbour.
> Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
> release the neighbor. The subsequent execution of the neighbor->count
> check triggers a UAF.
>
> The solution to the problem is to use a lock to synchronize each add a
> route to node, but for rigor, I'll add locks to related ioctl and route
> frame operations to maintain synchronization.
I think that adding another locking mechanism on top of an already
complex and not well understood locking and reference infra is not the
right direction.
Why reordering the statements as:
if (nr_node->routes[2].neighbour->count == 0 &&
!nr_node->routes[2].neighbour->locked)
nr_remove_neigh(nr_node->routes[2].neighbour);
nr_neigh_put(nr_node->routes[2].neighbour);
is not enough?
> syzbot reported:
> BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
>
> Call Trace:
> <TASK>
> nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
>
> Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
> Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
> ---
> V1 -> V2: update comments for cause uaf
> V2 -> V3: sync neighbor operations in ioctl and route frame, update comments
>
> net/netrom/nr_route.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
> index b94cb2ffbaf8..debe3e925338 100644
> --- a/net/netrom/nr_route.c
> +++ b/net/netrom/nr_route.c
> @@ -40,6 +40,7 @@ static HLIST_HEAD(nr_node_list);
> static DEFINE_SPINLOCK(nr_node_list_lock);
> static HLIST_HEAD(nr_neigh_list);
> static DEFINE_SPINLOCK(nr_neigh_list_lock);
> +static DEFINE_MUTEX(neighbor_lock);
>
> static struct nr_node *nr_node_get(ax25_address *callsign)
> {
> @@ -633,6 +634,8 @@ int nr_rt_ioctl(unsigned int cmd, void __user *arg)
> ax25_digi digi;
> int ret;
>
> + guard(mutex)(&neighbor_lock);
See:
https://elixir.bootlin.com/linux/v6.18-rc1/source/Documentation/process/maintainer-netdev.rst#L395
/P
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V3] netrom: Prevent race conditions between neighbor operations
2025-10-23 11:44 ` Paolo Abeni
@ 2025-10-23 11:54 ` Eric Dumazet
2025-10-23 12:41 ` Lizhi Xu
2025-10-24 10:45 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Dan Carpenter
2 siblings, 0 replies; 27+ messages in thread
From: Eric Dumazet @ 2025-10-23 11:54 UTC (permalink / raw)
To: Paolo Abeni
Cc: Lizhi Xu, dan.carpenter, davem, horms, kuba, linux-hams,
linux-kernel, netdev, syzbot+2860e75836a08b172755, syzkaller-bugs
On Thu, Oct 23, 2025 at 4:44 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 10/21/25 10:35 AM, Lizhi Xu wrote:
> > The root cause of the problem is that multiple different tasks initiate
> > SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
> > between them to protect the same nr_neigh.
> >
> > Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> > When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> > release the neighbour because its refcount value is 1.
> >
> > In this case, the following situation causes a UAF on Task2:
> >
> > Task0 Task1 Task2
> > ===== ===== =====
> > nr_add_node()
> > nr_neigh_get_dev() nr_add_node()
> > nr_node_lock()
> > nr_node->routes[2].neighbour->count--
> > nr_neigh_put(nr_node->routes[2].neighbour);
> > nr_remove_neigh(nr_node->routes[2].neighbour)
> > nr_node_unlock()
> > nr_node_lock()
> > nr_node->routes[2].neighbour = nr_neigh
> > nr_neigh_hold(nr_neigh); nr_add_node()
> > nr_neigh_put()
> > if (nr_node->routes[2].neighbour->count
> > Description of the UAF triggering process:
> > First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
> > Then, Task 1 puts the same neighbor from its routes[2] and executes
> > nr_remove_neigh() because the count is 0. After these two operations,
> > the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
> > lock and writes it to its routes[2].neighbour.
> > Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
> > release the neighbor. The subsequent execution of the neighbor->count
> > check triggers a UAF.
> >
> > The solution to the problem is to use a lock to synchronize each add a
> > route to node, but for rigor, I'll add locks to related ioctl and route
> > frame operations to maintain synchronization.
>
> I think that adding another locking mechanism on top of an already
> complex and not well understood locking and reference infra is not the
> right direction.
>
> Why reordering the statements as:
>
> if (nr_node->routes[2].neighbour->count == 0 &&
> !nr_node->routes[2].neighbour->locked)
> nr_remove_neigh(nr_node->routes[2].neighbour);
> nr_neigh_put(nr_node->routes[2].neighbour);
>
> is not enough?
>
> > syzbot reported:
> > BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> > Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
> >
> > Call Trace:
> > <TASK>
> > nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> >
> > Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
> > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
>
>
>
> > ---
> > V1 -> V2: update comments for cause uaf
> > V2 -> V3: sync neighbor operations in ioctl and route frame, update comments
> >
> > net/netrom/nr_route.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
> > index b94cb2ffbaf8..debe3e925338 100644
> > --- a/net/netrom/nr_route.c
> > +++ b/net/netrom/nr_route.c
> > @@ -40,6 +40,7 @@ static HLIST_HEAD(nr_node_list);
> > static DEFINE_SPINLOCK(nr_node_list_lock);
> > static HLIST_HEAD(nr_neigh_list);
> > static DEFINE_SPINLOCK(nr_neigh_list_lock);
> > +static DEFINE_MUTEX(neighbor_lock);
> >
> > static struct nr_node *nr_node_get(ax25_address *callsign)
> > {
> > @@ -633,6 +634,8 @@ int nr_rt_ioctl(unsigned int cmd, void __user *arg)
> > ax25_digi digi;
> > int ret;
> >
> > + guard(mutex)(&neighbor_lock);
>
> See:
>
> https://elixir.bootlin.com/linux/v6.18-rc1/source/Documentation/process/maintainer-netdev.rst#L395
>
I would also try to use a single spinlock : ie fuse together
nr_node_list_lock and nr_neigh_list_lock
Having two locks for something that is primarily used by fuzzers
nowadays is wasting our time.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V3] netrom: Prevent race conditions between neighbor operations
2025-10-23 11:44 ` Paolo Abeni
2025-10-23 11:54 ` Eric Dumazet
@ 2025-10-23 12:41 ` Lizhi Xu
2025-10-23 13:50 ` [PATCH V4] netrom: Preventing the use of abnormal neighbor Lizhi Xu
2025-10-24 10:45 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Dan Carpenter
2 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-23 12:41 UTC (permalink / raw)
To: pabeni
Cc: dan.carpenter, davem, edumazet, horms, kuba, linux-hams,
linux-kernel, lizhi.xu, netdev, syzbot+2860e75836a08b172755,
syzkaller-bugs
On Thu, 23 Oct 2025 13:44:18 +0200, Paolo Abeni wrote:
> > The root cause of the problem is that multiple different tasks initiate
> > SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
> > between them to protect the same nr_neigh.
> >
> > Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> > When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> > release the neighbour because its refcount value is 1.
> >
> > In this case, the following situation causes a UAF on Task2:
> >
> > Task0 Task1 Task2
> > ===== ===== =====
> > nr_add_node()
> > nr_neigh_get_dev() nr_add_node()
> > nr_node_lock()
> > nr_node->routes[2].neighbour->count--
> > nr_neigh_put(nr_node->routes[2].neighbour);
> > nr_remove_neigh(nr_node->routes[2].neighbour)
> > nr_node_unlock()
> > nr_node_lock()
> > nr_node->routes[2].neighbour = nr_neigh
> > nr_neigh_hold(nr_neigh); nr_add_node()
> > nr_neigh_put()
> > if (nr_node->routes[2].neighbour->count
> > Description of the UAF triggering process:
> > First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
> > Then, Task 1 puts the same neighbor from its routes[2] and executes
> > nr_remove_neigh() because the count is 0. After these two operations,
> > the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
> > lock and writes it to its routes[2].neighbour.
> > Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
> > release the neighbor. The subsequent execution of the neighbor->count
> > check triggers a UAF.
> >
> > The solution to the problem is to use a lock to synchronize each add a
> > route to node, but for rigor, I'll add locks to related ioctl and route
> > frame operations to maintain synchronization.
>
> I think that adding another locking mechanism on top of an already
> complex and not well understood locking and reference infra is not the
> right direction.
>
> Why reordering the statements as:
>
> if (nr_node->routes[2].neighbour->count == 0 &&
> !nr_node->routes[2].neighbour->locked)
> nr_remove_neigh(nr_node->routes[2].neighbour);
> nr_neigh_put(nr_node->routes[2].neighbour);
>
> is not enough?
This is not enough, the same uaf will appear, nr_remove_neigh() will also
execute nr_neigh_put(), and then executing nr_neigh_put() again will
trigger the uaf.
>
> > syzbot reported:
> > BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> > Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
> >
> > Call Trace:
> > <TASK>
> > nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
> >
> > Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
> > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
>
>
>
> > ---
> > V1 -> V2: update comments for cause uaf
> > V2 -> V3: sync neighbor operations in ioctl and route frame, update comments
> >
> > net/netrom/nr_route.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
> > index b94cb2ffbaf8..debe3e925338 100644
> > --- a/net/netrom/nr_route.c
> > +++ b/net/netrom/nr_route.c
> > @@ -40,6 +40,7 @@ static HLIST_HEAD(nr_node_list);
> > static DEFINE_SPINLOCK(nr_node_list_lock);
> > static HLIST_HEAD(nr_neigh_list);
> > static DEFINE_SPINLOCK(nr_neigh_list_lock);
> > +static DEFINE_MUTEX(neighbor_lock);
> >
> > static struct nr_node *nr_node_get(ax25_address *callsign)
> > {
> > @@ -633,6 +634,8 @@ int nr_rt_ioctl(unsigned int cmd, void __user *arg)
> > ax25_digi digi;
> > int ret;
> >
> > + guard(mutex)(&neighbor_lock);
>
> See:
>
> https://elixir.bootlin.com/linux/v6.18-rc1/source/Documentation/process/maintainer-netdev.rst#L395
Using guard is not recommended. I'll reconsider.
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V4] netrom: Preventing the use of abnormal neighbor
2025-10-23 12:41 ` Lizhi Xu
@ 2025-10-23 13:50 ` Lizhi Xu
2025-10-28 14:13 ` Paolo Abeni
0 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-23 13:50 UTC (permalink / raw)
To: lizhi.xu
Cc: dan.carpenter, davem, edumazet, horms, kuba, linux-hams,
linux-kernel, netdev, pabeni, syzbot+2860e75836a08b172755,
syzkaller-bugs
The root cause of the problem is that multiple different tasks initiate
SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
between them to protect the same nr_neigh.
Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
release the neighbour because its refcount value is 1.
In this case, the following situation causes a UAF on Task2:
Task0 Task1 Task2
===== ===== =====
nr_add_node()
nr_neigh_get_dev() nr_add_node()
nr_node_lock()
nr_node->routes[2].neighbour->count--
nr_neigh_put(nr_node->routes[2].neighbour);
nr_remove_neigh(nr_node->routes[2].neighbour)
nr_node_unlock()
nr_node_lock()
nr_node->routes[2].neighbour = nr_neigh
nr_neigh_hold(nr_neigh); nr_add_node()
nr_neigh_put()
if (nr_node->routes[2].neighbour->count
Description of the UAF triggering process:
First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
Then, Task 1 puts the same neighbor from its routes[2] and executes
nr_remove_neigh() because the count is 0. After these two operations,
the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
lock and writes it to its routes[2].neighbour.
Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
release the neighbor. The subsequent execution of the neighbor->count
check triggers a UAF.
Filter out neighbors with a refcount of 1 to avoid unsafe conditions.
syzbot reported:
BUG: KASAN: slab-use-after-free in nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Read of size 4 at addr ffff888051e6e9b0 by task syz.1.2539/8741
Call Trace:
<TASK>
nr_add_node+0x25db/0x2c00 net/netrom/nr_route.c:248
Reported-by: syzbot+2860e75836a08b172755@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2860e75836a08b172755
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
V1 -> V2: update comments for cause uaf
V2 -> V3: sync neighbor operations in ioctl and route frame, update comments
V3 -> V4: Preventing the use of neighbors with a reference count of 1
net/netrom/nr_route.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..1ef2743a5ec0 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -100,7 +100,7 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
{
struct nr_node *nr_node;
struct nr_neigh *nr_neigh;
- int i, found;
+ int i, found, ret = 0;
struct net_device *odev;
if ((odev=nr_dev_get(nr)) != NULL) { /* Can't add routes to ourself */
@@ -212,6 +212,10 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
return 0;
}
nr_node_lock(nr_node);
+ if (refcount_read(&nr_neigh->refcount) == 1) {
+ ret = -EINVAL;
+ goto out;
+ }
if (quality != 0)
strscpy(nr_node->mnemonic, mnemonic);
@@ -279,10 +283,11 @@ static int __must_check nr_add_node(ax25_address *nr, const char *mnemonic,
}
}
+out:
nr_neigh_put(nr_neigh);
nr_node_unlock(nr_node);
nr_node_put(nr_node);
- return 0;
+ return ret;
}
static void nr_remove_node_locked(struct nr_node *nr_node)
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH V3] netrom: Prevent race conditions between neighbor operations
2025-10-23 11:44 ` Paolo Abeni
2025-10-23 11:54 ` Eric Dumazet
2025-10-23 12:41 ` Lizhi Xu
@ 2025-10-24 10:45 ` Dan Carpenter
2 siblings, 0 replies; 27+ messages in thread
From: Dan Carpenter @ 2025-10-24 10:45 UTC (permalink / raw)
To: Paolo Abeni
Cc: Lizhi Xu, davem, edumazet, horms, kuba, linux-hams, linux-kernel,
netdev, syzbot+2860e75836a08b172755, syzkaller-bugs
On Thu, Oct 23, 2025 at 01:44:18PM +0200, Paolo Abeni wrote:
> Why reordering the statements as:
>
> if (nr_node->routes[2].neighbour->count == 0 &&
> !nr_node->routes[2].neighbour->locked)
> nr_remove_neigh(nr_node->routes[2].neighbour);
> nr_neigh_put(nr_node->routes[2].neighbour);
>
> is not enough?
There are so many unfortunate things like this:
net/netrom/nr_route.c
243 /* It must be better than the worst */
244 if (quality > nr_node->routes[2].quality) {
245 nr_node->routes[2].neighbour->count--;
++/-- are not atomic.
246 nr_neigh_put(nr_node->routes[2].neighbour);
247
248 if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
249 nr_remove_neigh(nr_node->routes[2].neighbour);
250
251 nr_node->routes[2].quality = quality;
252 nr_node->routes[2].obs_count = obs_count;
253 nr_node->routes[2].neighbour = nr_neigh;
This line should come after the next two lines.
254
255 nr_neigh_hold(nr_neigh);
256 nr_neigh->count++;
257 }
regards,
dan carpenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4] netrom: Preventing the use of abnormal neighbor
2025-10-23 13:50 ` [PATCH V4] netrom: Preventing the use of abnormal neighbor Lizhi Xu
@ 2025-10-28 14:13 ` Paolo Abeni
2025-10-29 2:59 ` Lizhi Xu
0 siblings, 1 reply; 27+ messages in thread
From: Paolo Abeni @ 2025-10-28 14:13 UTC (permalink / raw)
To: Lizhi Xu
Cc: dan.carpenter, davem, edumazet, horms, kuba, linux-hams,
linux-kernel, netdev, syzbot+2860e75836a08b172755, syzkaller-bugs
On 10/23/25 3:50 PM, Lizhi Xu wrote:
> The root cause of the problem is that multiple different tasks initiate
> SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
> between them to protect the same nr_neigh.
>
> Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> release the neighbour because its refcount value is 1.
>
> In this case, the following situation causes a UAF on Task2:
>
> Task0 Task1 Task2
> ===== ===== =====
> nr_add_node()
> nr_neigh_get_dev() nr_add_node()
> nr_node_lock()
> nr_node->routes[2].neighbour->count--
> nr_neigh_put(nr_node->routes[2].neighbour);
> nr_remove_neigh(nr_node->routes[2].neighbour)
> nr_node_unlock()
> nr_node_lock()
> nr_node->routes[2].neighbour = nr_neigh
> nr_neigh_hold(nr_neigh); nr_add_node()
> nr_neigh_put()
> if (nr_node->routes[2].neighbour->count
> Description of the UAF triggering process:
> First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
> Then, Task 1 puts the same neighbor from its routes[2] and executes
> nr_remove_neigh() because the count is 0. After these two operations,
> the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
> lock and writes it to its routes[2].neighbour.
> Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
> release the neighbor. The subsequent execution of the neighbor->count
> check triggers a UAF.
I looked at the code quite a bit and I think this could possibly avoid
the above mentioned race, but this whole area looks quite confusing to me.
I think it would be helpful if you could better describe the relevant
scenario starting from the initial setup (no nodes, no neighs).
Thanks,
Paolo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4] netrom: Preventing the use of abnormal neighbor
2025-10-28 14:13 ` Paolo Abeni
@ 2025-10-29 2:59 ` Lizhi Xu
2025-11-13 6:33 ` Lizhi Xu
0 siblings, 1 reply; 27+ messages in thread
From: Lizhi Xu @ 2025-10-29 2:59 UTC (permalink / raw)
To: pabeni
Cc: dan.carpenter, davem, edumazet, horms, kuba, linux-hams,
linux-kernel, lizhi.xu, netdev, syzbot+2860e75836a08b172755,
syzkaller-bugs
On Tue, 28 Oct 2025 15:13:37 +0100, Paolo Abeni wrote:
> > The root cause of the problem is that multiple different tasks initiate
> > SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
> > between them to protect the same nr_neigh.
> >
> > Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> > When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> > release the neighbour because its refcount value is 1.
> >
> > In this case, the following situation causes a UAF on Task2:
> >
> > Task0 Task1 Task2
> > ===== ===== =====
> > nr_add_node()
> > nr_neigh_get_dev() nr_add_node()
> > nr_node_lock()
> > nr_node->routes[2].neighbour->count--
> > nr_neigh_put(nr_node->routes[2].neighbour);
> > nr_remove_neigh(nr_node->routes[2].neighbour)
> > nr_node_unlock()
> > nr_node_lock()
> > nr_node->routes[2].neighbour = nr_neigh
> > nr_neigh_hold(nr_neigh); nr_add_node()
> > nr_neigh_put()
> > if (nr_node->routes[2].neighbour->count
> > Description of the UAF triggering process:
> > First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
> > Then, Task 1 puts the same neighbor from its routes[2] and executes
> > nr_remove_neigh() because the count is 0. After these two operations,
> > the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
> > lock and writes it to its routes[2].neighbour.
> > Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
> > release the neighbor. The subsequent execution of the neighbor->count
> > check triggers a UAF.
>
> I looked at the code quite a bit and I think this could possibly avoid
> the above mentioned race, but this whole area looks quite confusing to me.
>
> I think it would be helpful if you could better describe the relevant
> scenario starting from the initial setup (no nodes, no neighs).
OK. Let me fill in the origin of neigh.
Task3
=====
nr_add_node()
[146]if ((nr_neigh = kmalloc(sizeof(*nr_neigh), GFP_ATOMIC)) == NULL)
[253]nr_node->routes[2].neighbour = nr_neigh;
[255]nr_neigh_hold(nr_neigh);
[256]nr_neigh->count++;
neigh is created on line 146 in nr_add_node(), and added to node on
lines 253-256. It occurs before all Task0, Task1, and Task2.
Note:
1. [x], x is line number.
2. During my debugging process, I didn't pay attention to where the node
was created, and I apologize that I cannot provide the relevant creation
process.
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4] netrom: Preventing the use of abnormal neighbor
2025-10-29 2:59 ` Lizhi Xu
@ 2025-11-13 6:33 ` Lizhi Xu
0 siblings, 0 replies; 27+ messages in thread
From: Lizhi Xu @ 2025-11-13 6:33 UTC (permalink / raw)
To: lizhi.xu
Cc: dan.carpenter, davem, edumazet, horms, kuba, linux-hams,
linux-kernel, netdev, syzbot+2860e75836a08b172755, syzkaller-bugs
On Wed, 29 Oct 2025 10:59:04 +0800, Lizhi Xu wrote:
> > > The root cause of the problem is that multiple different tasks initiate
> > > SIOCADDRT & NETROM_NODE commands to add new routes, there is no lock
> > > between them to protect the same nr_neigh.
> > >
> > > Task0 can add the nr_neigh.refcount value of 1 on Task1 to routes[2].
> > > When Task2 executes nr_neigh_put(nr_node->routes[2].neighbour), it will
> > > release the neighbour because its refcount value is 1.
> > >
> > > In this case, the following situation causes a UAF on Task2:
> > >
> > > Task0 Task1 Task2
> > > ===== ===== =====
> > > nr_add_node()
> > > nr_neigh_get_dev() nr_add_node()
> > > nr_node_lock()
> > > nr_node->routes[2].neighbour->count--
> > > nr_neigh_put(nr_node->routes[2].neighbour);
> > > nr_remove_neigh(nr_node->routes[2].neighbour)
> > > nr_node_unlock()
> > > nr_node_lock()
> > > nr_node->routes[2].neighbour = nr_neigh
> > > nr_neigh_hold(nr_neigh); nr_add_node()
> > > nr_neigh_put()
> > > if (nr_node->routes[2].neighbour->count
> > > Description of the UAF triggering process:
> > > First, Task 0 executes nr_neigh_get_dev() to set neighbor refcount to 3.
> > > Then, Task 1 puts the same neighbor from its routes[2] and executes
> > > nr_remove_neigh() because the count is 0. After these two operations,
> > > the neighbor's refcount becomes 1. Then, Task 0 acquires the nr node
> > > lock and writes it to its routes[2].neighbour.
> > > Finally, Task 2 executes nr_neigh_put(nr_node->routes[2].neighbour) to
> > > release the neighbor. The subsequent execution of the neighbor->count
> > > check triggers a UAF.
> >
> > I looked at the code quite a bit and I think this could possibly avoid
> > the above mentioned race, but this whole area looks quite confusing to me.
> >
> > I think it would be helpful if you could better describe the relevant
> > scenario starting from the initial setup (no nodes, no neighs).
> OK. Let me fill in the origin of neigh.
>
> Task3
> =====
> nr_add_node()
> [146]if ((nr_neigh = kmalloc(sizeof(*nr_neigh), GFP_ATOMIC)) == NULL)
> [253]nr_node->routes[2].neighbour = nr_neigh;
> [255]nr_neigh_hold(nr_neigh);
> [256]nr_neigh->count++;
>
> neigh is created on line 146 in nr_add_node(), and added to node on
> lines 253-256. It occurs before all Task0, Task1, and Task2.
>
> Note:
> 1. [x], x is line number.
> 2. During my debugging process, I didn't pay attention to where the node
> was created, and I apologize that I cannot provide the relevant creation
> process.
Hi everyone,
Today is my last day at WindRiver. Starting tomorrow, my email address
lizhi.xu@windriver.com will no longer be used;
I will use eadavis@qq.com thereafter.
BR,
Lizhi
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2025-11-13 6:33 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-25 16:35 [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node syzbot
2025-10-18 20:37 ` syzbot
2025-10-19 5:02 ` Brahmajit Das
2025-10-19 5:21 ` syzbot
2025-10-19 5:10 ` Brahmajit Das
2025-10-19 5:10 ` syzbot
2025-10-20 8:13 ` [PATCH] netrom: Prevent race conditions between multiple add route Lizhi Xu
2025-10-20 10:10 ` Dan Carpenter
2025-10-20 11:02 ` [PATCH V2] " Lizhi Xu
2025-10-20 12:25 ` Dan Carpenter
2025-10-20 12:33 ` Dan Carpenter
2025-10-20 12:57 ` Dan Carpenter
2025-10-20 13:34 ` Lizhi Xu
2025-10-20 13:49 ` Lizhi Xu
2025-10-20 17:59 ` Dan Carpenter
2025-10-21 2:05 ` Lizhi Xu
2025-10-21 6:36 ` Dan Carpenter
2025-10-21 8:34 ` Lizhi Xu
2025-10-21 8:35 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Lizhi Xu
2025-10-23 11:44 ` Paolo Abeni
2025-10-23 11:54 ` Eric Dumazet
2025-10-23 12:41 ` Lizhi Xu
2025-10-23 13:50 ` [PATCH V4] netrom: Preventing the use of abnormal neighbor Lizhi Xu
2025-10-28 14:13 ` Paolo Abeni
2025-10-29 2:59 ` Lizhi Xu
2025-11-13 6:33 ` Lizhi Xu
2025-10-24 10:45 ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Dan Carpenter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).