netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
@ 2024-06-25 13:25 syzbot
  2024-06-25 14:53 ` James Chapman
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: syzbot @ 2024-06-25 13:25 UTC (permalink / raw)
  To: davem, edumazet, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
git tree:       net-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=129911fa980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12521b9c980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e84f50e44254/disk-185d7211.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/df64b575cc01/vmlinux-185d7211.xz
kernel image: https://storage.googleapis.com/syzbot-assets/16ad5d1d433b/bzImage-185d7211.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-use-after-free in instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
BUG: KASAN: slab-use-after-free in test_and_set_bit include/asm-generic/bitops/instrumented-atomic.h:71 [inline]
BUG: KASAN: slab-use-after-free in l2tp_session_delete+0x28/0x9e0 net/l2tp/l2tp_core.c:1639
Write of size 8 at addr ffff88802aaf5808 by task kworker/u8:11/2523

CPU: 0 PID: 2523 Comm: kworker/u8:11 Not tainted 6.10.0-rc4-syzkaller-00869-g185d72112b95 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: l2tp l2tp_tunnel_del_work
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
 instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
 test_and_set_bit include/asm-generic/bitops/instrumented-atomic.h:71 [inline]
 l2tp_session_delete+0x28/0x9e0 net/l2tp/l2tp_core.c:1639
 l2tp_tunnel_closeall net/l2tp/l2tp_core.c:1302 [inline]
 l2tp_tunnel_del_work+0x1cb/0x330 net/l2tp/l2tp_core.c:1334
 process_one_work kernel/workqueue.c:3231 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Allocated by task 5107:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
 __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
 kasan_kmalloc include/linux/kasan.h:211 [inline]
 __do_kmalloc_node mm/slub.c:4122 [inline]
 __kmalloc_noprof+0x1f9/0x400 mm/slub.c:4135
 kmalloc_noprof include/linux/slab.h:664 [inline]
 kzalloc_noprof include/linux/slab.h:778 [inline]
 l2tp_session_create+0x3b/0xc20 net/l2tp/l2tp_core.c:1675
 pppol2tp_connect+0xca3/0x17a0 net/l2tp/l2tp_ppp.c:782
 __sys_connect_file net/socket.c:2049 [inline]
 __sys_connect+0x2df/0x310 net/socket.c:2066
 __do_sys_connect net/socket.c:2076 [inline]
 __se_sys_connect net/socket.c:2073 [inline]
 __x64_sys_connect+0x7a/0x90 net/socket.c:2073
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 16:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
 kasan_slab_free include/linux/kasan.h:184 [inline]
 slab_free_hook mm/slub.c:2196 [inline]
 slab_free mm/slub.c:4437 [inline]
 kfree+0x149/0x360 mm/slub.c:4558
 __sk_destruct+0x58/0x5f0 net/core/sock.c:2191
 rcu_do_batch kernel/rcu/tree.c:2535 [inline]
 rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2809
 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
 run_ksoftirqd+0xca/0x130 kernel/softirq.c:928
 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Last potentially related work creation:
 kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
 __call_rcu_common kernel/rcu/tree.c:3072 [inline]
 call_rcu+0x167/0xa70 kernel/rcu/tree.c:3176
 pppol2tp_release+0x24b/0x350 net/l2tp/l2tp_ppp.c:457
 __sock_release net/socket.c:659 [inline]
 sock_close+0xbc/0x240 net/socket.c:1421
 __fput+0x406/0x8b0 fs/file_table.c:422
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xa27/0x27e0 kernel/exit.c:874
 do_group_exit+0x207/0x2c0 kernel/exit.c:1023
 get_signal+0x16a1/0x1740 kernel/signal.c:2909
 arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:310
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0xc9/0x370 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff88802aaf5800
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 8 bytes inside of
 freed 1024-byte region [ffff88802aaf5800, ffff88802aaf5c00)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2aaf0
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
anon flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffefff(slab)
raw: 00fff00000000040 ffff888015041dc0 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000100010 00000001ffffefff 0000000000000000
head: 00fff00000000040 ffff888015041dc0 0000000000000000 dead000000000001
head: 0000000000000000 0000000000100010 00000001ffffefff 0000000000000000
head: 00fff00000000003 ffffea0000aabc01 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 1, tgid 1 (swapper/0), ts 11074988535, free_ts 0
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1468
 prep_new_page mm/page_alloc.c:1476 [inline]
 get_page_from_freelist+0x2e43/0x2f00 mm/page_alloc.c:3420
 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4678
 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
 alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
 alloc_slab_page+0x5f/0x120 mm/slub.c:2265
 allocate_slab+0x5a/0x2f0 mm/slub.c:2428
 new_slab mm/slub.c:2481 [inline]
 ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3667
 __slab_alloc+0x58/0xa0 mm/slub.c:3757
 __slab_alloc_node mm/slub.c:3810 [inline]
 slab_alloc_node mm/slub.c:3989 [inline]
 __do_kmalloc_node mm/slub.c:4121 [inline]
 __kmalloc_noprof+0x257/0x400 mm/slub.c:4135
 kmalloc_noprof include/linux/slab.h:664 [inline]
 kzalloc_noprof include/linux/slab.h:778 [inline]
 alloc_workqueue+0x1b0/0x2060 kernel/workqueue.c:5667
 tipc_topsrv_work_start net/tipc/topsrv.c:635 [inline]
 tipc_topsrv_start net/tipc/topsrv.c:679 [inline]
 tipc_topsrv_init_net+0x303/0x9d0 net/tipc/topsrv.c:725
 ops_init+0x359/0x610 net/core/net_namespace.c:139
 __register_pernet_operations net/core/net_namespace.c:1252 [inline]
 register_pernet_operations+0x2cb/0x660 net/core/net_namespace.c:1325
 register_pernet_device+0x33/0x80 net/core/net_namespace.c:1412
 tipc_init+0x8d/0x190 net/tipc/core.c:168
 do_one_initcall+0x248/0x880 init/main.c:1267
 do_initcall_level+0x157/0x210 init/main.c:1329
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88802aaf5700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88802aaf5780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88802aaf5800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
 ffff88802aaf5880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88802aaf5900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
@ 2024-06-25 14:53 ` James Chapman
  2024-07-02  9:52 ` syzbot
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: James Chapman @ 2024-06-25 14:53 UTC (permalink / raw)
  To: syzbot, davem, edumazet, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

On 25/06/2024 14:25, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> git tree:       net-next
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=129911fa980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
> dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12521b9c980000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/e84f50e44254/disk-185d7211.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/df64b575cc01/vmlinux-185d7211.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/16ad5d1d433b/bzImage-185d7211.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KASAN: slab-use-after-free in instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
> BUG: KASAN: slab-use-after-free in test_and_set_bit include/asm-generic/bitops/instrumented-atomic.h:71 [inline]
> BUG: KASAN: slab-use-after-free in l2tp_session_delete+0x28/0x9e0 net/l2tp/l2tp_core.c:1639
> Write of size 8 at addr ffff88802aaf5808 by task kworker/u8:11/2523

I'll look at this.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
  2024-06-25 14:53 ` James Chapman
@ 2024-07-02  9:52 ` syzbot
  2024-07-03  9:46 ` Tom Parkin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2024-07-02  9:52 UTC (permalink / raw)
  To: davem, eadavis, edumazet, hdanton, jchapman, kuba, linux-kernel,
	netdev, pabeni, samuel.thibault, syzkaller-bugs, tparkin

syzbot has bisected this issue to:

commit d18d3f0a24fc4a3513495892ab1a3753628b341b
Author: James Chapman <jchapman@katalix.com>
Date:   Thu Jun 20 11:22:44 2024 +0000

    l2tp: replace hlist with simple list for per-tunnel session list

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15f2a512980000
start commit:   185d72112b95 net: xilinx: axienet: Enable multicast by def..
git tree:       net-next
final oops:     https://syzkaller.appspot.com/x/report.txt?x=17f2a512980000
console output: https://syzkaller.appspot.com/x/log.txt?x=13f2a512980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12521b9c980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000

Reported-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com
Fixes: d18d3f0a24fc ("l2tp: replace hlist with simple list for per-tunnel session list")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
  2024-06-25 14:53 ` James Chapman
  2024-07-02  9:52 ` syzbot
@ 2024-07-03  9:46 ` Tom Parkin
  2024-07-03 10:05   ` syzbot
  2024-07-03 11:24 ` Tom Parkin
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Tom Parkin @ 2024-07-03  9:46 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, netdev, syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]

On  Tue, Jun 25, 2024 at 06:25:23 -0700, syzbot wrote:
> syzbot found the following issue on:
> 
> HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> git tree:       net-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  185d72112b95

--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,13 +1290,14 @@ static void l2tp_session_unhash(struct l2tp_session *session)
 static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
 {
    struct l2tp_session *session;
-   struct list_head *pos;
-   struct list_head *tmp;

    spin_lock_bh(&tunnel->list_lock);
    tunnel->acpt_newsess = false;
-   list_for_each_safe(pos, tmp, &tunnel->session_list) {
-       session = list_entry(pos, struct l2tp_session, list);
+   for (;;) {
+       session = list_first_entry_or_null(&tunnel->session_list,
+                          struct l2tp_session, list);
+       if (!session)
+           break;
        list_del_init(&session->list);
        spin_unlock_bh(&tunnel->list_lock);
        l2tp_session_delete(session);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03  9:46 ` Tom Parkin
@ 2024-07-03 10:05   ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2024-07-03 10:05 UTC (permalink / raw)
  To: linux-kernel, netdev, syzkaller-bugs, tparkin

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file net/l2tp/l2tp_core.c
patch: **** unexpected end of file in patch



Tested on:

commit:         185d7211 net: xilinx: axienet: Enable multicast by def..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
compiler:       
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1788ff1e980000


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
                   ` (2 preceding siblings ...)
  2024-07-03  9:46 ` Tom Parkin
@ 2024-07-03 11:24 ` Tom Parkin
  2024-07-03 11:30   ` syzbot
  2024-07-03 11:51   ` Hillf Danton
  2024-07-03 12:07 ` Tom Parkin
  2024-07-03 16:37 ` Tom Parkin
  5 siblings, 2 replies; 15+ messages in thread
From: Tom Parkin @ 2024-07-03 11:24 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, netdev, syzkaller-bugs


[-- Attachment #1.1: Type: text/plain, Size: 379 bytes --]

On  Tue, Jun 25, 2024 at 06:25:23 -0700, syzbot wrote:
> syzbot found the following issue on:
> 
> HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> git tree:       net-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  185d72112b95

[-- Attachment #1.2: 0001-l2tp-fix-possible-UAF-when-cleaning-up-tunnels.patch --]
[-- Type: text/x-diff, Size: 3275 bytes --]

From 31321b7742266c4e58355076c19d8d490fa005d2 Mon Sep 17 00:00:00 2001
From: James Chapman <jchapman@katalix.com>
Date: Tue, 2 Jul 2024 12:49:07 +0100
Subject: [PATCH] l2tp: fix possible UAF when cleaning up tunnels

syzbot reported a UAF caused by a race when the L2TP work queue closes a
tunnel at the same time as a userspace thread closes a session in that
tunnel.

Tunnel cleanup is handled by a work queue which iterates through the
sessions contained within a tunnel, and closes them in turn.

Meanwhile, a userspace thread may arbitrarily close a session via
either netlink command or by closing the pppox socket in the case of
l2tp_ppp.

The race condition may occur when l2tp_tunnel_closeall walks the list
of sessions in the tunnel and deletes each one.  Currently this is
implemented using list_for_each_safe, but because the list spinlock is
dropped in the loop body it's possible for other threads to manipulate
the list during list_for_each_safe's list walk.  This can lead to the
list iterator being corrupted, leading to list_for_each_safe spinning.
One sequence of events which may lead to this is as follows:

 * A tunnel is created, containing two sessions A and B.
 * A thread closes the tunnel, triggering tunnel cleanup via the work
   queue.
 * l2tp_tunnel_closeall runs in the context of the work queue.  It
   removes session A from the tunnel session list, then drops the list
   lock.  At this point the list_for_each_safe temporary variable is
   pointing to the other session on the list, which is session B, and
   the list can be manipulated by other threads since the list lock has
   been released.
 * Userspace closes session B, which removes the session from its parent
   tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
   released the tunnel list lock, l2tp_session_delete is able to call
   list_del_init on the session B list node.
 * Back on the work queue, l2tp_tunnel_closeall resumes execution and
   will now spin forever on the same list entry until the underlying
   session structure is freed, at which point UAF occurs.

The solution is to iterate over the tunnel's session list using
list_first_entry_not_null to avoid the possibility of the list
iterator pointing at a list item which may be removed during the walk.

---
 net/l2tp/l2tp_core.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 64f446f0930b..afa180b7b428 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,13 +1290,14 @@ static void l2tp_session_unhash(struct l2tp_session *session)
 static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
 {
 	struct l2tp_session *session;
-	struct list_head *pos;
-	struct list_head *tmp;
 
 	spin_lock_bh(&tunnel->list_lock);
 	tunnel->acpt_newsess = false;
-	list_for_each_safe(pos, tmp, &tunnel->session_list) {
-		session = list_entry(pos, struct l2tp_session, list);
+	for (;;) {
+		session = list_first_entry_or_null(&tunnel->session_list,
+						   struct l2tp_session, list);
+		if (!session)
+			break;
 		list_del_init(&session->list);
 		spin_unlock_bh(&tunnel->list_lock);
 		l2tp_session_delete(session);
-- 
2.34.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03 11:24 ` Tom Parkin
@ 2024-07-03 11:30   ` syzbot
  2024-07-03 11:51   ` Hillf Danton
  1 sibling, 0 replies; 15+ messages in thread
From: syzbot @ 2024-07-03 11:30 UTC (permalink / raw)
  To: linux-kernel, netdev, syzkaller-bugs, tparkin

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file net/l2tp/l2tp_core.c
Hunk #1 FAILED at 1290.
1 out of 1 hunk FAILED



Tested on:

commit:         185d7211 net: xilinx: axienet: Enable multicast by def..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
compiler:       
patch:          https://syzkaller.appspot.com/x/patch.diff?x=177064e1980000


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03 11:24 ` Tom Parkin
  2024-07-03 11:30   ` syzbot
@ 2024-07-03 11:51   ` Hillf Danton
  2024-07-03 12:39     ` Tom Parkin
  1 sibling, 1 reply; 15+ messages in thread
From: Hillf Danton @ 2024-07-03 11:51 UTC (permalink / raw)
  To: Tom Parkin; +Cc: syzbot, linux-kernel, netdev, James Chapman, syzkaller-bugs

On Wed, 3 Jul 2024 12:24:49 +0100 Tom Parkin <tparkin@katalix.com>
> 
> [-- Attachment #1.1: Type: text/plain, Size: 379 bytes --]
> 
> On  Tue, Jun 25, 2024 at 06:25:23 -0700, syzbot wrote:
> > syzbot found the following issue on:
> > 
> > HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> > git tree:       net-next
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000
> 
> #syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  185d72112b95
> 
> [-- Attachment #1.2: 0001-l2tp-fix-possible-UAF-when-cleaning-up-tunnels.patch --]
> [-- Type: text/x-diff, Size: 3275 bytes --]
> 
> From 31321b7742266c4e58355076c19d8d490fa005d2 Mon Sep 17 00:00:00 2001
> From: James Chapman <jchapman@katalix.com>
> Date: Tue, 2 Jul 2024 12:49:07 +0100
> Subject: [PATCH] l2tp: fix possible UAF when cleaning up tunnels
> 
> syzbot reported a UAF caused by a race when the L2TP work queue closes a
> tunnel at the same time as a userspace thread closes a session in that
> tunnel.
> 
> Tunnel cleanup is handled by a work queue which iterates through the
> sessions contained within a tunnel, and closes them in turn.
> 
> Meanwhile, a userspace thread may arbitrarily close a session via
> either netlink command or by closing the pppox socket in the case of
> l2tp_ppp.
> 
> The race condition may occur when l2tp_tunnel_closeall walks the list
> of sessions in the tunnel and deletes each one.  Currently this is
> implemented using list_for_each_safe, but because the list spinlock is
> dropped in the loop body it's possible for other threads to manipulate
> the list during list_for_each_safe's list walk.  This can lead to the
> list iterator being corrupted, leading to list_for_each_safe spinning.
> One sequence of events which may lead to this is as follows:
> 
>  * A tunnel is created, containing two sessions A and B.
>  * A thread closes the tunnel, triggering tunnel cleanup via the work
>    queue.
>  * l2tp_tunnel_closeall runs in the context of the work queue.  It
>    removes session A from the tunnel session list, then drops the list
>    lock.  At this point the list_for_each_safe temporary variable is
>    pointing to the other session on the list, which is session B, and
>    the list can be manipulated by other threads since the list lock has
>    been released.
>  * Userspace closes session B, which removes the session from its parent
>    tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
>    released the tunnel list lock, l2tp_session_delete is able to call
>    list_del_init on the session B list node.
>  * Back on the work queue, l2tp_tunnel_closeall resumes execution and
>    will now spin forever on the same list entry until the underlying
>    session structure is freed, at which point UAF occurs.
> 
> The solution is to iterate over the tunnel's session list using
> list_first_entry_not_null to avoid the possibility of the list
> iterator pointing at a list item which may be removed during the walk.
> 
> ---
>  net/l2tp/l2tp_core.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 64f446f0930b..afa180b7b428 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1290,13 +1290,14 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>  static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>  {
>  	struct l2tp_session *session;
> -	struct list_head *pos;
> -	struct list_head *tmp;
>  
>  	spin_lock_bh(&tunnel->list_lock);
>  	tunnel->acpt_newsess = false;
> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> -		session = list_entry(pos, struct l2tp_session, list);
> +	for (;;) {
> +		session = list_first_entry_or_null(&tunnel->session_list,
> +						   struct l2tp_session, list);
> +		if (!session)
> +			break;

WTF difference could this patch make wrt closing the race above?

>  		list_del_init(&session->list);
>  		spin_unlock_bh(&tunnel->list_lock);
>  		l2tp_session_delete(session);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
                   ` (3 preceding siblings ...)
  2024-07-03 11:24 ` Tom Parkin
@ 2024-07-03 12:07 ` Tom Parkin
  2024-07-03 13:56   ` syzbot
  2024-07-03 16:37 ` Tom Parkin
  5 siblings, 1 reply; 15+ messages in thread
From: Tom Parkin @ 2024-07-03 12:07 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, netdev, syzkaller-bugs


[-- Attachment #1.1: Type: text/plain, Size: 382 bytes --]

On  Tue, Jun 25, 2024 at 06:25:23 -0700, syzbot wrote:
> syzbot found the following issue on:
> 
> HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> git tree:       net-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  185d72112b95
 

[-- Attachment #1.2: 0001-l2tp-fix-possible-UAF-when-cleaning-up-tunnels.patch --]
[-- Type: text/x-diff, Size: 3240 bytes --]

From 15d6d4c290810d5b8f357b9e494c5ad420c8a2fb Mon Sep 17 00:00:00 2001
From: Tom Parkin <tparkin@katalix.com>
Date: Wed, 3 Jul 2024 13:02:51 +0100
Subject: [PATCH] l2tp: fix possible UAF when cleaning up tunnels

syzbot reported a UAF caused by a race when the L2TP work queue closes a
tunnel at the same time as a userspace thread closes a session in that
tunnel.

Tunnel cleanup is handled by a work queue which iterates through the
sessions contained within a tunnel, and closes them in turn.

Meanwhile, a userspace thread may arbitrarily close a session via
either netlink command or by closing the pppox socket in the case of
l2tp_ppp.

The race condition may occur when l2tp_tunnel_closeall walks the list
of sessions in the tunnel and deletes each one.  Currently this is
implemented using list_for_each_safe, but because the list spinlock is
dropped in the loop body it's possible for other threads to manipulate
the list during list_for_each_safe's list walk.  This can lead to the
list iterator being corrupted, leading to list_for_each_safe spinning.
One sequence of events which may lead to this is as follows:

 * A tunnel is created, containing two sessions A and B.
 * A thread closes the tunnel, triggering tunnel cleanup via the work
   queue.
 * l2tp_tunnel_closeall runs in the context of the work queue.  It
   removes session A from the tunnel session list, then drops the list
   lock.  At this point the list_for_each_safe temporary variable is
   pointing to the other session on the list, which is session B, and
   the list can be manipulated by other threads since the list lock has
   been released.
 * Userspace closes session B, which removes the session from its parent
   tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
   released the tunnel list lock, l2tp_session_delete is able to call
   list_del_init on the session B list node.
 * Back on the work queue, l2tp_tunnel_closeall resumes execution and
   will now spin forever on the same list entry until the underlying
   session structure is freed, at which point UAF occurs.

The solution is to iterate over the tunnel's session list using
list_first_entry_not_null to avoid the possibility of the list
iterator pointing at a list item which may be removed during the walk.
---
 net/l2tp/l2tp_core.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index be4bcbf291a1..fa75a8eb8782 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,12 +1290,14 @@ static void l2tp_session_unhash(struct l2tp_session *session)
 static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
 {
 	struct l2tp_session *session;
-	struct list_head __rcu *pos;
-	struct list_head *tmp;
 
 	spin_lock_bh(&tunnel->list_lock);
 	tunnel->acpt_newsess = false;
-	list_for_each_safe(pos, tmp, &tunnel->session_list) {
+	for (;;) {
+		session = list_first_entry_or_null(&tunnel->session_list,
+						   struct l2tp_session, list);
+		if (!session)
+			break;
 		session = list_entry(pos, struct l2tp_session, list);
 		list_del_init(&session->list);
 		spin_unlock_bh(&tunnel->list_lock);
-- 
2.34.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03 11:51   ` Hillf Danton
@ 2024-07-03 12:39     ` Tom Parkin
  2024-07-04 11:23       ` Hillf Danton
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Parkin @ 2024-07-03 12:39 UTC (permalink / raw)
  To: Hillf Danton; +Cc: syzbot, linux-kernel, netdev, James Chapman, syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 5728 bytes --]

Hi Hillf,

On  Wed, Jul 03, 2024 at 19:51:13 +0800, Hillf Danton wrote:
> On Wed, 3 Jul 2024 12:24:49 +0100 Tom Parkin <tparkin@katalix.com>
> > 
> > [-- Attachment #1.1: Type: text/plain, Size: 379 bytes --]
> > 
> > On  Tue, Jun 25, 2024 at 06:25:23 -0700, syzbot wrote:
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> > > git tree:       net-next
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000
> > 
> > #syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  185d72112b95
> > 
> > [-- Attachment #1.2: 0001-l2tp-fix-possible-UAF-when-cleaning-up-tunnels.patch --]
> > [-- Type: text/x-diff, Size: 3275 bytes --]
> > 
> > From 31321b7742266c4e58355076c19d8d490fa005d2 Mon Sep 17 00:00:00 2001
> > From: James Chapman <jchapman@katalix.com>
> > Date: Tue, 2 Jul 2024 12:49:07 +0100
> > Subject: [PATCH] l2tp: fix possible UAF when cleaning up tunnels
> > 
> > syzbot reported a UAF caused by a race when the L2TP work queue closes a
> > tunnel at the same time as a userspace thread closes a session in that
> > tunnel.
> > 
> > Tunnel cleanup is handled by a work queue which iterates through the
> > sessions contained within a tunnel, and closes them in turn.
> > 
> > Meanwhile, a userspace thread may arbitrarily close a session via
> > either netlink command or by closing the pppox socket in the case of
> > l2tp_ppp.
> > 
> > The race condition may occur when l2tp_tunnel_closeall walks the list
> > of sessions in the tunnel and deletes each one.  Currently this is
> > implemented using list_for_each_safe, but because the list spinlock is
> > dropped in the loop body it's possible for other threads to manipulate
> > the list during list_for_each_safe's list walk.  This can lead to the
> > list iterator being corrupted, leading to list_for_each_safe spinning.
> > One sequence of events which may lead to this is as follows:
> > 
> >  * A tunnel is created, containing two sessions A and B.
> >  * A thread closes the tunnel, triggering tunnel cleanup via the work
> >    queue.
> >  * l2tp_tunnel_closeall runs in the context of the work queue.  It
> >    removes session A from the tunnel session list, then drops the list
> >    lock.  At this point the list_for_each_safe temporary variable is
> >    pointing to the other session on the list, which is session B, and
> >    the list can be manipulated by other threads since the list lock has
> >    been released.
> >  * Userspace closes session B, which removes the session from its parent
> >    tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
> >    released the tunnel list lock, l2tp_session_delete is able to call
> >    list_del_init on the session B list node.
> >  * Back on the work queue, l2tp_tunnel_closeall resumes execution and
> >    will now spin forever on the same list entry until the underlying
> >    session structure is freed, at which point UAF occurs.
> > 
> > The solution is to iterate over the tunnel's session list using
> > list_first_entry_not_null to avoid the possibility of the list
> > iterator pointing at a list item which may be removed during the walk.
> > 
> > ---
> >  net/l2tp/l2tp_core.c | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> > index 64f446f0930b..afa180b7b428 100644
> > --- a/net/l2tp/l2tp_core.c
> > +++ b/net/l2tp/l2tp_core.c
> > @@ -1290,13 +1290,14 @@ static void l2tp_session_unhash(struct l2tp_session *session)
> >  static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
> >  {
> >  	struct l2tp_session *session;
> > -	struct list_head *pos;
> > -	struct list_head *tmp;
> >  
> >  	spin_lock_bh(&tunnel->list_lock);
> >  	tunnel->acpt_newsess = false;
> > -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> > -		session = list_entry(pos, struct l2tp_session, list);
> > +	for (;;) {
> > +		session = list_first_entry_or_null(&tunnel->session_list,
> > +						   struct l2tp_session, list);
> > +		if (!session)
> > +			break;
> 
> WTF difference could this patch make wrt closing the race above?
>

Sorry if the commit message isn't clear :-(

The specific UAF that syzbot hits is due to the list node that the
list_for_each_safe temporary variable points to being modified while
the list_for_each_safe walk is in process.

This is possible due to l2tp_tunnel_closeall dropping the spin lock
that protects the list mid way through the list_for_each_safe loop.
This opens the door for another thread to call list_del_init on the
node that list_for_each_safe is planning to process next, causing
l2tp_tunnel_closeall to repeatedly iterate on that node forever.

In the context of l2tp_ppp, this eventually leads to UAF because the
session structure itself is freed when the pppol2tp socket is
destroyed and the pppol2tp sk_destruct handler unrefs the session
structure to zero.

So to avoid the UAF, the list can safely be processed using a loop
which accesses the first entry in the tunnel session list under
spin lock protection, removing that entry, then dropping the lock
to call l2tp_session_delete.

FWIW, I note that syzbot has bisected this UAF to d18d3f0a24fc ("l2tp:
replace hlist with simple list for per-tunnel session list") which
changes l2tp_tunnel_closeall from a similar pattern of repeatedly
grabbing the list head under spin lock projection, to the current code
in net-next.

> >  		list_del_init(&session->list);
> >  		spin_unlock_bh(&tunnel->list_lock);
> >  		l2tp_session_delete(session);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03 12:07 ` Tom Parkin
@ 2024-07-03 13:56   ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2024-07-03 13:56 UTC (permalink / raw)
  To: linux-kernel, netdev, syzkaller-bugs, tparkin

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

net/l2tp/l2tp_core.c:1301:24: error: use of undeclared identifier 'pos'


Tested on:

commit:         185d7211 net: xilinx: axienet: Enable multicast by def..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16c5a181980000


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
                   ` (4 preceding siblings ...)
  2024-07-03 12:07 ` Tom Parkin
@ 2024-07-03 16:37 ` Tom Parkin
  2024-07-03 17:23   ` syzbot
  5 siblings, 1 reply; 15+ messages in thread
From: Tom Parkin @ 2024-07-03 16:37 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, netdev, syzkaller-bugs


[-- Attachment #1.1: Type: text/plain, Size: 379 bytes --]

On  Tue, Jun 25, 2024 at 06:25:23 -0700, syzbot wrote:
> syzbot found the following issue on:
> 
> HEAD commit:    185d72112b95 net: xilinx: axienet: Enable multicast by def..
> git tree:       net-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1062bd46980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  185d72112b95

[-- Attachment #1.2: 0001-l2tp-fix-possible-UAF-when-cleaning-up-tunnels.patch --]
[-- Type: text/x-diff, Size: 3275 bytes --]

From 6bc347af14eba3b37f9e8f0831a53ef1aae4c203 Mon Sep 17 00:00:00 2001
From: Tom Parkin <tparkin@katalix.com>
Date: Wed, 3 Jul 2024 17:32:00 +0100
Subject: [PATCH] l2tp: fix possible UAF when cleaning up tunnels

syzbot reported a UAF caused by a race when the L2TP work queue closes a
tunnel at the same time as a userspace thread closes a session in that
tunnel.

Tunnel cleanup is handled by a work queue which iterates through the
sessions contained within a tunnel, and closes them in turn.

Meanwhile, a userspace thread may arbitrarily close a session via
either netlink command or by closing the pppox socket in the case of
l2tp_ppp.

The race condition may occur when l2tp_tunnel_closeall walks the list
of sessions in the tunnel and deletes each one.  Currently this is
implemented using list_for_each_safe, but because the list spinlock is
dropped in the loop body it's possible for other threads to manipulate
the list during list_for_each_safe's list walk.  This can lead to the
list iterator being corrupted, leading to list_for_each_safe spinning.
One sequence of events which may lead to this is as follows:

 * A tunnel is created, containing two sessions A and B.
 * A thread closes the tunnel, triggering tunnel cleanup via the work
   queue.
 * l2tp_tunnel_closeall runs in the context of the work queue.  It
   removes session A from the tunnel session list, then drops the list
   lock.  At this point the list_for_each_safe temporary variable is
   pointing to the other session on the list, which is session B, and
   the list can be manipulated by other threads since the list lock has
   been released.
 * Userspace closes session B, which removes the session from its parent
   tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
   released the tunnel list lock, l2tp_session_delete is able to call
   list_del_init on the session B list node.
 * Back on the work queue, l2tp_tunnel_closeall resumes execution and
   will now spin forever on the same list entry until the underlying
   session structure is freed, at which point UAF occurs.

The solution is to iterate over the tunnel's session list using
list_first_entry_not_null to avoid the possibility of the list
iterator pointing at a list item which may be removed during the walk.
---
 net/l2tp/l2tp_core.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index be4bcbf291a1..afa180b7b428 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,13 +1290,14 @@ static void l2tp_session_unhash(struct l2tp_session *session)
 static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
 {
 	struct l2tp_session *session;
-	struct list_head __rcu *pos;
-	struct list_head *tmp;
 
 	spin_lock_bh(&tunnel->list_lock);
 	tunnel->acpt_newsess = false;
-	list_for_each_safe(pos, tmp, &tunnel->session_list) {
-		session = list_entry(pos, struct l2tp_session, list);
+	for (;;) {
+		session = list_first_entry_or_null(&tunnel->session_list,
+						   struct l2tp_session, list);
+		if (!session)
+			break;
 		list_del_init(&session->list);
 		spin_unlock_bh(&tunnel->list_lock);
 		l2tp_session_delete(session);
-- 
2.34.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03 16:37 ` Tom Parkin
@ 2024-07-03 17:23   ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2024-07-03 17:23 UTC (permalink / raw)
  To: linux-kernel, netdev, syzkaller-bugs, tparkin

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com

Tested on:

commit:         185d7211 net: xilinx: axienet: Enable multicast by def..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=10eb109e980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=c041b4ce3a6dfd1e63e2
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=101379c1980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-03 12:39     ` Tom Parkin
@ 2024-07-04 11:23       ` Hillf Danton
  2024-07-04 13:59         ` Tom Parkin
  0 siblings, 1 reply; 15+ messages in thread
From: Hillf Danton @ 2024-07-04 11:23 UTC (permalink / raw)
  To: Tom Parkin; +Cc: syzbot, linux-kernel, netdev, James Chapman, syzkaller-bugs

On Wed, 3 Jul 2024 13:39:25 +0100 Tom Parkin <tparkin@katalix.com>
> 
> The specific UAF that syzbot hits is due to the list node that the
> list_for_each_safe temporary variable points to being modified while
> the list_for_each_safe walk is in process.
> 
> This is possible due to l2tp_tunnel_closeall dropping the spin lock
> that protects the list mid way through the list_for_each_safe loop.
> This opens the door for another thread to call list_del_init on the
> node that list_for_each_safe is planning to process next, causing
> l2tp_tunnel_closeall to repeatedly iterate on that node forever.
> 
Yeah the next node could race with other thread because of the door.

> In the context of l2tp_ppp, this eventually leads to UAF because the
> session structure itself is freed when the pppol2tp socket is
> destroyed and the pppol2tp sk_destruct handler unrefs the session
> structure to zero.
> 
> So to avoid the UAF, the list can safely be processed using a loop
> which accesses the first entry in the tunnel session list under
> spin lock protection, removing that entry, then dropping the lock
> to call l2tp_session_delete.

Race exists after your patch.

	cpu1				cpu2
	---				---
					pppol2tp_release()

	spin_lock_bh(&tunnel->list_lock);
	while (!list_empty(&tunnel->session_list)) {
		session = list_first_entry(&tunnel->session_list,
					struct l2tp_session, list);
 		list_del_init(&session->list);
 		spin_unlock_bh(&tunnel->list_lock);

 					l2tp_session_delete(session);

 		l2tp_session_delete(session);
 		spin_lock_bh(&tunnel->list_lock);
 	}
 	spin_unlock_bh(&tunnel->list_lock);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete
  2024-07-04 11:23       ` Hillf Danton
@ 2024-07-04 13:59         ` Tom Parkin
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Parkin @ 2024-07-04 13:59 UTC (permalink / raw)
  To: Hillf Danton; +Cc: syzbot, linux-kernel, netdev, James Chapman, syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 2379 bytes --]

On  Thu, Jul 04, 2024 at 19:23:03 +0800, Hillf Danton wrote:
> On Wed, 3 Jul 2024 13:39:25 +0100 Tom Parkin <tparkin@katalix.com>
> > 
> > The specific UAF that syzbot hits is due to the list node that the
> > list_for_each_safe temporary variable points to being modified while
> > the list_for_each_safe walk is in process.
> > 
> > This is possible due to l2tp_tunnel_closeall dropping the spin lock
> > that protects the list mid way through the list_for_each_safe loop.
> > This opens the door for another thread to call list_del_init on the
> > node that list_for_each_safe is planning to process next, causing
> > l2tp_tunnel_closeall to repeatedly iterate on that node forever.
> > 
> Yeah the next node could race with other thread because of the door.

Exactly, yes.

> > In the context of l2tp_ppp, this eventually leads to UAF because the
> > session structure itself is freed when the pppol2tp socket is
> > destroyed and the pppol2tp sk_destruct handler unrefs the session
> > structure to zero.
> > 
> > So to avoid the UAF, the list can safely be processed using a loop
> > which accesses the first entry in the tunnel session list under
> > spin lock protection, removing that entry, then dropping the lock
> > to call l2tp_session_delete.
> 
> Race exists after your patch.
> 
> 	cpu1				cpu2
> 	---				---
> 					pppol2tp_release()
> 
> 	spin_lock_bh(&tunnel->list_lock);
> 	while (!list_empty(&tunnel->session_list)) {
> 		session = list_first_entry(&tunnel->session_list,
> 					struct l2tp_session, list);
>  		list_del_init(&session->list);
>  		spin_unlock_bh(&tunnel->list_lock);
> 
>  					l2tp_session_delete(session);
> 
>  		l2tp_session_delete(session);
>  		spin_lock_bh(&tunnel->list_lock);
>  	}
>  	spin_unlock_bh(&tunnel->list_lock);

I take your point.  Calling l2tp_session_delete() on the same session
twice isn't a problem per-se, but if cpu2 manages to destruct the
socket and unref the session to zero before cpu1 progresses then we
have the same sort of problem.

To be honest, cleanup generally could use some TLC (the dancing around
the list lock in _closeall is not ideal), but a minimal patch to
address the UAF makes sense in the short term IMO -- so to that end
holding a session reference around l2tp_session_delete in _closeall is
probably the least invasive thing to do.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-07-04 14:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 13:25 [syzbot] [net?] KASAN: slab-use-after-free Write in l2tp_session_delete syzbot
2024-06-25 14:53 ` James Chapman
2024-07-02  9:52 ` syzbot
2024-07-03  9:46 ` Tom Parkin
2024-07-03 10:05   ` syzbot
2024-07-03 11:24 ` Tom Parkin
2024-07-03 11:30   ` syzbot
2024-07-03 11:51   ` Hillf Danton
2024-07-03 12:39     ` Tom Parkin
2024-07-04 11:23       ` Hillf Danton
2024-07-04 13:59         ` Tom Parkin
2024-07-03 12:07 ` Tom Parkin
2024-07-03 13:56   ` syzbot
2024-07-03 16:37 ` Tom Parkin
2024-07-03 17:23   ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).