* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback [not found] <69984159.050a0220.21cd75.01bb.GAE@google.com> @ 2026-02-23 13:40 ` Frederic Weisbecker 2026-02-23 15:15 ` Günther Noack 0 siblings, 1 reply; 11+ messages in thread From: Frederic Weisbecker @ 2026-02-23 13:40 UTC (permalink / raw) To: syzbot, Mickaël Salaün, Günther Noack, Paul Moore, James Morris, Serge E. Hallyn, linux-security-module Cc: anna-maria, linux-kernel, syzkaller-bugs, tglx Le Fri, Feb 20, 2026 at 03:11:21AM -0800, syzbot a écrit : > Hello, > > syzbot found the following issue on: > > HEAD commit: 635c467cc14e Add linux-next specific files for 20260213 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=61690c38d1398936 > dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817 > compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15813652580000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/78b3d15ca8e6/disk-635c467c.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/a95f3d108ef4/vmlinux-635c467c.xz > kernel image: https://storage.googleapis.com/syzbot-assets/e58086838b24/bzImage-635c467c.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com > > INFO: task syz.0.2812:14643 blocked for more than 143 seconds. > Not tainted syzkaller #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:syz.0.2812 state:D stack:25600 pid:14643 tgid:14643 ppid:13375 task_flags:0x400040 flags:0x00080002 > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5295 [inline] > __schedule+0x1585/0x5340 kernel/sched/core.c:6907 > __schedule_loop kernel/sched/core.c:6989 [inline] > schedule+0x164/0x360 kernel/sched/core.c:7004 > schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75 > do_wait_for_common kernel/sched/completion.c:100 [inline] > __wait_for_common kernel/sched/completion.c:121 [inline] > wait_for_common kernel/sched/completion.c:132 [inline] > wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153 > restrict_one_thread security/landlock/tsync.c:128 [inline] > restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162 Seems to be related to landlock security module. Cc'ing maintainers for awareness. Thanks. > task_work_run+0x1d9/0x270 kernel/task_work.c:233 > get_signal+0x11eb/0x1330 kernel/signal.c:2807 > arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337 > __exit_to_user_mode_loop kernel/entry/common.c:64 [inline] > exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98 > __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline] > syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline] > syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline] > do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7f8d7f19bf79 > RSP: 002b:00007ffe0b192a38 EFLAGS: 00000246 ORIG_RAX: 00000000000000db > RAX: fffffffffffffdfc RBX: 00000000000389f1 RCX: 00007f8d7f19bf79 > RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f8d7f41618c > RBP: 0000000000000032 R08: 3fffffffffffffff R09: 0000000000000000 > R10: 00007ffe0b192b40 R11: 0000000000000246 R12: 00007ffe0b192b60 > R13: 00007f8d7f41618c R14: 0000000000038a23 R15: 00007ffe0b192b40 > </TASK> > INFO: task syz.0.2812:14644 blocked for more than 143 seconds. > Not tainted syzkaller #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:syz.0.2812 state:D stack:28216 pid:14644 tgid:14643 ppid:13375 task_flags:0x400040 flags:0x00080002 > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5295 [inline] > __schedule+0x1585/0x5340 kernel/sched/core.c:6907 > __schedule_loop kernel/sched/core.c:6989 [inline] > schedule+0x164/0x360 kernel/sched/core.c:7004 > schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75 > do_wait_for_common kernel/sched/completion.c:100 [inline] > __wait_for_common kernel/sched/completion.c:121 [inline] > wait_for_common kernel/sched/completion.c:132 [inline] > wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153 > restrict_one_thread security/landlock/tsync.c:128 [inline] > restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162 > task_work_run+0x1d9/0x270 kernel/task_work.c:233 > get_signal+0x11eb/0x1330 kernel/signal.c:2807 > arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337 > __exit_to_user_mode_loop kernel/entry/common.c:64 [inline] > exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98 > __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline] > syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline] > syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline] > do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7f8d7f19bf79 > RSP: 002b:00007f8d8007c0e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > RAX: fffffffffffffe00 RBX: 00007f8d7f415fa8 RCX: 00007f8d7f19bf79 > RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f8d7f415fa8 > RBP: 00007f8d7f415fa0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007f8d7f416038 R14: 00007ffe0b1927f0 R15: 00007ffe0b1928d8 > </TASK> > INFO: task syz.0.2812:14645 blocked for more than 143 seconds. > Not tainted syzkaller #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:syz.0.2812 state:D stack:28648 pid:14645 tgid:14643 ppid:13375 task_flags:0x400140 flags:0x00080006 > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5295 [inline] > __schedule+0x1585/0x5340 kernel/sched/core.c:6907 > __schedule_loop kernel/sched/core.c:6989 [inline] > schedule+0x164/0x360 kernel/sched/core.c:7004 > schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75 > do_wait_for_common kernel/sched/completion.c:100 [inline] > __wait_for_common kernel/sched/completion.c:121 [inline] > wait_for_common kernel/sched/completion.c:132 [inline] > wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153 > landlock_restrict_sibling_threads+0xe9c/0x11f0 security/landlock/tsync.c:539 > __do_sys_landlock_restrict_self security/landlock/syscalls.c:574 [inline] > __se_sys_landlock_restrict_self+0x540/0x810 security/landlock/syscalls.c:482 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7f8d7f19bf79 > RSP: 002b:00007f8d8005b028 EFLAGS: 00000246 ORIG_RAX: 00000000000001be > RAX: ffffffffffffffda RBX: 00007f8d7f416090 RCX: 00007f8d7f19bf79 > RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000003 > RBP: 00007f8d7f2327e0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007f8d7f416128 R14: 00007f8d7f416090 R15: 00007ffe0b1928d8 > </TASK> > INFO: task syz.0.2812:14646 blocked for more than 144 seconds. > Not tainted syzkaller #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:syz.0.2812 state:D stack:28832 pid:14646 tgid:14643 ppid:13375 task_flags:0x400140 flags:0x00080006 > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5295 [inline] > __schedule+0x1585/0x5340 kernel/sched/core.c:6907 > __schedule_loop kernel/sched/core.c:6989 [inline] > schedule+0x164/0x360 kernel/sched/core.c:7004 > schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75 > do_wait_for_common kernel/sched/completion.c:100 [inline] > __wait_for_common kernel/sched/completion.c:121 [inline] > wait_for_common kernel/sched/completion.c:132 [inline] > wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153 > landlock_restrict_sibling_threads+0xe9c/0x11f0 security/landlock/tsync.c:539 > __do_sys_landlock_restrict_self security/landlock/syscalls.c:574 [inline] > __se_sys_landlock_restrict_self+0x540/0x810 security/landlock/syscalls.c:482 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7f8d7f19bf79 > RSP: 002b:00007f8d8003a028 EFLAGS: 00000246 ORIG_RAX: 00000000000001be > RAX: ffffffffffffffda RBX: 00007f8d7f416180 RCX: 00007f8d7f19bf79 > RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000003 > RBP: 00007f8d7f2327e0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007f8d7f416218 R14: 00007f8d7f416180 R15: 00007ffe0b1928d8 > </TASK> > > Showing all locks held in the system: > 1 lock held by khungtaskd/31: > #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:312 [inline] > #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:850 [inline] > #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 kernel/locking/lockdep.c:6775 > 2 locks held by getty/5581: > #0: ffff8880328890a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243 > #1: ffffc9000332b2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x45c/0x13c0 drivers/tty/n_tty.c:2211 > > ============================================= > > NMI backtrace for cpu 0 > CPU: 0 UID: 0 PID: 31 Comm: khungtaskd Not tainted syzkaller #0 PREEMPT(full) > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 > Call Trace: > <TASK> > dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 > nmi_cpu_backtrace+0x274/0x2d0 lib/nmi_backtrace.c:113 > nmi_trigger_cpumask_backtrace+0x17a/0x300 lib/nmi_backtrace.c:62 > trigger_all_cpu_backtrace include/linux/nmi.h:161 [inline] > __sys_info lib/sys_info.c:157 [inline] > sys_info+0x135/0x170 lib/sys_info.c:165 > check_hung_uninterruptible_tasks kernel/hung_task.c:346 [inline] > watchdog+0xfd9/0x1030 kernel/hung_task.c:515 > kthread+0x388/0x470 kernel/kthread.c:467 > ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 > </TASK> > Sending NMI from CPU 0 to CPUs 1: > NMI backtrace for cpu 1 > CPU: 1 UID: 0 PID: 86 Comm: kworker/u8:5 Not tainted syzkaller #0 PREEMPT(full) > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 > Workqueue: events_unbound nsim_dev_trap_report_work > RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:26 [inline] > RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:109 [inline] > RIP: 0010:arch_local_irq_save arch/x86/include/asm/irqflags.h:127 [inline] > RIP: 0010:lock_acquire+0xab/0x2e0 kernel/locking/lockdep.c:5864 > Code: 84 c1 00 00 00 65 8b 05 73 b8 9f 11 85 c0 0f 85 b2 00 00 00 65 48 8b 05 bb 72 9f 11 83 b8 14 0b 00 00 00 0f 85 9d 00 00 00 9c <5b> fa 48 c7 c7 8f a1 02 8e e8 57 40 17 0a 65 ff 05 40 b8 9f 11 45 > RSP: 0018:ffffc9000260f498 EFLAGS: 00000246 > RAX: ffff88801df81e40 RBX: ffffffff818f9166 RCX: 0000000080000002 > RDX: 0000000000000000 RSI: ffffffff8176da62 RDI: 1ffffffff1d2c05c > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > R10: ffffc9000260f638 R11: ffffffff81b11580 R12: 0000000000000002 > R13: ffffffff8e9602e0 R14: 0000000000000000 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff88812510b000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fe09b2c1ff8 CR3: 000000000e74c000 CR4: 00000000003526f0 > Call Trace: > <TASK> > rcu_lock_acquire include/linux/rcupdate.h:312 [inline] > rcu_read_lock include/linux/rcupdate.h:850 [inline] > class_rcu_constructor include/linux/rcupdate.h:1193 [inline] > unwind_next_frame+0xc2/0x23c0 arch/x86/kernel/unwind_orc.c:495 > arch_stack_walk+0x11b/0x150 arch/x86/kernel/stacktrace.c:25 > stack_trace_save+0xa9/0x100 kernel/stacktrace.c:122 > kasan_save_stack mm/kasan/common.c:57 [inline] > kasan_save_track+0x3e/0x80 mm/kasan/common.c:78 > unpoison_slab_object mm/kasan/common.c:340 [inline] > __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366 > kasan_slab_alloc include/linux/kasan.h:253 [inline] > slab_post_alloc_hook mm/slub.c:4501 [inline] > slab_alloc_node mm/slub.c:4830 [inline] > kmem_cache_alloc_node_noprof+0x384/0x690 mm/slub.c:4882 > __alloc_skb+0x1d0/0x7d0 net/core/skbuff.c:702 > alloc_skb include/linux/skbuff.h:1383 [inline] > nsim_dev_trap_skb_build drivers/net/netdevsim/dev.c:819 [inline] > nsim_dev_trap_report drivers/net/netdevsim/dev.c:876 [inline] > nsim_dev_trap_report_work+0x29a/0xb80 drivers/net/netdevsim/dev.c:922 > process_one_work+0x949/0x1650 kernel/workqueue.c:3279 > process_scheduled_works kernel/workqueue.c:3362 [inline] > worker_thread+0xb46/0x1140 kernel/workqueue.c:3443 > kthread+0x388/0x470 kernel/kthread.c:467 > ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 > </TASK> > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the report is already addressed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. > > If you want to overwrite report's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the report is a duplicate of another one, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup -- Frederic Weisbecker SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-23 13:40 ` [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback Frederic Weisbecker @ 2026-02-23 15:15 ` Günther Noack 0 siblings, 0 replies; 11+ messages in thread From: Günther Noack @ 2026-02-23 15:15 UTC (permalink / raw) To: Frederic Weisbecker Cc: syzbot, Mickaël Salaün, Paul Moore, James Morris, Serge E. Hallyn, linux-security-module, anna-maria, linux-kernel, syzkaller-bugs, tglx On Mon, Feb 23, 2026 at 02:40:15PM +0100, Frederic Weisbecker wrote: > Le Fri, Feb 20, 2026 at 03:11:21AM -0800, syzbot a écrit : > > Call Trace: > > <TASK> > > context_switch kernel/sched/core.c:5295 [inline] > > __schedule+0x1585/0x5340 kernel/sched/core.c:6907 > > __schedule_loop kernel/sched/core.c:6989 [inline] > > schedule+0x164/0x360 kernel/sched/core.c:7004 > > schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75 > > do_wait_for_common kernel/sched/completion.c:100 [inline] > > __wait_for_common kernel/sched/completion.c:121 [inline] > > wait_for_common kernel/sched/completion.c:132 [inline] > > wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153 > > restrict_one_thread security/landlock/tsync.c:128 [inline] > > restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162 > > Seems to be related to landlock security module. > Cc'ing maintainers for awareness. Thank you! That is correct. We are already discussing it in https://lore.kernel.org/all/00A9E53EDC82309F+7b1dfc69-95f8-4ffc-a67c-967de0e2dfee@uniontech.com/ —Günther ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <69995a88.050a0220.340abe.0d25.GAE@google.com>]
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback [not found] <69995a88.050a0220.340abe.0d25.GAE@google.com> @ 2026-02-21 7:28 ` Ding Yihan 2026-02-21 12:00 ` Günther Noack 2026-02-24 14:43 ` Günther Noack 0 siblings, 2 replies; 11+ messages in thread From: Ding Yihan @ 2026-02-21 7:28 UTC (permalink / raw) To: syzbot, Mickaël Salaün Cc: linux-security-module, Günther Noack Hi all, Thanks to syzbot for the testing and confirmation. Since I am relatively new to the inner workings of this specific subsystem, I would like to take a few days to thoroughly study the root cause (the task_work and mutex interaction) and prepare a detailed and proper commit message. I will send out the formal patch (v1) to the mailing list later. Best regards, Yihan Ding 在 2026/2/21 15:11, syzbot 写道: > Hello, > > syzbot has tested the proposed patch and the reproducer did not trigger any issue: > > Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com > Tested-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com > > Tested on: > > commit: d4906ae1 Add linux-next specific files for 20260220 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=13ea89e6580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=51f859f3211496bc > dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817 > compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 > patch: https://syzkaller.appspot.com/x/patch.diff?x=15f0595a580000 > > Note: testing is done by a robot and is best-effort only. > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-21 7:28 ` Ding Yihan @ 2026-02-21 12:00 ` Günther Noack 2026-02-21 13:19 ` Günther Noack 2026-02-24 14:43 ` Günther Noack 1 sibling, 1 reply; 11+ messages in thread From: Günther Noack @ 2026-02-21 12:00 UTC (permalink / raw) To: Ding Yihan Cc: syzbot, Mickaël Salaün, linux-security-module, Jann Horn Hello Ding! On Sat, Feb 21, 2026 at 03:28:47PM +0800, Ding Yihan wrote: > Since I am relatively new to the inner workings of this specific subsystem, > I would like to take a few days to thoroughly study the root cause > (the task_work and mutex interaction) and prepare a detailed and proper commit message. > > I will send out the formal patch (v1) to the mailing list later. Thank you very much for preparing a patch, and especially also for forwarding this to us. (The original syzkaller report was somehow not addressed to Landlock or the LSM list. We should fix that.) Timing wise, the feature was picked up for the 7.0 release, so we still have some time to fix it before this is stable. As an early review for the patch: Background: We had previously convinced ourselves that grabbing the cred_guard_mutex was not necessary. To quote the comment in landlock_restrict_sibling_threads(): Unlike seccomp, which modifies sibling tasks directly, we do not need to acquire the cred_guard_mutex and sighand->siglock: - As in our case, all threads are themselves exchanging their own struct cred through the credentials API, no locks are needed for that. - Our for_each_thread() loops are protected by RCU. - We do not acquire a lock to keep the list of sibling threads stable between our for_each_thread loops. If the list of available sibling threads changes between these for_each_thread loops, we make up for that by continuing to look for threads until they are all discovered and have entered their task_work, where they are unable to spawn new threads. The question of locking cred_guard_mutex came up in the patch discussion multiple times as well, the most recent discussion was: https://lore.kernel.org/all/20251020.fohbo6Iecahz@digikod.net/ If it helps, I keep some of my own notes for this particular feature on https://wiki.gnoack.org/LandlockMultithreadedEnforcement. (Very) tentative investigation: In the Syzkaller report [2], it seems that the reproducer [2.1] is creating two rulesets and then enforcing them in parallel, a scenario which we are exercising in the TEST(competing_enablement) in tools/testing/selftests/landlock/tsync_test.c already, but which has not failed in my own selftest runs. In the crash report, there are four threads in total: * Two are stuck in the line wait_for_completion(&ctx->ready_to_commit); in the per-thread task work (line 128 [4.1]) * Two are stuck in the line wait_for_completion(&shared_ctx.all_prepared) in the calling thread's coordination logic (line 539 [4.2]) In line 539, we are already on the code path where we detected that we are getting interrupted by another thread and where we are attempting to deal with the scenario where two landlock_restrict_self() calls compete. This is detected on line 523 when wait_for_completion_interruptible() is true. The approach to handle this is to set the overall -ERESTARTNOINTR error and cancel the work that has been ongoing so far, by canceling the task works that did not start running yet and waiting for the ones that did start running (that is the step where we are blocked!). The reasoning there was that these task works will all hit the "all_prepared" stage now, but as we can see in the stack trace, the task works that are actively running are already on line 128 and have passed the "all_prepared" stage). Differences I can see between syzkaller and our own test: * The reproducer also calls openat() and then twice socketpair(). These syscalls should be unrelated, but it's possible that the "async" invocation of socketpair() contributes to adding more threads. (Assuming that "async" means "in new thread" in syzkaller) * Syzkaller gives it more attempts. ([2.2]) I do not understand yet what went wrong in our scheme and need to look deeper. Ding, do you have more insights into it from your debugging? Thanks, –Günther For reference: [1] Report Mail: https://lore.kernel.org/all/69984159.050a0220.21cd75.01bb.GAE@google.com/ [2] Report: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817 [2.1] Reproducer: https://syzkaller.appspot.com/text?tag=ReproSyz&x=16e41c02580000 [2.2] Reproducer (C): https://syzkaller.appspot.com/text?tag=ReproC&x=15813652580000 [3] Patch: https://lore.kernel.org/all/6999504d.a70a0220.2c38d7.0154.GAE@google.com/ [4.1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/security/landlock/tsync.c?id=635c467cc14ebdffab3f77610217c1dacaf88e8c#n128 [4.2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/security/landlock/tsync.c?id=635c467cc14ebdffab3f77610217c1dacaf88e8c#n539 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-21 12:00 ` Günther Noack @ 2026-02-21 13:19 ` Günther Noack 2026-02-23 9:42 ` Günther Noack 0 siblings, 1 reply; 11+ messages in thread From: Günther Noack @ 2026-02-21 13:19 UTC (permalink / raw) To: Ding Yihan Cc: syzbot, Mickaël Salaün, linux-security-module, Jann Horn On Sat, Feb 21, 2026 at 01:00:03PM +0100, Günther Noack wrote: > (Very) tentative investigation: > > In the Syzkaller report [2], it seems that the reproducer [2.1] is > creating two rulesets and then enforcing them in parallel, a scenario > which we are exercising in the TEST(competing_enablement) in > tools/testing/selftests/landlock/tsync_test.c already, but which has > not failed in my own selftest runs. > > In the crash report, there are four threads in total: > > * Two are stuck in the line > wait_for_completion(&ctx->ready_to_commit); > in the per-thread task work (line 128 [4.1]) > * Two are stuck in the line > wait_for_completion(&shared_ctx.all_prepared) > in the calling thread's coordination logic (line 539 [4.2]) > > In line 539, we are already on the code path where we detected that we > are getting interrupted by another thread and where we are attempting > to deal with the scenario where two landlock_restrict_self() calls > compete. This is detected on line 523 when > wait_for_completion_interruptible() is true. The approach to handle > this is to set the overall -ERESTARTNOINTR error and cancel the work > that has been ongoing so far, by canceling the task works that did not > start running yet and waiting for the ones that did start running > (that is the step where we are blocked!). The reasoning there was > that these task works will all hit the "all_prepared" stage now, but > as we can see in the stack trace, the task works that are actively > running are already on line 128 and have passed the "all_prepared" > stage). > > Differences I can see between syzkaller and our own test: > > * The reproducer also calls openat() and then twice socketpair(). > These syscalls should be unrelated, but it's possible that the > "async" invocation of socketpair() contributes to adding more > threads. (Assuming that "async" means "in new thread" in syzkaller) > * Syzkaller gives it more attempts. ([2.2]) > > I do not understand yet what went wrong in our scheme and need to look > deeper. OK, I think I understand now. Our existing recovery code for this conflict is this: /* * Decrement num_preparing for current, to undo that we initialized it * to 1 a few lines above. */ if (atomic_dec_return(&shared_ctx.num_preparing) > 0) { if (wait_for_completion_interruptible( &shared_ctx.all_prepared)) { /* In case of interruption, we need to retry the system call. */ atomic_set(&shared_ctx.preparation_error, -ERESTARTNOINTR); /* * Cancel task works for tasks that did not start running yet, * and decrement all_prepared and num_unfinished accordingly. */ cancel_tsync_works(&works, &shared_ctx); /* * The remaining task works have started running, so waiting for * their completion will finish. */ wait_for_completion(&shared_ctx.all_prepared); } } When I wrote this, I assumed, as the last comment states, that the task works which we could not cancel, are already running. I was wrong there, because I had misunderstood task_work_run(). When the task works get run there, it first *atomically dequeues the entire queue of scheduled task works*, and then runs them sequentially. That is why, if we have one task work that belongs to the first landlock_restrict_self() call and one which belongs to the other, the task work which is scheduled later can (a) not be dequeued with cancel_tsync_works() any more, and (b) also has not started running yet. Now the only thing that is necessary to produce the deadlock is that we have a pair of threads where the task works for the restriction calls have been scheduled in different order. When the two landlock_restrict_self() calls end up in the recovery path quoted above, they will wait for one of their task works to run which is blocked from running by another task work that is scheduled before and does not finish either. (Just pasting a brain dump here to save you some time hunting for the root cause. I don't know the best solution yet either.) –Günther ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-21 13:19 ` Günther Noack @ 2026-02-23 9:42 ` Günther Noack 2026-02-23 11:29 ` Ding Yihan 0 siblings, 1 reply; 11+ messages in thread From: Günther Noack @ 2026-02-23 9:42 UTC (permalink / raw) To: Ding Yihan Cc: syzbot, Mickaël Salaün, linux-security-module, Jann Horn, Paul Moore On Sat, Feb 21, 2026 at 02:19:53PM +0100, Günther Noack wrote: > OK, I think I understand now. Our existing recovery code for this > conflict is this: > > /* > * Decrement num_preparing for current, to undo that we initialized it > * to 1 a few lines above. > */ > if (atomic_dec_return(&shared_ctx.num_preparing) > 0) { > if (wait_for_completion_interruptible( > &shared_ctx.all_prepared)) { > /* In case of interruption, we need to retry the system call. */ > atomic_set(&shared_ctx.preparation_error, > -ERESTARTNOINTR); > > /* > * Cancel task works for tasks that did not start running yet, > * and decrement all_prepared and num_unfinished accordingly. > */ > cancel_tsync_works(&works, &shared_ctx); > > /* > * The remaining task works have started running, so waiting for > * their completion will finish. > */ > wait_for_completion(&shared_ctx.all_prepared); > } > } > > When I wrote this, I assumed, as the last comment states, that the > task works which we could not cancel, are already running. > > I was wrong there, because I had misunderstood task_work_run(). When > the task works get run there, it first *atomically dequeues the entire > queue of scheduled task works*, and then runs them sequentially. > > That is why, if we have one task work that belongs to the first > landlock_restrict_self() call and one which belongs to the other, the > task work which is scheduled later can (a) not be dequeued with > cancel_tsync_works() any more, and (b) also has not started running > yet. > > Now the only thing that is necessary to produce the deadlock is that > we have a pair of threads where the task works for the restriction > calls have been scheduled in different order. When the two > landlock_restrict_self() calls end up in the recovery path quoted > above, they will wait for one of their task works to run which is > blocked from running by another task work that is scheduled before and > does not finish either. > > (Just pasting a brain dump here to save you some time hunting for the > root cause. I don't know the best solution yet either.) Let me propose the following fixes: 1. Immediate fix for that specific issue ---------------------------------------- Proposal: * Remove the wait_for_completion(&shared_ctx.all_prepared) call in the code snippet above. * Rewrite surrounding comments: Be clear about the fact that cancel_tsync_works() is an opportunistic improvement, but we don't have a guarantee at all that it cancels any of the enqueued task works (because task_work_run might already have popped them off). This removes the hold-and-wait dependency circle between the threads, which produces the observed deadlock. The way that we shut down now is that we exit the main loop (happens already without it, but we might also "break" to be explicit). I think that this fix or an equivalent one is needed here, because in either way, our assumptions in the quoted code above were wrong. 2. Can we reason constructively about correctness? -------------------------------------------------- The remaining question: If on the shutdown path, we can not actually remove all the enqueued task works, under what circumstances are we even able to interrupt and return from the landlock_restrict_self() system call? 2.1 For n competing restrict_self calls, n-1 of them need to get interrupted ---------------------------------------------------------------------------- To answer this, consider a multithreaded process with threads named "red", "green" and "blue" and many additional threads: When "red", "green" and "blue" enforce landlock_restrict_self() concurrently, due to differing iteration order, we might end up enqueueing the task works on other threads in all of the following combinations: t0: R G B <- front of queue t1: R B G t2: G R B t3: G B R t4: B R G t5: B G R In this configuration, for any of the landlock_restrict_self() system calls to even return (successfully or unsuccessfully), at least two threads must receive an interrupt and therefore remove their enqueued task works from the front of the queue. Assuming those are green and blue, we get: t0: R <- front of queue t1: R t2: G R t3: G B R t4: B R t5: B G R (This works because after the patch above, all of the enqueued G and B works finish even if there are remaining G and B works that are still blocked by an "R" entry.) Now, "R" is in the front of the queue, and the landlock_restrict_self() call for the red thread can finish normally, even without it being interrupted. Once the "R" task works are done as well, the remaining G and B works can run and finish as well. This scheme generalizes: If we have n competing landlock_restrict_self() calls, then in worst case, at least n-1 of these system calls need to be interrupted so that they can all terminate. 2.2 Can we guarantee that two system calls get interrupted? ----------------------------------------------------------- In case of competing landlock_restrict_self() calls, I think it is possible that not all relevant system calls get seen. The scenario is one where we have a "red" and "green" thread calling landlock_restrict_self(). (a set of additional threads) t0: task_works: R G t1: task_works: G R tR: red thread tG: green thread In the red thread, the following happens: * Under RCU, count the number of total threads => get a low number * Allocate space for that number of task_works * Under RCU * Enqueue "R" into t0 and t1 * Enqueue "R" for some of the "additional threads" * But we do not have enough pre-allocated space to enqueue "R" for the green thread tG. The same thing happens in the green thread as well. The result is that we still have a deadlock between t0 and t1, but neither the red nor the green thread get interrupted so that they can resolve it. (FWIW, you could resolve it from the outside by sending a signal to the red or green thread manually, but it is not guaranteed to happen on its own.) Caveat: I am making pessimistic assumptions about the iteration order of the task list here, and I am assuming that the number of "additional threads" is swinging up and down during the competing enforcement, so that the enforcing threads are mis-approximating the required space for memory pre-allocation. 2.3 Possible resolutions ------------------------ * We could try to interrupt all sibling threads during the teardown, to fix the issue discussed in 2.2. (Downside: Complicated, more expensive) * The reason why landlock_restrict_self() can't return is because it needs to wait until all task works are done before it can free the memory. Alternatively, we could make the task works take ownership of these memory structures (refcounting the shared_ctx). (Downside: The used memory is not linear to the number of threads any more.) Side remark: In testing, I had the impression that the landlock_restrict_self() calls can go into a retry loop for a while where all competing threads get interrupted all the time; in a debug build, when the Syzkaller test prints out a line for each attempt, sometimes it was hanging for seconds and *then* resolving itself again. 3 Conclusion --------------- I would prefer if the final solution would not require deadlock reasoning at that level and we could do it in simpler way. I therefore propose to do what Ding Yihan suggested, and what we had also discussed previously in the code review: * Let's serialize the landlock_restrict_self()-with-TSYNC operations through the cred_guard_mutex. This will resolve the issue where competing landlock_restrict_self() calls with TSYNC can deadlock. It will also remove the jittery behavior for that worst case where the conflict is resolved through retry. So in my mind, we need both patches: * The fix to the cleanup path from 1. above, to make interruption work more reliably and to correct the misunderstandings in the comments. * cred_guard_mutex to serialize the TSYNC invocations. Please let me know what you think. Thanks, –Günther ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-23 9:42 ` Günther Noack @ 2026-02-23 11:29 ` Ding Yihan 2026-02-23 15:16 ` Günther Noack 0 siblings, 1 reply; 11+ messages in thread From: Ding Yihan @ 2026-02-23 11:29 UTC (permalink / raw) To: Günther Noack Cc: syzbot, Mickaël Salaün, linux-security-module, Jann Horn, Paul Moore Hi Günther, Thank you for the detailed analysis and the clear breakdown. Apologies for the delayed response. I spent the last couple of days thoroughly reading through the previous mailing list discussions. I was trying hard to see if there was any viable pure lockless design that could solve this concurrency issue while preserving the original architecture. However, after looking at the complexities you outlined, I completely agree with your conclusion: serializing the TSYNC operations is indeed the most robust and reasonable path forward to prevent the deadlock. Regarding the lock choice, since 'cred_guard_mutex' is explicitly marked as deprecated for new code in the kernel,maybe we can use its modern replacement: 'exec_update_lock' (using down_write_trylock / up_write on current->signal). This aligns with the current subsystem standards and was also briefly touched upon by Jann in the older discussions. I fully understand the requirement for the two-part patch series: 1. Cleaning up the cancellation logic and comments. 2. Introducing the serialization lock for TSYNC. I will take some time to draft and test this patch series properly. I also plan to discuss this with my kernel colleagues here at UnionTech to see if they have any additional suggestions on the implementation details before I submit it. I will send out the v1 patch series to the list as soon as it is ready. Thanks again for your guidance and the great discussion! Best regards, Yihan Ding 在 2026/2/23 17:42, Günther Noack 写道: > On Sat, Feb 21, 2026 at 02:19:53PM +0100, Günther Noack wrote: >> OK, I think I understand now. Our existing recovery code for this >> conflict is this: >> >> /* >> * Decrement num_preparing for current, to undo that we initialized it >> * to 1 a few lines above. >> */ >> if (atomic_dec_return(&shared_ctx.num_preparing) > 0) { >> if (wait_for_completion_interruptible( >> &shared_ctx.all_prepared)) { >> /* In case of interruption, we need to retry the system call. */ >> atomic_set(&shared_ctx.preparation_error, >> -ERESTARTNOINTR); >> >> /* >> * Cancel task works for tasks that did not start running yet, >> * and decrement all_prepared and num_unfinished accordingly. >> */ >> cancel_tsync_works(&works, &shared_ctx); >> >> /* >> * The remaining task works have started running, so waiting for >> * their completion will finish. >> */ >> wait_for_completion(&shared_ctx.all_prepared); >> } >> } >> >> When I wrote this, I assumed, as the last comment states, that the >> task works which we could not cancel, are already running. >> >> I was wrong there, because I had misunderstood task_work_run(). When >> the task works get run there, it first *atomically dequeues the entire >> queue of scheduled task works*, and then runs them sequentially. >> >> That is why, if we have one task work that belongs to the first >> landlock_restrict_self() call and one which belongs to the other, the >> task work which is scheduled later can (a) not be dequeued with >> cancel_tsync_works() any more, and (b) also has not started running >> yet. >> >> Now the only thing that is necessary to produce the deadlock is that >> we have a pair of threads where the task works for the restriction >> calls have been scheduled in different order. When the two >> landlock_restrict_self() calls end up in the recovery path quoted >> above, they will wait for one of their task works to run which is >> blocked from running by another task work that is scheduled before and >> does not finish either. >> >> (Just pasting a brain dump here to save you some time hunting for the >> root cause. I don't know the best solution yet either.) > > Let me propose the following fixes: > > 1. Immediate fix for that specific issue > ---------------------------------------- > > Proposal: > * Remove the wait_for_completion(&shared_ctx.all_prepared) > call in the code snippet above. > * Rewrite surrounding comments: Be clear about the fact that > cancel_tsync_works() is an opportunistic improvement, but we don't > have a guarantee at all that it cancels any of the enqueued task > works (because task_work_run might already have popped them off). > > This removes the hold-and-wait dependency circle between the threads, > which produces the observed deadlock. The way that we shut down now > is that we exit the main loop (happens already without it, but we > might also "break" to be explicit). > > I think that this fix or an equivalent one is needed here, because in > either way, our assumptions in the quoted code above were wrong. > > > 2. Can we reason constructively about correctness? > -------------------------------------------------- > > The remaining question: If on the shutdown path, we can not actually > remove all the enqueued task works, under what circumstances are we > even able to interrupt and return from the landlock_restrict_self() > system call? > > 2.1 For n competing restrict_self calls, n-1 of them need to get interrupted > ---------------------------------------------------------------------------- > > To answer this, consider a multithreaded process with threads named > "red", "green" and "blue" and many additional threads: When "red", > "green" and "blue" enforce landlock_restrict_self() concurrently, due > to differing iteration order, we might end up enqueueing the task > works on other threads in all of the following combinations: > > t0: R G B <- front of queue > t1: R B G > t2: G R B > t3: G B R > t4: B R G > t5: B G R > > In this configuration, for any of the landlock_restrict_self() system > calls to even return (successfully or unsuccessfully), at least two > threads must receive an interrupt and therefore remove their enqueued > task works from the front of the queue. Assuming those are green and > blue, we get: > > t0: R <- front of queue > t1: R > t2: G R > t3: G B R > t4: B R > t5: B G R > > (This works because after the patch above, all of the enqueued G and B > works finish even if there are remaining G and B works that are still > blocked by an "R" entry.) > > Now, "R" is in the front of the queue, and the > landlock_restrict_self() call for the red thread can finish normally, > even without it being interrupted. > > Once the "R" task works are done as well, the remaining G and B works > can run and finish as well. > > This scheme generalizes: If we have n competing > landlock_restrict_self() calls, then in worst case, at least n-1 of > these system calls need to be interrupted so that they can all > terminate. > > 2.2 Can we guarantee that two system calls get interrupted? > ----------------------------------------------------------- > > In case of competing landlock_restrict_self() calls, I think it is > possible that not all relevant system calls get seen. The scenario is > one where we have a "red" and "green" thread calling > landlock_restrict_self(). > > (a set of additional threads) > t0: task_works: R G > t1: task_works: G R > tR: red thread > tG: green thread > > In the red thread, the following happens: > * Under RCU, count the number of total threads => get a low number > * Allocate space for that number of task_works > * Under RCU > * Enqueue "R" into t0 and t1 > * Enqueue "R" for some of the "additional threads" > * But we do not have enough pre-allocated space to enqueue "R" for > the green thread tG. > > The same thing happens in the green thread as well. > > The result is that we still have a deadlock between t0 and t1, but > neither the red nor the green thread get interrupted so that they can > resolve it. > > (FWIW, you could resolve it from the outside by sending a signal to > the red or green thread manually, but it is not guaranteed to happen > on its own.) > > Caveat: I am making pessimistic assumptions about the iteration order > of the task list here, and I am assuming that the number of > "additional threads" is swinging up and down during the competing > enforcement, so that the enforcing threads are mis-approximating the > required space for memory pre-allocation. > > 2.3 Possible resolutions > ------------------------ > > * We could try to interrupt all sibling threads during the teardown, > to fix the issue discussed in 2.2. (Downside: Complicated, more > expensive) > * The reason why landlock_restrict_self() can't return is because it > needs to wait until all task works are done before it can free the > memory. Alternatively, we could make the task works take ownership > of these memory structures (refcounting the shared_ctx). (Downside: > The used memory is not linear to the number of threads any more.) > > Side remark: In testing, I had the impression that the > landlock_restrict_self() calls can go into a retry loop for a while > where all competing threads get interrupted all the time; in a debug > build, when the Syzkaller test prints out a line for each attempt, > sometimes it was hanging for seconds and *then* resolving itself > again. > > 3 Conclusion > --------------- > > I would prefer if the final solution would not require deadlock > reasoning at that level and we could do it in simpler way. I > therefore propose to do what Ding Yihan suggested, and what we had > also discussed previously in the code review: > > * Let's serialize the landlock_restrict_self()-with-TSYNC operations > through the cred_guard_mutex. > > This will resolve the issue where competing landlock_restrict_self() > calls with TSYNC can deadlock. It will also remove the jittery > behavior for that worst case where the conflict is resolved through > retry. > > > So in my mind, we need both patches: > > * The fix to the cleanup path from 1. above, to make interruption > work more reliably and to correct the misunderstandings in the > comments. > * cred_guard_mutex to serialize the TSYNC invocations. > > Please let me know what you think. > > Thanks, > –Günther > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-23 11:29 ` Ding Yihan @ 2026-02-23 15:16 ` Günther Noack 2026-02-24 3:02 ` Ding Yihan 0 siblings, 1 reply; 11+ messages in thread From: Günther Noack @ 2026-02-23 15:16 UTC (permalink / raw) To: Ding Yihan Cc: Günther Noack, syzbot, Mickaël Salaün, linux-security-module, Jann Horn, Paul Moore Hello! On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote: > Thank you for the detailed analysis and the clear breakdown. > Apologies for the delayed response. I spent the last couple of days > thoroughly reading through the previous mailing list discussions. I > was trying hard to see if there was any viable pure lockless design > that could solve this concurrency issue while preserving the original > architecture. > > However, after looking at the complexities you outlined, I completely > agree with your conclusion: serializing the TSYNC operations is indeed > the most robust and reasonable path forward to prevent the deadlock. > > Regarding the lock choice, since 'cred_guard_mutex' is explicitly > marked as deprecated for new code in the kernel,maybe we can use its > modern replacement: 'exec_update_lock' (using down_write_trylock / > up_write on current->signal). This aligns with the current subsystem > standards and was also briefly touched upon by Jann in the older > discussions. > > I fully understand the requirement for the two-part patch series: > 1. Cleaning up the cancellation logic and comments. > 2. Introducing the serialization lock for TSYNC. > > I will take some time to draft and test this patch series properly. > I also plan to discuss this with my kernel colleagues here at > UnionTech to see if they have any additional suggestions on the > implementation details before I submit it. > > I will send out the v1 patch series to the list as soon as it is > ready. Thanks again for your guidance and the great discussion! Thank you, Ding, this is much appreciated! I agree, the `exec_update_lock` might be the better solution; I also need to familiarize myself more with it to double-check. —Günther ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-23 15:16 ` Günther Noack @ 2026-02-24 3:02 ` Ding Yihan 2026-02-24 3:03 ` syzbot 0 siblings, 1 reply; 11+ messages in thread From: Ding Yihan @ 2026-02-24 3:02 UTC (permalink / raw) To: Günther Noack Cc: Günther Noack, syzbot, Mickaël Salaün, linux-security-module, Jann Horn, Paul Moore Hi Günther, Thank you for the detailed analysis! I completely agree that serializing the TSYNC operations is the right way to prevent this deadlock. I have drafted a patch using `exec_update_lock` (similar to how seccomp uses `cred_guard_mutex`). Regarding your proposal to split this into two patches (one for the cleanup path and one for the lock): Maybe combining them into a single patch is a better choice. Here is why: We actually *cannot* remove `wait_for_completion(&shared_ctx.all_prepared)` in the interrupt recovery path. Since `shared_ctx` is allocated on the local stack of the caller, removing the wait would cause a severe Use-After-Free (UAF) if the thread returns to userspace while sibling task_works are still executing and dereferencing `ctx`. By adding the lock, we inherently resolve the deadlock, meaning the sibling task_works will never get stuck. Thus, `wait_for_completion` becomes perfectly safe to keep, and it remains strictly necessary to protect the stack memory. Therefore, the "fix" for the cleanup path is simply updating the comments to reflect this reality, which is tightly coupled with the locking fix. It felt more cohesive as a single patch. I have test the patch on my laptop,and it will not trigger the issue.Let's have syzbot test this combined logic: #syz test: --- a/security/landlock/tsync.c +++ b/security/landlock/tsync.c @@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred, shared_ctx.new_cred = new_cred; shared_ctx.set_no_new_privs = task_no_new_privs(current); + /* + * Serialize concurrent TSYNC operations to prevent deadlocks + * when multiple threads call landlock_restrict_self() simultaneously. + */ + down_write(¤t->signal->exec_update_lock); + /* * We schedule a pseudo-signal task_work for each of the calling task's * sibling threads. In the task work, each thread: @@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred, -ERESTARTNOINTR); /* - * Cancel task works for tasks that did not start running yet, - * and decrement all_prepared and num_unfinished accordingly. + * Opportunistic improvement: try to cancel task works + * for tasks that did not start running yet. We do not + * have a guarantee that it cancels any of the enqueued + * task works (because task_work_run() might already have + * dequeued them). */ cancel_tsync_works(&works, &shared_ctx); /* - * The remaining task works have started running, so waiting for - * their completion will finish. + * We must wait for the remaining task works to finish to + * prevent a use-after-free of the local shared_ctx. */ wait_for_completion(&shared_ctx.all_prepared); } @@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred, tsync_works_release(&works); + up_write(¤t->signal->exec_update_lock); + return atomic_read(&shared_ctx.preparation_error); } -- 在 2026/2/23 23:16, Günther Noack 写道: > Hello! > > On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote: >> Thank you for the detailed analysis and the clear breakdown. >> Apologies for the delayed response. I spent the last couple of days >> thoroughly reading through the previous mailing list discussions. I >> was trying hard to see if there was any viable pure lockless design >> that could solve this concurrency issue while preserving the original >> architecture. >> >> However, after looking at the complexities you outlined, I completely >> agree with your conclusion: serializing the TSYNC operations is indeed >> the most robust and reasonable path forward to prevent the deadlock. >> >> Regarding the lock choice, since 'cred_guard_mutex' is explicitly >> marked as deprecated for new code in the kernel,maybe we can use its >> modern replacement: 'exec_update_lock' (using down_write_trylock / >> up_write on current->signal). This aligns with the current subsystem >> standards and was also briefly touched upon by Jann in the older >> discussions. >> >> I fully understand the requirement for the two-part patch series: >> 1. Cleaning up the cancellation logic and comments. >> 2. Introducing the serialization lock for TSYNC. >> >> I will take some time to draft and test this patch series properly. >> I also plan to discuss this with my kernel colleagues here at >> UnionTech to see if they have any additional suggestions on the >> implementation details before I submit it. >> >> I will send out the v1 patch series to the list as soon as it is >> ready. Thanks again for your guidance and the great discussion! > > Thank you, Ding, this is much appreciated! > > I agree, the `exec_update_lock` might be the better solution; > I also need to familiarize myself more with it to double-check. > > —Günther > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-24 3:02 ` Ding Yihan @ 2026-02-24 3:03 ` syzbot 0 siblings, 0 replies; 11+ messages in thread From: syzbot @ 2026-02-24 3:03 UTC (permalink / raw) To: dingyihan Cc: dingyihan, gnoack3000, gnoack, jannh, linux-security-module, mic, paul, linux-kernel, syzkaller-bugs > Hi Günther, > > Thank you for the detailed analysis! I completely agree that serializing the TSYNC > operations is the right way to prevent this deadlock. I have drafted a patch using > `exec_update_lock` (similar to how seccomp uses `cred_guard_mutex`). > > Regarding your proposal to split this into two patches (one for the cleanup > path and one for the lock): Maybe combining them into a single patch is a better choice. Here is why: > > We actually *cannot* remove `wait_for_completion(&shared_ctx.all_prepared)` > in the interrupt recovery path. Since `shared_ctx` is allocated on the local > stack of the caller, removing the wait would cause a severe Use-After-Free (UAF) if the > thread returns to userspace while sibling task_works are still executing and dereferencing `ctx`. > > By adding the lock, we inherently resolve the deadlock, meaning the sibling task_works > will never get stuck. Thus, `wait_for_completion` becomes perfectly safe to keep, > and it remains strictly necessary to protect the stack memory. Therefore, the "fix" for the > cleanup path is simply updating the comments to reflect this reality, which is tightly coupled with the locking fix. > It felt more cohesive as a single patch. > > I have test the patch on my laptop,and it will not trigger the issue.Let's have syzbot test this combined logic: > > #syz test: "---" does not look like a valid git repo address. > > --- a/security/landlock/tsync.c > > +++ b/security/landlock/tsync.c > > @@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred, > > shared_ctx.new_cred = new_cred; > > shared_ctx.set_no_new_privs = task_no_new_privs(current); > > > > + /* > > + * Serialize concurrent TSYNC operations to prevent deadlocks > > + * when multiple threads call landlock_restrict_self() simultaneously. > > + */ > > + down_write(¤t->signal->exec_update_lock); > > + > > /* > > * We schedule a pseudo-signal task_work for each of the calling task's > > * sibling threads. In the task work, each thread: > > @@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred, > > -ERESTARTNOINTR); > > > > /* > > - * Cancel task works for tasks that did not start running yet, > > - * and decrement all_prepared and num_unfinished accordingly. > > + * Opportunistic improvement: try to cancel task works > > + * for tasks that did not start running yet. We do not > > + * have a guarantee that it cancels any of the enqueued > > + * task works (because task_work_run() might already have > > + * dequeued them). > > */ > > cancel_tsync_works(&works, &shared_ctx); > > > > /* > > - * The remaining task works have started running, so waiting for > > - * their completion will finish. > > + * We must wait for the remaining task works to finish to > > + * prevent a use-after-free of the local shared_ctx. > > */ > > wait_for_completion(&shared_ctx.all_prepared); > > } > > @@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred, > > > > tsync_works_release(&works); > > > > + up_write(¤t->signal->exec_update_lock); > > + > > return atomic_read(&shared_ctx.preparation_error); > > } > > -- > 在 2026/2/23 23:16, Günther Noack 写道: >> Hello! >> >> On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote: >>> Thank you for the detailed analysis and the clear breakdown. >>> Apologies for the delayed response. I spent the last couple of days >>> thoroughly reading through the previous mailing list discussions. I >>> was trying hard to see if there was any viable pure lockless design >>> that could solve this concurrency issue while preserving the original >>> architecture. >>> >>> However, after looking at the complexities you outlined, I completely >>> agree with your conclusion: serializing the TSYNC operations is indeed >>> the most robust and reasonable path forward to prevent the deadlock. >>> >>> Regarding the lock choice, since 'cred_guard_mutex' is explicitly >>> marked as deprecated for new code in the kernel,maybe we can use its >>> modern replacement: 'exec_update_lock' (using down_write_trylock / >>> up_write on current->signal). This aligns with the current subsystem >>> standards and was also briefly touched upon by Jann in the older >>> discussions. >>> >>> I fully understand the requirement for the two-part patch series: >>> 1. Cleaning up the cancellation logic and comments. >>> 2. Introducing the serialization lock for TSYNC. >>> >>> I will take some time to draft and test this patch series properly. >>> I also plan to discuss this with my kernel colleagues here at >>> UnionTech to see if they have any additional suggestions on the >>> implementation details before I submit it. >>> >>> I will send out the v1 patch series to the list as soon as it is >>> ready. Thanks again for your guidance and the great discussion! >> >> Thank you, Ding, this is much appreciated! >> >> I agree, the `exec_update_lock` might be the better solution; >> I also need to familiarize myself more with it to double-check. >> >> —Günther >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback 2026-02-21 7:28 ` Ding Yihan 2026-02-21 12:00 ` Günther Noack @ 2026-02-24 14:43 ` Günther Noack 1 sibling, 0 replies; 11+ messages in thread From: Günther Noack @ 2026-02-24 14:43 UTC (permalink / raw) To: syzbot; +Cc: linux-security-module #syz set subsystems: lsm, kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-02-24 14:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <69984159.050a0220.21cd75.01bb.GAE@google.com>
2026-02-23 13:40 ` [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback Frederic Weisbecker
2026-02-23 15:15 ` Günther Noack
[not found] <69995a88.050a0220.340abe.0d25.GAE@google.com>
2026-02-21 7:28 ` Ding Yihan
2026-02-21 12:00 ` Günther Noack
2026-02-21 13:19 ` Günther Noack
2026-02-23 9:42 ` Günther Noack
2026-02-23 11:29 ` Ding Yihan
2026-02-23 15:16 ` Günther Noack
2026-02-24 3:02 ` Ding Yihan
2026-02-24 3:03 ` syzbot
2026-02-24 14:43 ` Günther Noack
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox