public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [cgroups?] possible deadlock in task_rq_lock
@ 2024-08-16  5:50 syzbot
  2024-08-17  6:52 ` syzbot
  2024-08-18  7:05 ` syzbot
  0 siblings, 2 replies; 8+ messages in thread
From: syzbot @ 2024-08-16  5:50 UTC (permalink / raw)
  To: cgroups, hannes, linux-kernel, lizefan.x, mkoutny, syzkaller-bugs,
	tj

Hello,

syzbot found the following issue on:

HEAD commit:    edd1ec2e3a9f Add linux-next specific files for 20240815
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=10ba4ed5980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61ba6f3b22ee5467
dashboard link: https://syzkaller.appspot.com/bug?extid=ca14b36a46a8c541b509
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/d063ca2a0d8c/disk-edd1ec2e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/08634f60bc99/vmlinux-edd1ec2e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e6f08ac13836/bzImage-edd1ec2e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ca14b36a46a8c541b509@syzkaller.appspotmail.com

------------[ cut here ]------------
======================================================
WARNING: possible circular locking dependency detected
6.11.0-rc3-next-20240815-syzkaller #0 Not tainted
------------------------------------------------------
dhcpcd-run-hook/12621 is trying to acquire lock:
ffffffff8e815038 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139

but task is already holding lock:
ffff8880b913ea58 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:587

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&rq->__lock){-.-.}-{2:2}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
       _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
       raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:587
       raw_spin_rq_lock kernel/sched/sched.h:1485 [inline]
       task_rq_lock+0xc6/0x360 kernel/sched/core.c:689
       cgroup_move_task+0x92/0x2d0 kernel/sched/psi.c:1161
       css_set_move_task+0x72e/0x950 kernel/cgroup/cgroup.c:898
       cgroup_post_fork+0x256/0x880 kernel/cgroup/cgroup.c:6690
       copy_process+0x3ab1/0x3e30 kernel/fork.c:2620
       kernel_clone+0x226/0x8f0 kernel/fork.c:2806
       user_mode_thread+0x132/0x1a0 kernel/fork.c:2884
       rest_init+0x23/0x300 init/main.c:712
       start_kernel+0x47a/0x500 init/main.c:1103
       x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
       x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
       common_startup_64+0x13e/0x147

-> #1 (&p->pi_lock){-.-.}-{2:2}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
       try_to_wake_up+0xb0/0x1470 kernel/sched/core.c:4113
       up+0x72/0x90 kernel/locking/semaphore.c:191
       __up_console_sem kernel/printk/printk.c:340 [inline]
       __console_unlock kernel/printk/printk.c:2801 [inline]
       console_unlock+0x22f/0x4d0 kernel/printk/printk.c:3120
       vprintk_emit+0x5dc/0x7c0 kernel/printk/printk.c:2348
       _printk+0xd5/0x120 kernel/printk/printk.c:2373
       set_capacity_and_notify+0x1ae/0x240 block/genhd.c:86
       loop_set_size+0x44/0xb0 drivers/block/loop.c:232
       loop_configure+0x9fb/0xec0 drivers/block/loop.c:1102
       lo_ioctl+0x849/0x1f60
       blkdev_ioctl+0x580/0x6b0 block/ioctl.c:676
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 ((console_sem).lock){-.-.}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:3136 [inline]
       check_prevs_add kernel/locking/lockdep.c:3255 [inline]
       validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3871
       __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5145
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
       __down_trylock_console_sem+0x109/0x250 kernel/printk/printk.c:323
       console_trylock kernel/printk/printk.c:2754 [inline]
       console_trylock_spinning kernel/printk/printk.c:1958 [inline]
       vprintk_emit+0x2aa/0x7c0 kernel/printk/printk.c:2347
       _printk+0xd5/0x120 kernel/printk/printk.c:2373
       __report_bug lib/bug.c:195 [inline]
       report_bug+0x346/0x500 lib/bug.c:219
       handle_bug+0x60/0x90 arch/x86/kernel/traps.c:285
       exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:309
       asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
       lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
       rq_clock kernel/sched/sched.h:1624 [inline]
       replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
       update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
       update_curr+0x575/0xb20 kernel/sched/fair.c:1176
       put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
       put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
       put_prev_task kernel/sched/sched.h:2423 [inline]
       put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
       __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
       pick_next_task kernel/sched/core.c:6012 [inline]
       __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
       preempt_schedule_irq+0xfb/0x1c0 kernel/sched/core.c:6961
       irqentry_exit+0x5e/0x90 kernel/entry/common.c:354
       asm_sysvec_reschedule_ipi+0x1a/0x20 arch/x86/include/asm/idtentry.h:707
       lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5766
       down_read_trylock+0x24f/0x3c0 kernel/locking/rwsem.c:1568
       mmap_read_trylock include/linux/mmap_lock.h:163 [inline]
       get_mmap_lock_carefully mm/memory.c:6033 [inline]
       lock_mm_and_find_vma+0x32/0x2f0 mm/memory.c:6093
       do_user_addr_fault arch/x86/mm/fault.c:1361 [inline]
       handle_page_fault arch/x86/mm/fault.c:1481 [inline]
       exc_page_fault+0x1bf/0x8c0 arch/x86/mm/fault.c:1539
       asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
       __put_user_4+0x11/0x20 arch/x86/lib/putuser.S:86
       schedule_tail+0x96/0xb0 kernel/sched/core.c:5205
       ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:143
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
  (console_sem).lock --> &p->pi_lock --> &rq->__lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rq->__lock);
                               lock(&p->pi_lock);
                               lock(&rq->__lock);
  lock((console_sem).lock);

 *** DEADLOCK ***

2 locks held by dhcpcd-run-hook/12621:
 #0: ffff888011d14d98 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_trylock include/linux/mmap_lock.h:163 [inline]
 #0: ffff888011d14d98 (&mm->mmap_lock){++++}-{3:3}, at: get_mmap_lock_carefully mm/memory.c:6033 [inline]
 #0: ffff888011d14d98 (&mm->mmap_lock){++++}-{3:3}, at: lock_mm_and_find_vma+0x32/0x2f0 mm/memory.c:6093
 #1: ffff8880b913ea58 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:587

stack backtrace:
CPU: 1 UID: 0 PID: 12621 Comm: dhcpcd-run-hook Not tainted 6.11.0-rc3-next-20240815-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2189
 check_prev_add kernel/locking/lockdep.c:3136 [inline]
 check_prevs_add kernel/locking/lockdep.c:3255 [inline]
 validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3871
 __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5145
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
 down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
 __down_trylock_console_sem+0x109/0x250 kernel/printk/printk.c:323
 console_trylock kernel/printk/printk.c:2754 [inline]
 console_trylock_spinning kernel/printk/printk.c:1958 [inline]
 vprintk_emit+0x2aa/0x7c0 kernel/printk/printk.c:2347
 _printk+0xd5/0x120 kernel/printk/printk.c:2373
 __report_bug lib/bug.c:195 [inline]
 report_bug+0x346/0x500 lib/bug.c:219
 handle_bug+0x60/0x90 arch/x86/kernel/traps.c:285
 exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:309
 asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
RIP: 0010:lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
RIP: 0010:rq_clock kernel/sched/sched.h:1624 [inline]
RIP: 0010:replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
RIP: 0010:update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
Code: b5 50 fe ff ff 4c 89 ff ba 20 00 00 00 e8 e9 4f 00 00 e9 58 fe ff ff 4c 89 ef be 20 00 00 00 e8 b7 13 00 00 e9 46 fe ff ff 90 <0f> 0b 90 e9 be fb ff ff 89 f1 80 e1 07 38 c1 0f 8c b5 f9 ff ff 48
RSP: 0018:ffffc90004faf668 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880b903ea40 RCX: 0000000000000003
RDX: dffffc0000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c609dc0
RBP: 0000000000000031 R08: ffff8880b902c883 R09: 1ffff11017205910
R10: dffffc0000000000 R11: ffffed1017205911 R12: ffff8880b903f468
R13: ffff8880b903f428 R14: 1ffff11017207e8f R15: ffff8880b903f858
 update_curr+0x575/0xb20 kernel/sched/fair.c:1176
 put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
 put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
 put_prev_task kernel/sched/sched.h:2423 [inline]
 put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
 __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
 pick_next_task kernel/sched/core.c:6012 [inline]
 __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
 preempt_schedule_irq+0xfb/0x1c0 kernel/sched/core.c:6961
 irqentry_exit+0x5e/0x90 kernel/entry/common.c:354
 asm_sysvec_reschedule_ipi+0x1a/0x20 arch/x86/include/asm/idtentry.h:707
RIP: 0010:lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5766
Code: 2b 00 74 08 4c 89 f7 e8 2a 39 8c 00 f6 44 24 61 02 0f 85 85 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25
RSP: 0018:ffffc90004fafba0 EFLAGS: 00000206
RAX: 0000000000000001 RBX: 1ffff920009f5f80 RCX: 5f14a1617595db00
RDX: dffffc0000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c609dc0
RBP: ffffc90004fafce8 R08: ffffffff9375c837 R09: 1ffffffff26eb906
R10: dffffc0000000000 R11: fffffbfff26eb907 R12: 1ffff920009f5f7c
R13: dffffc0000000000 R14: ffffc90004fafc00 R15: 0000000000000246
 down_read_trylock+0x24f/0x3c0 kernel/locking/rwsem.c:1568
 mmap_read_trylock include/linux/mmap_lock.h:163 [inline]
 get_mmap_lock_carefully mm/memory.c:6033 [inline]
 lock_mm_and_find_vma+0x32/0x2f0 mm/memory.c:6093
 do_user_addr_fault arch/x86/mm/fault.c:1361 [inline]
 handle_page_fault arch/x86/mm/fault.c:1481 [inline]
 exc_page_fault+0x1bf/0x8c0 arch/x86/mm/fault.c:1539
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
RSP: 0018:ffffc90004faff00 EFLAGS: 00050206
RAX: 000000000000314d RBX: 0000000000000000 RCX: 00007f0b3e35e650
RDX: 0000000000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c609dc0
RBP: ffff8880217fa490 R08: ffffffff9018b8af R09: 1ffffffff2031715
R10: dffffc0000000000 R11: fffffbfff2031716 R12: 0000000000000000
R13: 0000000000000000 R14: 000000000000314d R15: dffffc0000000000
 schedule_tail+0x96/0xb0 kernel/sched/core.c:5205
 ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:143
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
WARNING: CPU: 1 PID: 12621 at kernel/sched/sched.h:1476 lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
WARNING: CPU: 1 PID: 12621 at kernel/sched/sched.h:1476 rq_clock kernel/sched/sched.h:1624 [inline]
WARNING: CPU: 1 PID: 12621 at kernel/sched/sched.h:1476 replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
WARNING: CPU: 1 PID: 12621 at kernel/sched/sched.h:1476 update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
Modules linked in:
CPU: 1 UID: 0 PID: 12621 Comm: dhcpcd-run-hook Not tainted 6.11.0-rc3-next-20240815-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
RIP: 0010:lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
RIP: 0010:rq_clock kernel/sched/sched.h:1624 [inline]
RIP: 0010:replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
RIP: 0010:update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
Code: b5 50 fe ff ff 4c 89 ff ba 20 00 00 00 e8 e9 4f 00 00 e9 58 fe ff ff 4c 89 ef be 20 00 00 00 e8 b7 13 00 00 e9 46 fe ff ff 90 <0f> 0b 90 e9 be fb ff ff 89 f1 80 e1 07 38 c1 0f 8c b5 f9 ff ff 48
RSP: 0018:ffffc90004faf668 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880b903ea40 RCX: 0000000000000003
RDX: dffffc0000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c609dc0
RBP: 0000000000000031 R08: ffff8880b902c883 R09: 1ffff11017205910
R10: dffffc0000000000 R11: ffffed1017205911 R12: ffff8880b903f468
R13: ffff8880b903f428 R14: 1ffff11017207e8f R15: ffff8880b903f858
FS:  00007f0b3e35e380(0000) GS:ffff8880b9100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0b3e35e650 CR3: 000000006f33e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 update_curr+0x575/0xb20 kernel/sched/fair.c:1176
 put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
 put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
 put_prev_task kernel/sched/sched.h:2423 [inline]
 put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
 __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
 pick_next_task kernel/sched/core.c:6012 [inline]
 __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
 preempt_schedule_irq+0xfb/0x1c0 kernel/sched/core.c:6961
 irqentry_exit+0x5e/0x90 kernel/entry/common.c:354
 asm_sysvec_reschedule_ipi+0x1a/0x20 arch/x86/include/asm/idtentry.h:707
RIP: 0010:lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5766
Code: 2b 00 74 08 4c 89 f7 e8 2a 39 8c 00 f6 44 24 61 02 0f 85 85 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25
RSP: 0018:ffffc90004fafba0 EFLAGS: 00000206
RAX: 0000000000000001 RBX: 1ffff920009f5f80 RCX: 5f14a1617595db00
RDX: dffffc0000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c609dc0
RBP: ffffc90004fafce8 R08: ffffffff9375c837 R09: 1ffffffff26eb906
R10: dffffc0000000000 R11: fffffbfff26eb907 R12: 1ffff920009f5f7c
R13: dffffc0000000000 R14: ffffc90004fafc00 R15: 0000000000000246
 down_read_trylock+0x24f/0x3c0 kernel/locking/rwsem.c:1568
 mmap_read_trylock include/linux/mmap_lock.h:163 [inline]
 get_mmap_lock_carefully mm/memory.c:6033 [inline]
 lock_mm_and_find_vma+0x32/0x2f0 mm/memory.c:6093
 do_user_addr_fault arch/x86/mm/fault.c:1361 [inline]
 handle_page_fault arch/x86/mm/fault.c:1481 [inline]
 exc_page_fault+0x1bf/0x8c0 arch/x86/mm/fault.c:1539
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
RSP: 0018:ffffc90004faff00 EFLAGS: 00050206
RAX: 000000000000314d RBX: 0000000000000000 RCX: 00007f0b3e35e650
RDX: 0000000000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c609dc0
RBP: ffff8880217fa490 R08: ffffffff9018b8af R09: 1ffffffff2031715
R10: dffffc0000000000 R11: fffffbfff2031716 R12: 0000000000000000
R13: 0000000000000000 R14: 000000000000314d R15: dffffc0000000000
 schedule_tail+0x96/0xb0 kernel/sched/core.c:5205
 ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:143
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
----------------
Code disassembly (best guess):
   0:	2b 00                	sub    (%rax),%eax
   2:	74 08                	je     0xc
   4:	4c 89 f7             	mov    %r14,%rdi
   7:	e8 2a 39 8c 00       	call   0x8c3936
   c:	f6 44 24 61 02       	testb  $0x2,0x61(%rsp)
  11:	0f 85 85 01 00 00    	jne    0x19c
  17:	41 f7 c7 00 02 00 00 	test   $0x200,%r15d
  1e:	74 01                	je     0x21
  20:	fb                   	sti
  21:	48 c7 44 24 40 0e 36 	movq   $0x45e0360e,0x40(%rsp)
  28:	e0 45
* 2a:	4b c7 44 25 00 00 00 	movq   $0x0,0x0(%r13,%r12,1) <-- trapping instruction
  31:	00 00
  33:	43 c7 44 25 09 00 00 	movl   $0x0,0x9(%r13,%r12,1)
  3a:	00 00
  3c:	43                   	rex.XB
  3d:	c7                   	.byte 0xc7
  3e:	44                   	rex.R
  3f:	25                   	.byte 0x25


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-16  5:50 [syzbot] [cgroups?] possible deadlock in task_rq_lock syzbot
@ 2024-08-17  6:52 ` syzbot
  2024-08-17  8:31   ` Hillf Danton
                     ` (2 more replies)
  2024-08-18  7:05 ` syzbot
  1 sibling, 3 replies; 8+ messages in thread
From: syzbot @ 2024-08-17  6:52 UTC (permalink / raw)
  To: cgroups, hannes, linux-kernel, lizefan.x, mkoutny, syzkaller-bugs,
	tj

syzbot has found a reproducer for the following issue on:

HEAD commit:    367b5c3d53e5 Add linux-next specific files for 20240816
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=147f345b980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61ba6f3b22ee5467
dashboard link: https://syzkaller.appspot.com/bug?extid=ca14b36a46a8c541b509
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13d6dbf3980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=142413c5980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0b1b4e3cad3c/disk-367b5c3d.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/5bb090f7813c/vmlinux-367b5c3d.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6674cb0709b1/bzImage-367b5c3d.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ca14b36a46a8c541b509@syzkaller.appspotmail.com

------------[ cut here ]------------
======================================================
WARNING: possible circular locking dependency detected
6.11.0-rc3-next-20240816-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u8:7/5301 is trying to acquire lock:
ffffffff8e815038 ((console_sem).lock){-...}-{2:2}, at: down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139

but task is already holding lock:
ffff8880b913ea58 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:587

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&rq->__lock){-.-.}-{2:2}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
       _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
       raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:587
       raw_spin_rq_lock kernel/sched/sched.h:1485 [inline]
       task_rq_lock+0xc6/0x360 kernel/sched/core.c:689
       cgroup_move_task+0x92/0x2d0 kernel/sched/psi.c:1161
       css_set_move_task+0x72e/0x950 kernel/cgroup/cgroup.c:898
       cgroup_post_fork+0x256/0x880 kernel/cgroup/cgroup.c:6690
       copy_process+0x3ab1/0x3e30 kernel/fork.c:2620
       kernel_clone+0x226/0x8f0 kernel/fork.c:2806
       user_mode_thread+0x132/0x1a0 kernel/fork.c:2884
       rest_init+0x23/0x300 init/main.c:712
       start_kernel+0x47a/0x500 init/main.c:1103
       x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
       x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
       common_startup_64+0x13e/0x147

-> #1 (&p->pi_lock){-.-.}-{2:2}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
       try_to_wake_up+0xb0/0x1470 kernel/sched/core.c:4113
       up+0x72/0x90 kernel/locking/semaphore.c:191
       __up_console_sem kernel/printk/printk.c:340 [inline]
       __console_unlock kernel/printk/printk.c:2801 [inline]
       console_unlock+0x22f/0x4d0 kernel/printk/printk.c:3120
       vprintk_emit+0x5dc/0x7c0 kernel/printk/printk.c:2348
       dev_vprintk_emit+0x2ae/0x330 drivers/base/core.c:4921
       dev_printk_emit+0xdd/0x120 drivers/base/core.c:4932
       _dev_warn+0x122/0x170 drivers/base/core.c:4988
       _request_firmware+0xd2c/0x12b0 drivers/base/firmware_loader/main.c:910
       request_firmware_work_func+0x12a/0x280 drivers/base/firmware_loader/main.c:1165
       process_one_work kernel/workqueue.c:3232 [inline]
       process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3313
       worker_thread+0x86d/0xd10 kernel/workqueue.c:3390
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #0 ((console_sem).lock){-...}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:3136 [inline]
       check_prevs_add kernel/locking/lockdep.c:3255 [inline]
       validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3871
       __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5145
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
       __down_trylock_console_sem+0x109/0x250 kernel/printk/printk.c:323
       console_trylock kernel/printk/printk.c:2754 [inline]
       console_trylock_spinning kernel/printk/printk.c:1958 [inline]
       vprintk_emit+0x2aa/0x7c0 kernel/printk/printk.c:2347
       _printk+0xd5/0x120 kernel/printk/printk.c:2373
       __report_bug lib/bug.c:195 [inline]
       report_bug+0x346/0x500 lib/bug.c:219
       handle_bug+0x60/0x90 arch/x86/kernel/traps.c:285
       exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:309
       asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
       lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
       rq_clock kernel/sched/sched.h:1624 [inline]
       replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
       update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
       update_curr+0x575/0xb20 kernel/sched/fair.c:1176
       put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
       put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
       put_prev_task kernel/sched/sched.h:2423 [inline]
       put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
       __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
       pick_next_task kernel/sched/core.c:6012 [inline]
       __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
       preempt_schedule_common+0x84/0xd0 kernel/sched/core.c:6818
       preempt_schedule+0xe1/0xf0 kernel/sched/core.c:6842
       preempt_schedule_thunk+0x1a/0x30 arch/x86/entry/thunk.S:12
       __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
       _raw_spin_unlock_irqrestore+0x130/0x140 kernel/locking/spinlock.c:194
       task_rq_unlock kernel/sched/sched.h:1759 [inline]
       __sched_setscheduler+0xf35/0x1ba0 kernel/sched/syscalls.c:858
       _sched_setscheduler kernel/sched/syscalls.c:880 [inline]
       sched_setscheduler_nocheck+0x190/0x2e0 kernel/sched/syscalls.c:927
       kthread+0x1aa/0x390 kernel/kthread.c:370
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
  (console_sem).lock --> &p->pi_lock --> &rq->__lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rq->__lock);
                               lock(&p->pi_lock);
                               lock(&rq->__lock);
  lock((console_sem).lock);

 *** DEADLOCK ***

1 lock held by kworker/u8:7/5301:
 #0: ffff8880b913ea58 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:587

stack backtrace:
CPU: 1 UID: 0 PID: 5301 Comm: kworker/u8:7 Not tainted 6.11.0-rc3-next-20240816-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2189
 check_prev_add kernel/locking/lockdep.c:3136 [inline]
 check_prevs_add kernel/locking/lockdep.c:3255 [inline]
 validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3871
 __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5145
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
 down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
 __down_trylock_console_sem+0x109/0x250 kernel/printk/printk.c:323
 console_trylock kernel/printk/printk.c:2754 [inline]
 console_trylock_spinning kernel/printk/printk.c:1958 [inline]
 vprintk_emit+0x2aa/0x7c0 kernel/printk/printk.c:2347
 _printk+0xd5/0x120 kernel/printk/printk.c:2373
 __report_bug lib/bug.c:195 [inline]
 report_bug+0x346/0x500 lib/bug.c:219
 handle_bug+0x60/0x90 arch/x86/kernel/traps.c:285
 exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:309
 asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
RIP: 0010:lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
RIP: 0010:rq_clock kernel/sched/sched.h:1624 [inline]
RIP: 0010:replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
RIP: 0010:update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
Code: b5 50 fe ff ff 4c 89 ff ba 20 00 00 00 e8 e9 4f 00 00 e9 58 fe ff ff 4c 89 ef be 20 00 00 00 e8 b7 13 00 00 e9 46 fe ff ff 90 <0f> 0b 90 e9 be fb ff ff 89 f1 80 e1 07 38 c1 0f 8c b5 f9 ff ff 48
RSP: 0018:ffffc9000417f6c8 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880b903ea40 RCX: 0000000000000003
RDX: dffffc0000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c60a8c0
RBP: 0000000000000031 R08: ffff8880b902c883 R09: 1ffff11017205910
R10: dffffc0000000000 R11: ffffed1017205911 R12: ffff8880b903f468
R13: ffff8880b903f428 R14: 1ffff11017207e8f R15: ffff8880b903f858
 update_curr+0x575/0xb20 kernel/sched/fair.c:1176
 put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
 put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
 put_prev_task kernel/sched/sched.h:2423 [inline]
 put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
 __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
 pick_next_task kernel/sched/core.c:6012 [inline]
 __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
 preempt_schedule_common+0x84/0xd0 kernel/sched/core.c:6818
 preempt_schedule+0xe1/0xf0 kernel/sched/core.c:6842
 preempt_schedule_thunk+0x1a/0x30 arch/x86/entry/thunk.S:12
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
 _raw_spin_unlock_irqrestore+0x130/0x140 kernel/locking/spinlock.c:194
 task_rq_unlock kernel/sched/sched.h:1759 [inline]
 __sched_setscheduler+0xf35/0x1ba0 kernel/sched/syscalls.c:858
 _sched_setscheduler kernel/sched/syscalls.c:880 [inline]
 sched_setscheduler_nocheck+0x190/0x2e0 kernel/sched/syscalls.c:927
 kthread+0x1aa/0x390 kernel/kthread.c:370
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
WARNING: CPU: 1 PID: 5301 at kernel/sched/sched.h:1476 lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
WARNING: CPU: 1 PID: 5301 at kernel/sched/sched.h:1476 rq_clock kernel/sched/sched.h:1624 [inline]
WARNING: CPU: 1 PID: 5301 at kernel/sched/sched.h:1476 replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
WARNING: CPU: 1 PID: 5301 at kernel/sched/sched.h:1476 update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
Modules linked in:
CPU: 1 UID: 0 PID: 5301 Comm: kworker/u8:7 Not tainted 6.11.0-rc3-next-20240816-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]
RIP: 0010:rq_clock kernel/sched/sched.h:1624 [inline]
RIP: 0010:replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
RIP: 0010:update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
Code: b5 50 fe ff ff 4c 89 ff ba 20 00 00 00 e8 e9 4f 00 00 e9 58 fe ff ff 4c 89 ef be 20 00 00 00 e8 b7 13 00 00 e9 46 fe ff ff 90 <0f> 0b 90 e9 be fb ff ff 89 f1 80 e1 07 38 c1 0f 8c b5 f9 ff ff 48
RSP: 0018:ffffc9000417f6c8 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880b903ea40 RCX: 0000000000000003
RDX: dffffc0000000000 RSI: ffffffff8c0adfc0 RDI: ffffffff8c60a8c0
RBP: 0000000000000031 R08: ffff8880b902c883 R09: 1ffff11017205910
R10: dffffc0000000000 R11: ffffed1017205911 R12: ffff8880b903f468
R13: ffff8880b903f428 R14: 1ffff11017207e8f R15: ffff8880b903f858
FS:  0000000000000000(0000) GS:ffff8880b9100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f69bb64cd58 CR3: 0000000078782000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 update_curr+0x575/0xb20 kernel/sched/fair.c:1176
 put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
 put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
 put_prev_task kernel/sched/sched.h:2423 [inline]
 put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
 __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
 pick_next_task kernel/sched/core.c:6012 [inline]
 __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
 preempt_schedule_common+0x84/0xd0 kernel/sched/core.c:6818
 preempt_schedule+0xe1/0xf0 kernel/sched/core.c:6842
 preempt_schedule_thunk+0x1a/0x30 arch/x86/entry/thunk.S:12
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
 _raw_spin_unlock_irqrestore+0x130/0x140 kernel/locking/spinlock.c:194
 task_rq_unlock kernel/sched/sched.h:1759 [inline]
 __sched_setscheduler+0xf35/0x1ba0 kernel/sched/syscalls.c:858
 _sched_setscheduler kernel/sched/syscalls.c:880 [inline]
 sched_setscheduler_nocheck+0x190/0x2e0 kernel/sched/syscalls.c:927
 kthread+0x1aa/0x390 kernel/kthread.c:370
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-17  6:52 ` syzbot
@ 2024-08-17  8:31   ` Hillf Danton
  2024-08-17 10:09     ` syzbot
  2024-08-17 11:57   ` Hillf Danton
  2024-08-17 22:53   ` Hillf Danton
  2 siblings, 1 reply; 8+ messages in thread
From: Hillf Danton @ 2024-08-17  8:31 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, 16 Aug 2024 23:52:22 -0700
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    367b5c3d53e5 Add linux-next specific files for 20240816
> git tree:       linux-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=142413c5980000

#syz test linux-next  367b5c3d53e5

--- x/kernel/sched/deadline.c
+++ y/kernel/sched/deadline.c
@@ -1498,6 +1498,9 @@ static void update_curr_dl_se(struct rq
 	 * starting a new period, pushing the activation.
 	 */
 	if (dl_se->dl_defer && dl_se->dl_throttled && dl_runtime_exceeded(dl_se)) {
+		bool lock = rq != dl_se->rq;
+		struct rq_flags rf;
+		struct rq *__rq = dl_se->rq;
 		/*
 		 * If the server was previously activated - the starving condition
 		 * took place, it this point it went away because the fair scheduler
@@ -1508,7 +1511,11 @@ static void update_curr_dl_se(struct rq
 
 		hrtimer_try_to_cancel(&dl_se->dl_timer);
 
+		if (lock)
+			rq_lock(__rq, &rf);
 		replenish_dl_new_period(dl_se, dl_se->rq);
+		if (lock)
+			rq_unlock(__rq, &rf);
 
 		/*
 		 * Not being able to start the timer seems problematic. If it could not
--

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-17  8:31   ` Hillf Danton
@ 2024-08-17 10:09     ` syzbot
  0 siblings, 0 replies; 8+ messages in thread
From: syzbot @ 2024-08-17 10:09 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in __alloc_workqueue

workqueue: Failed to create a rescuer kthread for wq "wg-crypt-\x18": -EINTR
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 UID: 0 PID: 6468 Comm: syz-executor Not tainted 6.11.0-rc3-next-20240816-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:__lock_acquire+0x69/0x2040 kernel/locking/lockdep.c:5010
Code: b6 04 30 84 c0 0f 85 87 16 00 00 45 31 f6 83 3d 48 05 a9 0e 00 0f 84 ac 13 00 00 89 54 24 54 89 5c 24 68 4c 89 f8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ff e8 49 5c 8c 00 48 be 00 00 00 00 00 fc
RSP: 0018:ffffc9000438ec30 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff20318b6 R12: ffff88802b189e00
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  000055556e398500(0000) GS:ffff8880b9000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005555911bd808 CR3: 00000000702f8000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
 touch_wq_lockdep_map kernel/workqueue.c:3876 [inline]
 __flush_workqueue+0x1e3/0x1770 kernel/workqueue.c:3918
 drain_workqueue+0xc9/0x3a0 kernel/workqueue.c:4082
 destroy_workqueue+0xba/0xc40 kernel/workqueue.c:5830
 __alloc_workqueue+0x1c30/0x1fb0 kernel/workqueue.c:5745
 alloc_workqueue+0xd6/0x210 kernel/workqueue.c:5758
 wg_newlink+0x260/0x640 drivers/net/wireguard/device.c:343
 rtnl_newlink_create net/core/rtnetlink.c:3510 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3730 [inline]
 rtnl_newlink+0x1591/0x20a0 net/core/rtnetlink.c:3743
 rtnetlink_rcv_msg+0x73f/0xcf0 net/core/rtnetlink.c:6647
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 __sys_sendto+0x3a8/0x500 net/socket.c:2204
 __do_sys_sendto net/socket.c:2216 [inline]
 __se_sys_sendto net/socket.c:2212 [inline]
 __x64_sys_sendto+0xde/0x100 net/socket.c:2212
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f52c6b7bd0c
Code: 2a 5a 02 00 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 70 5a 02 00 48 8b
RSP: 002b:00007ffdb7bbca90 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f52c7844620 RCX: 00007f52c6b7bd0c
RDX: 000000000000003c RSI: 00007f52c7844670 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00007ffdb7bbcae4 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003
R13: 0000000000000000 R14: 00007f52c7844670 R15: 0000000000000000
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__lock_acquire+0x69/0x2040 kernel/locking/lockdep.c:5010
Code: b6 04 30 84 c0 0f 85 87 16 00 00 45 31 f6 83 3d 48 05 a9 0e 00 0f 84 ac 13 00 00 89 54 24 54 89 5c 24 68 4c 89 f8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ff e8 49 5c 8c 00 48 be 00 00 00 00 00 fc
RSP: 0018:ffffc9000438ec30 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff20318b6 R12: ffff88802b189e00
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  000055556e398500(0000) GS:ffff8880b9000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005555911bd808 CR3: 00000000702f8000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
   0:	b6 04                	mov    $0x4,%dh
   2:	30 84 c0 0f 85 87 16 	xor    %al,0x1687850f(%rax,%rax,8)
   9:	00 00                	add    %al,(%rax)
   b:	45 31 f6             	xor    %r14d,%r14d
   e:	83 3d 48 05 a9 0e 00 	cmpl   $0x0,0xea90548(%rip)        # 0xea9055d
  15:	0f 84 ac 13 00 00    	je     0x13c7
  1b:	89 54 24 54          	mov    %edx,0x54(%rsp)
  1f:	89 5c 24 68          	mov    %ebx,0x68(%rsp)
  23:	4c 89 f8             	mov    %r15,%rax
  26:	48 c1 e8 03          	shr    $0x3,%rax
* 2a:	80 3c 30 00          	cmpb   $0x0,(%rax,%rsi,1) <-- trapping instruction
  2e:	74 12                	je     0x42
  30:	4c 89 ff             	mov    %r15,%rdi
  33:	e8 49 5c 8c 00       	call   0x8c5c81
  38:	48                   	rex.W
  39:	be 00 00 00 00       	mov    $0x0,%esi
  3e:	00 fc                	add    %bh,%ah


Tested on:

commit:         367b5c3d Add linux-next specific files for 20240816
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1191c45b980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61ba6f3b22ee5467
dashboard link: https://syzkaller.appspot.com/bug?extid=ca14b36a46a8c541b509
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16a7edf5980000


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-17  6:52 ` syzbot
  2024-08-17  8:31   ` Hillf Danton
@ 2024-08-17 11:57   ` Hillf Danton
  2024-08-17 12:23     ` syzbot
  2024-08-17 22:53   ` Hillf Danton
  2 siblings, 1 reply; 8+ messages in thread
From: Hillf Danton @ 2024-08-17 11:57 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, 16 Aug 2024 23:52:22 -0700
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    367b5c3d53e5 Add linux-next specific files for 20240816
> git tree:       linux-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=142413c5980000

#syz test linux-next  367b5c3d53e5

--- x/kernel/sched/deadline.c
+++ y/kernel/sched/deadline.c
@@ -1498,6 +1498,9 @@ static void update_curr_dl_se(struct rq
 	 * starting a new period, pushing the activation.
 	 */
 	if (dl_se->dl_defer && dl_se->dl_throttled && dl_runtime_exceeded(dl_se)) {
+		bool lock = rq != dl_se->rq;
+		struct rq_flags rf;
+		struct rq *__rq = dl_se->rq;
 		/*
 		 * If the server was previously activated - the starving condition
 		 * took place, it this point it went away because the fair scheduler
@@ -1508,7 +1511,11 @@ static void update_curr_dl_se(struct rq
 
 		hrtimer_try_to_cancel(&dl_se->dl_timer);
 
+		if (lock)
+			rq_lock(__rq, &rf);
 		replenish_dl_new_period(dl_se, dl_se->rq);
+		if (lock)
+			rq_unlock(__rq, &rf);
 
 		/*
 		 * Not being able to start the timer seems problematic. If it could not
--- x/kernel/workqueue.c
+++ y/kernel/workqueue.c
@@ -5653,6 +5653,7 @@ static struct workqueue_struct *__alloc_
 	wq = kzalloc(wq_size, GFP_KERNEL);
 	if (!wq)
 		return NULL;
+	wq_init_lockdep(wq);
 
 	if (flags & WQ_UNBOUND) {
 		wq->unbound_attrs = alloc_workqueue_attrs();
@@ -5757,10 +5758,6 @@ struct workqueue_struct *alloc_workqueue
 	va_start(args, max_active);
 	wq = __alloc_workqueue(fmt, flags, max_active, args);
 	va_end(args);
-	if (!wq)
-		return NULL;
-
-	wq_init_lockdep(wq);
 
 	return wq;
 }
--

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-17 11:57   ` Hillf Danton
@ 2024-08-17 12:23     ` syzbot
  0 siblings, 0 replies; 8+ messages in thread
From: syzbot @ 2024-08-17 12:23 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+ca14b36a46a8c541b509@syzkaller.appspotmail.com
Tested-by: syzbot+ca14b36a46a8c541b509@syzkaller.appspotmail.com

Tested on:

commit:         367b5c3d Add linux-next specific files for 20240816
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=16b7edf5980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61ba6f3b22ee5467
dashboard link: https://syzkaller.appspot.com/bug?extid=ca14b36a46a8c541b509
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16eaba05980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-17  6:52 ` syzbot
  2024-08-17  8:31   ` Hillf Danton
  2024-08-17 11:57   ` Hillf Danton
@ 2024-08-17 22:53   ` Hillf Danton
  2 siblings, 0 replies; 8+ messages in thread
From: Hillf Danton @ 2024-08-17 22:53 UTC (permalink / raw)
  To: syzbot
  Cc: linux-kernel, syzkaller-bugs, Daniel Bristot de Oliveira,
	Peter Zijlstra, vincent.guittot

On Fri, 16 Aug 2024 23:52:22 -0700
>
>-> #0 ((console_sem).lock){-.-.}-{2:2}:
>       check_prev_add kernel/locking/lockdep.c:3136 [inline]
>       check_prevs_add kernel/locking/lockdep.c:3255 [inline]
>       validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3871
>       __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5145
>       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5762
>       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
>       down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
>       __down_trylock_console_sem+0x109/0x250 kernel/printk/printk.c:323
>       console_trylock kernel/printk/printk.c:2754 [inline]
>       console_trylock_spinning kernel/printk/printk.c:1958 [inline]
>       vprintk_emit+0x2aa/0x7c0 kernel/printk/printk.c:2347
>       _printk+0xd5/0x120 kernel/printk/printk.c:2373
>       __report_bug lib/bug.c:195 [inline]
>       report_bug+0x346/0x500 lib/bug.c:219
>       handle_bug+0x60/0x90 arch/x86/kernel/traps.c:285
>       exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:309
>       asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
>       lockdep_assert_rq_held kernel/sched/sched.h:1476 [inline]

It looks no way to me to fix the lockdep assert without breaking the safe
locking order in double_rq_lock().

>       rq_clock kernel/sched/sched.h:1624 [inline]
>       replenish_dl_new_period kernel/sched/deadline.c:777 [inline]
>       update_curr_dl_se+0x66f/0x920 kernel/sched/deadline.c:1511
>       update_curr+0x575/0xb20 kernel/sched/fair.c:1176
>       put_prev_entity+0x3d/0x210 kernel/sched/fair.c:5505
>       put_prev_task_fair+0x4d/0x80 kernel/sched/fair.c:8686
>       put_prev_task kernel/sched/sched.h:2423 [inline]
>       put_prev_task_balance+0x11d/0x190 kernel/sched/core.c:5886
>       __pick_next_task+0xc6/0x2f0 kernel/sched/core.c:5946
>       pick_next_task kernel/sched/core.c:6012 [inline]
>       __schedule+0x725/0x4ad0 kernel/sched/core.c:6594
>       preempt_schedule_irq+0xfb/0x1c0 kernel/sched/core.c:6961
>       irqentry_exit+0x5e/0x90 kernel/entry/common.c:354
>       asm_sysvec_reschedule_ipi+0x1a/0x20 arch/x86/include/asm/idtentry.h:707
>       lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5766
>       down_read_trylock+0x24f/0x3c0 kernel/locking/rwsem.c:1568
>       mmap_read_trylock include/linux/mmap_lock.h:163 [inline]
>       get_mmap_lock_carefully mm/memory.c:6033 [inline]
>       lock_mm_and_find_vma+0x32/0x2f0 mm/memory.c:6093
>       do_user_addr_fault arch/x86/mm/fault.c:1361 [inline]
>       handle_page_fault arch/x86/mm/fault.c:1481 [inline]
>       exc_page_fault+0x1bf/0x8c0 arch/x86/mm/fault.c:1539
>       asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
>       __put_user_4+0x11/0x20 arch/x86/lib/putuser.S:86
>       schedule_tail+0x96/0xb0 kernel/sched/core.c:5205
>       ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:143
>       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [syzbot] [cgroups?] possible deadlock in task_rq_lock
  2024-08-16  5:50 [syzbot] [cgroups?] possible deadlock in task_rq_lock syzbot
  2024-08-17  6:52 ` syzbot
@ 2024-08-18  7:05 ` syzbot
  1 sibling, 0 replies; 8+ messages in thread
From: syzbot @ 2024-08-18  7:05 UTC (permalink / raw)
  To: bristot, cgroups, hannes, hdanton, juri.lelli, linux-kernel,
	lizefan.x, mkoutny, peterz, syzkaller-bugs, tglx, tj,
	vincent.guittot, vineeth

syzbot has bisected this issue to:

commit 5f6bd380c7bdbe10f7b4e8ddcceed60ce0714c6d
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Mon May 27 12:06:55 2024 +0000

    sched/rt: Remove default bandwidth control

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11a668dd980000
start commit:   367b5c3d53e5 Add linux-next specific files for 20240816
git tree:       linux-next
final oops:     https://syzkaller.appspot.com/x/report.txt?x=13a668dd980000
console output: https://syzkaller.appspot.com/x/log.txt?x=15a668dd980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61ba6f3b22ee5467
dashboard link: https://syzkaller.appspot.com/bug?extid=ca14b36a46a8c541b509
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13d6dbf3980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=142413c5980000

Reported-by: syzbot+ca14b36a46a8c541b509@syzkaller.appspotmail.com
Fixes: 5f6bd380c7bd ("sched/rt: Remove default bandwidth control")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-08-18  7:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-16  5:50 [syzbot] [cgroups?] possible deadlock in task_rq_lock syzbot
2024-08-17  6:52 ` syzbot
2024-08-17  8:31   ` Hillf Danton
2024-08-17 10:09     ` syzbot
2024-08-17 11:57   ` Hillf Danton
2024-08-17 12:23     ` syzbot
2024-08-17 22:53   ` Hillf Danton
2024-08-18  7:05 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox