* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
[not found] ` <30819e01-43ce-638f-0cc6-067d6a8d03c7@candelatech.com>
@ 2024-04-03 19:35 ` Ben Greear
2024-04-08 16:35 ` Johannes Berg
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Ben Greear @ 2024-04-03 19:35 UTC (permalink / raw)
To: LKML, linux-leds; +Cc: Johannes Berg
On 4/2/24 10:38, Ben Greear wrote:
> On 4/2/24 09:37, Ben Greear wrote:
>> Hello,
>>
>> Sometime between rc1 and today's rc2, my system quit booting.
>> I'm not seeing any splats, it just stops. Evidently before
>> sysrq is enabled.
>>
>> [ OK ] Started Flush Journal to Persistent Storage.
>> [ OK ] Started udev Coldplug all Devices.
>> Starting udev Wait for Complete Device Initialization...
>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>> Starting LVM2 PV scan on device 8:19...
>> Starting LVM2 PV scan on device 8:3...
>> [ OK ] Started Device-mapper event daemon.
>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>> sysrq: This sysrq operation is disabled.
>>
>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>
>> Thanks,
>> Ben
>>
>
> So, deadlock I guess....
>
> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
> Workqueue: events deferred_probe_timeout_work_func
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? mark_held_locks+0x49/0x70
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> ? __flush_work+0x1ff/0x460
> __flush_work+0x287/0x460
> ? flush_workqueue_prep_pwqs+0x120/0x120
> deferred_probe_timeout_work_func+0x2b/0xa0
> process_one_work+0x212/0x710
> ? lock_is_held_type+0xa5/0x110
> worker_thread+0x188/0x340
> ? rescuer_thread+0x380/0x380
> kthread+0xd7/0x110
> ? kthread_complete_and_exit+0x20/0x20
> ret_from_fork+0x28/0x40
> ? kthread_complete_and_exit+0x20/0x20
> ret_from_fork_asm+0x11/0x20
> </TASK>
> INFO: task udevadm:763 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:udevadm state:D stack:0 pid:763 tgid:763 ppid:1 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> ? __flush_work+0x1ff/0x460
> __flush_work+0x287/0x460
> ? flush_workqueue_prep_pwqs+0x120/0x120
> fsnotify_destroy_group+0x66/0xf0
> inotify_release+0x12/0x40
> __fput+0xa6/0x2d0
> __x64_sys_close+0x33/0x70
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7f744d5bc878
> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
> </TASK>
> INFO: task modprobe:968 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:968 tgid:968 ppid:65 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7fde25530ddd
> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
> </TASK>
> INFO: task modprobe:969 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:969 tgid:969 ppid:93 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7f338d516ddd
> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
> </TASK>
> INFO: task modprobe:1044 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:1044 tgid:1044 ppid:10 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7f7637b30ddd
> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
> </TASK>
> INFO: task modprobe:1047 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:1047 tgid:1047 ppid:113 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7f3907130ddd
> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
> </TASK>
> INFO: task modprobe:1056 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:1056 tgid:1056 ppid:1045 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7fcb1e730ddd
> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
> </TASK>
> INFO: task modprobe:1058 blocked for more than 180 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:1058 tgid:1058 ppid:1051 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7f0a17b30ddd
> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
> </TASK>
> INFO: task modprobe:1060 blocked for more than 181 seconds.
> Not tainted 6.9.0-rc2+ #23
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:modprobe state:D stack:0 pid:1060 tgid:1060 ppid:1057 flags:0x00000000
> Call Trace:
> <TASK>
> __schedule+0x43d/0xe20
> schedule+0x31/0x130
> schedule_timeout+0x1b9/0x1d0
> ? __wait_for_common+0xb0/0x1d0
> ? lock_release+0xc6/0x290
> ? lockdep_hardirqs_on_prepare+0xd6/0x170
> __wait_for_common+0xb9/0x1d0
> ? usleep_range_state+0xb0/0xb0
> idempotent_init_module+0x1ae/0x290
> __x64_sys_finit_module+0x55/0xb0
> do_syscall_64+0x6c/0x170
> entry_SYSCALL_64_after_hwframe+0x46/0x4e
> RIP: 0033:0x7f12c0130ddd
> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
> </TASK>
>
> Showing all locks held in the system:
> 2 locks held by systemd/1:
> #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
> #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
> 2 locks held by kworker/0:1/9:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:0/10:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/3:0/37:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/7:0/61:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:1/65:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 1 lock held by khungtaskd/66:
> #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
> 2 locks held by kworker/1:1/79:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:2/93:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/6:1/94:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/3:1/96:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/1:2/102:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:3/107:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:4/113:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/6:2/189:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/6:5/196:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/6:6/197:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/6:8/199:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/7:2/296:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/7:3/297:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/7:4/298:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/7:5/320:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/2:2/371:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/5:13/648:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/5:14/649:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/5:15/650:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/5:16/651:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/4:3/722:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/1:4/768:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/1:5/769:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/0:2/849:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by lvm/860:
> #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
> #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
> 2 locks held by modprobe/1019:
> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
> 2 locks held by kworker/u32:5/1045:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:6/1051:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/u32:7/1057:
> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/3:3/1111:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
> 2 locks held by kworker/3:4/1132:
> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
> #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>
> =============================================
>
>
I ran a bisect on this. The tagged bad commit is a LED related merge, but commit
shows no code changes when I look at it in git. I double checked that the
merge is bad by manually going to it again at the end of the bisect and
indeed it fails.
From looking at lockdep, this below may be interesting. I do have 24 intel be200 radios
in this system, so maybe lots of iwlwifi radios help trigger the problem?
> 2 locks held by modprobe/1019:
> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
Please let me know if you have any suggestions for how to debug this further.
[greearb@ben-dt5 linux-2.6]$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
# status: waiting for bad commit, 1 good commit known
# bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
git bisect bad 4cece764965020c22cff7665b18a012006359095
# good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
# bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
# good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
# bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
# bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
# bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
# good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
# good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
# good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
# good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
# good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
# good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
# good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
# first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
[greearb@ben-dt5 linux-2.6]$
Thanks,
Ben
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-03 19:35 ` 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related) Ben Greear
@ 2024-04-08 16:35 ` Johannes Berg
2024-04-09 8:47 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-11 7:07 ` Lee Jones
2024-05-06 20:00 ` Heiner Kallweit
2 siblings, 1 reply; 24+ messages in thread
From: Johannes Berg @ 2024-04-08 16:35 UTC (permalink / raw)
To: Ben Greear, LKML, linux-leds
On Wed, 2024-04-03 at 12:35 -0700, Ben Greear wrote:
> >
> > So, deadlock I guess....
> >
> > INFO: task kworker/5:13:648 blocked for more than 180 seconds.
> > Not tainted 6.9.0-rc2+ #23
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
> > Workqueue: events deferred_probe_timeout_work_func
> > Call Trace:
> > <TASK>
> > __schedule+0x43d/0xe20
> > schedule+0x31/0x130
> > schedule_timeout+0x1b9/0x1d0
> > ? mark_held_locks+0x49/0x70
> > ? lockdep_hardirqs_on_prepare+0xd6/0x170
> > __wait_for_common+0xb9/0x1d0
> > ? usleep_range_state+0xb0/0xb0
> > ? __flush_work+0x1ff/0x460
> > __flush_work+0x287/0x460
> > ? flush_workqueue_prep_pwqs+0x120/0x120
> > deferred_probe_timeout_work_func+0x2b/0xa0
This seems ... complex.
Does changing iwlwifi's request_module() to request_module_nowait() in
iwl_req_fw_callback() (only) fix it? I think we can do that, we don't
really care at that point, and the issue here seems to be waiting for it
while also other modules are loading and due to the hashing in
idempotent() (kernel/module/main.c) it can't make progress.
johannes
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-08 16:35 ` Johannes Berg
@ 2024-04-09 8:47 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-09 8:59 ` Johannes Berg
0 siblings, 1 reply; 24+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-09 8:47 UTC (permalink / raw)
To: Johannes Berg, Ben Greear, LKML, linux-leds,
Linux kernel regressions list
On 08.04.24 18:35, Johannes Berg wrote:
> On Wed, 2024-04-03 at 12:35 -0700, Ben Greear wrote:
>>>
>>> So, deadlock I guess....
>>>
>>> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
>>> Workqueue: events deferred_probe_timeout_work_func
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? mark_held_locks+0x49/0x70
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> ? __flush_work+0x1ff/0x460
>>> __flush_work+0x287/0x460
>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>> deferred_probe_timeout_work_func+0x2b/0xa0
>
> This seems ... complex.
>
> Does changing iwlwifi's request_module() to request_module_nowait() in
> iwl_req_fw_callback() (only) fix it? I think we can do that, we don't
> really care at that point, and the issue here seems to be waiting for it
> while also other modules are loading and due to the hashing in
> idempotent() (kernel/module/main.c) it can't make progress.
Not my area of expertise, so listening to me my send you off-track.
Might this be related to the issue discussed here:
https://lore.kernel.org/all/ZhRD3cOtz5i-61PB@mail-itl/
See also:
https://lore.kernel.org/all/2695e9db-a5a0-4564-9812-a50b91fb1b46@gmail.com/
Ciao, Thorsten
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-09 8:47 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-09 8:59 ` Johannes Berg
0 siblings, 0 replies; 24+ messages in thread
From: Johannes Berg @ 2024-04-09 8:59 UTC (permalink / raw)
To: Linux regressions mailing list, Ben Greear, LKML, linux-leds
On Tue, 2024-04-09 at 10:47 +0200, Linux regression tracking (Thorsten
Leemhuis) wrote:
> On 08.04.24 18:35, Johannes Berg wrote:
> > On Wed, 2024-04-03 at 12:35 -0700, Ben Greear wrote:
> > > >
> > > > So, deadlock I guess....
> > > >
> > > > INFO: task kworker/5:13:648 blocked for more than 180 seconds.
> > > > Not tainted 6.9.0-rc2+ #23
> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
> > > > Workqueue: events deferred_probe_timeout_work_func
> > > > Call Trace:
> > > > <TASK>
> > > > __schedule+0x43d/0xe20
> > > > schedule+0x31/0x130
> > > > schedule_timeout+0x1b9/0x1d0
> > > > ? mark_held_locks+0x49/0x70
> > > > ? lockdep_hardirqs_on_prepare+0xd6/0x170
> > > > __wait_for_common+0xb9/0x1d0
> > > > ? usleep_range_state+0xb0/0xb0
> > > > ? __flush_work+0x1ff/0x460
> > > > __flush_work+0x287/0x460
> > > > ? flush_workqueue_prep_pwqs+0x120/0x120
> > > > deferred_probe_timeout_work_func+0x2b/0xa0
> >
> > This seems ... complex.
> >
> > Does changing iwlwifi's request_module() to request_module_nowait() in
> > iwl_req_fw_callback() (only) fix it? I think we can do that, we don't
> > really care at that point, and the issue here seems to be waiting for it
> > while also other modules are loading and due to the hashing in
> > idempotent() (kernel/module/main.c) it can't make progress.
>
> Not my area of expertise, so listening to me my send you off-track.
>
> Might this be related to the issue discussed here:
> https://lore.kernel.org/all/ZhRD3cOtz5i-61PB@mail-itl/
>
> See also:
> https://lore.kernel.org/all/2695e9db-a5a0-4564-9812-a50b91fb1b46@gmail.com/
Don't think so, that was related to devm and module unload, neither
seems to be the case here.
johannes
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-03 19:35 ` 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related) Ben Greear
2024-04-08 16:35 ` Johannes Berg
@ 2024-04-11 7:07 ` Lee Jones
2024-04-15 20:37 ` Ben Greear
2024-05-06 20:00 ` Heiner Kallweit
2 siblings, 1 reply; 24+ messages in thread
From: Lee Jones @ 2024-04-11 7:07 UTC (permalink / raw)
To: Ben Greear; +Cc: LKML, linux-leds, Johannes Berg
On Wed, 03 Apr 2024, Ben Greear wrote:
> On 4/2/24 10:38, Ben Greear wrote:
> > On 4/2/24 09:37, Ben Greear wrote:
> > > Hello,
> > >
> > > Sometime between rc1 and today's rc2, my system quit booting.
> > > I'm not seeing any splats, it just stops. Evidently before
> > > sysrq is enabled.
> > >
> > > [ OK ] Started Flush Journal to Persistent Storage.
> > > [ OK ] Started udev Coldplug all Devices.
> > > Starting udev Wait for Complete Device Initialization...
> > > [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
> > > [ OK ] Created slice system-lvm2\x2dpvscan.slice.
> > > Starting LVM2 PV scan on device 8:19...
> > > Starting LVM2 PV scan on device 8:3...
> > > [ OK ] Started Device-mapper event daemon.
> > > iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
> > > sysrq: This sysrq operation is disabled.
> > >
> > > I can start a bisect, but in case anyone knows the answer already, please let me know.
> > >
> > > Thanks,
> > > Ben
> > >
> >
> > So, deadlock I guess....
Does this help you in any way?
https://lore.kernel.org/all/bbcdbc1b-44bc-4cf8-86ef-6e6af2b009c3@gmail.com/
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-11 7:07 ` Lee Jones
@ 2024-04-15 20:37 ` Ben Greear
2024-04-16 6:17 ` Johannes Berg
2024-04-16 12:19 ` Lee Jones
0 siblings, 2 replies; 24+ messages in thread
From: Ben Greear @ 2024-04-15 20:37 UTC (permalink / raw)
To: Lee Jones; +Cc: LKML, linux-leds, Johannes Berg
On 4/11/24 00:07, Lee Jones wrote:
> On Wed, 03 Apr 2024, Ben Greear wrote:
>
>> On 4/2/24 10:38, Ben Greear wrote:
>>> On 4/2/24 09:37, Ben Greear wrote:
>>>> Hello,
>>>>
>>>> Sometime between rc1 and today's rc2, my system quit booting.
>>>> I'm not seeing any splats, it just stops. Evidently before
>>>> sysrq is enabled.
>>>>
>>>> [ OK ] Started Flush Journal to Persistent Storage.
>>>> [ OK ] Started udev Coldplug all Devices.
>>>> Starting udev Wait for Complete Device Initialization...
>>>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>>>> Starting LVM2 PV scan on device 8:19...
>>>> Starting LVM2 PV scan on device 8:3...
>>>> [ OK ] Started Device-mapper event daemon.
>>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>>> sysrq: This sysrq operation is disabled.
>>>>
>>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>>
>>>> Thanks,
>>>> Ben
>>>>
>>>
>>> So, deadlock I guess....
>
> Does this help you in any way?
>
> https://lore.kernel.org/all/bbcdbc1b-44bc-4cf8-86ef-6e6af2b009c3@gmail.com/
>
Hello Lee,
I cannot see how that patch above would fix my issues since I am not using that driver,
but possibly some similar change needs to be made to iwlwifi.
Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
iwl_req_fw_callback()
Is that still best thing to try in your opinion?
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-15 20:37 ` Ben Greear
@ 2024-04-16 6:17 ` Johannes Berg
2024-04-23 9:00 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-16 12:19 ` Lee Jones
1 sibling, 1 reply; 24+ messages in thread
From: Johannes Berg @ 2024-04-16 6:17 UTC (permalink / raw)
To: Ben Greear, Lee Jones; +Cc: LKML, linux-leds
On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
>
> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
> iwl_req_fw_callback()
>
> Is that still best thing to try in your opinion?
>
I guess so, I don't have any better ideas so far anyway ...
johannes
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-15 20:37 ` Ben Greear
2024-04-16 6:17 ` Johannes Berg
@ 2024-04-16 12:19 ` Lee Jones
1 sibling, 0 replies; 24+ messages in thread
From: Lee Jones @ 2024-04-16 12:19 UTC (permalink / raw)
To: Ben Greear; +Cc: LKML, linux-leds, Johannes Berg
On Mon, 15 Apr 2024, Ben Greear wrote:
> On 4/11/24 00:07, Lee Jones wrote:
> > On Wed, 03 Apr 2024, Ben Greear wrote:
> >
> > > On 4/2/24 10:38, Ben Greear wrote:
> > > > On 4/2/24 09:37, Ben Greear wrote:
> > > > > Hello,
> > > > >
> > > > > Sometime between rc1 and today's rc2, my system quit booting.
> > > > > I'm not seeing any splats, it just stops. Evidently before
> > > > > sysrq is enabled.
> > > > >
> > > > > [ OK ] Started Flush Journal to Persistent Storage.
> > > > > [ OK ] Started udev Coldplug all Devices.
> > > > > Starting udev Wait for Complete Device Initialization...
> > > > > [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
> > > > > [ OK ] Created slice system-lvm2\x2dpvscan.slice.
> > > > > Starting LVM2 PV scan on device 8:19...
> > > > > Starting LVM2 PV scan on device 8:3...
> > > > > [ OK ] Started Device-mapper event daemon.
> > > > > iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
> > > > > sysrq: This sysrq operation is disabled.
> > > > >
> > > > > I can start a bisect, but in case anyone knows the answer already, please let me know.
> > > > >
> > > > > Thanks,
> > > > > Ben
> > > > >
> > > >
> > > > So, deadlock I guess....
> >
> > Does this help you in any way?
> >
> > https://lore.kernel.org/all/bbcdbc1b-44bc-4cf8-86ef-6e6af2b009c3@gmail.com/
> >
> Hello Lee,
>
> I cannot see how that patch above would fix my issues since I am not using that driver,
> but possibly some similar change needs to be made to iwlwifi.
>
> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
> iwl_req_fw_callback()
>
> Is that still best thing to try in your opinion?
I mean the general principle, rather than the exact patch.
Yes, you would need to apply it to your own use-case.
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-16 6:17 ` Johannes Berg
@ 2024-04-23 9:00 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-23 9:06 ` Johannes Berg
0 siblings, 1 reply; 24+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-23 9:00 UTC (permalink / raw)
To: Johannes Berg, Miri Korenblit; +Cc: LKML, linux-leds, Ben Greear, Lee Jones
On 16.04.24 08:17, Johannes Berg wrote:
> On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
>>
>> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
>> iwl_req_fw_callback()
>>
>> Is that still best thing to try in your opinion?
>
> I guess so, I don't have any better ideas so far anyway ...
[adding the iwlwifi maintainer; thread starts here:
https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
]
Johannes, Miri, what's the status wrt to this regression? From here
things look somewhat stalled -- but maybe there was progress and I just
missed it.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
#regzbot poke
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-23 9:00 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-23 9:06 ` Johannes Berg
2024-04-23 9:29 ` Linux regression tracking (Thorsten Leemhuis)
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Berg @ 2024-04-23 9:06 UTC (permalink / raw)
To: Linux regressions mailing list, Miri Korenblit
Cc: LKML, linux-leds, Ben Greear, Lee Jones
On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking (Thorsten
Leemhuis) wrote:
> On 16.04.24 08:17, Johannes Berg wrote:
> > On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
> > >
> > > Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
> > > iwl_req_fw_callback()
> > >
> > > Is that still best thing to try in your opinion?
> >
> > I guess so, I don't have any better ideas so far anyway ...
>
> [adding the iwlwifi maintainer; thread starts here:
> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
>
> ]
>
> Johannes, Miri, what's the status wrt to this regression? From here
> things look somewhat stalled -- but maybe there was progress and I just
> missed it.
What do you want? It got bisected to an LED merge, but you ping _us_?
Way to go ...
johannes
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-23 9:06 ` Johannes Berg
@ 2024-04-23 9:29 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-24 17:26 ` Ben Greear
2024-05-02 7:19 ` Lee Jones
0 siblings, 2 replies; 24+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-23 9:29 UTC (permalink / raw)
To: Lee Jones, Ben Greear
Cc: LKML, linux-leds, Miri Korenblit, Johannes Berg,
Linux regressions mailing list
On 23.04.24 11:06, Johannes Berg wrote:
> On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking (Thorsten
> Leemhuis) wrote:
>> On 16.04.24 08:17, Johannes Berg wrote:
>>> On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
>>>>
>>>> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
>>>> iwl_req_fw_callback()
>>>>
>>>> Is that still best thing to try in your opinion?
>>>
>>> I guess so, I don't have any better ideas so far anyway ...
>>
>> [adding the iwlwifi maintainer; thread starts here:
>> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
>>
>> ]
>>
>> Johannes, Miri, what's the status wrt to this regression? From here
>> things look somewhat stalled -- but maybe there was progress and I just
>> missed it.
>
> What do you want? It got bisected to an LED merge, but you ping _us_?
> Way to go ...
Sorry, to me it sounded a bit like you had an idea for a fix and were
going to give it a try -- similar to how the maintainers for a r8169
driver and the igc driver provided fixes for bugs recent LED changes
exposed.
But sure, you are right, in the end some LED change seems to have cause
this, so the duty to fix it lies in that field. Therefore:
Lee, what's the status here to get this fixed before the final?
Ciao, Thorsten
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-23 9:29 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-24 17:26 ` Ben Greear
2024-04-24 17:31 ` Johannes Berg
2024-05-02 7:19 ` Lee Jones
1 sibling, 1 reply; 24+ messages in thread
From: Ben Greear @ 2024-04-24 17:26 UTC (permalink / raw)
To: Linux regressions mailing list, Lee Jones
Cc: LKML, linux-leds, Miri Korenblit, Johannes Berg
On 4/23/24 02:29, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 23.04.24 11:06, Johannes Berg wrote:
>> On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking (Thorsten
>> Leemhuis) wrote:
>>> On 16.04.24 08:17, Johannes Berg wrote:
>>>> On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
>>>>>
>>>>> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
>>>>> iwl_req_fw_callback()
>>>>>
>>>>> Is that still best thing to try in your opinion?
>>>>
>>>> I guess so, I don't have any better ideas so far anyway ...
>>>
>>> [adding the iwlwifi maintainer; thread starts here:
>>> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
>>>
>>> ]
>>>
>>> Johannes, Miri, what's the status wrt to this regression? From here
>>> things look somewhat stalled -- but maybe there was progress and I just
>>> missed it.
>>
>> What do you want? It got bisected to an LED merge, but you ping _us_?
>> Way to go ...
>
> Sorry, to me it sounded a bit like you had an idea for a fix and were
> going to give it a try -- similar to how the maintainers for a r8169
> driver and the igc driver provided fixes for bugs recent LED changes
> exposed.
>
> But sure, you are right, in the end some LED change seems to have cause
> this, so the duty to fix it lies in that field. Therefore:
>
> Lee, what's the status here to get this fixed before the final?
>
> Ciao, Thorsten
Hello Johannes,
This patch makes the problem go away in my testbed with 24 Intel
iwlwifi radios. My guess is that it is just papering over the problem, but
maybe good enough? Would you like me to submit as official
patch to linux-wireless?
$ git diff
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
index 4696d73c8971..993177e1de27 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
@@ -1744,7 +1744,7 @@ static void iwl_req_fw_callback(const struct firmware *ucode_raw, void *context)
* or hangs loading.
*/
if (load_module)
- request_module("%s", op->name);
+ request_module_nowait("%s", op->name);
failure = false;
goto free;
Thanks,
Ben
>
>
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-24 17:26 ` Ben Greear
@ 2024-04-24 17:31 ` Johannes Berg
0 siblings, 0 replies; 24+ messages in thread
From: Johannes Berg @ 2024-04-24 17:31 UTC (permalink / raw)
To: Ben Greear, Linux regressions mailing list, Lee Jones
Cc: LKML, linux-leds, Miri Korenblit
Hi Ben,
On Wed, 2024-04-24 at 10:26 -0700, Ben Greear wrote:
>
> This patch makes the problem go away in my testbed with 24 Intel
> iwlwifi radios. My guess is that it is just papering over the problem,
Agree, there seems to be some locking issue with LED stuff, but I'm not
sure where, and the driver doesn't even hold any locks here.
> but maybe good enough?
For all I care, yes. We explicitly do this last, from a worker that
holds no locks in the driver ... so it's odd. Looking at the history of
it, it seems that it was originally _nowait(), but then in 3.6 I changed
it because of some backport concerns. Though then I also moved it out of
the locked section.
> Would you like me to submit as official
> patch to linux-wireless?
Sure.
johannes
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-23 9:29 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-24 17:26 ` Ben Greear
@ 2024-05-02 7:19 ` Lee Jones
2024-05-02 8:30 ` Linux regression tracking (Thorsten Leemhuis)
2024-05-05 5:48 ` Ben Greear
1 sibling, 2 replies; 24+ messages in thread
From: Lee Jones @ 2024-05-02 7:19 UTC (permalink / raw)
To: Linux regression tracking (Thorsten Leemhuis)
Cc: Ben Greear, LKML, linux-leds, Miri Korenblit, Johannes Berg,
Linux regressions mailing list
On Tue, 23 Apr 2024, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 23.04.24 11:06, Johannes Berg wrote:
> > On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking (Thorsten
> > Leemhuis) wrote:
> >> On 16.04.24 08:17, Johannes Berg wrote:
> >>> On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
> >>>>
> >>>> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
> >>>> iwl_req_fw_callback()
> >>>>
> >>>> Is that still best thing to try in your opinion?
> >>>
> >>> I guess so, I don't have any better ideas so far anyway ...
> >>
> >> [adding the iwlwifi maintainer; thread starts here:
> >> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
> >>
> >> ]
> >>
> >> Johannes, Miri, what's the status wrt to this regression? From here
> >> things look somewhat stalled -- but maybe there was progress and I just
> >> missed it.
> >
> > What do you want? It got bisected to an LED merge, but you ping _us_?
> > Way to go ...
>
> Sorry, to me it sounded a bit like you had an idea for a fix and were
> going to give it a try -- similar to how the maintainers for a r8169
> driver and the igc driver provided fixes for bugs recent LED changes
> exposed.
>
> But sure, you are right, in the end some LED change seems to have cause
> this, so the duty to fix it lies in that field. Therefore:
>
> Lee, what's the status here to get this fixed before the final?
No idea. Did you send a fix?
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-02 7:19 ` Lee Jones
@ 2024-05-02 8:30 ` Linux regression tracking (Thorsten Leemhuis)
2024-05-05 5:48 ` Ben Greear
1 sibling, 0 replies; 24+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-05-02 8:30 UTC (permalink / raw)
To: Lee Jones
Cc: Ben Greear, LKML, linux-leds, Miri Korenblit, Johannes Berg,
Linux regressions mailing list
On 02.05.24 09:19, Lee Jones wrote:
> On Tue, 23 Apr 2024, Linux regression tracking (Thorsten Leemhuis) wrote:
>
>> On 23.04.24 11:06, Johannes Berg wrote:
>>> On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking (Thorsten
>>> Leemhuis) wrote:
>>>> On 16.04.24 08:17, Johannes Berg wrote:
>>>>> On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
>>>>>>
>>>>>> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
>>>>>> iwl_req_fw_callback()
>>>>>>
>>>>>> Is that still best thing to try in your opinion?
>>>>>
>>>>> I guess so, I don't have any better ideas so far anyway ...
>>>>
>>>> [adding the iwlwifi maintainer; thread starts here:
>>>> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
>>>> ]
>>>>
>>>> Johannes, Miri, what's the status wrt to this regression? From here
>>>> things look somewhat stalled -- but maybe there was progress and I just
>>>> missed it.
>>>
>>> What do you want? It got bisected to an LED merge, but you ping _us_?
>>> Way to go ...
>>
>> Sorry, to me it sounded a bit like you had an idea for a fix and were
>> going to give it a try -- similar to how the maintainers for a r8169
>> driver and the igc driver provided fixes for bugs recent LED changes
>> exposed.
>>
>> But sure, you are right, in the end some LED change seems to have cause
>> this, so the duty to fix it lies in that field. Therefore:
>>
>> Lee, what's the status here to get this fixed before the final?
>
> No idea. Did you send a fix?
I'm just here to help Linus keeping an eye on regression to ensure they
are handled like he wants them to be handled.
But Ben Greear sent a patch to work around the problem:
https://lore.kernel.org/all/20240430234212.2132958-1-greearb@candelatech.com/
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-02 7:19 ` Lee Jones
2024-05-02 8:30 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-05-05 5:48 ` Ben Greear
2024-05-05 10:55 ` Tetsuo Handa
2024-05-07 8:23 ` Lee Jones
1 sibling, 2 replies; 24+ messages in thread
From: Ben Greear @ 2024-05-05 5:48 UTC (permalink / raw)
To: Lee Jones, Linux regression tracking (Thorsten Leemhuis)
Cc: LKML, linux-leds, Miri Korenblit, Johannes Berg,
Linux regressions mailing list
On 5/2/24 00:19, Lee Jones wrote:
> On Tue, 23 Apr 2024, Linux regression tracking (Thorsten Leemhuis) wrote:
>
>> On 23.04.24 11:06, Johannes Berg wrote:
>>> On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking (Thorsten
>>> Leemhuis) wrote:
>>>> On 16.04.24 08:17, Johannes Berg wrote:
>>>>> On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
>>>>>>
>>>>>> Johannes, you had another suggestion: changing iwlwifi's request_module() to request_module_nowait() in
>>>>>> iwl_req_fw_callback()
>>>>>>
>>>>>> Is that still best thing to try in your opinion?
>>>>>
>>>>> I guess so, I don't have any better ideas so far anyway ...
>>>>
>>>> [adding the iwlwifi maintainer; thread starts here:
>>>> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
>>>>
>>>> ]
>>>>
>>>> Johannes, Miri, what's the status wrt to this regression? From here
>>>> things look somewhat stalled -- but maybe there was progress and I just
>>>> missed it.
>>>
>>> What do you want? It got bisected to an LED merge, but you ping _us_?
>>> Way to go ...
>>
>> Sorry, to me it sounded a bit like you had an idea for a fix and were
>> going to give it a try -- similar to how the maintainers for a r8169
>> driver and the igc driver provided fixes for bugs recent LED changes
>> exposed.
>>
>> But sure, you are right, in the end some LED change seems to have cause
>> this, so the duty to fix it lies in that field. Therefore:
>>
>> Lee, what's the status here to get this fixed before the final?
>
> No idea. Did you send a fix?
I sent what is probably just a work-around. I also spent time bisecting and testing.
The problem appears to have come in with the LED related merge. I think it is fair
to ask the LED folks to at least take a look at the lockdep debugging I posted. It is
not fair to expect anyone that manages to find or track a bug to also fix it.
If someone has a different suggested fix than the hack I posted, I will be happy to
test. On my system with lots of radios, it is 100% reproducible.
Maybe email me directly as I don't keep close watch on LKML.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-05 5:48 ` Ben Greear
@ 2024-05-05 10:55 ` Tetsuo Handa
2024-05-07 4:21 ` Ben Greear
2024-05-07 8:23 ` Lee Jones
1 sibling, 1 reply; 24+ messages in thread
From: Tetsuo Handa @ 2024-05-05 10:55 UTC (permalink / raw)
To: Ben Greear, Lee Jones,
Linux regression tracking (Thorsten Leemhuis)
Cc: LKML, linux-leds, Miri Korenblit, Johannes Berg,
Linux regressions mailing list
On 2024/05/05 14:48, Ben Greear wrote:
> If someone has a different suggested fix than the hack I posted, I will be happy to
> test. On my system with lots of radios, it is 100% reproducible.
> Maybe email me directly as I don't keep close watch on LKML.
Please collect stacktraces of all lock holders using
https://lkml.kernel.org/r/77e32fdc-3f63-e124-588e-7d60dd66fc9a@I-love.SAKURA.ne.jp .
Depending on the output, I might ask you to decode addresses using ./scripts/faddr2line .
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-04-03 19:35 ` 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related) Ben Greear
2024-04-08 16:35 ` Johannes Berg
2024-04-11 7:07 ` Lee Jones
@ 2024-05-06 20:00 ` Heiner Kallweit
2024-05-07 4:20 ` Ben Greear
2 siblings, 1 reply; 24+ messages in thread
From: Heiner Kallweit @ 2024-05-06 20:00 UTC (permalink / raw)
To: Ben Greear, LKML, linux-leds, Lee Jones; +Cc: Johannes Berg
On 03.04.2024 21:35, Ben Greear wrote:
> On 4/2/24 10:38, Ben Greear wrote:
>> On 4/2/24 09:37, Ben Greear wrote:
>>> Hello,
>>>
>>> Sometime between rc1 and today's rc2, my system quit booting.
>>> I'm not seeing any splats, it just stops. Evidently before
>>> sysrq is enabled.
>>>
For my understanding:
You say 6.9-rc1 was ok, but 6.9-rc2 is not?
If I look at the diff then I see no LED subsystem changes,
but iwlwifi changes. It's not clear to me why your bisect
points to something outside the diff.
>>> [ OK ] Started Flush Journal to Persistent Storage.
>>> [ OK ] Started udev Coldplug all Devices.
>>> Starting udev Wait for Complete Device Initialization...
>>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>>> Starting LVM2 PV scan on device 8:19...
>>> Starting LVM2 PV scan on device 8:3...
>>> [ OK ] Started Device-mapper event daemon.
>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>> sysrq: This sysrq operation is disabled.
>>>
>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>
>>> Thanks,
>>> Ben
>>>
>>
>> So, deadlock I guess....
>>
>> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
>> Workqueue: events deferred_probe_timeout_work_func
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? mark_held_locks+0x49/0x70
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> ? __flush_work+0x1ff/0x460
>> __flush_work+0x287/0x460
>> ? flush_workqueue_prep_pwqs+0x120/0x120
>> deferred_probe_timeout_work_func+0x2b/0xa0
>> process_one_work+0x212/0x710
>> ? lock_is_held_type+0xa5/0x110
>> worker_thread+0x188/0x340
>> ? rescuer_thread+0x380/0x380
>> kthread+0xd7/0x110
>> ? kthread_complete_and_exit+0x20/0x20
>> ret_from_fork+0x28/0x40
>> ? kthread_complete_and_exit+0x20/0x20
>> ret_from_fork_asm+0x11/0x20
>> </TASK>
>> INFO: task udevadm:763 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:udevadm state:D stack:0 pid:763 tgid:763 ppid:1 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> ? __flush_work+0x1ff/0x460
>> __flush_work+0x287/0x460
>> ? flush_workqueue_prep_pwqs+0x120/0x120
>> fsnotify_destroy_group+0x66/0xf0
>> inotify_release+0x12/0x40
>> __fput+0xa6/0x2d0
>> __x64_sys_close+0x33/0x70
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7f744d5bc878
>> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
>> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
>> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
>> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
>> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
>> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
>> </TASK>
>> INFO: task modprobe:968 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:968 tgid:968 ppid:65 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7fde25530ddd
>> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
>> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
>> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
>> </TASK>
>> INFO: task modprobe:969 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:969 tgid:969 ppid:93 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7f338d516ddd
>> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
>> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
>> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
>> </TASK>
>> INFO: task modprobe:1044 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:1044 tgid:1044 ppid:10 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7f7637b30ddd
>> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
>> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
>> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
>> </TASK>
>> INFO: task modprobe:1047 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:1047 tgid:1047 ppid:113 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7f3907130ddd
>> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
>> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
>> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
>> </TASK>
>> INFO: task modprobe:1056 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:1056 tgid:1056 ppid:1045 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7fcb1e730ddd
>> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
>> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
>> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
>> </TASK>
>> INFO: task modprobe:1058 blocked for more than 180 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:1058 tgid:1058 ppid:1051 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7f0a17b30ddd
>> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
>> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
>> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
>> </TASK>
>> INFO: task modprobe:1060 blocked for more than 181 seconds.
>> Not tainted 6.9.0-rc2+ #23
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:modprobe state:D stack:0 pid:1060 tgid:1060 ppid:1057 flags:0x00000000
>> Call Trace:
>> <TASK>
>> __schedule+0x43d/0xe20
>> schedule+0x31/0x130
>> schedule_timeout+0x1b9/0x1d0
>> ? __wait_for_common+0xb0/0x1d0
>> ? lock_release+0xc6/0x290
>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>> __wait_for_common+0xb9/0x1d0
>> ? usleep_range_state+0xb0/0xb0
>> idempotent_init_module+0x1ae/0x290
>> __x64_sys_finit_module+0x55/0xb0
>> do_syscall_64+0x6c/0x170
>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> RIP: 0033:0x7f12c0130ddd
>> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
>> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
>> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
>> </TASK>
>>
>> Showing all locks held in the system:
>> 2 locks held by systemd/1:
>> #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
>> #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
>> 2 locks held by kworker/0:1/9:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:0/10:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/3:0/37:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/7:0/61:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:1/65:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 1 lock held by khungtaskd/66:
>> #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
>> 2 locks held by kworker/1:1/79:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:2/93:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/6:1/94:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/3:1/96:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/1:2/102:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:3/107:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:4/113:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/6:2/189:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/6:5/196:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/6:6/197:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/6:8/199:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/7:2/296:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/7:3/297:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/7:4/298:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/7:5/320:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/2:2/371:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/5:13/648:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/5:14/649:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/5:15/650:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/5:16/651:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/4:3/722:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/1:4/768:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/1:5/769:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/0:2/849:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by lvm/860:
>> #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
>> #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
>> 2 locks held by modprobe/1019:
>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>> 2 locks held by kworker/u32:5/1045:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:6/1051:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/u32:7/1057:
>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/3:3/1111:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>> 2 locks held by kworker/3:4/1132:
>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>> #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>
>> =============================================
>>
>>
>
> I ran a bisect on this. The tagged bad commit is a LED related merge, but commit
> shows no code changes when I look at it in git. I double checked that the
> merge is bad by manually going to it again at the end of the bisect and
> indeed it fails.
>
> From looking at lockdep, this below may be interesting. I do have 24 intel be200 radios
> in this system, so maybe lots of iwlwifi radios help trigger the problem?
>
>> 2 locks held by modprobe/1019:
>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>
> Please let me know if you have any suggestions for how to debug this further.
>
> [greearb@ben-dt5 linux-2.6]$ git bisect log
> git bisect start
> # status: waiting for both good and bad commits
> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
> git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
> # status: waiting for bad commit, 1 good commit known
> # bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
> git bisect bad 4cece764965020c22cff7665b18a012006359095
> # good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
> git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
> # bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
> git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
> # good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
> git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
> # bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
> git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
> # bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
> git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
> # bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
> git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
> # good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
> git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
> # good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
> git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
> # good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
> git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
> # good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
> git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
> # good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
> git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
> # good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
> git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
> # good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
> git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
> # first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
> [greearb@ben-dt5 linux-2.6]$
>
> Thanks,
> Ben
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-06 20:00 ` Heiner Kallweit
@ 2024-05-07 4:20 ` Ben Greear
2024-05-07 6:27 ` Heiner Kallweit
0 siblings, 1 reply; 24+ messages in thread
From: Ben Greear @ 2024-05-07 4:20 UTC (permalink / raw)
To: Heiner Kallweit, LKML, linux-leds, Lee Jones; +Cc: Johannes Berg
On 5/6/24 13:00, Heiner Kallweit wrote:
> On 03.04.2024 21:35, Ben Greear wrote:
>> On 4/2/24 10:38, Ben Greear wrote:
>>> On 4/2/24 09:37, Ben Greear wrote:
>>>> Hello,
>>>>
>>>> Sometime between rc1 and today's rc2, my system quit booting.
>>>> I'm not seeing any splats, it just stops. Evidently before
>>>> sysrq is enabled.
>>>>
>
> For my understanding:
> You say 6.9-rc1 was ok, but 6.9-rc2 is not?
>
> If I look at the diff then I see no LED subsystem changes,
> but iwlwifi changes. It's not clear to me why your bisect
> points to something outside the diff.
I was incorrect in my early assessment about exactly where
the error came in. I later ran a full bisect to find the commit
that showed the error. The problem only seems to happen when there
are lots of iwlwifi (in my case) radios in a system, so that added
to my initial confusion on the bug.
It is almost certainly LED related, as my initial hack to make the problem
go away was to just comment out the led registration logic in iwlwifi.
Johanne's solution also makes the problem go away.
Thanks,
Ben
>
>
>>>> [ OK ] Started Flush Journal to Persistent Storage.
>>>> [ OK ] Started udev Coldplug all Devices.
>>>> Starting udev Wait for Complete Device Initialization...
>>>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>>>> Starting LVM2 PV scan on device 8:19...
>>>> Starting LVM2 PV scan on device 8:3...
>>>> [ OK ] Started Device-mapper event daemon.
>>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>>> sysrq: This sysrq operation is disabled.
>>>>
>>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>>
>>>> Thanks,
>>>> Ben
>>>>
>>>
>>> So, deadlock I guess....
>>>
>>> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
>>> Workqueue: events deferred_probe_timeout_work_func
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? mark_held_locks+0x49/0x70
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> ? __flush_work+0x1ff/0x460
>>> __flush_work+0x287/0x460
>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>> deferred_probe_timeout_work_func+0x2b/0xa0
>>> process_one_work+0x212/0x710
>>> ? lock_is_held_type+0xa5/0x110
>>> worker_thread+0x188/0x340
>>> ? rescuer_thread+0x380/0x380
>>> kthread+0xd7/0x110
>>> ? kthread_complete_and_exit+0x20/0x20
>>> ret_from_fork+0x28/0x40
>>> ? kthread_complete_and_exit+0x20/0x20
>>> ret_from_fork_asm+0x11/0x20
>>> </TASK>
>>> INFO: task udevadm:763 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:udevadm state:D stack:0 pid:763 tgid:763 ppid:1 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> ? __flush_work+0x1ff/0x460
>>> __flush_work+0x287/0x460
>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>> fsnotify_destroy_group+0x66/0xf0
>>> inotify_release+0x12/0x40
>>> __fput+0xa6/0x2d0
>>> __x64_sys_close+0x33/0x70
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f744d5bc878
>>> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
>>> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
>>> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
>>> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
>>> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
>>> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
>>> </TASK>
>>> INFO: task modprobe:968 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:968 tgid:968 ppid:65 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7fde25530ddd
>>> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
>>> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
>>> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
>>> </TASK>
>>> INFO: task modprobe:969 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:969 tgid:969 ppid:93 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f338d516ddd
>>> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
>>> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
>>> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
>>> </TASK>
>>> INFO: task modprobe:1044 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:1044 tgid:1044 ppid:10 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f7637b30ddd
>>> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
>>> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
>>> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
>>> </TASK>
>>> INFO: task modprobe:1047 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:1047 tgid:1047 ppid:113 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f3907130ddd
>>> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
>>> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
>>> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
>>> </TASK>
>>> INFO: task modprobe:1056 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:1056 tgid:1056 ppid:1045 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7fcb1e730ddd
>>> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
>>> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
>>> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
>>> </TASK>
>>> INFO: task modprobe:1058 blocked for more than 180 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:1058 tgid:1058 ppid:1051 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f0a17b30ddd
>>> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
>>> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
>>> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
>>> </TASK>
>>> INFO: task modprobe:1060 blocked for more than 181 seconds.
>>> Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe state:D stack:0 pid:1060 tgid:1060 ppid:1057 flags:0x00000000
>>> Call Trace:
>>> <TASK>
>>> __schedule+0x43d/0xe20
>>> schedule+0x31/0x130
>>> schedule_timeout+0x1b9/0x1d0
>>> ? __wait_for_common+0xb0/0x1d0
>>> ? lock_release+0xc6/0x290
>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>> __wait_for_common+0xb9/0x1d0
>>> ? usleep_range_state+0xb0/0xb0
>>> idempotent_init_module+0x1ae/0x290
>>> __x64_sys_finit_module+0x55/0xb0
>>> do_syscall_64+0x6c/0x170
>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f12c0130ddd
>>> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
>>> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
>>> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
>>> </TASK>
>>>
>>> Showing all locks held in the system:
>>> 2 locks held by systemd/1:
>>> #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
>>> #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
>>> 2 locks held by kworker/0:1/9:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:0/10:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:0/37:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:0/61:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:1/65:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 1 lock held by khungtaskd/66:
>>> #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
>>> 2 locks held by kworker/1:1/79:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:2/93:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:1/94:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:1/96:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/1:2/102:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:3/107:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:4/113:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:2/189:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:5/196:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:6/197:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:8/199:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:2/296:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:3/297:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:4/298:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:5/320:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/2:2/371:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:13/648:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:14/649:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:15/650:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:16/651:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/4:3/722:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/1:4/768:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/1:5/769:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/0:2/849:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by lvm/860:
>>> #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
>>> #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
>>> 2 locks held by modprobe/1019:
>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>> 2 locks held by kworker/u32:5/1045:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:6/1051:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:7/1057:
>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:3/1111:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:4/1132:
>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>> #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>
>>> =============================================
>>>
>>>
>>
>> I ran a bisect on this. The tagged bad commit is a LED related merge, but commit
>> shows no code changes when I look at it in git. I double checked that the
>> merge is bad by manually going to it again at the end of the bisect and
>> indeed it fails.
>>
>> From looking at lockdep, this below may be interesting. I do have 24 intel be200 radios
>> in this system, so maybe lots of iwlwifi radios help trigger the problem?
>>
>>> 2 locks held by modprobe/1019:
>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>
>> Please let me know if you have any suggestions for how to debug this further.
>>
>> [greearb@ben-dt5 linux-2.6]$ git bisect log
>> git bisect start
>> # status: waiting for both good and bad commits
>> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
>> git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
>> # status: waiting for bad commit, 1 good commit known
>> # bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
>> git bisect bad 4cece764965020c22cff7665b18a012006359095
>> # good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
>> git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
>> # bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
>> git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
>> # good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>> git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
>> # bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
>> git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
>> # bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>> git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
>> # bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>> git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
>> # good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>> git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
>> # good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
>> git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
>> # good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
>> git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
>> # good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
>> git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
>> # good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
>> git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
>> # good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
>> git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
>> # good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>> git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
>> # first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>> [greearb@ben-dt5 linux-2.6]$
>>
>> Thanks,
>> Ben
>>
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-05 10:55 ` Tetsuo Handa
@ 2024-05-07 4:21 ` Ben Greear
0 siblings, 0 replies; 24+ messages in thread
From: Ben Greear @ 2024-05-07 4:21 UTC (permalink / raw)
To: Tetsuo Handa, Lee Jones,
Linux regression tracking (Thorsten Leemhuis)
Cc: LKML, linux-leds, Miri Korenblit, Johannes Berg,
Linux regressions mailing list
On 5/5/24 03:55, Tetsuo Handa wrote:
> On 2024/05/05 14:48, Ben Greear wrote:
>> If someone has a different suggested fix than the hack I posted, I will be happy to
>> test. On my system with lots of radios, it is 100% reproducible.
>> Maybe email me directly as I don't keep close watch on LKML.
>
> Please collect stacktraces of all lock holders using
> https://lkml.kernel.org/r/77e32fdc-3f63-e124-588e-7d60dd66fc9a@I-love.SAKURA.ne.jp .
>
> Depending on the output, I might ask you to decode addresses using ./scripts/faddr2line .
>
>
I am travelling for next few weeks, but will work on this when I return.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-07 4:20 ` Ben Greear
@ 2024-05-07 6:27 ` Heiner Kallweit
2024-05-07 10:58 ` Heiner Kallweit
0 siblings, 1 reply; 24+ messages in thread
From: Heiner Kallweit @ 2024-05-07 6:27 UTC (permalink / raw)
To: Ben Greear, LKML, linux-leds, Lee Jones; +Cc: Johannes Berg
On 07.05.2024 06:20, Ben Greear wrote:
> On 5/6/24 13:00, Heiner Kallweit wrote:
>> On 03.04.2024 21:35, Ben Greear wrote:
>>> On 4/2/24 10:38, Ben Greear wrote:
>>>> On 4/2/24 09:37, Ben Greear wrote:
>>>>> Hello,
>>>>>
>>>>> Sometime between rc1 and today's rc2, my system quit booting.
>>>>> I'm not seeing any splats, it just stops. Evidently before
>>>>> sysrq is enabled.
>>>>>
>>
>> For my understanding:
>> You say 6.9-rc1 was ok, but 6.9-rc2 is not?
>>
>> If I look at the diff then I see no LED subsystem changes,
>> but iwlwifi changes. It's not clear to me why your bisect
>> points to something outside the diff.
>
> I was incorrect in my early assessment about exactly where
> the error came in. I later ran a full bisect to find the commit
> that showed the error. The problem only seems to happen when there
> are lots of iwlwifi (in my case) radios in a system, so that added
> to my initial confusion on the bug.
>
> It is almost certainly LED related, as my initial hack to make the problem
> go away was to just comment out the led registration logic in iwlwifi.
>
The bisect points to a merge commit, what makes root cause analysis cumbersome.
Could you bisect based on the following? Between for-leds-fixes and leds-next-6.9
Then the bisect result should point to the individual commit.
git.kernel.org/pub/scm/linux/kernel/git/lee/leds.git/log/?h=leds-next-6.9
> Johanne's solution also makes the problem go away.
>
> Thanks,
> Ben
>
>>
>>
>>>>> [ OK ] Started Flush Journal to Persistent Storage.
>>>>> [ OK ] Started udev Coldplug all Devices.
>>>>> Starting udev Wait for Complete Device Initialization...
>>>>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>>>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>>>>> Starting LVM2 PV scan on device 8:19...
>>>>> Starting LVM2 PV scan on device 8:3...
>>>>> [ OK ] Started Device-mapper event daemon.
>>>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>>>> sysrq: This sysrq operation is disabled.
>>>>>
>>>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>>>
>>>>> Thanks,
>>>>> Ben
>>>>>
>>>>
>>>> So, deadlock I guess....
>>>>
>>>> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
>>>> Workqueue: events deferred_probe_timeout_work_func
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? mark_held_locks+0x49/0x70
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> ? __flush_work+0x1ff/0x460
>>>> __flush_work+0x287/0x460
>>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>>> deferred_probe_timeout_work_func+0x2b/0xa0
>>>> process_one_work+0x212/0x710
>>>> ? lock_is_held_type+0xa5/0x110
>>>> worker_thread+0x188/0x340
>>>> ? rescuer_thread+0x380/0x380
>>>> kthread+0xd7/0x110
>>>> ? kthread_complete_and_exit+0x20/0x20
>>>> ret_from_fork+0x28/0x40
>>>> ? kthread_complete_and_exit+0x20/0x20
>>>> ret_from_fork_asm+0x11/0x20
>>>> </TASK>
>>>> INFO: task udevadm:763 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:udevadm state:D stack:0 pid:763 tgid:763 ppid:1 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> ? __flush_work+0x1ff/0x460
>>>> __flush_work+0x287/0x460
>>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>>> fsnotify_destroy_group+0x66/0xf0
>>>> inotify_release+0x12/0x40
>>>> __fput+0xa6/0x2d0
>>>> __x64_sys_close+0x33/0x70
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7f744d5bc878
>>>> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
>>>> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
>>>> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
>>>> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
>>>> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
>>>> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
>>>> </TASK>
>>>> INFO: task modprobe:968 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:968 tgid:968 ppid:65 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7fde25530ddd
>>>> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
>>>> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
>>>> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
>>>> </TASK>
>>>> INFO: task modprobe:969 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:969 tgid:969 ppid:93 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7f338d516ddd
>>>> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
>>>> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
>>>> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
>>>> </TASK>
>>>> INFO: task modprobe:1044 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:1044 tgid:1044 ppid:10 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7f7637b30ddd
>>>> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
>>>> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
>>>> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
>>>> </TASK>
>>>> INFO: task modprobe:1047 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:1047 tgid:1047 ppid:113 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7f3907130ddd
>>>> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
>>>> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
>>>> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
>>>> </TASK>
>>>> INFO: task modprobe:1056 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:1056 tgid:1056 ppid:1045 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7fcb1e730ddd
>>>> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
>>>> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
>>>> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
>>>> </TASK>
>>>> INFO: task modprobe:1058 blocked for more than 180 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:1058 tgid:1058 ppid:1051 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7f0a17b30ddd
>>>> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
>>>> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
>>>> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
>>>> </TASK>
>>>> INFO: task modprobe:1060 blocked for more than 181 seconds.
>>>> Not tainted 6.9.0-rc2+ #23
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:modprobe state:D stack:0 pid:1060 tgid:1060 ppid:1057 flags:0x00000000
>>>> Call Trace:
>>>> <TASK>
>>>> __schedule+0x43d/0xe20
>>>> schedule+0x31/0x130
>>>> schedule_timeout+0x1b9/0x1d0
>>>> ? __wait_for_common+0xb0/0x1d0
>>>> ? lock_release+0xc6/0x290
>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>> __wait_for_common+0xb9/0x1d0
>>>> ? usleep_range_state+0xb0/0xb0
>>>> idempotent_init_module+0x1ae/0x290
>>>> __x64_sys_finit_module+0x55/0xb0
>>>> do_syscall_64+0x6c/0x170
>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>> RIP: 0033:0x7f12c0130ddd
>>>> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
>>>> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
>>>> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
>>>> </TASK>
>>>>
>>>> Showing all locks held in the system:
>>>> 2 locks held by systemd/1:
>>>> #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
>>>> #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
>>>> 2 locks held by kworker/0:1/9:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:0/10:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/3:0/37:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/7:0/61:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:1/65:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 1 lock held by khungtaskd/66:
>>>> #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
>>>> 2 locks held by kworker/1:1/79:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:2/93:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/6:1/94:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/3:1/96:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/1:2/102:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:3/107:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:4/113:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/6:2/189:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/6:5/196:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/6:6/197:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/6:8/199:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/7:2/296:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/7:3/297:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/7:4/298:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/7:5/320:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/2:2/371:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/5:13/648:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/5:14/649:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/5:15/650:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/5:16/651:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/4:3/722:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/1:4/768:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/1:5/769:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/0:2/849:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by lvm/860:
>>>> #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
>>>> #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
>>>> 2 locks held by modprobe/1019:
>>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>>> 2 locks held by kworker/u32:5/1045:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:6/1051:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/u32:7/1057:
>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/3:3/1111:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>> 2 locks held by kworker/3:4/1132:
>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>> #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>
>>>> =============================================
>>>>
>>>>
>>>
>>> I ran a bisect on this. The tagged bad commit is a LED related merge, but commit
>>> shows no code changes when I look at it in git. I double checked that the
>>> merge is bad by manually going to it again at the end of the bisect and
>>> indeed it fails.
>>>
>>> From looking at lockdep, this below may be interesting. I do have 24 intel be200 radios
>>> in this system, so maybe lots of iwlwifi radios help trigger the problem?
>>>
>>>> 2 locks held by modprobe/1019:
>>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>>
>>> Please let me know if you have any suggestions for how to debug this further.
>>>
>>> [greearb@ben-dt5 linux-2.6]$ git bisect log
>>> git bisect start
>>> # status: waiting for both good and bad commits
>>> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
>>> git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
>>> # status: waiting for bad commit, 1 good commit known
>>> # bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
>>> git bisect bad 4cece764965020c22cff7665b18a012006359095
>>> # good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
>>> git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
>>> # bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
>>> git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
>>> # good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>>> git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
>>> # bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
>>> git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
>>> # bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>> git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
>>> # bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>>> git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
>>> # good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>>> git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
>>> # good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
>>> git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
>>> # good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
>>> git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
>>> # good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
>>> git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
>>> # good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
>>> git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
>>> # good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
>>> git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
>>> # good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>>> git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
>>> # first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>>> [greearb@ben-dt5 linux-2.6]$
>>>
>>> Thanks,
>>> Ben
>>>
>>
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-05 5:48 ` Ben Greear
2024-05-05 10:55 ` Tetsuo Handa
@ 2024-05-07 8:23 ` Lee Jones
1 sibling, 0 replies; 24+ messages in thread
From: Lee Jones @ 2024-05-07 8:23 UTC (permalink / raw)
To: Ben Greear
Cc: Linux regression tracking (Thorsten Leemhuis), LKML, linux-leds,
Miri Korenblit, Johannes Berg, Linux regressions mailing list
On Sat, 04 May 2024, Ben Greear wrote:
> On 5/2/24 00:19, Lee Jones wrote:
> > On Tue, 23 Apr 2024, Linux regression tracking (Thorsten Leemhuis)
> > wrote:
> >
> > > On 23.04.24 11:06, Johannes Berg wrote:
> > > > On Tue, 2024-04-23 at 11:00 +0200, Linux regression tracking
> > > > (Thorsten Leemhuis) wrote:
> > > > > On 16.04.24 08:17, Johannes Berg wrote:
> > > > > > On Mon, 2024-04-15 at 13:37 -0700, Ben Greear wrote:
> > > > > > >
> > > > > > > Johannes, you had another suggestion: changing iwlwifi's
> > > > > > > request_module() to request_module_nowait() in
> > > > > > > iwl_req_fw_callback()
> > > > > > >
> > > > > > > Is that still best thing to try in your opinion?
> > > > > >
> > > > > > I guess so, I don't have any better ideas so far anyway ...
> > > > >
> > > > > [adding the iwlwifi maintainer; thread starts here:
> > > > > https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com/
> > > > >
> > > > > ]
> > > > >
> > > > > Johannes, Miri, what's the status wrt to this regression? From
> > > > > here things look somewhat stalled -- but maybe there was
> > > > > progress and I just missed it.
> > > >
> > > > What do you want? It got bisected to an LED merge, but you ping
> > > > _us_? Way to go ...
> > >
> > > Sorry, to me it sounded a bit like you had an idea for a fix and
> > > were going to give it a try -- similar to how the maintainers for
> > > a r8169 driver and the igc driver provided fixes for bugs recent
> > > LED changes exposed.
> > >
> > > But sure, you are right, in the end some LED change seems to have
> > > cause this, so the duty to fix it lies in that field. Therefore:
> > >
> > > Lee, what's the status here to get this fixed before the final?
> >
> > No idea. Did you send a fix?
>
> I sent what is probably just a work-around. I also spent time
> bisecting and testing. The problem appears to have come in with the
> LED related merge. I think it is fair to ask the LED folks to at
> least take a look at the lockdep debugging I posted.
I can't speak for Pavel, but I personally have no way of debugging or
reproducing this. The only usefulness I can provide is to review and
apply fixes as and when they appear.
> It is not fair to expect anyone that manages to find or track a bug to
> also fix it.
No such expectation has been felt or communicated.
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-07 6:27 ` Heiner Kallweit
@ 2024-05-07 10:58 ` Heiner Kallweit
2024-05-08 5:51 ` Heiner Kallweit
0 siblings, 1 reply; 24+ messages in thread
From: Heiner Kallweit @ 2024-05-07 10:58 UTC (permalink / raw)
To: Ben Greear, LKML, linux-leds, Lee Jones; +Cc: Johannes Berg
On 07.05.2024 08:27, Heiner Kallweit wrote:
> On 07.05.2024 06:20, Ben Greear wrote:
>> On 5/6/24 13:00, Heiner Kallweit wrote:
>>> On 03.04.2024 21:35, Ben Greear wrote:
>>>> On 4/2/24 10:38, Ben Greear wrote:
>>>>> On 4/2/24 09:37, Ben Greear wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Sometime between rc1 and today's rc2, my system quit booting.
>>>>>> I'm not seeing any splats, it just stops. Evidently before
>>>>>> sysrq is enabled.
>>>>>>
>>>
>>> For my understanding:
>>> You say 6.9-rc1 was ok, but 6.9-rc2 is not?
>>>
>>> If I look at the diff then I see no LED subsystem changes,
>>> but iwlwifi changes. It's not clear to me why your bisect
>>> points to something outside the diff.
>>
>> I was incorrect in my early assessment about exactly where
>> the error came in. I later ran a full bisect to find the commit
>> that showed the error. The problem only seems to happen when there
>> are lots of iwlwifi (in my case) radios in a system, so that added
>> to my initial confusion on the bug.
>>
>> It is almost certainly LED related, as my initial hack to make the problem
>> go away was to just comment out the led registration logic in iwlwifi.
>>
> The bisect points to a merge commit, what makes root cause analysis cumbersome.
> Could you bisect based on the following? Between for-leds-fixes and leds-next-6.9
> Then the bisect result should point to the individual commit.
>
> git.kernel.org/pub/scm/linux/kernel/git/lee/leds.git/log/?h=leds-next-6.9
>
The issue seems to be somewhat related to module loading. So a trace of module
loading activity may help. Could you please enable module activity tracing
and provide the resulting trace file?
/sys/kernel/debug/tracing/events/module/enable
>> Johanne's solution also makes the problem go away.
>>
>> Thanks,
>> Ben
>>
>>>
>>>
>>>>>> [ OK ] Started Flush Journal to Persistent Storage.
>>>>>> [ OK ] Started udev Coldplug all Devices.
>>>>>> Starting udev Wait for Complete Device Initialization...
>>>>>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>>>>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>>>>>> Starting LVM2 PV scan on device 8:19...
>>>>>> Starting LVM2 PV scan on device 8:3...
>>>>>> [ OK ] Started Device-mapper event daemon.
>>>>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>>>>> sysrq: This sysrq operation is disabled.
>>>>>>
>>>>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>>>>
>>>>>> Thanks,
>>>>>> Ben
>>>>>>
>>>>>
>>>>> So, deadlock I guess....
>>>>>
>>>>> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
>>>>> Workqueue: events deferred_probe_timeout_work_func
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? mark_held_locks+0x49/0x70
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> ? __flush_work+0x1ff/0x460
>>>>> __flush_work+0x287/0x460
>>>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>>>> deferred_probe_timeout_work_func+0x2b/0xa0
>>>>> process_one_work+0x212/0x710
>>>>> ? lock_is_held_type+0xa5/0x110
>>>>> worker_thread+0x188/0x340
>>>>> ? rescuer_thread+0x380/0x380
>>>>> kthread+0xd7/0x110
>>>>> ? kthread_complete_and_exit+0x20/0x20
>>>>> ret_from_fork+0x28/0x40
>>>>> ? kthread_complete_and_exit+0x20/0x20
>>>>> ret_from_fork_asm+0x11/0x20
>>>>> </TASK>
>>>>> INFO: task udevadm:763 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:udevadm state:D stack:0 pid:763 tgid:763 ppid:1 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> ? __flush_work+0x1ff/0x460
>>>>> __flush_work+0x287/0x460
>>>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>>>> fsnotify_destroy_group+0x66/0xf0
>>>>> inotify_release+0x12/0x40
>>>>> __fput+0xa6/0x2d0
>>>>> __x64_sys_close+0x33/0x70
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7f744d5bc878
>>>>> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
>>>>> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
>>>>> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
>>>>> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
>>>>> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
>>>>> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
>>>>> </TASK>
>>>>> INFO: task modprobe:968 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:968 tgid:968 ppid:65 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7fde25530ddd
>>>>> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
>>>>> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
>>>>> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
>>>>> </TASK>
>>>>> INFO: task modprobe:969 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:969 tgid:969 ppid:93 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7f338d516ddd
>>>>> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
>>>>> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
>>>>> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
>>>>> </TASK>
>>>>> INFO: task modprobe:1044 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:1044 tgid:1044 ppid:10 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7f7637b30ddd
>>>>> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
>>>>> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
>>>>> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
>>>>> </TASK>
>>>>> INFO: task modprobe:1047 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:1047 tgid:1047 ppid:113 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7f3907130ddd
>>>>> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
>>>>> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
>>>>> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
>>>>> </TASK>
>>>>> INFO: task modprobe:1056 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:1056 tgid:1056 ppid:1045 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7fcb1e730ddd
>>>>> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
>>>>> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
>>>>> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
>>>>> </TASK>
>>>>> INFO: task modprobe:1058 blocked for more than 180 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:1058 tgid:1058 ppid:1051 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7f0a17b30ddd
>>>>> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
>>>>> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
>>>>> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
>>>>> </TASK>
>>>>> INFO: task modprobe:1060 blocked for more than 181 seconds.
>>>>> Not tainted 6.9.0-rc2+ #23
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:modprobe state:D stack:0 pid:1060 tgid:1060 ppid:1057 flags:0x00000000
>>>>> Call Trace:
>>>>> <TASK>
>>>>> __schedule+0x43d/0xe20
>>>>> schedule+0x31/0x130
>>>>> schedule_timeout+0x1b9/0x1d0
>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>> ? lock_release+0xc6/0x290
>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>> __wait_for_common+0xb9/0x1d0
>>>>> ? usleep_range_state+0xb0/0xb0
>>>>> idempotent_init_module+0x1ae/0x290
>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>> do_syscall_64+0x6c/0x170
>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>> RIP: 0033:0x7f12c0130ddd
>>>>> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
>>>>> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
>>>>> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
>>>>> </TASK>
>>>>>
>>>>> Showing all locks held in the system:
>>>>> 2 locks held by systemd/1:
>>>>> #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
>>>>> #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
>>>>> 2 locks held by kworker/0:1/9:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:0/10:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/3:0/37:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/7:0/61:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:1/65:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 1 lock held by khungtaskd/66:
>>>>> #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
>>>>> 2 locks held by kworker/1:1/79:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:2/93:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/6:1/94:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/3:1/96:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/1:2/102:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:3/107:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:4/113:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/6:2/189:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/6:5/196:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/6:6/197:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/6:8/199:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/7:2/296:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/7:3/297:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/7:4/298:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/7:5/320:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/2:2/371:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/5:13/648:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/5:14/649:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/5:15/650:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/5:16/651:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/4:3/722:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/1:4/768:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/1:5/769:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/0:2/849:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by lvm/860:
>>>>> #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
>>>>> #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
>>>>> 2 locks held by modprobe/1019:
>>>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>>>> 2 locks held by kworker/u32:5/1045:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:6/1051:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/u32:7/1057:
>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/3:3/1111:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>> 2 locks held by kworker/3:4/1132:
>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>> #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>
>>>>> =============================================
>>>>>
>>>>>
>>>>
>>>> I ran a bisect on this. The tagged bad commit is a LED related merge, but commit
>>>> shows no code changes when I look at it in git. I double checked that the
>>>> merge is bad by manually going to it again at the end of the bisect and
>>>> indeed it fails.
>>>>
>>>> From looking at lockdep, this below may be interesting. I do have 24 intel be200 radios
>>>> in this system, so maybe lots of iwlwifi radios help trigger the problem?
>>>>
>>>>> 2 locks held by modprobe/1019:
>>>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>>>
>>>> Please let me know if you have any suggestions for how to debug this further.
>>>>
>>>> [greearb@ben-dt5 linux-2.6]$ git bisect log
>>>> git bisect start
>>>> # status: waiting for both good and bad commits
>>>> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
>>>> git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
>>>> # status: waiting for bad commit, 1 good commit known
>>>> # bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
>>>> git bisect bad 4cece764965020c22cff7665b18a012006359095
>>>> # good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
>>>> git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
>>>> # bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
>>>> git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
>>>> # good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>>>> git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
>>>> # bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
>>>> git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
>>>> # bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>>> git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
>>>> # bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>>>> git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
>>>> # good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>>>> git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
>>>> # good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
>>>> git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
>>>> # good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
>>>> git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
>>>> # good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
>>>> git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
>>>> # good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
>>>> git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
>>>> # good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
>>>> git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
>>>> # good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>>>> git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
>>>> # first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>>>> [greearb@ben-dt5 linux-2.6]$
>>>>
>>>> Thanks,
>>>> Ben
>>>>
>>>
>>
>>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)
2024-05-07 10:58 ` Heiner Kallweit
@ 2024-05-08 5:51 ` Heiner Kallweit
0 siblings, 0 replies; 24+ messages in thread
From: Heiner Kallweit @ 2024-05-08 5:51 UTC (permalink / raw)
To: Ben Greear, LKML, linux-leds, Lee Jones; +Cc: Johannes Berg
On 07.05.2024 12:58, Heiner Kallweit wrote:
> On 07.05.2024 08:27, Heiner Kallweit wrote:
>> On 07.05.2024 06:20, Ben Greear wrote:
>>> On 5/6/24 13:00, Heiner Kallweit wrote:
>>>> On 03.04.2024 21:35, Ben Greear wrote:
>>>>> On 4/2/24 10:38, Ben Greear wrote:
>>>>>> On 4/2/24 09:37, Ben Greear wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Sometime between rc1 and today's rc2, my system quit booting.
>>>>>>> I'm not seeing any splats, it just stops. Evidently before
>>>>>>> sysrq is enabled.
>>>>>>>
>>>>
>>>> For my understanding:
>>>> You say 6.9-rc1 was ok, but 6.9-rc2 is not?
>>>>
>>>> If I look at the diff then I see no LED subsystem changes,
>>>> but iwlwifi changes. It's not clear to me why your bisect
>>>> points to something outside the diff.
>>>
>>> I was incorrect in my early assessment about exactly where
>>> the error came in. I later ran a full bisect to find the commit
>>> that showed the error. The problem only seems to happen when there
>>> are lots of iwlwifi (in my case) radios in a system, so that added
>>> to my initial confusion on the bug.
>>>
>>> It is almost certainly LED related, as my initial hack to make the problem
>>> go away was to just comment out the led registration logic in iwlwifi.
>>>
>> The bisect points to a merge commit, what makes root cause analysis cumbersome.
>> Could you bisect based on the following? Between for-leds-fixes and leds-next-6.9
>> Then the bisect result should point to the individual commit.
>>
>> git.kernel.org/pub/scm/linux/kernel/git/lee/leds.git/log/?h=leds-next-6.9
>>
> The issue seems to be somewhat related to module loading. So a trace of module
> loading activity may help. Could you please enable module activity tracing
> and provide the resulting trace file?
>
> /sys/kernel/debug/tracing/events/module/enable
>
So far the only change that seems to be related is adding the following
module loading. Eventhough I don't see anything wrong with it, could you
please test whether removing it makes a difference?
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index b1b323b19..6e45d9c61 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -282,8 +282,8 @@ void led_trigger_set_default(struct led_classdev *led_cdev)
* If default trigger wasn't found, maybe trigger module isn't loaded yet.
* Once loaded it will re-probe with all led_cdev's.
*/
- if (!found)
- request_module_nowait("ledtrig:%s", led_cdev->default_trigger);
+// if (!found)
+// request_module_nowait("ledtrig:%s", led_cdev->default_trigger);
}
EXPORT_SYMBOL_GPL(led_trigger_set_default);
--
>>> Johanne's solution also makes the problem go away.
>>>
>>> Thanks,
>>> Ben
>>>
>>>>
>>>>
>>>>>>> [ OK ] Started Flush Journal to Persistent Storage.
>>>>>>> [ OK ] Started udev Coldplug all Devices.
>>>>>>> Starting udev Wait for Complete Device Initialization...
>>>>>>> [ OK ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>>>>>> [ OK ] Created slice system-lvm2\x2dpvscan.slice.
>>>>>>> Starting LVM2 PV scan on device 8:19...
>>>>>>> Starting LVM2 PV scan on device 8:3...
>>>>>>> [ OK ] Started Device-mapper event daemon.
>>>>>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>>>>>> sysrq: This sysrq operation is disabled.
>>>>>>>
>>>>>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ben
>>>>>>>
>>>>>>
>>>>>> So, deadlock I guess....
>>>>>>
>>>>>> INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:kworker/5:13 state:D stack:0 pid:648 tgid:648 ppid:2 flags:0x00004000
>>>>>> Workqueue: events deferred_probe_timeout_work_func
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? mark_held_locks+0x49/0x70
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> ? __flush_work+0x1ff/0x460
>>>>>> __flush_work+0x287/0x460
>>>>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>>>>> deferred_probe_timeout_work_func+0x2b/0xa0
>>>>>> process_one_work+0x212/0x710
>>>>>> ? lock_is_held_type+0xa5/0x110
>>>>>> worker_thread+0x188/0x340
>>>>>> ? rescuer_thread+0x380/0x380
>>>>>> kthread+0xd7/0x110
>>>>>> ? kthread_complete_and_exit+0x20/0x20
>>>>>> ret_from_fork+0x28/0x40
>>>>>> ? kthread_complete_and_exit+0x20/0x20
>>>>>> ret_from_fork_asm+0x11/0x20
>>>>>> </TASK>
>>>>>> INFO: task udevadm:763 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:udevadm state:D stack:0 pid:763 tgid:763 ppid:1 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> ? __flush_work+0x1ff/0x460
>>>>>> __flush_work+0x287/0x460
>>>>>> ? flush_workqueue_prep_pwqs+0x120/0x120
>>>>>> fsnotify_destroy_group+0x66/0xf0
>>>>>> inotify_release+0x12/0x40
>>>>>> __fput+0xa6/0x2d0
>>>>>> __x64_sys_close+0x33/0x70
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7f744d5bc878
>>>>>> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
>>>>>> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
>>>>>> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
>>>>>> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
>>>>>> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
>>>>>> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
>>>>>> </TASK>
>>>>>> INFO: task modprobe:968 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:968 tgid:968 ppid:65 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7fde25530ddd
>>>>>> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
>>>>>> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
>>>>>> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
>>>>>> </TASK>
>>>>>> INFO: task modprobe:969 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:969 tgid:969 ppid:93 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7f338d516ddd
>>>>>> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
>>>>>> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
>>>>>> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
>>>>>> </TASK>
>>>>>> INFO: task modprobe:1044 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:1044 tgid:1044 ppid:10 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7f7637b30ddd
>>>>>> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
>>>>>> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
>>>>>> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
>>>>>> </TASK>
>>>>>> INFO: task modprobe:1047 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:1047 tgid:1047 ppid:113 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7f3907130ddd
>>>>>> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
>>>>>> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
>>>>>> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
>>>>>> </TASK>
>>>>>> INFO: task modprobe:1056 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:1056 tgid:1056 ppid:1045 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7fcb1e730ddd
>>>>>> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
>>>>>> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
>>>>>> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
>>>>>> </TASK>
>>>>>> INFO: task modprobe:1058 blocked for more than 180 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:1058 tgid:1058 ppid:1051 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7f0a17b30ddd
>>>>>> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
>>>>>> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
>>>>>> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
>>>>>> </TASK>
>>>>>> INFO: task modprobe:1060 blocked for more than 181 seconds.
>>>>>> Not tainted 6.9.0-rc2+ #23
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:modprobe state:D stack:0 pid:1060 tgid:1060 ppid:1057 flags:0x00000000
>>>>>> Call Trace:
>>>>>> <TASK>
>>>>>> __schedule+0x43d/0xe20
>>>>>> schedule+0x31/0x130
>>>>>> schedule_timeout+0x1b9/0x1d0
>>>>>> ? __wait_for_common+0xb0/0x1d0
>>>>>> ? lock_release+0xc6/0x290
>>>>>> ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>>>> __wait_for_common+0xb9/0x1d0
>>>>>> ? usleep_range_state+0xb0/0xb0
>>>>>> idempotent_init_module+0x1ae/0x290
>>>>>> __x64_sys_finit_module+0x55/0xb0
>>>>>> do_syscall_64+0x6c/0x170
>>>>>> entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> RIP: 0033:0x7f12c0130ddd
>>>>>> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
>>>>>> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
>>>>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
>>>>>> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
>>>>>> </TASK>
>>>>>>
>>>>>> Showing all locks held in the system:
>>>>>> 2 locks held by systemd/1:
>>>>>> #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
>>>>>> #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
>>>>>> 2 locks held by kworker/0:1/9:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:0/10:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/3:0/37:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/7:0/61:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:1/65:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 1 lock held by khungtaskd/66:
>>>>>> #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
>>>>>> 2 locks held by kworker/1:1/79:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:2/93:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/6:1/94:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/3:1/96:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/1:2/102:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:3/107:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:4/113:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/6:2/189:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/6:5/196:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/6:6/197:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/6:8/199:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/7:2/296:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/7:3/297:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/7:4/298:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/7:5/320:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/2:2/371:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/5:13/648:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/5:14/649:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/5:15/650:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/5:16/651:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/4:3/722:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/1:4/768:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/1:5/769:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/0:2/849:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by lvm/860:
>>>>>> #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
>>>>>> #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
>>>>>> 2 locks held by modprobe/1019:
>>>>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>>>>> 2 locks held by kworker/u32:5/1045:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:6/1051:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/u32:7/1057:
>>>>>> #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/3:3/1111:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>> 2 locks held by kworker/3:4/1132:
>>>>>> #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>>>> #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>>>>
>>>>>> =============================================
>>>>>>
>>>>>>
>>>>>
>>>>> I ran a bisect on this. The tagged bad commit is a LED related merge, but commit
>>>>> shows no code changes when I look at it in git. I double checked that the
>>>>> merge is bad by manually going to it again at the end of the bisect and
>>>>> indeed it fails.
>>>>>
>>>>> From looking at lockdep, this below may be interesting. I do have 24 intel be200 radios
>>>>> in this system, so maybe lots of iwlwifi radios help trigger the problem?
>>>>>
>>>>>> 2 locks held by modprobe/1019:
>>>>>> #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>>>> #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>>>>
>>>>> Please let me know if you have any suggestions for how to debug this further.
>>>>>
>>>>> [greearb@ben-dt5 linux-2.6]$ git bisect log
>>>>> git bisect start
>>>>> # status: waiting for both good and bad commits
>>>>> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
>>>>> git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
>>>>> # status: waiting for bad commit, 1 good commit known
>>>>> # bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
>>>>> git bisect bad 4cece764965020c22cff7665b18a012006359095
>>>>> # good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
>>>>> git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
>>>>> # bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
>>>>> git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
>>>>> # good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>>>>> git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
>>>>> # bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
>>>>> git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
>>>>> # bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>>>> git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
>>>>> # bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>>>>> git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
>>>>> # good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>>>>> git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
>>>>> # good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
>>>>> git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
>>>>> # good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
>>>>> git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
>>>>> # good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
>>>>> git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
>>>>> # good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
>>>>> git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
>>>>> # good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
>>>>> git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
>>>>> # good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>>>>> git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
>>>>> # first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>>>>> [greearb@ben-dt5 linux-2.6]$
>>>>>
>>>>> Thanks,
>>>>> Ben
>>>>>
>>>>
>>>
>>>
>>
>
^ permalink raw reply related [flat|nested] 24+ messages in thread
end of thread, other threads:[~2024-05-08 5:51 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <30f757e3-73c5-5473-c1f8-328bab98fd7d@candelatech.com>
[not found] ` <30819e01-43ce-638f-0cc6-067d6a8d03c7@candelatech.com>
2024-04-03 19:35 ` 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related) Ben Greear
2024-04-08 16:35 ` Johannes Berg
2024-04-09 8:47 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-09 8:59 ` Johannes Berg
2024-04-11 7:07 ` Lee Jones
2024-04-15 20:37 ` Ben Greear
2024-04-16 6:17 ` Johannes Berg
2024-04-23 9:00 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-23 9:06 ` Johannes Berg
2024-04-23 9:29 ` Linux regression tracking (Thorsten Leemhuis)
2024-04-24 17:26 ` Ben Greear
2024-04-24 17:31 ` Johannes Berg
2024-05-02 7:19 ` Lee Jones
2024-05-02 8:30 ` Linux regression tracking (Thorsten Leemhuis)
2024-05-05 5:48 ` Ben Greear
2024-05-05 10:55 ` Tetsuo Handa
2024-05-07 4:21 ` Ben Greear
2024-05-07 8:23 ` Lee Jones
2024-04-16 12:19 ` Lee Jones
2024-05-06 20:00 ` Heiner Kallweit
2024-05-07 4:20 ` Ben Greear
2024-05-07 6:27 ` Heiner Kallweit
2024-05-07 10:58 ` Heiner Kallweit
2024-05-08 5:51 ` Heiner Kallweit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).