* Re: [syzbot] [arm?] WARNING in delayed_work_timer_fn [not found] <00000000000046efb605fec3def0@google.com> @ 2024-12-13 12:30 ` syzbot 2025-02-11 11:10 ` Will Deacon 0 siblings, 1 reply; 5+ messages in thread From: syzbot @ 2024-12-13 12:30 UTC (permalink / raw) To: catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs, will syzbot has found a reproducer for the following issue on: HEAD commit: 2e7aff49b5da Merge branches 'for-next/core' and 'for-next/.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci console output: https://syzkaller.appspot.com/x/log.txt?x=11ed5be8580000 kernel config: https://syzkaller.appspot.com/x/.config?x=696fb014d05da3a3 dashboard link: https://syzkaller.appspot.com/bug?extid=e13e654d315d4da1277c compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 userspace arch: arm64 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10192d44580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/ef408f67fde3/disk-2e7aff49.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/414ac17a20dc/vmlinux-2e7aff49.xz kernel image: https://storage.googleapis.com/syzbot-assets/a93415d2a7e7/Image-2e7aff49.gz.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+e13e654d315d4da1277c@syzkaller.appspotmail.com ------------[ cut here ]------------ WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 current_wq_worker kernel/workqueue_internal.h:69 [inline] WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 is_chained_work kernel/workqueue.c:2199 [inline] WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 Modules linked in: CPU: 1 UID: 0 PID: 6023 Comm: klogd Not tainted 6.13.0-rc2-syzkaller-g2e7aff49b5da #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __queue_work+0xe50/0x1308 kernel/workqueue_internal.h:69 lr : current_wq_worker kernel/workqueue_internal.h:69 [inline] lr : is_chained_work kernel/workqueue.c:2199 [inline] lr : __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 sp : ffff800080017b40 x29: ffff800080017b90 x28: dfff800000000000 x27: ffff700010002f88 x26: dfff800000000000 x25: ffff0000d1c63800 x24: 0000000000000008 x23: 1fffe0001b14d791 x22: ffff0000d1c639c0 x21: 0000000000000100 x20: ffff0000d8a6bc80 x19: ffff0000da2f0930 x18: 1fffe000366c5e7e x17: ffff800123d64000 x16: ffff80008326d65c x15: 0000000000000002 x14: 1ffff00011f300ca x13: dfff800000000000 x12: 00000000b1aebe1c x11: 00000000beaa9035 x10: 0000000000ff0100 x9 : 0000000000000000 x8 : ffff0000d8a6bc80 x7 : ffff8000805819c0 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008df6a320 x2 : ffff0000da2f0930 x1 : 0000000000000100 x0 : 0000000000000000 Call trace: current_wq_worker kernel/workqueue_internal.h:69 [inline] (P) is_chained_work kernel/workqueue.c:2199 [inline] (P) __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (P) current_wq_worker kernel/workqueue_internal.h:69 [inline] (L) is_chained_work kernel/workqueue.c:2199 [inline] (L) __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (L) delayed_work_timer_fn+0x74/0x90 kernel/workqueue.c:2485 call_timer_fn+0x1b4/0x8b8 kernel/time/timer.c:1793 expire_timers kernel/time/timer.c:1839 [inline] __run_timers kernel/time/timer.c:2418 [inline] __run_timer_base+0x59c/0x7b4 kernel/time/timer.c:2430 run_timer_base kernel/time/timer.c:2439 [inline] run_timer_softirq+0xcc/0x194 kernel/time/timer.c:2449 handle_softirqs+0x320/0xd34 kernel/softirq.c:554 __do_softirq+0x14/0x20 kernel/softirq.c:588 ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:81 call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891 do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:86 invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0x1d8/0x544 kernel/softirq.c:655 irq_exit_rcu+0x14/0x84 kernel/softirq.c:671 __el1_irq arch/arm64/kernel/entry-common.c:561 [inline] el1_interrupt+0x38/0x68 arch/arm64/kernel/entry-common.c:575 el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:580 el1h_64_irq+0x6c/0x70 arch/arm64/kernel/entry.S:596 __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:26 [inline] (P) arch_local_irq_enable arch/arm64/include/asm/irqflags.h:48 [inline] (P) raw_spin_rq_unlock_irq kernel/sched/sched.h:1526 [inline] (P) finish_lock_switch+0xc0/0x1e4 kernel/sched/core.c:5123 (P) raw_spin_rq_unlock_irq kernel/sched/sched.h:1526 [inline] (L) finish_lock_switch+0xbc/0x1e4 kernel/sched/core.c:5123 (L) finish_task_switch+0x120/0x608 kernel/sched/core.c:5241 context_switch kernel/sched/core.c:5372 [inline] __schedule+0x1748/0x27f0 kernel/sched/core.c:6756 preempt_schedule_irq+0x80/0x188 kernel/sched/core.c:7078 arm64_preempt_schedule_irq arch/arm64/kernel/entry-common.c:301 [inline] __el1_irq arch/arm64/kernel/entry-common.c:563 [inline] el1_interrupt+0x4c/0x68 arch/arm64/kernel/entry-common.c:575 el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:580 el1h_64_irq+0x6c/0x70 arch/arm64/kernel/entry.S:596 __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:26 [inline] (P) arch_local_irq_enable arch/arm64/include/asm/irqflags.h:48 [inline] (P) do_notify_resume+0x5c/0x1f4 arch/arm64/kernel/entry-common.c:133 (P) do_notify_resume+0x58/0x1f4 arch/arm64/kernel/entry-common.c:133 (L) exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline] exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline] el0_interrupt+0xc8/0x248 arch/arm64/kernel/entry-common.c:832 __el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:837 el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:842 el0t_64_irq+0x198/0x19c arch/arm64/kernel/entry.S:601 irq event stamp: 35689 hardirqs last enabled at (35688): [<ffff80008b687f2c>] __exit_to_kernel_mode arch/arm64/kernel/entry-common.c:85 [inline] hardirqs last enabled at (35688): [<ffff80008b687f2c>] exit_to_kernel_mode+0xdc/0x10c arch/arm64/kernel/entry-common.c:95 hardirqs last disabled at (35689): [<ffff80008b6a9b44>] __raw_spin_lock_irq include/linux/spinlock_api_smp.h:117 [inline] hardirqs last disabled at (35689): [<ffff80008b6a9b44>] _raw_spin_lock_irq+0x28/0x70 kernel/locking/spinlock.c:170 softirqs last enabled at (34558): [<ffff800080129934>] local_bh_enable+0x10/0x34 include/linux/bottom_half.h:32 softirqs last disabled at (35685): [<ffff800080020db4>] __do_softirq+0x14/0x20 kernel/softirq.c:588 ---[ end trace 0000000000000000 ]--- --- If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [arm?] WARNING in delayed_work_timer_fn 2024-12-13 12:30 ` [syzbot] [arm?] WARNING in delayed_work_timer_fn syzbot @ 2025-02-11 11:10 ` Will Deacon 2025-02-11 20:59 ` Tejun Heo 0 siblings, 1 reply; 5+ messages in thread From: Will Deacon @ 2025-02-11 11:10 UTC (permalink / raw) To: syzbot Cc: catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs, maz, mark.rutland, richard.xnu.clark, jiangshanlai, tj, marcel, johan.hedberg, luiz.dentz, linux-bluetooth [+ some other folks] On Fri, Dec 13, 2024 at 04:30:24AM -0800, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 2e7aff49b5da Merge branches 'for-next/core' and 'for-next/.. > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci > console output: https://syzkaller.appspot.com/x/log.txt?x=11ed5be8580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=696fb014d05da3a3 > dashboard link: https://syzkaller.appspot.com/bug?extid=e13e654d315d4da1277c > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > userspace arch: arm64 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10192d44580000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/ef408f67fde3/disk-2e7aff49.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/414ac17a20dc/vmlinux-2e7aff49.xz > kernel image: https://storage.googleapis.com/syzbot-assets/a93415d2a7e7/Image-2e7aff49.gz.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+e13e654d315d4da1277c@syzkaller.appspotmail.com I spent a bit of time looking at the reports in the "dashboard link" above. There are quite a few in there and I ended up suspecting the bluetooth subsystem and drivers using hci_register_dev(). Specifically, it looks like one of the workqueues is serviced in softirq context after its corresponding device has been destroyed. I wasn't able to reproduce it locally (I don't have a setup for running the 'syz' reproducers), but the program is pretty small (see the "syz repro" link above). Given that this seems to explode a few times a month, I wonder if it's worth adding some instrumentation to e.g. dump the name of the workqueue? Will --->8 (mammoth backtrace below) > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 current_wq_worker kernel/workqueue_internal.h:69 [inline] > WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 is_chained_work kernel/workqueue.c:2199 [inline] > WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 > Modules linked in: > CPU: 1 UID: 0 PID: 6023 Comm: klogd Not tainted 6.13.0-rc2-syzkaller-g2e7aff49b5da #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 > pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : __queue_work+0xe50/0x1308 kernel/workqueue_internal.h:69 > lr : current_wq_worker kernel/workqueue_internal.h:69 [inline] > lr : is_chained_work kernel/workqueue.c:2199 [inline] > lr : __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 > sp : ffff800080017b40 > x29: ffff800080017b90 x28: dfff800000000000 x27: ffff700010002f88 > x26: dfff800000000000 x25: ffff0000d1c63800 x24: 0000000000000008 > x23: 1fffe0001b14d791 x22: ffff0000d1c639c0 x21: 0000000000000100 > x20: ffff0000d8a6bc80 x19: ffff0000da2f0930 x18: 1fffe000366c5e7e > x17: ffff800123d64000 x16: ffff80008326d65c x15: 0000000000000002 > x14: 1ffff00011f300ca x13: dfff800000000000 x12: 00000000b1aebe1c > x11: 00000000beaa9035 x10: 0000000000ff0100 x9 : 0000000000000000 > x8 : ffff0000d8a6bc80 x7 : ffff8000805819c0 x6 : 0000000000000000 > x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008df6a320 > x2 : ffff0000da2f0930 x1 : 0000000000000100 x0 : 0000000000000000 > Call trace: > current_wq_worker kernel/workqueue_internal.h:69 [inline] (P) > is_chained_work kernel/workqueue.c:2199 [inline] (P) > __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (P) > current_wq_worker kernel/workqueue_internal.h:69 [inline] (L) > is_chained_work kernel/workqueue.c:2199 [inline] (L) > __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (L) > delayed_work_timer_fn+0x74/0x90 kernel/workqueue.c:2485 > call_timer_fn+0x1b4/0x8b8 kernel/time/timer.c:1793 > expire_timers kernel/time/timer.c:1839 [inline] > __run_timers kernel/time/timer.c:2418 [inline] > __run_timer_base+0x59c/0x7b4 kernel/time/timer.c:2430 > run_timer_base kernel/time/timer.c:2439 [inline] > run_timer_softirq+0xcc/0x194 kernel/time/timer.c:2449 > handle_softirqs+0x320/0xd34 kernel/softirq.c:554 > __do_softirq+0x14/0x20 kernel/softirq.c:588 > ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:81 > call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891 > do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:86 > invoke_softirq kernel/softirq.c:435 [inline] > __irq_exit_rcu+0x1d8/0x544 kernel/softirq.c:655 > irq_exit_rcu+0x14/0x84 kernel/softirq.c:671 > __el1_irq arch/arm64/kernel/entry-common.c:561 [inline] > el1_interrupt+0x38/0x68 arch/arm64/kernel/entry-common.c:575 > el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:580 > el1h_64_irq+0x6c/0x70 arch/arm64/kernel/entry.S:596 > __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:26 [inline] (P) > arch_local_irq_enable arch/arm64/include/asm/irqflags.h:48 [inline] (P) > raw_spin_rq_unlock_irq kernel/sched/sched.h:1526 [inline] (P) > finish_lock_switch+0xc0/0x1e4 kernel/sched/core.c:5123 (P) > raw_spin_rq_unlock_irq kernel/sched/sched.h:1526 [inline] (L) > finish_lock_switch+0xbc/0x1e4 kernel/sched/core.c:5123 (L) > finish_task_switch+0x120/0x608 kernel/sched/core.c:5241 > context_switch kernel/sched/core.c:5372 [inline] > __schedule+0x1748/0x27f0 kernel/sched/core.c:6756 > preempt_schedule_irq+0x80/0x188 kernel/sched/core.c:7078 > arm64_preempt_schedule_irq arch/arm64/kernel/entry-common.c:301 [inline] > __el1_irq arch/arm64/kernel/entry-common.c:563 [inline] > el1_interrupt+0x4c/0x68 arch/arm64/kernel/entry-common.c:575 > el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:580 > el1h_64_irq+0x6c/0x70 arch/arm64/kernel/entry.S:596 > __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:26 [inline] (P) > arch_local_irq_enable arch/arm64/include/asm/irqflags.h:48 [inline] (P) > do_notify_resume+0x5c/0x1f4 arch/arm64/kernel/entry-common.c:133 (P) > do_notify_resume+0x58/0x1f4 arch/arm64/kernel/entry-common.c:133 (L) > exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline] > exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline] > el0_interrupt+0xc8/0x248 arch/arm64/kernel/entry-common.c:832 > __el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:837 > el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:842 > el0t_64_irq+0x198/0x19c arch/arm64/kernel/entry.S:601 > irq event stamp: 35689 > hardirqs last enabled at (35688): [<ffff80008b687f2c>] __exit_to_kernel_mode arch/arm64/kernel/entry-common.c:85 [inline] > hardirqs last enabled at (35688): [<ffff80008b687f2c>] exit_to_kernel_mode+0xdc/0x10c arch/arm64/kernel/entry-common.c:95 > hardirqs last disabled at (35689): [<ffff80008b6a9b44>] __raw_spin_lock_irq include/linux/spinlock_api_smp.h:117 [inline] > hardirqs last disabled at (35689): [<ffff80008b6a9b44>] _raw_spin_lock_irq+0x28/0x70 kernel/locking/spinlock.c:170 > softirqs last enabled at (34558): [<ffff800080129934>] local_bh_enable+0x10/0x34 include/linux/bottom_half.h:32 > softirqs last disabled at (35685): [<ffff800080020db4>] __do_softirq+0x14/0x20 kernel/softirq.c:588 > ---[ end trace 0000000000000000 ]--- > > > --- > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [arm?] WARNING in delayed_work_timer_fn 2025-02-11 11:10 ` Will Deacon @ 2025-02-11 20:59 ` Tejun Heo 2025-02-14 16:45 ` Will Deacon 0 siblings, 1 reply; 5+ messages in thread From: Tejun Heo @ 2025-02-11 20:59 UTC (permalink / raw) To: Will Deacon Cc: syzbot, catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs, maz, mark.rutland, richard.xnu.clark, jiangshanlai, marcel, johan.hedberg, luiz.dentz, linux-bluetooth On Tue, Feb 11, 2025 at 11:10:46AM +0000, Will Deacon wrote: > Given that this seems to explode a few times a month, I wonder if it's > worth adding some instrumentation to e.g. dump the name of the workqueue? It's usually most useful to print out the work function when identifying the culprit. I'd be happy to take the patch. Thanks. -- tejun ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [arm?] WARNING in delayed_work_timer_fn 2025-02-11 20:59 ` Tejun Heo @ 2025-02-14 16:45 ` Will Deacon 2025-03-13 19:30 ` Will Deacon 0 siblings, 1 reply; 5+ messages in thread From: Will Deacon @ 2025-02-14 16:45 UTC (permalink / raw) To: Tejun Heo Cc: syzbot, catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs, maz, mark.rutland, richard.xnu.clark, jiangshanlai, marcel, johan.hedberg, luiz.dentz, linux-bluetooth On Tue, Feb 11, 2025 at 10:59:30AM -1000, Tejun Heo wrote: > On Tue, Feb 11, 2025 at 11:10:46AM +0000, Will Deacon wrote: > > Given that this seems to explode a few times a month, I wonder if it's > > worth adding some instrumentation to e.g. dump the name of the workqueue? > > It's usually most useful to print out the work function when identifying the > culprit. I'd be happy to take the patch. Thanks, Tejun. I sent a patch adding some more diagnostics: https://lore.kernel.org/r/20250214164349.13694-1-will@kernel.org Will ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [arm?] WARNING in delayed_work_timer_fn 2025-02-14 16:45 ` Will Deacon @ 2025-03-13 19:30 ` Will Deacon 0 siblings, 0 replies; 5+ messages in thread From: Will Deacon @ 2025-03-13 19:30 UTC (permalink / raw) To: Tejun Heo Cc: syzbot, catalin.marinas, linux-arm-kernel, linux-kernel, syzkaller-bugs, maz, mark.rutland, richard.xnu.clark, jiangshanlai, marcel, johan.hedberg, luiz.dentz, linux-bluetooth On Fri, Feb 14, 2025 at 04:45:57PM +0000, Will Deacon wrote: > On Tue, Feb 11, 2025 at 10:59:30AM -1000, Tejun Heo wrote: > > On Tue, Feb 11, 2025 at 11:10:46AM +0000, Will Deacon wrote: > > > Given that this seems to explode a few times a month, I wonder if it's > > > worth adding some instrumentation to e.g. dump the name of the workqueue? > > > > It's usually most useful to print out the work function when identifying the > > culprit. I'd be happy to take the patch. > > Thanks, Tejun. I sent a patch adding some more diagnostics: > > https://lore.kernel.org/r/20250214164349.13694-1-will@kernel.org Ok, we have our first crash [1] with the new debug information: [ 77.133818][ T10] Bluetooth: hci3: Opcode 0x0c1a failed: -110 [ 77.135329][ T10] Bluetooth: hci3: Error when powering off device on rfkill (-110) [ 77.143656][ C1] ------------[ cut here ]------------ [ 77.145336][ C1] workqueue: cannot queue hci_cmd_timeout on wq hci3 [ 77.147101][ C1] WARNING: CPU: 1 PID: 7433 at kernel/workqueue.c:2258 __queue_work+0xe94/0x1324 [ 77.149301][ C1] Modules linked in: [ 77.150287][ C1] CPU: 1 UID: 0 PID: 7433 Comm: syz.2.193 Not tainted 6.14.0-rc5-syzkaller-g77c95b8c7a16 #0 [ 77.152960][ C1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 [ 77.155583][ C1] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 77.157546][ C1] pc : __queue_work+0xe94/0x1324 [ 77.158811][ C1] lr : __queue_work+0xe94/0x1324 [ 77.160114][ C1] sp : ffff800080017b40 [ 77.161145][ C1] x29: ffff800080017b90 x28: dfff800000000000 x27: ffff700010002f88 [ 77.163285][ C1] x26: ffff80008fbbd600 x25: ffff0000dacac800 x24: 0000000000000008 [ 77.165326][ C1] x23: ffff0000dacac9c0 x22: 0000000000000100 x21: 1fffe0001b595938 [ 77.167421][ C1] x20: ffff800092f09000 x19: ffff0000ecf20948 x18: 0000000000000008 [ 77.169356][ C1] x17: 0000000000000000 x16: ffff80008b72ce6c x15: ffff700011f87b38 [ 77.171279][ C1] x14: 1ffff00011f87b38 x13: 0000000000000004 x12: ffffffffffffffff [ 77.173271][ C1] x11: 0000000000000103 x10: 0000000000ff0100 x9 : d5bcdc7e4519df00 [ 77.175255][ C1] x8 : d5bcdc7e4519df00 x7 : 0000000000000001 x6 : 0000000000000001 [ 77.177277][ C1] x5 : ffff8000800172f8 x4 : ffff80008fcaf7c0 x3 : ffff8000804a7bb4 [ 77.179372][ C1] x2 : 0000000000000000 x1 : 0000000100000101 x0 : 0000000000000000 [ 77.181374][ C1] Call trace: [ 77.182230][ C1] __queue_work+0xe94/0x1324 (P) [ 77.183476][ C1] delayed_work_timer_fn+0x74/0x90 [ 77.184755][ C1] call_timer_fn+0x1b4/0x8b8 [ 77.185932][ C1] __run_timer_base+0x59c/0x7b4 [ 77.187119][ C1] run_timer_softirq+0xcc/0x194 [ 77.188360][ C1] handle_softirqs+0x320/0xd34 [ 77.189576][ C1] __do_softirq+0x14/0x20 [ 77.190672][ C1] ____do_softirq+0x14/0x20 [ 77.191884][ C1] call_on_irq_stack+0x24/0x4c [ 77.193050][ C1] do_softirq_own_stack+0x20/0x2c [ 77.194311][ C1] __irq_exit_rcu+0x1d8/0x544 [ 77.195550][ C1] irq_exit_rcu+0x14/0x84 [ 77.196669][ C1] el1_interrupt+0x38/0x68 [ 77.197818][ C1] el1h_64_irq_handler+0x18/0x24 [ 77.199152][ C1] el1h_64_irq+0x6c/0x70 [ 77.200245][ C1] lock_acquire+0x278/0x724 (P) [ 77.201520][ C1] rcu_lock_acquire+0x44/0x54 [ 77.202722][ C1] page_ext_get+0x2c/0x2e8 [ 77.203908][ C1] page_table_check_set+0xa0/0x408 [ 77.205117][ C1] __page_table_check_ptes_set+0x2d0/0x398 [ 77.206601][ C1] set_pte_range+0x618/0x644 [ 77.207792][ C1] finish_fault+0x968/0xd6c [ 77.208968][ C1] handle_pte_fault+0x3528/0x57b0 [ 77.210314][ C1] handle_mm_fault+0xfa8/0x188c [ 77.211592][ C1] __get_user_pages+0x1878/0x3400 [ 77.212967][ C1] populate_vma_page_range+0x220/0x2f0 [ 77.214435][ C1] __mm_populate+0x240/0x3d8 [ 77.215600][ C1] vm_mmap_pgoff+0x304/0x3c4 [ 77.216833][ C1] ksys_mmap_pgoff+0xd0/0x5c8 [ 77.218066][ C1] __arm64_sys_mmap+0xf8/0x110 [ 77.219195][ C1] invoke_syscall+0x98/0x2b8 [ 77.220453][ C1] el0_svc_common+0x130/0x23c [ 77.221747][ C1] do_el0_svc+0x48/0x58 [ 77.222881][ C1] el0_svc+0x54/0x168 [ 77.223946][ C1] el0t_64_sync_handler+0x84/0x108 [ 77.225305][ C1] el0t_64_sync+0x198/0x19c [ 77.226425][ C1] irq event stamp: 3967 [ 77.227573][ C1] hardirqs last enabled at (3966): [<ffff80008b7ec538>] _raw_spin_unlock_irqrestore+0x38/0x98 [ 77.230143][ C1] hardirqs last disabled at (3967): [<ffff80008b7ec3e0>] _raw_spin_lock_irq+0x28/0x70 [ 77.232639][ C1] softirqs last enabled at (2318): [<ffff800080311b48>] handle_softirqs+0xb44/0xd34 [ 77.235151][ C1] softirqs last disabled at (3923): [<ffff800080020dbc>] __do_softirq+0x14/0x20 [ 77.237528][ C1] ---[ end trace 0000000000000000 ]--- So it looks like I was right about bluetooth (the work function is "hci_cmd_timeout") but, unfortunately, "hci3" doesn't tell us much about the client. Perhaps the failed power-off right before the warning suggests that we're not tearing down the wq properly on that error path? Let's see... hci_rfkill_set_block() calls hci_dev_do_close() if the power-off fails. That in turn cancels the delayed 'cmd_timer' work but only if HCI_UP is not set in the device flags. But if hci_dev_do_poweroff() failed, HCI_UP may well still be set. Then somehow the workqueue is destroyed, I guess via hci_unregister_dev() setting HCI_UNREGISTER which enables hci_release_dev() but at that point it's hard to know which bluetooth driver is responsible. Assuming hci_unregister_dev() is triggered via hci_dev_do_close() closing the device, then unconditionally cancelling the delayed cmd work when clearing HCI_UP sounds like a reasonable starter for ten? Dunno, this is all guesswork in code that I'm not familiar with, so I'm hoping somebody can tell me why I'm wrong :) Will --->8 diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index dd770ef5ec36..d2623f6d0593 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -5147,11 +5147,9 @@ int hci_dev_close_sync(struct hci_dev *hdev) } err = hci_dev_shutdown(hdev); - - if (!test_and_clear_bit(HCI_UP, &hdev->flags)) { - cancel_delayed_work_sync(&hdev->cmd_timer); + cancel_delayed_work_sync(&hdev->cmd_timer); + if (!test_and_clear_bit(HCI_UP, &hdev->flags)) return err; - } hci_leds_update_powered(hdev, false); ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-03-13 19:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <00000000000046efb605fec3def0@google.com>
2024-12-13 12:30 ` [syzbot] [arm?] WARNING in delayed_work_timer_fn syzbot
2025-02-11 11:10 ` Will Deacon
2025-02-11 20:59 ` Tejun Heo
2025-02-14 16:45 ` Will Deacon
2025-03-13 19:30 ` Will Deacon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox