* [syzbot] [bluetooth?] possible deadlock in __flush_workqueue
@ 2024-01-17 10:03 syzbot
2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2024-01-17 10:03 UTC (permalink / raw)
To: johan.hedberg, linux-bluetooth, linux-kernel, luiz.dentz, marcel,
syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 943b9f0ab2cf Add linux-next specific files for 20240117
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=121de2fbe80000
kernel config: https://syzkaller.appspot.com/x/.config?x=12af1d067b6a6d19
dashboard link: https://syzkaller.appspot.com/bug?extid=da0a9c9721e36db712e8
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9c032ce79e0f/disk-943b9f0a.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/93163e287878/vmlinux-943b9f0a.xz
kernel image: https://storage.googleapis.com/syzbot-assets/512cc2e14a4b/bzImage-943b9f0a.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+da0a9c9721e36db712e8@syzkaller.appspotmail.com
Bluetooth: hci2: Opcode 0x0c03 failed: -110
============================================
WARNING: possible recursive locking detected
6.7.0-next-20240117-syzkaller #0 Not tainted
--------------------------------------------
kworker/u5:1/21244 is trying to acquire lock:
ffff88802e0a2538 ((wq_completion)hci2){+.+.}-{0:0}, at: __flush_workqueue+0x141/0x1340 kernel/workqueue.c:3147
but task is already holding lock:
ffff88802e0a2538 ((wq_completion)hci2){+.+.}-{0:0}, at: process_one_work+0x7ba/0x16e0 kernel/workqueue.c:2608
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock((wq_completion)hci2);
lock((wq_completion)hci2);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by kworker/u5:1/21244:
#0: ffff88802e0a2538 ((wq_completion)hci2){+.+.}-{0:0}, at: process_one_work+0x7ba/0x16e0 kernel/workqueue.c:2608
#1: ffffc9000dc27d80 ((work_completion)(&hdev->error_reset)){+.+.}-{0:0}, at: process_one_work+0x824/0x16e0 kernel/workqueue.c:2609
stack backtrace:
CPU: 1 PID: 21244 Comm: kworker/u5:1 Not tainted 6.7.0-next-20240117-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: hci2 hci_error_reset
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
check_deadlock kernel/locking/lockdep.c:3062 [inline]
validate_chain kernel/locking/lockdep.c:3856 [inline]
__lock_acquire+0x20e6/0x3b30 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1b1/0x540 kernel/locking/lockdep.c:5719
__flush_workqueue+0x14b/0x1340 kernel/workqueue.c:3147
drain_workqueue+0x18f/0x3d0 kernel/workqueue.c:3312
destroy_workqueue+0xc3/0xb10 kernel/workqueue.c:4801
hci_release_dev+0x14e/0x620 net/bluetooth/hci_core.c:2808
bt_host_release+0x6a/0xb0 net/bluetooth/hci_sysfs.c:94
device_release+0xa1/0x240 drivers/base/core.c:2485
kobject_cleanup lib/kobject.c:682 [inline]
kobject_release lib/kobject.c:716 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x1d0/0x440 lib/kobject.c:733
put_device+0x1f/0x30 drivers/base/core.c:3733
process_one_work+0x8d5/0x16e0 kernel/workqueue.c:2633
process_scheduled_works kernel/workqueue.c:2707 [inline]
worker_thread+0x8b6/0x1290 kernel/workqueue.c:2788
kthread+0x2c1/0x3a0 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:242
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
2024-01-17 10:03 [syzbot] [bluetooth?] possible deadlock in __flush_workqueue syzbot
@ 2024-06-10 11:00 ` Tetsuo Handa
2024-06-10 11:32 ` bluez.test.bot
2024-06-14 15:20 ` [PATCH] " patchwork-bot+bluetooth
0 siblings, 2 replies; 4+ messages in thread
From: Tetsuo Handa @ 2024-06-10 11:00 UTC (permalink / raw)
To: Marcel Holtmann, Johan Hedberg, Luiz Augusto von Dentz
Cc: linux-bluetooth@vger.kernel.org, syzbot+da0a9c9721e36db712e8,
syzkaller-bugs
syzbot is reporting that calling hci_release_dev() from hci_error_reset()
due to hci_dev_put() from hci_error_reset() can cause deadlock at
destroy_workqueue(), for hci_error_reset() is called from
hdev->req_workqueue which destroy_workqueue() needs to flush.
We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
queued into hdev->workqueue and hdev->{power_on,error_reset} which are
queued into hdev->req_workqueue are no longer running by the moment
destroy_workqueue(hdev->workqueue);
destroy_workqueue(hdev->req_workqueue);
are called from hci_release_dev().
Call cancel_work_sync() on these work items from hci_unregister_dev()
as soon as hdev->list is removed from hci_dev_list.
Reported-by: syzbot <syzbot+da0a9c9721e36db712e8@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=da0a9c9721e36db712e8
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Completely untested. Please do tests with lockdep enabled before committing.
Maybe it is too early to cancel hdev->{rx_work,cmd_work,tx_work}.
Maybe there are more work items which should be canceled before
hci_unregister_dev() completes. I don't know...
net/bluetooth/hci_core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index dd3b0f501018..dbbe5e2da210 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2751,7 +2751,11 @@ void hci_unregister_dev(struct hci_dev *hdev)
list_del(&hdev->list);
write_unlock(&hci_dev_list_lock);
+ cancel_work_sync(&hdev->rx_work);
+ cancel_work_sync(&hdev->cmd_work);
+ cancel_work_sync(&hdev->tx_work);
cancel_work_sync(&hdev->power_on);
+ cancel_work_sync(&hdev->error_reset);
hci_cmd_sync_clear(hdev);
--
2.18.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
@ 2024-06-10 11:32 ` bluez.test.bot
2024-06-14 15:20 ` [PATCH] " patchwork-bot+bluetooth
1 sibling, 0 replies; 4+ messages in thread
From: bluez.test.bot @ 2024-06-10 11:32 UTC (permalink / raw)
To: linux-bluetooth, penguin-kernel
[-- Attachment #1: Type: text/plain, Size: 3175 bytes --]
This is automated email and please do not reply to this email!
Dear submitter,
Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=860391
---Test result---
Test Summary:
CheckPatch PASS 0.68 seconds
GitLint FAIL 0.50 seconds
SubjectPrefix PASS 0.13 seconds
BuildKernel PASS 29.41 seconds
CheckAllWarning PASS 32.19 seconds
CheckSparse PASS 37.62 seconds
CheckSmatch FAIL 35.74 seconds
BuildKernel32 PASS 29.74 seconds
TestRunnerSetup PASS 514.54 seconds
TestRunner_l2cap-tester PASS 20.03 seconds
TestRunner_iso-tester PASS 28.23 seconds
TestRunner_bnep-tester PASS 4.74 seconds
TestRunner_mgmt-tester PASS 111.15 seconds
TestRunner_rfcomm-tester PASS 7.28 seconds
TestRunner_sco-tester PASS 49.37 seconds
TestRunner_ioctl-tester PASS 7.67 seconds
TestRunner_mesh-tester PASS 7.84 seconds
TestRunner_smp-tester PASS 6.87 seconds
TestRunner_userchan-tester PASS 4.97 seconds
IncrementalBuild PASS 27.45 seconds
Details
##############################
Test: GitLint - FAIL
Desc: Run gitlint
Output:
Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
WARNING: I3 - ignore-body-lines: gitlint will be switching from using Python regex 'match' (match beginning) to 'search' (match anywhere) semantics. Please review your ignore-body-lines.regex option accordingly. To remove this warning, set general.regex-style-search=True. More details: https://jorisroovers.github.io/gitlint/configuration/#regex-style-search
1: T1 Title exceeds max length (105>80): "Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()"
##############################
Test: CheckSmatch - FAIL
Desc: Run smatch tool with source
Output:
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: net/bluetooth/hci_core.o] Error 139
make[4]: *** Deleting file 'net/bluetooth/hci_core.o'
make[3]: *** [scripts/Makefile.build:485: net/bluetooth] Error 2
make[2]: *** [scripts/Makefile.build:485: net] Error 2
make[2]: *** Waiting for unfinished jobs....
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bcm203x.o] Error 139
make[4]: *** Deleting file 'drivers/bluetooth/bcm203x.o'
make[4]: *** Waiting for unfinished jobs....
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bpa10x.o] Error 139
make[4]: *** Deleting file 'drivers/bluetooth/bpa10x.o'
make[3]: *** [scripts/Makefile.build:485: drivers/bluetooth] Error 2
make[2]: *** [scripts/Makefile.build:485: drivers] Error 2
make[1]: *** [/github/workspace/src/src/Makefile:1919: .] Error 2
make: *** [Makefile:240: __sub-make] Error 2
---
Regards,
Linux Bluetooth
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
2024-06-10 11:32 ` bluez.test.bot
@ 2024-06-14 15:20 ` patchwork-bot+bluetooth
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+bluetooth @ 2024-06-14 15:20 UTC (permalink / raw)
To: Tetsuo Handa
Cc: marcel, johan.hedberg, luiz.dentz, linux-bluetooth,
syzbot+da0a9c9721e36db712e8, syzkaller-bugs
Hello:
This patch was applied to bluetooth/bluetooth-next.git (master)
by Luiz Augusto von Dentz <luiz.von.dentz@intel.com>:
On Mon, 10 Jun 2024 20:00:32 +0900 you wrote:
> syzbot is reporting that calling hci_release_dev() from hci_error_reset()
> due to hci_dev_put() from hci_error_reset() can cause deadlock at
> destroy_workqueue(), for hci_error_reset() is called from
> hdev->req_workqueue which destroy_workqueue() needs to flush.
>
> We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
> queued into hdev->workqueue and hdev->{power_on,error_reset} which are
> queued into hdev->req_workqueue are no longer running by the moment
>
> [...]
Here is the summary with links:
- Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
https://git.kernel.org/bluetooth/bluetooth-next/c/5b41aa213455
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-14 15:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-17 10:03 [syzbot] [bluetooth?] possible deadlock in __flush_workqueue syzbot
2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
2024-06-10 11:32 ` bluez.test.bot
2024-06-14 15:20 ` [PATCH] " patchwork-bot+bluetooth
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.