All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [bluetooth?] possible deadlock in __flush_workqueue
@ 2024-01-17 10:03 syzbot
  2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
  0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2024-01-17 10:03 UTC (permalink / raw)
  To: johan.hedberg, linux-bluetooth, linux-kernel, luiz.dentz, marcel,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    943b9f0ab2cf Add linux-next specific files for 20240117
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=121de2fbe80000
kernel config:  https://syzkaller.appspot.com/x/.config?x=12af1d067b6a6d19
dashboard link: https://syzkaller.appspot.com/bug?extid=da0a9c9721e36db712e8
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9c032ce79e0f/disk-943b9f0a.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/93163e287878/vmlinux-943b9f0a.xz
kernel image: https://storage.googleapis.com/syzbot-assets/512cc2e14a4b/bzImage-943b9f0a.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+da0a9c9721e36db712e8@syzkaller.appspotmail.com

Bluetooth: hci2: Opcode 0x0c03 failed: -110
============================================
WARNING: possible recursive locking detected
6.7.0-next-20240117-syzkaller #0 Not tainted
--------------------------------------------
kworker/u5:1/21244 is trying to acquire lock:
ffff88802e0a2538 ((wq_completion)hci2){+.+.}-{0:0}, at: __flush_workqueue+0x141/0x1340 kernel/workqueue.c:3147

but task is already holding lock:
ffff88802e0a2538 ((wq_completion)hci2){+.+.}-{0:0}, at: process_one_work+0x7ba/0x16e0 kernel/workqueue.c:2608

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock((wq_completion)hci2);
  lock((wq_completion)hci2);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by kworker/u5:1/21244:
 #0: ffff88802e0a2538 ((wq_completion)hci2){+.+.}-{0:0}, at: process_one_work+0x7ba/0x16e0 kernel/workqueue.c:2608
 #1: ffffc9000dc27d80 ((work_completion)(&hdev->error_reset)){+.+.}-{0:0}, at: process_one_work+0x824/0x16e0 kernel/workqueue.c:2609

stack backtrace:
CPU: 1 PID: 21244 Comm: kworker/u5:1 Not tainted 6.7.0-next-20240117-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: hci2 hci_error_reset
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
 check_deadlock kernel/locking/lockdep.c:3062 [inline]
 validate_chain kernel/locking/lockdep.c:3856 [inline]
 __lock_acquire+0x20e6/0x3b30 kernel/locking/lockdep.c:5137
 lock_acquire kernel/locking/lockdep.c:5754 [inline]
 lock_acquire+0x1b1/0x540 kernel/locking/lockdep.c:5719
 __flush_workqueue+0x14b/0x1340 kernel/workqueue.c:3147
 drain_workqueue+0x18f/0x3d0 kernel/workqueue.c:3312
 destroy_workqueue+0xc3/0xb10 kernel/workqueue.c:4801
 hci_release_dev+0x14e/0x620 net/bluetooth/hci_core.c:2808
 bt_host_release+0x6a/0xb0 net/bluetooth/hci_sysfs.c:94
 device_release+0xa1/0x240 drivers/base/core.c:2485
 kobject_cleanup lib/kobject.c:682 [inline]
 kobject_release lib/kobject.c:716 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1d0/0x440 lib/kobject.c:733
 put_device+0x1f/0x30 drivers/base/core.c:3733
 process_one_work+0x8d5/0x16e0 kernel/workqueue.c:2633
 process_scheduled_works kernel/workqueue.c:2707 [inline]
 worker_thread+0x8b6/0x1290 kernel/workqueue.c:2788
 kthread+0x2c1/0x3a0 kernel/kthread.c:388
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:242
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
  2024-01-17 10:03 [syzbot] [bluetooth?] possible deadlock in __flush_workqueue syzbot
@ 2024-06-10 11:00 ` Tetsuo Handa
  2024-06-10 11:32   ` bluez.test.bot
  2024-06-14 15:20   ` [PATCH] " patchwork-bot+bluetooth
  0 siblings, 2 replies; 4+ messages in thread
From: Tetsuo Handa @ 2024-06-10 11:00 UTC (permalink / raw)
  To: Marcel Holtmann, Johan Hedberg, Luiz Augusto von Dentz
  Cc: linux-bluetooth@vger.kernel.org, syzbot+da0a9c9721e36db712e8,
	syzkaller-bugs

syzbot is reporting that calling hci_release_dev() from hci_error_reset()
due to hci_dev_put() from hci_error_reset() can cause deadlock at
destroy_workqueue(), for hci_error_reset() is called from
hdev->req_workqueue which destroy_workqueue() needs to flush.

We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
queued into hdev->workqueue and hdev->{power_on,error_reset} which are
queued into hdev->req_workqueue are no longer running by the moment

       destroy_workqueue(hdev->workqueue);
       destroy_workqueue(hdev->req_workqueue);

are called from hci_release_dev().

Call cancel_work_sync() on these work items from hci_unregister_dev()
as soon as hdev->list is removed from hci_dev_list.

Reported-by: syzbot <syzbot+da0a9c9721e36db712e8@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=da0a9c9721e36db712e8
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Completely untested. Please do tests with lockdep enabled before committing.
Maybe it is too early to cancel hdev->{rx_work,cmd_work,tx_work}.
Maybe there are more work items which should be canceled before
hci_unregister_dev() completes. I don't know...

 net/bluetooth/hci_core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index dd3b0f501018..dbbe5e2da210 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2751,7 +2751,11 @@ void hci_unregister_dev(struct hci_dev *hdev)
 	list_del(&hdev->list);
 	write_unlock(&hci_dev_list_lock);
 
+	cancel_work_sync(&hdev->rx_work);
+	cancel_work_sync(&hdev->cmd_work);
+	cancel_work_sync(&hdev->tx_work);
 	cancel_work_sync(&hdev->power_on);
+	cancel_work_sync(&hdev->error_reset);
 
 	hci_cmd_sync_clear(hdev);
 
-- 
2.18.4



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
  2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
@ 2024-06-10 11:32   ` bluez.test.bot
  2024-06-14 15:20   ` [PATCH] " patchwork-bot+bluetooth
  1 sibling, 0 replies; 4+ messages in thread
From: bluez.test.bot @ 2024-06-10 11:32 UTC (permalink / raw)
  To: linux-bluetooth, penguin-kernel

[-- Attachment #1: Type: text/plain, Size: 3175 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=860391

---Test result---

Test Summary:
CheckPatch                    PASS      0.68 seconds
GitLint                       FAIL      0.50 seconds
SubjectPrefix                 PASS      0.13 seconds
BuildKernel                   PASS      29.41 seconds
CheckAllWarning               PASS      32.19 seconds
CheckSparse                   PASS      37.62 seconds
CheckSmatch                   FAIL      35.74 seconds
BuildKernel32                 PASS      29.74 seconds
TestRunnerSetup               PASS      514.54 seconds
TestRunner_l2cap-tester       PASS      20.03 seconds
TestRunner_iso-tester         PASS      28.23 seconds
TestRunner_bnep-tester        PASS      4.74 seconds
TestRunner_mgmt-tester        PASS      111.15 seconds
TestRunner_rfcomm-tester      PASS      7.28 seconds
TestRunner_sco-tester         PASS      49.37 seconds
TestRunner_ioctl-tester       PASS      7.67 seconds
TestRunner_mesh-tester        PASS      7.84 seconds
TestRunner_smp-tester         PASS      6.87 seconds
TestRunner_userchan-tester    PASS      4.97 seconds
IncrementalBuild              PASS      27.45 seconds

Details
##############################
Test: GitLint - FAIL
Desc: Run gitlint
Output:
Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()

WARNING: I3 - ignore-body-lines: gitlint will be switching from using Python regex 'match' (match beginning) to 'search' (match anywhere) semantics. Please review your ignore-body-lines.regex option accordingly. To remove this warning, set general.regex-style-search=True. More details: https://jorisroovers.github.io/gitlint/configuration/#regex-style-search
1: T1 Title exceeds max length (105>80): "Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()"
##############################
Test: CheckSmatch - FAIL
Desc: Run smatch tool with source
Output:

Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: net/bluetooth/hci_core.o] Error 139
make[4]: *** Deleting file 'net/bluetooth/hci_core.o'
make[3]: *** [scripts/Makefile.build:485: net/bluetooth] Error 2
make[2]: *** [scripts/Makefile.build:485: net] Error 2
make[2]: *** Waiting for unfinished jobs....
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bcm203x.o] Error 139
make[4]: *** Deleting file 'drivers/bluetooth/bcm203x.o'
make[4]: *** Waiting for unfinished jobs....
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bpa10x.o] Error 139
make[4]: *** Deleting file 'drivers/bluetooth/bpa10x.o'
make[3]: *** [scripts/Makefile.build:485: drivers/bluetooth] Error 2
make[2]: *** [scripts/Makefile.build:485: drivers] Error 2
make[1]: *** [/github/workspace/src/src/Makefile:1919: .] Error 2
make: *** [Makefile:240: __sub-make] Error 2


---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
  2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
  2024-06-10 11:32   ` bluez.test.bot
@ 2024-06-14 15:20   ` patchwork-bot+bluetooth
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+bluetooth @ 2024-06-14 15:20 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: marcel, johan.hedberg, luiz.dentz, linux-bluetooth,
	syzbot+da0a9c9721e36db712e8, syzkaller-bugs

Hello:

This patch was applied to bluetooth/bluetooth-next.git (master)
by Luiz Augusto von Dentz <luiz.von.dentz@intel.com>:

On Mon, 10 Jun 2024 20:00:32 +0900 you wrote:
> syzbot is reporting that calling hci_release_dev() from hci_error_reset()
> due to hci_dev_put() from hci_error_reset() can cause deadlock at
> destroy_workqueue(), for hci_error_reset() is called from
> hdev->req_workqueue which destroy_workqueue() needs to flush.
> 
> We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
> queued into hdev->workqueue and hdev->{power_on,error_reset} which are
> queued into hdev->req_workqueue are no longer running by the moment
> 
> [...]

Here is the summary with links:
  - Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev()
    https://git.kernel.org/bluetooth/bluetooth-next/c/5b41aa213455

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-14 15:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-17 10:03 [syzbot] [bluetooth?] possible deadlock in __flush_workqueue syzbot
2024-06-10 11:00 ` [PATCH] Bluetooth: hci_core: cancel rx_work,cmd_work,tx_work,power_on,error_reset works upon hci_unregister_dev() Tetsuo Handa
2024-06-10 11:32   ` bluez.test.bot
2024-06-14 15:20   ` [PATCH] " patchwork-bot+bluetooth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.