[syzbot] [bluetooth?] WARNING in hci_send

public inbox for linux-bluetooth@vger.kernel.org
 help / color / mirror / Atom feed

* [syzbot] [bluetooth?] WARNING in hci_send_cmd (4)
@ 2026-04-25 23:07 syzbot
  2026-04-26 19:11 ` Arjan van de Ven
  0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2026-04-25 23:07 UTC (permalink / raw)
  To: linux-bluetooth, linux-kernel, luiz.dentz, marcel, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    b4e07588e743 tracing: tell git to ignore the generated 'un..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1768ac36580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f17ac3d4b47ac43a
dashboard link: https://syzkaller.appspot.com/bug?extid=00f5a866124dc44cce14
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-b4e07588.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a3832abcd2f7/vmlinux-b4e07588.xz
kernel image: https://storage.googleapis.com/syzbot-assets/0b5d6e6e9cbd/bzImage-b4e07588.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+00f5a866124dc44cce14@syzkaller.appspotmail.com

workqueue: cannot queue hci_cmd_work on wq hci0
WARNING: kernel/workqueue.c:2298 at __queue_work+0xd1f/0xfc0 kernel/workqueue.c:2296, CPU#0: kworker/0:3/1378
Modules linked in:
CPU: 0 UID: 0 PID: 1378 Comm: kworker/0:3 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Workqueue: events l2cap_info_timeout
RIP: 0010:__queue_work+0xd4a/0xfc0 kernel/workqueue.c:2296
Code: 83 c5 18 4c 89 e8 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 ef e8 17 4d a5 00 49 8b 75 00 49 81 c7 70 01 00 00 4c 89 f7 4c 89 fa <67> 48 0f b9 3a 48 83 c4 58 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc
RSP: 0018:ffffc9000257f720 EFLAGS: 00010082
RAX: 1ffff110081cc181 RBX: 0000000000000008 RCX: ffff888000260000
RDX: ffff888040182170 RSI: ffffffff8aa9ccd0 RDI: ffffffff90368d70
RBP: 0000000000000020 R08: ffff888040e60bf7 R09: 1ffff110081cc17e
R10: dffffc0000000000 R11: ffffed10081cc17f R12: dffffc0000000000
R13: ffff888040e60c08 R14: ffffffff90368d70 R15: ffff888040182170
FS:  0000000000000000(0000) GS:ffff88808c80c000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000005dc0 CR3: 0000000012a31000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 queue_work_on+0x106/0x1d0 kernel/workqueue.c:2432
 queue_work include/linux/workqueue.h:696 [inline]
 hci_send_cmd+0xb7/0x1a0 net/bluetooth/hci_core.c:3111
 hci_conn_auth net/bluetooth/hci_conn.c:2459 [inline]
 hci_conn_security+0x599/0xa80 net/bluetooth/hci_conn.c:2551
 l2cap_conn_start+0x3bc/0xf20 net/bluetooth/l2cap_core.c:1534
 l2cap_info_timeout+0x68/0xa0 net/bluetooth/l2cap_core.c:1685
 process_one_work kernel/workqueue.c:3302 [inline]
 process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3385
 worker_thread+0xa53/0xfc0 kernel/workqueue.c:3466
 kthread+0x388/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>
----------------
Code disassembly (best guess):
   0:	83 c5 18             	add    $0x18,%ebp
   3:	4c 89 e8             	mov    %r13,%rax
   6:	48 c1 e8 03          	shr    $0x3,%rax
   a:	42 80 3c 20 00       	cmpb   $0x0,(%rax,%r12,1)
   f:	74 08                	je     0x19
  11:	4c 89 ef             	mov    %r13,%rdi
  14:	e8 17 4d a5 00       	call   0xa54d30
  19:	49 8b 75 00          	mov    0x0(%r13),%rsi
  1d:	49 81 c7 70 01 00 00 	add    $0x170,%r15
  24:	4c 89 f7             	mov    %r14,%rdi
  27:	4c 89 fa             	mov    %r15,%rdx
* 2a:	67 48 0f b9 3a       	ud1    (%edx),%rdi <-- trapping instruction
  2f:	48 83 c4 58          	add    $0x58,%rsp
  33:	5b                   	pop    %rbx
  34:	41 5c                	pop    %r12
  36:	41 5d                	pop    %r13
  38:	41 5e                	pop    %r14
  3a:	41 5f                	pop    %r15
  3c:	5d                   	pop    %rbp
  3d:	c3                   	ret
  3e:	cc                   	int3
  3f:	cc                   	int3


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [syzbot] [bluetooth?] WARNING in hci_send_cmd (4)
  2026-04-25 23:07 [syzbot] [bluetooth?] WARNING in hci_send_cmd (4) syzbot
@ 2026-04-26 19:11 ` Arjan van de Ven
  0 siblings, 0 replies; 2+ messages in thread
From: Arjan van de Ven @ 2026-04-26 19:11 UTC (permalink / raw)
  To: linux-bluetooth
  Cc: linux-kernel, luiz.dentz, marcel, syzbot+00f5a866124dc44cce14,
	syzkaller-bugs

This email is created by automation to help kernel developers
deal with a large volume of AI generated bug reports by decoding
oopses into more actionable information.


Decoded Backtrace

1. __queue_work -- crash site (kernel/workqueue.c:2297)

   2275  static void __queue_work(int cpu, struct workqueue_struct *wq,
   2276                           struct work_struct *work)
   2277  {
   2278      struct pool_workqueue *pwq;
   2279      struct worker_pool *last_pool, *pool;
   2280      unsigned int work_flags;
   2281      unsigned int req_cpu = cpu;
   2289      lockdep_assert_irqs_disabled();
   2291      /*
   2292       * For a draining wq, only works from the same workqueue are
   2293       * allowed. The __WQ_DESTROYING helps to spot the issue that
   2294       * queues a new work item to a wq after destroy_workqueue(wq).
   2295       */
   2296      if (unlikely(wq->flags & (__WQ_DESTROYING | __WQ_DRAINING) &&
-> 2297             WARN_ONCE(!is_chained_work(wq),
   2298                             "workqueue: cannot queue %ps on wq %s\n",
   2299                             work->func, wq->name))) {
   2300          return;
   2301      }

2. queue_work_on (kernel/workqueue.c:2432)

   2422  bool queue_work_on(int cpu, struct workqueue_struct *wq,
   2423                     struct work_struct *work)
   2424  {
   2425      bool ret = false;
   2426      unsigned long irq_flags;
   2428      local_irq_save(irq_flags);
   2430      if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)) &&
   2431          !clear_pending_if_disabled(work)) {
-> 2432          __queue_work(cpu, wq, work);
   2433          ret = true;
   2434      }
   2436      local_irq_restore(irq_flags);
   2437      return ret;
   2438  }

3. hci_send_cmd (net/bluetooth/hci_core.c:3111)

   3092  int hci_send_cmd(struct hci_dev *hdev, __u16 opcode, __u32 plen,
   3093                   const void *param)
   3094  {
   3095      struct sk_buff *skb;
   3097      BT_DBG("%s opcode 0x%4.4x plen %d", hdev->name, opcode, plen);
   3099      skb = hci_cmd_sync_alloc(hdev, opcode, plen, param, NULL);
   3100      if (!skb) {
   3101          bt_dev_err(hdev, "no memory for command");
   3102          return -ENOMEM;
   3103      }
   3105      /* Stand-alone HCI commands must be flagged as
   3106       * single-command requests.
   3107       */
   3108      bt_cb(skb)->hci.req_flags |= HCI_REQ_START;
   3110      skb_queue_tail(&hdev->cmd_q, skb);
-> 3111      queue_work(hdev->workqueue, &hdev->cmd_work);    // hdev->workqueue is __WQ_DRAINING
   3113      return 0;
   3114  }

   Inlined caller: hci_conn_auth (net/bluetooth/hci_conn.c:2459)

   2438  static int hci_conn_auth(struct hci_conn *conn, __u8 sec_level,
   2439                           __u8 auth_type)
   2440  {
   2454      if (!test_and_set_bit(HCI_CONN_AUTH_PEND, &conn->flags)) {
   2455          struct hci_cp_auth_requested cp;
   2456          cp.handle = cpu_to_le16(conn->handle);
-> 2459          hci_send_cmd(conn->hdev, HCI_OP_AUTH_REQUESTED,
   2460                       sizeof(cp), &cp);
   2461      }
   2464      return 0;
   2465  }

4. hci_conn_security (net/bluetooth/hci_conn.c:2551)

   2487  int hci_conn_security(struct hci_conn *conn, __u8 sec_level,
   2488                        __u8 auth_type, bool initiator)
   2489  {
   2544  auth:
   2548      if (initiator)
   2549          set_bit(HCI_CONN_AUTH_INITIATOR, &conn->flags);
-> 2551      if (!hci_conn_auth(conn, sec_level, auth_type))
   2552          return 0;
   2563  }

5. l2cap_info_timeout (net/bluetooth/l2cap_core.c:1685)
   -- running on the 'events' workqueue, NOT on hdev->workqueue

   1675  static void l2cap_info_timeout(struct work_struct *work)
   1676  {
   1677      struct l2cap_conn *conn = container_of(work, struct l2cap_conn,
   1678                                             info_timer.work);
   1680      conn->info_state |= L2CAP_INFO_FEAT_MASK_REQ_DONE;
   1681      conn->info_ident = 0;
   1683  
   1684      mutex_lock(&conn->lock);
-> 1685      l2cap_conn_start(conn);
   1686      mutex_unlock(&conn->lock);
   1687  }


Tentative Analysis

The WARNING fires in __queue_work() at kernel/workqueue.c:2297 when
hci_send_cmd() attempts to queue hci_cmd_work onto hdev->workqueue
after that workqueue has entered the draining or destroying state.

The call chain is:
  l2cap_info_timeout (events wq)
    -> l2cap_conn_start
    -> hci_conn_security
    -> hci_conn_auth [inlined]
    -> hci_send_cmd
    -> queue_work(hdev->workqueue, &hdev->cmd_work)   // <-

The race arises in hci_dev_close_sync() (net/bluetooth/hci_sync.c):

  1. HCI_UP is cleared with test_and_clear_bit(HCI_UP, ...).
  2. drain_workqueue(hdev->workqueue) is called -- sets __WQ_DRAINING.
  3. hci_conn_hash_flush() -> l2cap_conn_del() ->
     disable_delayed_work_sync(&conn->info_timer) -- cancels the timer.

The l2cap_info_timeout callback runs on the 'events' workqueue, which
is entirely separate from hdev->workqueue.  Draining hdev->workqueue
has no effect on pending events workqueue items.  If the
l2cap_info_timeout work item was already queued on the events
workqueue before step 3 cancels it, the timeout fires in the window
between steps 2 and 3, reaching hci_send_cmd with hdev->workqueue
already in the __WQ_DRAINING state.

hci_send_cmd() does not check HCI_UP before calling queue_work(),
unlike hci_recv_frame() and the hci_dev_ioctl() handler, which both
guard with test_bit(HCI_UP, &hdev->flags).


Potential Solution

Add the same HCI_UP guard to hci_send_cmd() that already exists in
hci_recv_frame() and hci_dev_ioctl():

    if (!test_bit(HCI_UP, &hdev->flags))
        return -ENETDOWN;

Place this check at the top of hci_send_cmd(), before the skb
allocation.  Since HCI_UP is cleared before drain_workqueue() in
hci_dev_close_sync(), this guard catches the race without introducing
any resource leak, lock imbalance, or side effects on callers.


More information

Oops-Analysis: http://oops.fenrus.org/reports/lkml/69ed492c.050a0220.e51af.0005.GAE_google.com/
Assisted-by: GitHub Copilot linux-kernel-oops-x86.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-26 19:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-25 23:07 [syzbot] [bluetooth?] WARNING in hci_send_cmd (4) syzbot
2026-04-26 19:11 ` Arjan van de Ven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox