public inbox for linux-bluetooth@vger.kernel.org
 help / color / mirror / Atom feed
* Bluetooth: btusb: Intel AX211 (8087:0033) permanently lost after suspend/resume
@ 2026-03-20 11:40 Frederik Berg
  2026-03-20 11:44 ` [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend Frederik Berg
  0 siblings, 1 reply; 5+ messages in thread
From: Frederik Berg @ 2026-03-20 11:40 UTC (permalink / raw)
  To: linux-bluetooth

Hi,

I'm hitting an issue where my Intel AX211 Bluetooth controller (USB 
8087:0033) permanently vanishes after suspend/resume. The device 
disappears from lsusb entirely and doesn't come back, even across 
further suspend/resume cycles. Only way to recover without rebooting is 
resetting the xHCI host controller.

I traced this across multiple suspend/resume cycles in the same boot and 
found what looks like a race condition between the PM notifier's HCI 
shutdown and the USB subsystem tearing down the device. I also took a 
look at the btusb source and believe I found the root cause, included as 
a patch in a follow-up.

*Setup:*

  * Lenovo ThinkPad T14 Intel
  * Intel AX211 Bluetooth, USB ID 8087:0033
  * Internal USB port 3-10, xHCI at PCI 0000:00:14.0
  * Firmware: intel/ibt-0040-0041.sfi, version 133-20.25
  * Kernel: 6.17.0-14-generic (Ubuntu)

*What I'm seeing:*

On successful resumes, the HCI power-off command (Write Scan Enable, 
0x0c1a) times out with -ETIMEDOUT. The ACPI reset method then kicks in, 
the device re-enumerates at full-speed, firmware loads, everything works:

|Bluetooth: hci0: command 0x0c1a tx timeout Bluetooth: hci0: Initiating 
acpi reset method Bluetooth: hci0: Opcode 0x0c1a failed: -110 usb 3-10: 
new full-speed USB device using xhci_hcd usb 3-10: New USB device found, 
idVendor=8087, idProduct=0033 Bluetooth: hci0: Found device firmware: 
intel/ibt-0040-0041.sfi Bluetooth: hci0: Firmware loaded in 1602079 usecs |

On the failed resume, the USB subsystem disconnects the device before 
the HCI command can timeout. The error is -ENODEV instead of -ETIMEDOUT:

|usb 3-10: USB disconnect, device number 19 Bluetooth: hci0: Opcode 
0x0c1a failed: -19 Bluetooth: hci0: Error when powering off device on 
rfkill (-19) Bluetooth: hci0: sending frame failed (-19) Bluetooth: 
hci0: HCI reset during shutdown failed usb 3-10: new low-speed USB 
device number 30 using xhci_hcd usb 3-10: Device not responding to setup 
address. usb 3-10: device not accepting address 30, error -71 |

After this, /sys/bus/usb/devices/3-10/ is gone, bluetoothctl says "No 
default controller available", and btusb is loaded with 0 users. rfkill 
block/unblock does nothing since the USB device itself no longer exists.

*What I think is happening in the code:*

hci_cmd_timeout() (hci_core.c:1462) is the only path that calls 
hdev->reset(), which for Intel devices triggers 
btintel_acpi_reset_method() via btusb_intel_reset(). But when the 
pending HCI command is canceled with ENODEV (via 
hci_cmd_sync_cancel_sync in hci_dev_close_sync), hci_cmd_timeout never 
fires, so the ACPI reset that would recover the device is bypassed entirely.

The race exists because the PM notifier (hci_suspend_notifier -> 
hci_suspend_dev) and USB suspend (btusb_suspend) run concurrently with 
no synchronization. Whether the device survives depends on whether the 
HCI command times out before or after the USB subsystem tears things down.

*Recovery* (without reboot):

|echo "0000:00:14.0" > /sys/bus/pci/drivers/xhci_hcd/unbind sleep 3 echo 
"0000:00:14.0" > /sys/bus/pci/drivers/xhci_hcd/bind |

*Reproduction:*

Intermittent, timing-dependent. In my case, the first two suspend/resume 
cycles in a boot succeeded and the third failed. I haven't bisected to a 
specific commit yet.

Thanks,
Frederik Berg



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend
  2026-03-20 11:40 Bluetooth: btusb: Intel AX211 (8087:0033) permanently lost after suspend/resume Frederik Berg
@ 2026-03-20 11:44 ` Frederik Berg
  2026-03-20 12:14   ` bluez.test.bot
  2026-03-20 18:31   ` [PATCH] " Luiz Augusto von Dentz
  0 siblings, 2 replies; 5+ messages in thread
From: Frederik Berg @ 2026-03-20 11:44 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: Frederik Berg

During system suspend, the PM notifier and USB subsystem suspend run
concurrently without synchronization. If the PM notifier's
hci_power_off_sync() is blocked waiting for an HCI command response
(e.g., Write Scan Enable, opcode 0x0c1a) when the USB subsystem tears
down the device, the command fails with -ENODEV.

Unlike a timeout (-ETIMEDOUT), an ENODEV cancellation does not trigger
hci_cmd_timeout(), which means hdev->reset() is never called and the
Intel ACPI reset method (_PRR/_RST) that would recover the device is
bypassed. The device then fails to re-enumerate on resume (detected as
low-speed instead of full-speed, error -71) and is permanently lost
until the xHCI host controller is reset or the system is rebooted.

Cancel any pending synchronous HCI commands in btusb_suspend() before
btusb_stop_traffic() kills the URBs, but only for non-autosuspend
(system suspend). This matches the pattern used by hci_suspend_dev()
which calls hci_cancel_cmd_sync(hdev, EHOSTDOWN) before proceeding
with HCI-level suspend.

Signed-off-by: Frederik Berg <fberg@posteo.de>
---
Proposed fix for the race described in the parent email. This cancels
any pending synchronous HCI commands in btusb_suspend() before
btusb_stop_traffic() kills the URBs, preventing the race. Only for
system suspend, not autosuspend. Same pattern used by hci_suspend_dev()
which calls hci_cancel_cmd_sync(hdev, EHOSTDOWN).

I haven't been able to build-test this against a running kernel yet,
so consider this an RFC.

 drivers/bluetooth/btusb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index a1c5eb993e4..acec484b495 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -4497,6 +4497,9 @@ static int btusb_suspend(struct usb_interface *intf, pm_message_t message)
 
 	cancel_work_sync(&data->work);
 
+	if (!PMSG_IS_AUTO(message))
+		hci_cmd_sync_cancel_sync(data->hdev, EHOSTDOWN);
+
 	if (data->suspend)
 		data->suspend(data->hdev);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: Bluetooth: btusb: cancel pending HCI commands before USB suspend
  2026-03-20 11:44 ` [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend Frederik Berg
@ 2026-03-20 12:14   ` bluez.test.bot
  2026-03-20 18:31   ` [PATCH] " Luiz Augusto von Dentz
  1 sibling, 0 replies; 5+ messages in thread
From: bluez.test.bot @ 2026-03-20 12:14 UTC (permalink / raw)
  To: linux-bluetooth, fberg

[-- Attachment #1: Type: text/plain, Size: 2833 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=1069868

---Test result---

Test Summary:
CheckPatch                    PENDING   0.38 seconds
GitLint                       PENDING   0.23 seconds
SubjectPrefix                 PASS      0.09 seconds
BuildKernel                   PASS      26.69 seconds
CheckAllWarning               PASS      29.06 seconds
CheckSparse                   PASS      28.04 seconds
BuildKernel32                 PASS      25.68 seconds
TestRunnerSetup               PASS      569.56 seconds
TestRunner_l2cap-tester       PASS      28.01 seconds
TestRunner_iso-tester         FAIL      38.59 seconds
TestRunner_bnep-tester        PASS      6.27 seconds
TestRunner_mgmt-tester        FAIL      116.26 seconds
TestRunner_rfcomm-tester      PASS      9.50 seconds
TestRunner_sco-tester         FAIL      14.44 seconds
TestRunner_ioctl-tester       PASS      10.30 seconds
TestRunner_mesh-tester        FAIL      11.49 seconds
TestRunner_smp-tester         PASS      8.55 seconds
TestRunner_userchan-tester    PASS      6.66 seconds
IncrementalBuild              PENDING   0.89 seconds

Details
##############################
Test: CheckPatch - PENDING
Desc: Run checkpatch.pl script
Output:

##############################
Test: GitLint - PENDING
Desc: Run gitlint
Output:

##############################
Test: TestRunner_iso-tester - FAIL
Desc: Run iso-tester with test-runner
Output:
BUG: KASAN: slab-use-after-free in le_read_features_complete+0x7e/0x2b0
Total: 141, Passed: 141 (100.0%), Failed: 0, Not Run: 0
##############################
Test: TestRunner_mgmt-tester - FAIL
Desc: Run mgmt-tester with test-runner
Output:
Total: 494, Passed: 489 (99.0%), Failed: 1, Not Run: 4

Failed Test Cases
Read Exp Feature - Success                           Failed       0.104 seconds
##############################
Test: TestRunner_sco-tester - FAIL
Desc: Run sco-tester with test-runner
Output:
WARNING: possible circular locking dependency detected
BUG: sleeping function called from invalid context at net/core/sock.c:3782
Total: 30, Passed: 30 (100.0%), Failed: 0, Not Run: 0
##############################
Test: TestRunner_mesh-tester - FAIL
Desc: Run mesh-tester with test-runner
Output:
Total: 10, Passed: 8 (80.0%), Failed: 2, Not Run: 0

Failed Test Cases
Mesh - Send cancel - 1                               Timed out    1.845 seconds
Mesh - Send cancel - 2                               Timed out    1.997 seconds
##############################
Test: IncrementalBuild - PENDING
Desc: Incremental build with the patches in the series
Output:



---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend
  2026-03-20 11:44 ` [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend Frederik Berg
  2026-03-20 12:14   ` bluez.test.bot
@ 2026-03-20 18:31   ` Luiz Augusto von Dentz
  2026-03-20 18:48     ` Frederik Berg
  1 sibling, 1 reply; 5+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-20 18:31 UTC (permalink / raw)
  To: Frederik Berg; +Cc: linux-bluetooth

Hi Frederik,

On Fri, Mar 20, 2026 at 7:44 AM Frederik Berg <fberg@posteo.de> wrote:
>
> During system suspend, the PM notifier and USB subsystem suspend run
> concurrently without synchronization. If the PM notifier's
> hci_power_off_sync() is blocked waiting for an HCI command response
> (e.g., Write Scan Enable, opcode 0x0c1a) when the USB subsystem tears
> down the device, the command fails with -ENODEV.
>
> Unlike a timeout (-ETIMEDOUT), an ENODEV cancellation does not trigger
> hci_cmd_timeout(), which means hdev->reset() is never called and the
> Intel ACPI reset method (_PRR/_RST) that would recover the device is
> bypassed. The device then fails to re-enumerate on resume (detected as
> low-speed instead of full-speed, error -71) and is permanently lost
> until the xHCI host controller is reset or the system is rebooted.
>
> Cancel any pending synchronous HCI commands in btusb_suspend() before
> btusb_stop_traffic() kills the URBs, but only for non-autosuspend
> (system suspend). This matches the pattern used by hci_suspend_dev()
> which calls hci_cancel_cmd_sync(hdev, EHOSTDOWN) before proceeding
> with HCI-level suspend.
>
> Signed-off-by: Frederik Berg <fberg@posteo.de>
> ---
> Proposed fix for the race described in the parent email. This cancels
> any pending synchronous HCI commands in btusb_suspend() before
> btusb_stop_traffic() kills the URBs, preventing the race. Only for
> system suspend, not autosuspend. Same pattern used by hci_suspend_dev()
> which calls hci_cancel_cmd_sync(hdev, EHOSTDOWN).
>
> I haven't been able to build-test this against a running kernel yet,
> so consider this an RFC.
>
>  drivers/bluetooth/btusb.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index a1c5eb993e4..acec484b495 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -4497,6 +4497,9 @@ static int btusb_suspend(struct usb_interface *intf, pm_message_t message)
>
>         cancel_work_sync(&data->work);
>
> +       if (!PMSG_IS_AUTO(message))
> +               hci_cmd_sync_cancel_sync(data->hdev, EHOSTDOWN);
> +
>         if (data->suspend)
>                 data->suspend(data->hdev);
>
> --
> 2.53.0

Have to agree with a few points over here:

https://sashiko.dev/#/patchset/20260320114347.2340528-1-fberg%40posteo.de

Specially the first point: if hci_suspend_sync (which should be
pending not hci_power_off_sync ) had run, then there shouldn't be
anything to cancel at btusb_suspend and hdev should be suspended.
Perhaps something is trying to power off bluetooth (e.g. userspace)?
Anyway hci_suspend_dev does already perform
hci_cmd_sync_cancel_sync(data->hdev, EHOSTDOWN); or do you have
HCI_QUIRK_NO_SUSPEND_NOTIFIER set for it, which is why it doesn't run?

The other option would be to call hci_suspend_dev directly from the
driver, but then again the register_pm_notifier guarantee that the
notifier completed before calling the driver suspend, or am I missing
something?

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend
  2026-03-20 18:31   ` [PATCH] " Luiz Augusto von Dentz
@ 2026-03-20 18:48     ` Frederik Berg
  0 siblings, 0 replies; 5+ messages in thread
From: Frederik Berg @ 2026-03-20 18:48 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth

Hi Luiz,

Thank you for the quick and thorough feedback.

You're right that HCI_QUIRK_NO_SUSPEND_NOTIFIER is not set for Intel, so 
hci_suspend_dev() should be running and completing before 
btusb_suspend(). I missed that the PM notifier ordering guarantees this.

Looking at the logs again more carefully, the error message is 
specifically "Error when powering off device on rfkill (-19)" which 
points to the hci_rfkill_set_block() -> hci_dev_do_poweroff() path, not 
the PM notifier's hci_suspend_sync(). So you're right that something in 
userspace appears to be triggering rfkill to power off bluetooth around 
the same time as system suspend.

I had a custom systemd user service running that did "rfkill block 
bluetooth" followed by "rfkill unblock bluetooth" on resume, which was 
an earlier workaround attempt. That rfkill block could have been racing 
with USB suspend. I've since removed it and will test a few 
suspend/resume cycles without it to see if the issue still reproduces.

I'll report back with results. If the issue persists without the rfkill 
workaround, I'll capture btmon traces and dmesg with more detail around 
the suspend path.

Thanks,
Frederik

On 3/20/26 19:31, Luiz Augusto von Dentz wrote:
> Hi Frederik,
>
> On Fri, Mar 20, 2026 at 7:44 AM Frederik Berg <fberg@posteo.de> wrote:
>> During system suspend, the PM notifier and USB subsystem suspend run
>> concurrently without synchronization. If the PM notifier's
>> hci_power_off_sync() is blocked waiting for an HCI command response
>> (e.g., Write Scan Enable, opcode 0x0c1a) when the USB subsystem tears
>> down the device, the command fails with -ENODEV.
>>
>> Unlike a timeout (-ETIMEDOUT), an ENODEV cancellation does not trigger
>> hci_cmd_timeout(), which means hdev->reset() is never called and the
>> Intel ACPI reset method (_PRR/_RST) that would recover the device is
>> bypassed. The device then fails to re-enumerate on resume (detected as
>> low-speed instead of full-speed, error -71) and is permanently lost
>> until the xHCI host controller is reset or the system is rebooted.
>>
>> Cancel any pending synchronous HCI commands in btusb_suspend() before
>> btusb_stop_traffic() kills the URBs, but only for non-autosuspend
>> (system suspend). This matches the pattern used by hci_suspend_dev()
>> which calls hci_cancel_cmd_sync(hdev, EHOSTDOWN) before proceeding
>> with HCI-level suspend.
>>
>> Signed-off-by: Frederik Berg <fberg@posteo.de>
>> ---
>> Proposed fix for the race described in the parent email. This cancels
>> any pending synchronous HCI commands in btusb_suspend() before
>> btusb_stop_traffic() kills the URBs, preventing the race. Only for
>> system suspend, not autosuspend. Same pattern used by hci_suspend_dev()
>> which calls hci_cancel_cmd_sync(hdev, EHOSTDOWN).
>>
>> I haven't been able to build-test this against a running kernel yet,
>> so consider this an RFC.
>>
>>   drivers/bluetooth/btusb.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
>> index a1c5eb993e4..acec484b495 100644
>> --- a/drivers/bluetooth/btusb.c
>> +++ b/drivers/bluetooth/btusb.c
>> @@ -4497,6 +4497,9 @@ static int btusb_suspend(struct usb_interface *intf, pm_message_t message)
>>
>>          cancel_work_sync(&data->work);
>>
>> +       if (!PMSG_IS_AUTO(message))
>> +               hci_cmd_sync_cancel_sync(data->hdev, EHOSTDOWN);
>> +
>>          if (data->suspend)
>>                  data->suspend(data->hdev);
>>
>> --
>> 2.53.0
> Have to agree with a few points over here:
>
> https://sashiko.dev/#/patchset/20260320114347.2340528-1-fberg%40posteo.de
>
> Specially the first point: if hci_suspend_sync (which should be
> pending not hci_power_off_sync ) had run, then there shouldn't be
> anything to cancel at btusb_suspend and hdev should be suspended.
> Perhaps something is trying to power off bluetooth (e.g. userspace)?
> Anyway hci_suspend_dev does already perform
> hci_cmd_sync_cancel_sync(data->hdev, EHOSTDOWN); or do you have
> HCI_QUIRK_NO_SUSPEND_NOTIFIER set for it, which is why it doesn't run?
>
> The other option would be to call hci_suspend_dev directly from the
> driver, but then again the register_pm_notifier guarantee that the
> notifier completed before calling the driver suspend, or am I missing
> something?
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-20 18:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-20 11:40 Bluetooth: btusb: Intel AX211 (8087:0033) permanently lost after suspend/resume Frederik Berg
2026-03-20 11:44 ` [PATCH] Bluetooth: btusb: cancel pending HCI commands before USB suspend Frederik Berg
2026-03-20 12:14   ` bluez.test.bot
2026-03-20 18:31   ` [PATCH] " Luiz Augusto von Dentz
2026-03-20 18:48     ` Frederik Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox