From: Desnes Nunes <desnesn@redhat.com>
To: linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org
Cc: gregkh@linuxfoundation.org, mathias.nyman@intel.com,
Desnes Nunes <desnesn@redhat.com>,
stable@vger.kernel.org
Subject: [PATCH] usb: xhci: bound wait command completion to avoid kdump deadlock
Date: Wed, 29 Apr 2026 22:48:17 -0300 [thread overview]
Message-ID: <20260430014817.2006885-1-desnesn@redhat.com> (raw)
The following deadlock in the usb subsystem can be triggered during kdump:
systemd-udevd[402]: usb3: Worker [419] processing SEQNUM=2194 is taking a long time
dracut-initqueue[432]: Timed out while waiting for udev queue to empty.
systemd-udevd[402]: usb3: Worker [419] processing SEQNUM=2194 killed
systemd-udevd[402]: usb3: Worker [419] terminated by signal 9 (KILL).
...
kdump[720]: saving vmcore complete
...
systemd-shutdown[1]: Rebooting.
INFO: task kworker/0:6:76 blocked for more than 122 seconds.
Not tainted 6.12.0-223.2443_2475543665.el10.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:6 state:D stack:0 pid:76 tgid:76 ppid:2 task_flags:0x4208060 flags:0x00004000
Workqueue: usb_hub_wq hub_event
Call Trace:
<TASK>
__schedule+0x2a5/0x630
schedule+0x27/0x80
schedule_timeout+0xbf/0x100
__wait_for_common+0x95/0x1b0
? __pfx_schedule_timeout+0x10/0x10
xhci_alloc_dev+0x9e/0x290
usb_alloc_dev+0x77/0x3a0
hub_port_connect+0x293/0x9a0
hub_port_connect_change+0x94/0x260
port_event+0x4d1/0x7f0
hub_event+0x16f/0x480
process_one_work+0x174/0x330
worker_thread+0x256/0x3a0
? __pfx_worker_thread+0x10/0x10
kthread+0xfa/0x240
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
INFO: task systemd-shutdow:1 blocked for more than 122 seconds.
Not tainted 6.12.0-223.2443_2475543665.el10.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:systemd-shutdow state:D stack:0 pid:1 tgid:1 ppid:0 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x2a5/0x630
schedule+0x27/0x80
schedule_preempt_disabled+0x15/0x30
__mutex_lock.constprop.0+0x497/0x860
device_shutdown+0xac/0x190
kernel_restart+0x3a/0x70
__do_sys_reboot+0x146/0x240
do_syscall_64+0x7d/0x160
? devkmsg_write.cold+0x24/0x4a
? update_load_avg+0x7f/0x730
? __dequeue_entity+0x3ec/0x4a0
? update_load_avg+0x7f/0x730
? pick_next_task_fair+0x1e6/0x330
? finish_task_switch.isra.0+0x97/0x2a0
? rseq_get_rseq_cs+0x1d/0x220
? rseq_ip_fixup+0x8d/0x1d0
? arch_exit_to_user_mode_prepare.isra.0+0xa5/0xd0
? syscall_exit_to_user_mode+0x32/0x190
? do_syscall_64+0x89/0x160
? handle_mm_fault+0x110/0x370
? do_user_addr_fault+0x606/0x830
? exc_page_fault+0x7f/0x150
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f32517d9917
RSP: 002b:00007ffc018d4fb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f32517d9917
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007ffc018d5130 R08: 0000000000000069 R09: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffc018d5258 R15: 0000000000000000
</TASK>
During crashkernel's boot, hub_event() takes usb_lock_device(hdev) of the
root hub and keeps it for the whole hub processing loop, since it calls
hub_port_connect() -> usb_alloc_dev() -> xhci_alloc_dev(). If during kdump
another device (e.g., a mis-initialized dGPU) hogs interrupts or DMAs, the
TRB_ENABLE_SLOT command will be blocked from completion in time, moving
the HC to an unstable condition (e.g., HSE in USBSTS). After vmcore gets
captured, init calls device_shutdown() trying to shut down the hub device,
by also trying to take the same lock still held by the hub kworker task.
Avoid the deadlock by adding a 2x timeout for command completion before
calling xhci_hc_died(). This gives enough time before marking the host un-
stable, dying and calling xhci_cleanup_command_queue(); which unblocks the
hub worker into releasing the lock, allowing device_shutdown() to proceed.
Fixes: c311e391a7efd ("xhci: rework command timeout and cancellation,")
Cc: stable@vger.kernel.org
Signed-off-by: Desnes Nunes <desnesn@redhat.com>
---
drivers/usb/host/xhci.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index a54f5b57f205..55250fe814c9 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4219,7 +4219,7 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev)
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
struct xhci_virt_device *vdev;
struct xhci_slot_ctx *slot_ctx;
- unsigned long flags;
+ unsigned long flags, tflags;
int ret, slot_id;
struct xhci_command *command;
@@ -4238,9 +4238,14 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev)
xhci_ring_cmd_db(xhci);
spin_unlock_irqrestore(&xhci->lock, flags);
- wait_for_completion(command->completion);
- slot_id = command->slot_id;
+ if (!wait_for_completion_timeout(command->completion,
+ msecs_to_jiffies(2 * command->timeout_ms))) {
+ spin_lock_irqsave(&xhci->lock, tflags);
+ xhci_hc_died(xhci);
+ spin_unlock_irqrestore(&xhci->lock, tflags);
+ }
+ slot_id = command->slot_id;
if (!slot_id || command->status != COMP_SUCCESS) {
xhci_err(xhci, "Error while assigning device slot ID: %s\n",
xhci_trb_comp_code_string(command->status));
--
2.51.0
next reply other threads:[~2026-04-30 1:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 1:48 Desnes Nunes [this message]
2026-04-30 8:48 ` [PATCH] usb: xhci: bound wait command completion to avoid kdump deadlock Michal Pecio
2026-04-30 17:27 ` Desnes Nunes
2026-04-30 21:54 ` Michal Pecio
2026-05-01 14:09 ` Desnes Nunes
2026-05-02 9:46 ` [PATCH RFT RFC] usb: xhci: Kill hosts with HCE or HSE on command timeout Michal Pecio
2026-05-02 11:38 ` Desnes Nunes
2026-05-02 21:55 ` Michal Pecio
2026-05-03 3:36 ` Desnes Nunes
2026-05-03 5:17 ` Michal Pecio
2026-05-03 16:20 ` Desnes Nunes
2026-05-03 19:31 ` Michal Pecio
2026-05-04 7:31 ` Michal Pecio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260430014817.2006885-1-desnesn@redhat.com \
--to=desnesn@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox