* [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
@ 2025-06-23 13:39 Mathias Nyman
2025-07-15 17:48 ` Greg KH
2025-08-11 6:16 ` Jiri Slaby
0 siblings, 2 replies; 18+ messages in thread
From: Mathias Nyman @ 2025-06-23 13:39 UTC (permalink / raw)
To: gregkh; +Cc: linux-usb, stern, Mathias Nyman, stable, Łukasz Bartosik
Hub driver warm-resets ports in SS.Inactive or Compliance mode to
recover a possible connected device. The port reset code correctly
detects if a connection is lost during reset, but hub driver
port_event() fails to take this into account in some cases.
port_event() ends up using stale values and assumes there is a
connected device, and will try all means to recover it, including
power-cycling the port.
Details:
This case was triggered when xHC host was suspended with DbC (Debug
Capability) enabled and connected. DbC turns one xHC port into a simple
usb debug device, allowing debugging a system with an A-to-A USB debug
cable.
xhci DbC code disables DbC when xHC is system suspended to D3, and
enables it back during resume.
We essentially end up with two hosts connected to each other during
suspend, and, for a short while during resume, until DbC is enabled back.
The suspended xHC host notices some activity on the roothub port, but
can't train the link due to being suspended, so xHC hardware sets a CAS
(Cold Attach Status) flag for this port to inform xhci host driver that
the port needs to be warm reset once xHC resumes.
CAS is xHCI specific, and not part of USB specification, so xhci driver
tells usb core that the port has a connection and link is in compliance
mode. Recovery from complinace mode is similar to CAS recovery.
xhci CAS driver support that fakes a compliance mode connection was added
in commit 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
Once xHCI resumes and DbC is enabled back, all activity on the xHC
roothub host side port disappears. The hub driver will anyway think
port has a connection and link is in compliance mode, and hub driver
will try to recover it.
The port power-cycle during recovery seems to cause issues to the active
DbC connection.
Fix this by clearing connect_change flag if hub_port_reset() returns
-ENOTCONN, thus avoiding the whole unnecessary port recovery and
initialization attempt.
Cc: stable@vger.kernel.org
Fixes: 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
drivers/usb/core/hub.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 6bb6e92cb0a4..f981e365be36 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -5754,6 +5754,7 @@ static void port_event(struct usb_hub *hub, int port1)
struct usb_device *hdev = hub->hdev;
u16 portstatus, portchange;
int i = 0;
+ int err;
connect_change = test_bit(port1, hub->change_bits);
clear_bit(port1, hub->event_bits);
@@ -5850,8 +5851,11 @@ static void port_event(struct usb_hub *hub, int port1)
} else if (!udev || !(portstatus & USB_PORT_STAT_CONNECTION)
|| udev->state == USB_STATE_NOTATTACHED) {
dev_dbg(&port_dev->dev, "do warm reset, port only\n");
- if (hub_port_reset(hub, port1, NULL,
- HUB_BH_RESET_TIME, true) < 0)
+ err = hub_port_reset(hub, port1, NULL,
+ HUB_BH_RESET_TIME, true);
+ if (!udev && err == -ENOTCONN)
+ connect_change = 0;
+ else if (err < 0)
hub_port_disable(hub, port1, 1);
} else {
dev_dbg(&port_dev->dev, "do warm reset, full device\n");
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-06-23 13:39 [PATCH] usb: hub: Don't try to recover devices lost during warm reset Mathias Nyman
@ 2025-07-15 17:48 ` Greg KH
2025-07-15 18:54 ` Alan Stern
2025-08-11 6:16 ` Jiri Slaby
1 sibling, 1 reply; 18+ messages in thread
From: Greg KH @ 2025-07-15 17:48 UTC (permalink / raw)
To: Mathias Nyman, stern; +Cc: linux-usb, stable, Łukasz Bartosik
On Mon, Jun 23, 2025 at 04:39:47PM +0300, Mathias Nyman wrote:
> Hub driver warm-resets ports in SS.Inactive or Compliance mode to
> recover a possible connected device. The port reset code correctly
> detects if a connection is lost during reset, but hub driver
> port_event() fails to take this into account in some cases.
> port_event() ends up using stale values and assumes there is a
> connected device, and will try all means to recover it, including
> power-cycling the port.
>
> Details:
> This case was triggered when xHC host was suspended with DbC (Debug
> Capability) enabled and connected. DbC turns one xHC port into a simple
> usb debug device, allowing debugging a system with an A-to-A USB debug
> cable.
>
> xhci DbC code disables DbC when xHC is system suspended to D3, and
> enables it back during resume.
> We essentially end up with two hosts connected to each other during
> suspend, and, for a short while during resume, until DbC is enabled back.
> The suspended xHC host notices some activity on the roothub port, but
> can't train the link due to being suspended, so xHC hardware sets a CAS
> (Cold Attach Status) flag for this port to inform xhci host driver that
> the port needs to be warm reset once xHC resumes.
>
> CAS is xHCI specific, and not part of USB specification, so xhci driver
> tells usb core that the port has a connection and link is in compliance
> mode. Recovery from complinace mode is similar to CAS recovery.
>
> xhci CAS driver support that fakes a compliance mode connection was added
> in commit 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
>
> Once xHCI resumes and DbC is enabled back, all activity on the xHC
> roothub host side port disappears. The hub driver will anyway think
> port has a connection and link is in compliance mode, and hub driver
> will try to recover it.
>
> The port power-cycle during recovery seems to cause issues to the active
> DbC connection.
>
> Fix this by clearing connect_change flag if hub_port_reset() returns
> -ENOTCONN, thus avoiding the whole unnecessary port recovery and
> initialization attempt.
>
> Cc: stable@vger.kernel.org
> Fixes: 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
> Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> ---
> drivers/usb/core/hub.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
Alan, any objection to this?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-07-15 17:48 ` Greg KH
@ 2025-07-15 18:54 ` Alan Stern
0 siblings, 0 replies; 18+ messages in thread
From: Alan Stern @ 2025-07-15 18:54 UTC (permalink / raw)
To: Greg KH; +Cc: Mathias Nyman, linux-usb, stable, Łukasz Bartosik
On Tue, Jul 15, 2025 at 07:48:50PM +0200, Greg KH wrote:
> On Mon, Jun 23, 2025 at 04:39:47PM +0300, Mathias Nyman wrote:
> > Hub driver warm-resets ports in SS.Inactive or Compliance mode to
> > recover a possible connected device. The port reset code correctly
> > detects if a connection is lost during reset, but hub driver
> > port_event() fails to take this into account in some cases.
> > port_event() ends up using stale values and assumes there is a
> > connected device, and will try all means to recover it, including
> > power-cycling the port.
> >
> > Details:
> > This case was triggered when xHC host was suspended with DbC (Debug
> > Capability) enabled and connected. DbC turns one xHC port into a simple
> > usb debug device, allowing debugging a system with an A-to-A USB debug
> > cable.
> >
> > xhci DbC code disables DbC when xHC is system suspended to D3, and
> > enables it back during resume.
> > We essentially end up with two hosts connected to each other during
> > suspend, and, for a short while during resume, until DbC is enabled back.
> > The suspended xHC host notices some activity on the roothub port, but
> > can't train the link due to being suspended, so xHC hardware sets a CAS
> > (Cold Attach Status) flag for this port to inform xhci host driver that
> > the port needs to be warm reset once xHC resumes.
> >
> > CAS is xHCI specific, and not part of USB specification, so xhci driver
> > tells usb core that the port has a connection and link is in compliance
> > mode. Recovery from complinace mode is similar to CAS recovery.
> >
> > xhci CAS driver support that fakes a compliance mode connection was added
> > in commit 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
> >
> > Once xHCI resumes and DbC is enabled back, all activity on the xHC
> > roothub host side port disappears. The hub driver will anyway think
> > port has a connection and link is in compliance mode, and hub driver
> > will try to recover it.
> >
> > The port power-cycle during recovery seems to cause issues to the active
> > DbC connection.
> >
> > Fix this by clearing connect_change flag if hub_port_reset() returns
> > -ENOTCONN, thus avoiding the whole unnecessary port recovery and
> > initialization attempt.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
> > Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
> > Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> > ---
> > drivers/usb/core/hub.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
>
> Alan, any objection to this?
No objection, it looks okay to me.
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Alan Stern
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-06-23 13:39 [PATCH] usb: hub: Don't try to recover devices lost during warm reset Mathias Nyman
2025-07-15 17:48 ` Greg KH
@ 2025-08-11 6:16 ` Jiri Slaby
2025-08-11 11:06 ` Jiri Slaby
` (2 more replies)
1 sibling, 3 replies; 18+ messages in thread
From: Jiri Slaby @ 2025-08-11 6:16 UTC (permalink / raw)
To: Mathias Nyman, gregkh
Cc: linux-usb, stern, stable, Łukasz Bartosik, Oliver Neukum
On 23. 06. 25, 15:39, Mathias Nyman wrote:
> Hub driver warm-resets ports in SS.Inactive or Compliance mode to
> recover a possible connected device. The port reset code correctly
> detects if a connection is lost during reset, but hub driver
> port_event() fails to take this into account in some cases.
> port_event() ends up using stale values and assumes there is a
> connected device, and will try all means to recover it, including
> power-cycling the port.
>
> Details:
> This case was triggered when xHC host was suspended with DbC (Debug
> Capability) enabled and connected. DbC turns one xHC port into a simple
> usb debug device, allowing debugging a system with an A-to-A USB debug
> cable.
>
> xhci DbC code disables DbC when xHC is system suspended to D3, and
> enables it back during resume.
> We essentially end up with two hosts connected to each other during
> suspend, and, for a short while during resume, until DbC is enabled back.
> The suspended xHC host notices some activity on the roothub port, but
> can't train the link due to being suspended, so xHC hardware sets a CAS
> (Cold Attach Status) flag for this port to inform xhci host driver that
> the port needs to be warm reset once xHC resumes.
>
> CAS is xHCI specific, and not part of USB specification, so xhci driver
> tells usb core that the port has a connection and link is in compliance
> mode. Recovery from complinace mode is similar to CAS recovery.
>
> xhci CAS driver support that fakes a compliance mode connection was added
> in commit 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
>
> Once xHCI resumes and DbC is enabled back, all activity on the xHC
> roothub host side port disappears. The hub driver will anyway think
> port has a connection and link is in compliance mode, and hub driver
> will try to recover it.
>
> The port power-cycle during recovery seems to cause issues to the active
> DbC connection.
>
> Fix this by clearing connect_change flag if hub_port_reset() returns
> -ENOTCONN, thus avoiding the whole unnecessary port recovery and
> initialization attempt.
>
> Cc: stable@vger.kernel.org
> Fixes: 8bea2bd37df0 ("usb: Add support for root hub port status CAS")
> Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> ---
> drivers/usb/core/hub.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index 6bb6e92cb0a4..f981e365be36 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -5754,6 +5754,7 @@ static void port_event(struct usb_hub *hub, int port1)
> struct usb_device *hdev = hub->hdev;
> u16 portstatus, portchange;
> int i = 0;
> + int err;
>
> connect_change = test_bit(port1, hub->change_bits);
> clear_bit(port1, hub->event_bits);
> @@ -5850,8 +5851,11 @@ static void port_event(struct usb_hub *hub, int port1)
> } else if (!udev || !(portstatus & USB_PORT_STAT_CONNECTION)
> || udev->state == USB_STATE_NOTATTACHED) {
> dev_dbg(&port_dev->dev, "do warm reset, port only\n");
> - if (hub_port_reset(hub, port1, NULL,
> - HUB_BH_RESET_TIME, true) < 0)
> + err = hub_port_reset(hub, port1, NULL,
> + HUB_BH_RESET_TIME, true);
> + if (!udev && err == -ENOTCONN)
> + connect_change = 0;
> + else if (err < 0)
> hub_port_disable(hub, port1, 1);
This was reported to break the USB on one box:
> [Wed Aug 6 16:51:33 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: Device not responding to setup address.
> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: Device not responding to setup address.
> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: device not accepting address 12, error -71
> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: WARN: invalid context state for evaluate context command.
> [Wed Aug 6 16:51:36 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> [Wed Aug 6 16:51:36 2025] [ C10] xhci_hcd 0000:0e:00.0: ERROR unknown event type 2
> [Wed Aug 6 16:51:36 2025] [ T355745] usb 1-2: Device not responding to setup address.
> [Wed Aug 6 16:51:37 2025] [ C10] xhci_hcd 0000:0e:00.0: ERROR unknown event type 2
> [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: Abort failed to stop command ring: -110
> [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: xHCI host controller not responding, assume dead
> [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: HC died; cleaning up
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-1: USB disconnect, device number 13
> [Wed Aug 6 16:52:50 2025] [ T355745] xhci_hcd 0000:0e:00.0: Timeout while waiting for setup device command
> [Wed Aug 6 16:52:50 2025] [ T362645] usb 2-3: USB disconnect, device number 2
> [Wed Aug 6 16:52:50 2025] [ T362839] cdc_acm 1-5:1.5: acm_port_activate - usb_submit_urb(ctrl irq) failed
> [Wed Aug 6 16:52:50 2025] [ T355745] usb 1-2: device not accepting address 12, error -62
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-2: USB disconnect, device number 12
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-3: USB disconnect, device number 4
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-3.1: USB disconnect, device number 6
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-4: USB disconnect, device number 16
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-5: USB disconnect, device number 15
> [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-7: USB disconnect, device number 8
Using 6.16 minus this 2521106fc732b0b makes it works again.
The same happens with 6.15.8 as this was backported there. (6.15.6 is fine).
lsusb --tree
> /: Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/12p, 480M
> |__ Port 003: Dev 006, If 0, Class=Hub, Driver=hub/4p, 480M
> |__ Port 001: Dev 008, If 0, Class=Human Interface Device, Driver=usbhid, 12M
> |__ Port 001: Dev 008, If 1, Class=Human Interface Device, Driver=usbhid, 12M
> |__ Port 001: Dev 008, If 2, Class=Chip/SmartCard, Driver=usbfs, 12M
> |__ Port 004: Dev 007, If 0, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 004: Dev 007, If 1, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 004: Dev 007, If 2, Class=Application Specific Interface, Driver=[none], 480M
> |__ Port 004: Dev 007, If 3, Class=Communications, Driver=cdc_acm, 480M
> |__ Port 004: Dev 007, If 4, Class=CDC Data, Driver=cdc_acm, 480M
> |__ Port 005: Dev 009, If 0, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 005: Dev 009, If 1, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 005: Dev 009, If 2, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 005: Dev 009, If 3, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 005: Dev 009, If 4, Class=Audio, Driver=snd-usb-audio, 480M
> |__ Port 005: Dev 009, If 5, Class=Communications, Driver=cdc_acm, 480M
> |__ Port 005: Dev 009, If 6, Class=CDC Data, Driver=cdc_acm, 480M
> |__ Port 007: Dev 010, If 0, Class=Vendor Specific Class, Driver=[none], 12M
> |__ Port 007: Dev 010, If 2, Class=Human Interface Device, Driver=usbhid, 12M
> /: Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/5p, 20000M/x2
> |__ Port 003: Dev 002, If 0, Class=Hub, Driver=hub/4p, 5000M
> /: Bus 003.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/12p, 480M
> |__ Port 003: Dev 002, If 0, Class=Human Interface Device, Driver=usbhid, 12M
> |__ Port 003: Dev 002, If 1, Class=Human Interface Device, Driver=usbhid, 12M
> |__ Port 003: Dev 002, If 2, Class=Human Interface Device, Driver=usbhid, 12M
> /: Bus 004.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/5p, 20000M/x2
> /: Bus 005.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 480M
> /: Bus 006.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 10000M
> /: Bus 007.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 480M
> |__ Port 002: Dev 002, If 0, Class=Human Interface Device, Driver=usbhid, 480M
> |__ Port 002: Dev 002, If 1, Class=Human Interface Device, Driver=usbhid, 480M
> |__ Port 002: Dev 002, If 2, Class=Human Interface Device, Driver=usbhid, 480M
> /: Bus 008.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 10000M
> /: Bus 009.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 480M
> /: Bus 010.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/0p, 5000M
Any ideas? What would you need to debug this?
thanks,
--
js
suse labs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-11 6:16 ` Jiri Slaby
@ 2025-08-11 11:06 ` Jiri Slaby
2025-08-11 19:24 ` Alan Stern
2025-08-11 21:28 ` Michał Pecio
2025-08-12 10:48 ` Mathias Nyman
2 siblings, 1 reply; 18+ messages in thread
From: Jiri Slaby @ 2025-08-11 11:06 UTC (permalink / raw)
To: Mathias Nyman, gregkh
Cc: linux-usb, stern, stable, Łukasz Bartosik, Oliver Neukum
On 11. 08. 25, 8:16, Jiri Slaby wrote:
>> @@ -5850,8 +5851,11 @@ static void port_event(struct usb_hub *hub, int
>> port1)
>> } else if (!udev || !(portstatus & USB_PORT_STAT_CONNECTION)
>> || udev->state == USB_STATE_NOTATTACHED) {
>> dev_dbg(&port_dev->dev, "do warm reset, port only\n");
>> - if (hub_port_reset(hub, port1, NULL,
>> - HUB_BH_RESET_TIME, true) < 0)
>> + err = hub_port_reset(hub, port1, NULL,
>> + HUB_BH_RESET_TIME, true);
>> + if (!udev && err == -ENOTCONN)
>> + connect_change = 0;
>> + else if (err < 0)
>> hub_port_disable(hub, port1, 1);
FTR this is now tracked downstream as:
https://bugzilla.suse.com/show_bug.cgi?id=1247895
> This was reported to break the USB on one box:
>> [Wed Aug 6 16:51:33 2025] [ T355745] usb 1-2: reset full-speed USB
>> device number 12 using xhci_hcd
>> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor
>> read/64, error -71
>> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor
>> read/64, error -71
> thanks,
--
js
suse labs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-11 11:06 ` Jiri Slaby
@ 2025-08-11 19:24 ` Alan Stern
0 siblings, 0 replies; 18+ messages in thread
From: Alan Stern @ 2025-08-11 19:24 UTC (permalink / raw)
To: Jiri Slaby
Cc: Mathias Nyman, gregkh, linux-usb, stable, Łukasz Bartosik,
Oliver Neukum
On Mon, Aug 11, 2025 at 01:06:03PM +0200, Jiri Slaby wrote:
> On 11. 08. 25, 8:16, Jiri Slaby wrote:
> > > @@ -5850,8 +5851,11 @@ static void port_event(struct usb_hub *hub,
> > > int port1)
> > > } else if (!udev || !(portstatus & USB_PORT_STAT_CONNECTION)
> > > || udev->state == USB_STATE_NOTATTACHED) {
> > > dev_dbg(&port_dev->dev, "do warm reset, port only\n");
> > > - if (hub_port_reset(hub, port1, NULL,
> > > - HUB_BH_RESET_TIME, true) < 0)
> > > + err = hub_port_reset(hub, port1, NULL,
> > > + HUB_BH_RESET_TIME, true);
> > > + if (!udev && err == -ENOTCONN)
> > > + connect_change = 0;
> > > + else if (err < 0)
> > > hub_port_disable(hub, port1, 1);
>
> FTR this is now tracked downstream as:
> https://bugzilla.suse.com/show_bug.cgi?id=1247895
>
> > This was reported to break the USB on one box:
> > > [Wed Aug 6 16:51:33 2025] [ T355745] usb 1-2: reset full-speed USB
> > > device number 12 using xhci_hcd
> > > [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor
> > > read/64, error -71
> > > [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor
> > > read/64, error -71
What shows up in the kernel log (with usbcore dynamic debugging enabled)
if the commit is present and if the commit is reverted?
Alan Stern
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-11 6:16 ` Jiri Slaby
2025-08-11 11:06 ` Jiri Slaby
@ 2025-08-11 21:28 ` Michał Pecio
2025-08-12 10:48 ` Mathias Nyman
2 siblings, 0 replies; 18+ messages in thread
From: Michał Pecio @ 2025-08-11 21:28 UTC (permalink / raw)
To: Jiri Slaby
Cc: Mathias Nyman, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Mon, 11 Aug 2025 08:16:06 +0200, Jiri Slaby wrote:
> This was reported to break the USB on one box:
> > [Wed Aug 6 16:51:33 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> > [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> > [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> > [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> > [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> > [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
> > [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> > [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: Device not responding to setup address.
> > [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: Device not responding to setup address.
> > [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: device not accepting address 12, error -71
> > [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: WARN: invalid context state for evaluate context command.
> > [Wed Aug 6 16:51:36 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
> > [Wed Aug 6 16:51:36 2025] [ C10] xhci_hcd 0000:0e:00.0: ERROR unknown event type 2
> > [Wed Aug 6 16:51:36 2025] [ T355745] usb 1-2: Device not responding to setup address.
> > [Wed Aug 6 16:51:37 2025] [ C10] xhci_hcd 0000:0e:00.0: ERROR unknown event type 2
> > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: Abort failed to stop command ring: -110
> > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: xHCI host controller not responding, assume dead
> > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: HC died; cleaning up
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-1: USB disconnect, device number 13
> > [Wed Aug 6 16:52:50 2025] [ T355745] xhci_hcd 0000:0e:00.0: Timeout while waiting for setup device command
> > [Wed Aug 6 16:52:50 2025] [ T362645] usb 2-3: USB disconnect, device number 2
> > [Wed Aug 6 16:52:50 2025] [ T362839] cdc_acm 1-5:1.5: acm_port_activate - usb_submit_urb(ctrl irq) failed
> > [Wed Aug 6 16:52:50 2025] [ T355745] usb 1-2: device not accepting address 12, error -62
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-2: USB disconnect, device number 12
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-3: USB disconnect, device number 4
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-3.1: USB disconnect, device number 6
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-4: USB disconnect, device number 16
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-5: USB disconnect, device number 15
> > [Wed Aug 6 16:52:50 2025] [ T359046] usb 1-7: USB disconnect, device number 8
Is the problem that this USB device fails to work, or that it takes
down the whole bus while failing to work as usual?
The latter issue looks like some ASMedia xHCI controller being unhappy
about something. What does 'lspci' say about this 0e:00.0?
So far I failed to repro this on v6.16.0 with a few of my ASMedias and
a dummy device which never responds to any packet.
Can you mount debugfs and get these two files after the HC goes dead?
/sys/kernel/debug/usb/xhci/0000:0e:00.0/command-ring/trbs
/sys/kernel/debug/usb/xhci/0000:0e:00.0/event-ring/trbs
Regards,
Michal
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-11 6:16 ` Jiri Slaby
2025-08-11 11:06 ` Jiri Slaby
2025-08-11 21:28 ` Michał Pecio
@ 2025-08-12 10:48 ` Mathias Nyman
2025-08-12 18:15 ` Marcus Rückert
2 siblings, 1 reply; 18+ messages in thread
From: Mathias Nyman @ 2025-08-12 10:48 UTC (permalink / raw)
To: Jiri Slaby, gregkh
Cc: linux-usb, stern, stable, Łukasz Bartosik, Oliver Neukum
Hi
>
> This was reported to break the USB on one box:
>> [Wed Aug 6 16:51:33 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
>> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
>> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
Protocol error (EPROTO) reading 64 bytes of device descriptor
>> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
>> [Wed Aug 6 16:51:34 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
>> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: device descriptor read/64, error -71
>> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
>> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: Device not responding to setup address.
The xhci "address device" command failed with a transaction error
Slot does not reach "addressed" state
>> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: Device not responding to setup address.
>> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: device not accepting address 12, error -71
>> [Wed Aug 6 16:51:35 2025] [ T355745] usb 1-2: WARN: invalid context state for evaluate context command.
xhci evaluate context command failed, probably due to slot not in addressed state
>> [Wed Aug 6 16:51:36 2025] [ T355745] usb 1-2: reset full-speed USB device number 12 using xhci_hcd
>> [Wed Aug 6 16:51:36 2025] [ C10] xhci_hcd 0000:0e:00.0: ERROR unknown event type 2
This is odd,
TRBs of type "2" should not exists on event rings, TRB type id 2 are supposed to be the
setup TRB for control transfers, and only exist on transfer rings.
>> [Wed Aug 6 16:51:36 2025] [ T355745] usb 1-2: Device not responding to setup address.
>> [Wed Aug 6 16:51:37 2025] [ C10] xhci_hcd 0000:0e:00.0: ERROR unknown event type 2
>> [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: Abort failed to stop command ring: -110
Aborting command due to driver not seeing command completions.
The missing command completions are probably those mangled "unknown" events
>> [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: xHCI host controller not responding, assume dead
>> [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: HC died; cleaning up
Tear down xhci.
>
> Any ideas? What would you need to debug this?
Could be that this patch reveals some underlying race in xhci re-enumeration path.
Could also be related to ep0 max packet size setting as this is a full-speed device.
(max packet size is unknown until host reads first 8 bytes of descriptor, then adjusts
it on the fly with an evaluate context command)
Appreciated if this could be reproduced with as few usb devices as possible, and with
xhci tracing and dynamic debug enabled:
mount -t debugfs none /sys/kernel/debug
echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
echo 1 > /sys/kernel/debug/tracing/tracing_on
< Reproduce issue >
Send output of dmesg
Send content of /sys/kernel/debug/tracing/trace
Thanks
Mathias
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-12 10:48 ` Mathias Nyman
@ 2025-08-12 18:15 ` Marcus Rückert
2025-08-12 22:02 ` Michał Pecio
0 siblings, 1 reply; 18+ messages in thread
From: Marcus Rückert @ 2025-08-12 18:15 UTC (permalink / raw)
To: Mathias Nyman, Jiri Slaby, gregkh
Cc: linux-usb, stern, stable, Łukasz Bartosik, Oliver Neukum,
Michał Pecio
On Tue, 2025-08-12 at 13:48 +0300, Mathias Nyman wrote:
> > > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: xHCI
> > > host controller not responding, assume dead
> > > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: HC
> > > died; cleaning up
>
> Tear down xhci.
so usb is not dead completely. I can connect my keyboard to the
charging cable of my mouse and it starts working again. but it seems
all my devices hanging on that part of the usb tree are dead
(DAC/keyboard)
lspci is here
https://bugzilla.opensuse.org/show_bug.cgi?id=1247895#c3
Mainboard is a ASUS ProArt X870E-CREATOR WIFI
> >
> > Any ideas? What would you need to debug this?
>
> Could be that this patch reveals some underlying race in xhci re-
> enumeration path.
possible.
> Could also be related to ep0 max packet size setting as this is a
> full-speed device.
> (max packet size is unknown until host reads first 8 bytes of
> descriptor, then adjusts
> it on the fly with an evaluate context command)
>
> Appreciated if this could be reproduced with as few usb devices as
> possible, and with
> xhci tracing and dynamic debug enabled:
sadly this is not really reproducible on command. sometimes it happens
after only a few hours. sometimes it happens after a day or 2.
> mount -t debugfs none /sys/kernel/debug
> echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
> echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
> echo 1 > /sys/kernel/debug/tracing/tracing_on
Running with this now.
> < Reproduce issue >
> Send output of dmesg
> Send content of /sys/kernel/debug/tracing/trace
Will do once it happened again.
darix
--
Always remember:
Never accept the world as it appears to be.
Dare to see it for what it could be.
The world can always use more heroes.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-12 18:15 ` Marcus Rückert
@ 2025-08-12 22:02 ` Michał Pecio
2025-08-13 1:58 ` Marcus Rückert
2025-08-13 2:11 ` Marcus Rückert
0 siblings, 2 replies; 18+ messages in thread
From: Michał Pecio @ 2025-08-12 22:02 UTC (permalink / raw)
To: Marcus Rückert
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Tue, 12 Aug 2025 20:15:13 +0200, Marcus Rückert wrote:
> On Tue, 2025-08-12 at 13:48 +0300, Mathias Nyman wrote:
> > > > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: xHCI
> > > > host controller not responding, assume dead
> > > > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0: HC
> > > > died; cleaning up
> >
> > Tear down xhci.
>
> so usb is not dead completely. I can connect my keyboard to the
> charging cable of my mouse and it starts working again. but it seems
> all my devices hanging on that part of the usb tree are dead
> (DAC/keyboard)
You have multiple USB buses on multiple xHCI controllers. Controller
responsible for bus 1 goes belly up and its devices are lost, but the
rest keeps working.
It would make sense to figure out what was this device on port 2 of
bus 1 which triggered the failure. Your lsusb output shows no such
device, so it was either disconnected, connected to another port or
it malfunctioned and failed to enumerate at the time. Do you know?
What's the output of these commands right now?
dmesg |grep 'usb 1-2'
dmesg |grep 'descriptor read'
Do you have logs? Can you look at them to see if it was always
"usb 1-2" causing trouble in the past?
> lspci is here
>
> https://bugzilla.opensuse.org/show_bug.cgi?id=1247895#c3
>
> Mainboard is a ASUS ProArt X870E-CREATOR WIFI
Thanks. Unfortunately I don't have this exact chipset, but it's
an AMD chipset made by ASMedia, as suspected.
The situation is somewhat similar (though different) to this bug:
https://bugzilla.kernel.org/show_bug.cgi?id=220069
Random failures for no clear reason, apparently triggered by some
repetitive background activity. Very annoying.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-12 22:02 ` Michał Pecio
@ 2025-08-13 1:58 ` Marcus Rückert
2025-08-13 6:42 ` Michał Pecio
2025-08-13 2:11 ` Marcus Rückert
1 sibling, 1 reply; 18+ messages in thread
From: Marcus Rückert @ 2025-08-13 1:58 UTC (permalink / raw)
To: Michał Pecio
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 2025-08-13 at 00:02 +0200, Michał Pecio wrote:
> On Tue, 12 Aug 2025 20:15:13 +0200, Marcus Rückert wrote:
> > On Tue, 2025-08-12 at 13:48 +0300, Mathias Nyman wrote:
> > > > > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0:
> > > > > xHCI
> > > > > host controller not responding, assume dead
> > > > > [Wed Aug 6 16:52:50 2025] [ T362645] xhci_hcd 0000:0e:00.0:
> > > > > HC
> > > > > died; cleaning up
> > >
> > > Tear down xhci.
> >
> > so usb is not dead completely. I can connect my keyboard to the
> > charging cable of my mouse and it starts working again. but it
> > seems
> > all my devices hanging on that part of the usb tree are dead
> > (DAC/keyboard)
>
> You have multiple USB buses on multiple xHCI controllers. Controller
> responsible for bus 1 goes belly up and its devices are lost, but the
> rest keeps working.
>
> It would make sense to figure out what was this device on port 2 of
> bus 1 which triggered the failure. Your lsusb output shows no such
> device, so it was either disconnected, connected to another port or
> it malfunctioned and failed to enumerate at the time. Do you know?
>
> What's the output of these commands right now?
> dmesg |grep 'usb 1-2'
> dmesg |grep 'descriptor read'
dmesg |grep 'usb 1-2' ; dmesg |grep 'descriptor read'
[ 2.686292] [ T787] usb 1-2: new full-speed USB device number 3
using xhci_hcd
[ 3.054496] [ T787] usb 1-2: New USB device found, idVendor=31e3,
idProduct=1322, bcdDevice= 2.30
[ 3.054499] [ T787] usb 1-2: New USB device strings: Mfr=1,
Product=2, SerialNumber=3
[ 3.054500] [ T787] usb 1-2: Product: Wooting 60HE+
[ 3.054501] [ T787] usb 1-2: Manufacturer: Wooting
the device is running firmware 2.11.0b-beta.3
> Do you have logs? Can you look at them to see if it was always
> "usb 1-2" causing trouble in the past?
looks like it according to
journalctl --since 2025-07-01 --grep "reset full-speed USB device
number"
Jul 24 15:56:34 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:56:35 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:56:36 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:56:37 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 31 19:53:02 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:53:03 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Aug 06 16:51:34 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:51:35 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
> > lspci is here
> >
> > https://bugzilla.opensuse.org/show_bug.cgi?id=1247895#c3
> >
> > Mainboard is a ASUS ProArt X870E-CREATOR WIFI
>
> Thanks. Unfortunately I don't have this exact chipset, but it's
> an AMD chipset made by ASMedia, as suspected.
I will drop wooting a mail so they are in the loop.
darix
--
Always remember:
Never accept the world as it appears to be.
Dare to see it for what it could be.
The world can always use more heroes.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-12 22:02 ` Michał Pecio
2025-08-13 1:58 ` Marcus Rückert
@ 2025-08-13 2:11 ` Marcus Rückert
1 sibling, 0 replies; 18+ messages in thread
From: Marcus Rückert @ 2025-08-13 2:11 UTC (permalink / raw)
To: Michał Pecio
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 2025-08-13 at 00:02 +0200, Michał Pecio wrote:
> It would make sense to figure out what was this device on port 2 of
> bus 1 which triggered the failure. Your lsusb output shows no such
> device, so it was either disconnected, connected to another port or
> it malfunctioned and failed to enumerate at the time. Do you know?
I forgot to answer this part: I am only using that keyboard for gaming.
so it goes into power save mode at some point. maybe it doesnt properly
unregister for that?
--
Always remember:
Never accept the world as it appears to be.
Dare to see it for what it could be.
The world can always use more heroes.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-13 1:58 ` Marcus Rückert
@ 2025-08-13 6:42 ` Michał Pecio
2025-08-13 9:14 ` Marcus Rückert
0 siblings, 1 reply; 18+ messages in thread
From: Michał Pecio @ 2025-08-13 6:42 UTC (permalink / raw)
To: Marcus Rückert
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 13 Aug 2025 03:58:07 +0200, Marcus Rückert wrote:
> dmesg |grep 'usb 1-2' ; dmesg |grep 'descriptor read'
> [ 2.686292] [ T787] usb 1-2: new full-speed USB device number 3
> using xhci_hcd
> [ 3.054496] [ T787] usb 1-2: New USB device found, idVendor=31e3,
> idProduct=1322, bcdDevice= 2.30
> [ 3.054499] [ T787] usb 1-2: New USB device strings: Mfr=1,
> Product=2, SerialNumber=3
> [ 3.054500] [ T787] usb 1-2: Product: Wooting 60HE+
> [ 3.054501] [ T787] usb 1-2: Manufacturer: Wooting
OK, so you had a keyboard in this port during the last boot. Is this
keyboard always connected to the same port? There is no bus 1 port 2
device on your earlier lsusb output, so it was either not connected
there or not detected due to malfunction.
> journalctl --since 2025-07-01 --grep "reset full-speed USB device
> number"
>
> Jul 24 15:56:34 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:56:35 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:56:36 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:56:37 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 31 19:53:02 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:53:03 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Aug 06 16:51:34 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:51:35 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
So this port was getting reset in the past. Can you also check:
- how many of those resets were followed by "HC died"
- if all "HC died" events were caused by resets of port usb 1-2
(or some other port)
And for the record, what exactly was the original problem which you
reported to Suse and believe to be caused by a kernel upgrade? Was it
"HC died" and loss of multiple devices, or just the keyborad failing
to work and spamming "reset USB device numebr x", or something else?
Regards,
Michal
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-13 6:42 ` Michał Pecio
@ 2025-08-13 9:14 ` Marcus Rückert
2025-08-13 9:48 ` Michał Pecio
0 siblings, 1 reply; 18+ messages in thread
From: Marcus Rückert @ 2025-08-13 9:14 UTC (permalink / raw)
To: Michał Pecio
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 2025-08-13 at 08:42 +0200, Michał Pecio wrote:
> On Wed, 13 Aug 2025 03:58:07 +0200, Marcus Rückert wrote:
> > dmesg |grep 'usb 1-2' ; dmesg |grep 'descriptor read'
> > [ 2.686292] [ T787] usb 1-2: new full-speed USB device number
> > 3
> > using xhci_hcd
> > [ 3.054496] [ T787] usb 1-2: New USB device found,
> > idVendor=31e3,
> > idProduct=1322, bcdDevice= 2.30
> > [ 3.054499] [ T787] usb 1-2: New USB device strings: Mfr=1,
> > Product=2, SerialNumber=3
> > [ 3.054500] [ T787] usb 1-2: Product: Wooting 60HE+
> > [ 3.054501] [ T787] usb 1-2: Manufacturer: Wooting
>
> OK, so you had a keyboard in this port during the last boot. Is this
> keyboard always connected to the same port? There is no bus 1 port 2
> device on your earlier lsusb output, so it was either not connected
> there or not detected due to malfunction.
yes it is always connected to that port. the setup is quite static.
> So this port was getting reset in the past. Can you also check:
> - how many of those resets were followed by "HC died"
> - if all "HC died" events were caused by resets of port usb 1-2
> (or some other port)
Jul 24 15:56:34 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:56:35 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:56:36 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:56:37 kernel: usb 1-2: reset full-speed USB device number 14
using xhci_hcd
Jul 24 15:57:56 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
Jul 31 19:53:02 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:53:03 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
using xhci_hcd
Jul 31 19:55:05 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
Aug 06 16:51:34 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:51:35 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
using xhci_hcd
Aug 06 16:52:50 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
all HC died events were connected to reset full-speed.
> And for the record, what exactly was the original problem which you
> reported to Suse and believe to be caused by a kernel upgrade? Was it
> "HC died" and loss of multiple devices, or just the keyborad failing
> to work and spamming "reset USB device numebr x", or something else?
The spamming I wouldnt have noticed. but the loss of the other devices
from the "HC died" I did notice. So I asked Jiri if the recent kernel
updates included USB changes and we started debugging :)
darix
--
Always remember:
Never accept the world as it appears to be.
Dare to see it for what it could be.
The world can always use more heroes.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-13 9:14 ` Marcus Rückert
@ 2025-08-13 9:48 ` Michał Pecio
2025-08-13 10:05 ` Marcus Rückert
2025-08-13 10:13 ` Mathias Nyman
0 siblings, 2 replies; 18+ messages in thread
From: Michał Pecio @ 2025-08-13 9:48 UTC (permalink / raw)
To: Marcus Rückert
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 13 Aug 2025 11:14:04 +0200, Marcus Rückert wrote:
> Jul 24 15:56:34 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:56:35 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:56:36 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:56:37 kernel: usb 1-2: reset full-speed USB device number 14
> using xhci_hcd
> Jul 24 15:57:56 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
> Jul 31 19:53:02 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:53:03 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
> using xhci_hcd
> Jul 31 19:55:05 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
> Aug 06 16:51:34 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:51:35 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
> using xhci_hcd
> Aug 06 16:52:50 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
>
>
> all HC died events were connected to reset full-speed.
OK, three reset loops and three HC died in the last month, both at
the same time, about once a week. Possibly not a coincidence ;)
Not sure if we can confidently say that reverting this patch helped,
because a week is just passing today. But the same hardware worked
fine for weeks/months/years? before a recent kernel upgrade, correct?
Random idea: would anything happen if you run 'usbreset' to manually
reset this device? Maybe a few times.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-13 9:48 ` Michał Pecio
@ 2025-08-13 10:05 ` Marcus Rückert
2025-08-14 5:41 ` Michał Pecio
2025-08-13 10:13 ` Mathias Nyman
1 sibling, 1 reply; 18+ messages in thread
From: Marcus Rückert @ 2025-08-13 10:05 UTC (permalink / raw)
To: Michał Pecio
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 2025-08-13 at 11:48 +0200, Michał Pecio wrote:
> OK, three reset loops and three HC died in the last month, both at
> the same time, about once a week. Possibly not a coincidence ;)
>
> Not sure if we can confidently say that reverting this patch helped,
> because a week is just passing today. But the same hardware worked
> fine for weeks/months/years? before a recent kernel upgrade, correct?
From 2024-07 until end of July this year (when I upgraded to kernel
6.15.7) everything was working fine. Also since I run with the kernel
where the patch is reverted the issue has not shown up again.
> Random idea: would anything happen if you run 'usbreset' to manually
> reset this device? Maybe a few times.
How do I do that?
darix
--
Always remember:
Never accept the world as it appears to be.
Dare to see it for what it could be.
The world can always use more heroes.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-13 9:48 ` Michał Pecio
2025-08-13 10:05 ` Marcus Rückert
@ 2025-08-13 10:13 ` Mathias Nyman
1 sibling, 0 replies; 18+ messages in thread
From: Mathias Nyman @ 2025-08-13 10:13 UTC (permalink / raw)
To: Michał Pecio, Marcus Rückert
Cc: Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On 13.8.2025 12.48, Michał Pecio wrote:
> On Wed, 13 Aug 2025 11:14:04 +0200, Marcus Rückert wrote:
>> Jul 24 15:56:34 kernel: usb 1-2: reset full-speed USB device number 14
>> using xhci_hcd
>> Jul 24 15:56:35 kernel: usb 1-2: reset full-speed USB device number 14
>> using xhci_hcd
>> Jul 24 15:56:36 kernel: usb 1-2: reset full-speed USB device number 14
>> using xhci_hcd
>> Jul 24 15:56:37 kernel: usb 1-2: reset full-speed USB device number 14
>> using xhci_hcd
>> Jul 24 15:57:56 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
>> Jul 31 19:53:02 kernel: usb 1-2: reset full-speed USB device number 50
>> using xhci_hcd
>> Jul 31 19:53:03 kernel: usb 1-2: reset full-speed USB device number 50
>> using xhci_hcd
>> Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
>> using xhci_hcd
>> Jul 31 19:53:04 kernel: usb 1-2: reset full-speed USB device number 50
>> using xhci_hcd
>> Jul 31 19:55:05 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
>> Aug 06 16:51:34 kernel: usb 1-2: reset full-speed USB device number 12
>> using xhci_hcd
>> Aug 06 16:51:35 kernel: usb 1-2: reset full-speed USB device number 12
>> using xhci_hcd
>> Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
>> using xhci_hcd
>> Aug 06 16:51:36 kernel: usb 1-2: reset full-speed USB device number 12
>> using xhci_hcd
>> Aug 06 16:52:50 kernel: xhci_hcd 0000:0e:00.0: HC died; cleaning up
>>
>>
>> all HC died events were connected to reset full-speed.
>
> OK, three reset loops and three HC died in the last month, both at
> the same time, about once a week. Possibly not a coincidence ;)
>
> Not sure if we can confidently say that reverting this patch helped,
> because a week is just passing today. But the same hardware worked
> fine for weeks/months/years? before a recent kernel upgrade, correct?
This patch also only concerns SuperSpeed and SuperSpeedPlus (USB 3) devices,
so it's unlikely the real cause.
It is possible it reveals some existig race between the SuperSpeed bus and
the slower High- and Full-speed bus. Both those buses are handled by the same
xHCI controller.
In this setup usb1 is the high+fFull speed bus, and usb2 the SuperSpeed bus
Thanks
-Mathias
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] usb: hub: Don't try to recover devices lost during warm reset.
2025-08-13 10:05 ` Marcus Rückert
@ 2025-08-14 5:41 ` Michał Pecio
0 siblings, 0 replies; 18+ messages in thread
From: Michał Pecio @ 2025-08-14 5:41 UTC (permalink / raw)
To: Marcus Rückert
Cc: Mathias Nyman, Jiri Slaby, gregkh, linux-usb, stern, stable,
Łukasz Bartosik, Oliver Neukum
On Wed, 13 Aug 2025 12:05:16 +0200, Marcus Rückert wrote:
> On Wed, 2025-08-13 at 11:48 +0200, Michał Pecio wrote:
> > OK, three reset loops and three HC died in the last month, both at
> > the same time, about once a week. Possibly not a coincidence ;)
> >
> > Not sure if we can confidently say that reverting this patch helped,
> > because a week is just passing today. But the same hardware worked
> > fine for weeks/months/years? before a recent kernel upgrade, correct?
>
> From 2024-07 until end of July this year (when I upgraded to kernel
> 6.15.7) everything was working fine. Also since I run with the kernel
> where the patch is reverted the issue has not shown up again.
Considering rarity of those events I think you would need to run for
a few weeks to be sure that the problem is gone.
There is also a chance that some hardware change wich doesn't involve
the "usb 1-2" keyboard caused it. In bug 220069, another AMD chipset
was dying every few days if and only if two particular devices were
connected to the same USB controller (the chipset had two controllers).
>
> > Random idea: would anything happen if you run 'usbreset' to manually
> > reset this device? Maybe a few times.
>
> How do I do that?
Run usbreset without arguments (as root) and it will print a small help
text and a list of devices it can reset. If you don't have usbreset,
ask Suse. Normally it should be in usbutils package like lsusb.
But I suspect nothing will happen (ie. the device will reset normally).
We tried it in bug 220069 as well.
So it will be waiting until it crashes spontaneously again.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-08-14 5:41 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-23 13:39 [PATCH] usb: hub: Don't try to recover devices lost during warm reset Mathias Nyman
2025-07-15 17:48 ` Greg KH
2025-07-15 18:54 ` Alan Stern
2025-08-11 6:16 ` Jiri Slaby
2025-08-11 11:06 ` Jiri Slaby
2025-08-11 19:24 ` Alan Stern
2025-08-11 21:28 ` Michał Pecio
2025-08-12 10:48 ` Mathias Nyman
2025-08-12 18:15 ` Marcus Rückert
2025-08-12 22:02 ` Michał Pecio
2025-08-13 1:58 ` Marcus Rückert
2025-08-13 6:42 ` Michał Pecio
2025-08-13 9:14 ` Marcus Rückert
2025-08-13 9:48 ` Michał Pecio
2025-08-13 10:05 ` Marcus Rückert
2025-08-14 5:41 ` Michał Pecio
2025-08-13 10:13 ` Mathias Nyman
2025-08-13 2:11 ` Marcus Rückert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).