* [REGRESSION] Long boot times due to USB enumeration
@ 2025-03-05 16:32 Seïfane Idouchach
2025-03-05 18:09 ` Greg KH
0 siblings, 1 reply; 7+ messages in thread
From: Seïfane Idouchach @ 2025-03-05 16:32 UTC (permalink / raw)
To: dirk.behme; +Cc: gregkh, rafael, dakr, linux-kernel, regressions, stable
Dear all,
I am reporting what I believe to be regression due to
c0a40097f0bc81deafc15f9195d1fb54595cd6d0.
After this change I am experiencing long boot times on a setup that
has what seems like a bad usb.
The progress of the boot gets halted while retrying (and ultimately
failing) to enumerate the USB device and is only allowed to continue
after giving up enumerating the USB device.
On Arch Linux this manifests itself by a message from SystemD having a
wait job on journald. Journald starts just after the enumeration fails
with "unable to enumerate USB device".
This results in longer boot times on average 1 minute longer than
usual (usually around 10s).
No stable kernel before this change exhibits the issue all stable
kernels after this change exhibit the issue.
See the related USB messages attached below (these messages are
continuous and have not been snipped) :
[...]
[ 9.640854] usb 1-9: device descriptor read/64, error -110
[ 25.147505] usb 1-9: device descriptor read/64, error -110
[ 25.650779] usb 1-9: new high-speed USB device number 5 using xhci_hcd
[ 30.907482] usb 1-9: device descriptor read/64, error -110
[ 46.480900] usb 1-9: device descriptor read/64, error -110
[ 46.589883] usb usb1-port9: attempt power cycle
[ 46.990815] usb 1-9: new high-speed USB device number 6 using xhci_hcd
[ 51.791571] usb 1-9: Device not responding to setup address.
[ 56.801594] usb 1-9: Device not responding to setup address.
[ 57.010803] usb 1-9: device not accepting address 6, error -71
[ 57.137485] usb 1-9: new high-speed USB device number 7 using xhci_hcd
[ 61.937624] usb 1-9: Device not responding to setup address.
[ 66.947485] usb 1-9: Device not responding to setup address.
[ 67.154086] usb 1-9: device not accepting address 7, error -71
[ 67.156426] usb usb1-port9: unable to enumerate USB device
[...]
This issue does not manifest in 44a45be57f85.
I am available to test any patches to address this on my system since
I understand this could be quite hard to replicate on any system.
I am available to provide more information if I am able or with
guidance to help troubleshoot the issue further.
Wishing you all a good day.
#regzbot introduced: c0a40097f0bc81deafc15f9195d1fb54595cd6d0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Long boot times due to USB enumeration
2025-03-05 16:32 [REGRESSION] Long boot times due to USB enumeration Seïfane Idouchach
@ 2025-03-05 18:09 ` Greg KH
2025-03-05 19:57 ` Seïfane Idouchach
0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2025-03-05 18:09 UTC (permalink / raw)
To: Seïfane Idouchach
Cc: dirk.behme, rafael, dakr, linux-kernel, regressions, stable
On Thu, Mar 06, 2025 at 12:32:59AM +0800, Seïfane Idouchach wrote:
> Dear all,
>
> I am reporting what I believe to be regression due to
> c0a40097f0bc81deafc15f9195d1fb54595cd6d0.
>
> After this change I am experiencing long boot times on a setup that
> has what seems like a bad usb.
> The progress of the boot gets halted while retrying (and ultimately
> failing) to enumerate the USB device and is only allowed to continue
> after giving up enumerating the USB device.
> On Arch Linux this manifests itself by a message from SystemD having a
> wait job on journald. Journald starts just after the enumeration fails
> with "unable to enumerate USB device".
> This results in longer boot times on average 1 minute longer than
> usual (usually around 10s).
> No stable kernel before this change exhibits the issue all stable
> kernels after this change exhibit the issue.
>
> See the related USB messages attached below (these messages are
> continuous and have not been snipped) :
>
> [...]
> [ 9.640854] usb 1-9: device descriptor read/64, error -110
> [ 25.147505] usb 1-9: device descriptor read/64, error -110
> [ 25.650779] usb 1-9: new high-speed USB device number 5 using xhci_hcd
> [ 30.907482] usb 1-9: device descriptor read/64, error -110
> [ 46.480900] usb 1-9: device descriptor read/64, error -110
> [ 46.589883] usb usb1-port9: attempt power cycle
> [ 46.990815] usb 1-9: new high-speed USB device number 6 using xhci_hcd
> [ 51.791571] usb 1-9: Device not responding to setup address.
> [ 56.801594] usb 1-9: Device not responding to setup address.
> [ 57.010803] usb 1-9: device not accepting address 6, error -71
> [ 57.137485] usb 1-9: new high-speed USB device number 7 using xhci_hcd
> [ 61.937624] usb 1-9: Device not responding to setup address.
> [ 66.947485] usb 1-9: Device not responding to setup address.
> [ 67.154086] usb 1-9: device not accepting address 7, error -71
> [ 67.156426] usb usb1-port9: unable to enumerate USB device
That's a real issue, but should not be due to the commit id you
referenced.
> [...]
>
> This issue does not manifest in 44a45be57f85.
What does that commit have to do with this? That's just a build break
fix.
> I am available to test any patches to address this on my system since
> I understand this could be quite hard to replicate on any system.
> I am available to provide more information if I am able or with
> guidance to help troubleshoot the issue further.
>
> Wishing you all a good day.
>
> #regzbot introduced: c0a40097f0bc81deafc15f9195d1fb54595cd6d0
>
We know there are issues here. That commit was "fixed" by commit
15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race"),
but then that caused a different problem, so it was reverted by commit
9a71892cbcdb ("Revert "driver core: Fix uevent_show() vs driver detach
race"").
There are many discussions about this on the mailing list, with a
proposal to add Dan's "fix" back. If you could try that, it would be
great to see.
I think your USB problem is different here, but if you add 15fffc6a5624
("driver core: Fix uevent_show() vs driver detach race") to your kernel,
that would be great to see.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Long boot times due to USB enumeration
2025-03-05 18:09 ` Greg KH
@ 2025-03-05 19:57 ` Seïfane Idouchach
2025-03-07 12:58 ` Seïfane Idouchach
0 siblings, 1 reply; 7+ messages in thread
From: Seïfane Idouchach @ 2025-03-05 19:57 UTC (permalink / raw)
To: Greg KH; +Cc: dirk.behme, rafael, dakr, linux-kernel, regressions, stable
On Thu, Mar 6, 2025 at 2:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Mar 06, 2025 at 12:32:59AM +0800, Seïfane Idouchach wrote:
> > Dear all,
> >
> > I am reporting what I believe to be regression due to
> > c0a40097f0bc81deafc15f9195d1fb54595cd6d0.
> >
> > After this change I am experiencing long boot times on a setup that
> > has what seems like a bad usb.
> > The progress of the boot gets halted while retrying (and ultimately
> > failing) to enumerate the USB device and is only allowed to continue
> > after giving up enumerating the USB device.
> > On Arch Linux this manifests itself by a message from SystemD having a
> > wait job on journald. Journald starts just after the enumeration fails
> > with "unable to enumerate USB device".
> > This results in longer boot times on average 1 minute longer than
> > usual (usually around 10s).
> > No stable kernel before this change exhibits the issue all stable
> > kernels after this change exhibit the issue.
> >
> > See the related USB messages attached below (these messages are
> > continuous and have not been snipped) :
> >
> > [...]
> > [ 9.640854] usb 1-9: device descriptor read/64, error -110
> > [ 25.147505] usb 1-9: device descriptor read/64, error -110
> > [ 25.650779] usb 1-9: new high-speed USB device number 5 using xhci_hcd
> > [ 30.907482] usb 1-9: device descriptor read/64, error -110
> > [ 46.480900] usb 1-9: device descriptor read/64, error -110
> > [ 46.589883] usb usb1-port9: attempt power cycle
> > [ 46.990815] usb 1-9: new high-speed USB device number 6 using xhci_hcd
> > [ 51.791571] usb 1-9: Device not responding to setup address.
> > [ 56.801594] usb 1-9: Device not responding to setup address.
> > [ 57.010803] usb 1-9: device not accepting address 6, error -71
> > [ 57.137485] usb 1-9: new high-speed USB device number 7 using xhci_hcd
> > [ 61.937624] usb 1-9: Device not responding to setup address.
> > [ 66.947485] usb 1-9: Device not responding to setup address.
> > [ 67.154086] usb 1-9: device not accepting address 7, error -71
> > [ 67.156426] usb usb1-port9: unable to enumerate USB device
>
> That's a real issue, but should not be due to the commit id you
> referenced.
>
> > [...]
> >
> > This issue does not manifest in 44a45be57f85.
>
> What does that commit have to do with this? That's just a build break
> fix.
>
> > I am available to test any patches to address this on my system since
> > I understand this could be quite hard to replicate on any system.
> > I am available to provide more information if I am able or with
> > guidance to help troubleshoot the issue further.
> >
> > Wishing you all a good day.
> >
> > #regzbot introduced: c0a40097f0bc81deafc15f9195d1fb54595cd6d0
> >
>
> We know there are issues here. That commit was "fixed" by commit
> 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race"),
> but then that caused a different problem, so it was reverted by commit
> 9a71892cbcdb ("Revert "driver core: Fix uevent_show() vs driver detach
> race"").
>
> There are many discussions about this on the mailing list, with a
> proposal to add Dan's "fix" back. If you could try that, it would be
> great to see.
>
> I think your USB problem is different here, but if you add 15fffc6a5624
> ("driver core: Fix uevent_show() vs driver detach race") to your kernel,
> that would be great to see.
>
> thanks,
>
> greg k-h
Hello Greg,
Thank you for your time.
> What does that commit have to do with this? That's just a build break
> fix.
This commit comes right before what seems to be the bad commit. I got
to the cited (maybe) bad commit after a bisection and wanted to
confirm the results.
> I think your USB problem is different here, but if you add 15fffc6a5624
> ("driver core: Fix uevent_show() vs driver detach race") to your kernel,
> that would be great to see.
After reapplying the patch (15fffc6a5624) at v6.13 (ffd294d346d1), it
indeed does not resolve the issue.
The behavior is bit different than at the reported commit
(c0a40097f0bc) in the sense that it seems that the block is happening
earlier in the boot before even systemd has started because there is
no mention of a wait job.
However the end result is still the same; the boot will only continue
after the "unable to enumerate USB device" message.
staying available if you have anything else
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Long boot times due to USB enumeration
2025-03-05 19:57 ` Seïfane Idouchach
@ 2025-03-07 12:58 ` Seïfane Idouchach
2025-03-07 14:07 ` Greg KH
0 siblings, 1 reply; 7+ messages in thread
From: Seïfane Idouchach @ 2025-03-07 12:58 UTC (permalink / raw)
To: Greg KH; +Cc: dirk.behme, rafael, dakr, linux-kernel, regressions, stable
Dear all,
I continued bisecting and while applying Dan's fix (15fffc6a5624) along the way.
While the patch solves the problem for some commits it seems I am
hitting another commit that exhibits the error again
(25f51b76f90f10f9bf2fbc05fc51cf685da7ccad).
I tested on top of v6.14-rc5 (7eb172143d5508) which has the issue,
applying the fix and reverting the bad commit (25f51b76f90f10) fixes
it.
Both the applying fix and the revert are needed to resolve the issue.
Let me know your thoughts on this.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Long boot times due to USB enumeration
2025-03-07 12:58 ` Seïfane Idouchach
@ 2025-03-07 14:07 ` Greg KH
2025-03-07 15:45 ` Seïfane Idouchach
0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2025-03-07 14:07 UTC (permalink / raw)
To: Seïfane Idouchach
Cc: dirk.behme, rafael, dakr, linux-kernel, regressions, stable
On Fri, Mar 07, 2025 at 08:58:04PM +0800, Seïfane Idouchach wrote:
> Dear all,
>
> I continued bisecting and while applying Dan's fix (15fffc6a5624) along the way.
> While the patch solves the problem for some commits it seems I am
> hitting another commit that exhibits the error again
> (25f51b76f90f10f9bf2fbc05fc51cf685da7ccad).
That is a totally different change, I think you have something odd here
as these bisection points are very confusing.
> I tested on top of v6.14-rc5 (7eb172143d5508) which has the issue,
> applying the fix and reverting the bad commit (25f51b76f90f10) fixes
> it.
> Both the applying fix and the revert are needed to resolve the issue.
>
> Let me know your thoughts on this.
I think you have a mix of problems here. Let's fix up all of those
error messages in the log first. Dan's fix has nothing to do with that
at all, once the USB bus connection stuff is resolved, then it should be
ok.
As that xhci commit you point at is showing an issue, are you sure that
you are properly building the right xhci driver into the system? Do you
have a Renesas xhci controller? What is the output of 'lspci'?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Long boot times due to USB enumeration
2025-03-07 14:07 ` Greg KH
@ 2025-03-07 15:45 ` Seïfane Idouchach
2025-03-08 9:06 ` Seïfane Idouchach
0 siblings, 1 reply; 7+ messages in thread
From: Seïfane Idouchach @ 2025-03-07 15:45 UTC (permalink / raw)
To: Greg KH; +Cc: dirk.behme, rafael, dakr, linux-kernel, regressions, stable
> That is a totally different change, I think you have something odd here
> as these bisection points are very confusing.
I can only agree. I was skeptical that reverting this commit would fix
the issue but it does.
> I think you have a mix of problems here. Let's fix up all of those
> error messages in the log first. Dan's fix has nothing to do with that
> at all, once the USB bus connection stuff is resolved, then it should be
> ok.
Are you suggesting you want to fix those messages ? I am sorry if I
was not clear before, those messages are always present even on a
"good" build.
The issue is that on a "bad" build they hold back the boot process
from continuing. USB functionality is never affected.
> As that xhci commit you point at is showing an issue, are you sure that
> you are properly building the right xhci driver into the system? Do you
> have a Renesas xhci controller? What is the output of 'lspci'?
I am building with a config based on my current distribution, Arch
Linux, with olddefconfig. A quick grep for the values found in the
commit returns the following :
CONFIG_USB_XHCI_PCI=y
CONFIG_USB_XHCI_PCI_RENESAS=m
lspci as requested:
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse Root Complex [1022:1480]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse IOMMU [1022:1481]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse GPP Bridge [1022:1483]
00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse GPP Bridge [1022:1483]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse GPP Bridge [1022:1483]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:05.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus
Controller [1022:790b] (rev 61)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC
Bridge [1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 0 [1022:1440]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 1 [1022:1441]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 2 [1022:1442]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 3 [1022:1443]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 4 [1022:1444]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 5 [1022:1445]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 6 [1022:1446]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD]
Matisse/Vermeer Data Fabric: Device 18h; Function 7 [1022:1447]
01:00.0 Non-Volatile memory controller [0108]: Kingston Technology
Company, Inc. A2000 NVMe SSD [SM2263EN] [2646:2263] (rev 03)
02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 500
Series Chipset USB 3.1 XHCI Controller [1022:43ee]
02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 500
Series Chipset SATA Controller [1022:43eb]
02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 500
Series Chipset Switch Upstream Port [1022:43e9]
03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
03:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
04:00.0 Ethernet controller [0200]: Intel Corporation 82599ES
10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
2a:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
2b:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU106
[GeForce RTX 2070 Rev. A] [10de:1f07] (rev a1)
2b:00.1 Audio device [0403]: NVIDIA Corporation TU106 High Definition
Audio Controller [10de:10f9] (rev a1)
2b:00.2 USB controller [0c03]: NVIDIA Corporation TU106 USB 3.1 Host
Controller [10de:1ada] (rev a1)
2b:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU106 USB
Type-C UCSI Controller [10de:1adb] (rev a1)
2c:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices,
Inc. [AMD] Starship/Matisse PCIe Dummy Function [1022:148a]
2d:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices,
Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
2d:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc.
[AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
2d:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD]
Matisse USB 3.0 Host Controller [1022:149c]
2d:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse HD Audio Controller [1022:1487]
Thanks for your time
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Long boot times due to USB enumeration
2025-03-07 15:45 ` Seïfane Idouchach
@ 2025-03-08 9:06 ` Seïfane Idouchach
0 siblings, 0 replies; 7+ messages in thread
From: Seïfane Idouchach @ 2025-03-08 9:06 UTC (permalink / raw)
To: Greg KH; +Cc: dirk.behme, rafael, dakr, linux-kernel, regressions, stable
Some development here,
I noticed today that while applying Dan's patch and reverting the
"bad" commit resolves the issue, it only does so on a reboot. The boot
is still slow on a cold boot.
As you said this might very well be a mix of different issues. It is
my own fault for not reporting this regression earlier thinking it
would be fixed.
As a sanity check I retested old LTS releases. I find that v6.1 does
not have the issue on cold boot while v6.6 does. The USB error
messages are there regardless, they just don't impede on the boot
process time.
I am almost 90% positive that those error messages have always been
present on this system, for what it's worth.
I have gone through the troubleshooting step of unplugging all USB
devices and headers and the errors are still present.
If I find the time I might run another bisect between v6.1 and v6.6
doing cold boots instead of reboots and report back. I am just afraid
I will just get back to the initial commit reported since this is what
I first did.
Thank you.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-08 9:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05 16:32 [REGRESSION] Long boot times due to USB enumeration Seïfane Idouchach
2025-03-05 18:09 ` Greg KH
2025-03-05 19:57 ` Seïfane Idouchach
2025-03-07 12:58 ` Seïfane Idouchach
2025-03-07 14:07 ` Greg KH
2025-03-07 15:45 ` Seïfane Idouchach
2025-03-08 9:06 ` Seïfane Idouchach
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox