* Re: Regression in 5.19.0: USB errors during boot
[not found] <25342.20092.262450.330346@wylie.me.uk>
@ 2022-08-18 14:47 ` Greg Kroah-Hartman
2022-08-18 14:56 ` Alan J. Wylie
2022-08-21 6:23 ` Christoph Hellwig
0 siblings, 2 replies; 7+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-18 14:47 UTC (permalink / raw)
To: Alan J. Wylie, linux-usb
Cc: Christoph Hellwig, Linus Torvalds, linux-kernel, stable
[Adding in linux-usb@vger]
On Thu, Aug 18, 2022 at 03:36:44PM +0100, Alan J. Wylie wrote:
>
> Apologies for the delay in reporting this: I messed up my first attempt at
> bisecting, then I've spent a week going to, enjoying, returning from and
> recovering from a music festival.
>
> Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see
> lots of errors and hangs on the USB2 chipset, e.g.
>
> $ grep "usb 9-4" dmesg.5.19.2
> [ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci
> [ 6.829087] usb 9-4: device descriptor read/64, error -32
> [ 7.097094] usb 9-4: device descriptor read/64, error -32
> [ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci
> [ 7.521152] usb 9-4: device descriptor read/64, error -32
> [ 7.789066] usb 9-4: device descriptor read/64, error -32
> [ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci
> [ 8.497138] usb 9-4: device not accepting address 4, error -32
> [ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci
> [ 9.069141] usb 9-4: device not accepting address 5, error -32
> $
>
> $ grep "usb 1-2" dmesg.5.19.2
> [ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci
> [ 6.277076] usb 1-2: device descriptor read/64, error -71
> [ 6.513143] usb 1-2: device descriptor read/64, error -32
> [ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci
> [ 6.881143] usb 1-2: device descriptor read/64, error -32
> [ 7.117144] usb 1-2: device descriptor read/64, error -32
> [ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci
> [ 7.845134] usb 1-2: device not accepting address 4, error -32
> [ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci
> [ 8.393158] usb 1-2: device not accepting address 5, error -32
> $
>
> the USB port is then no longer usable
>
> This is not reproducible on the other chipset (USB3) on this machine,
> nor on two other systems. Swapping USB cables doesn't help.
>
> I have bisected it to
>
> $ git bisect bad
> 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit
> commit 78013eaadf696d2105982abb4018fbae394ca08f
> Author: Christoph Hellwig <hch@lst.de>
> Date: Mon Feb 14 14:11:44 2022 +0100
>
> x86: remove the IOMMU table infrastructure
>
> however it will not easily revert
>
> I'll be more than happy to assist with any debugging/testing.
>
> $ git revert 78013eaadf696d2105982abb4018fbae394ca08f
> Auto-merging arch/x86/include/asm/dma-mapping.h
> CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h
> Auto-merging arch/x86/include/asm/iommu.h
> Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h
> Auto-merging arch/x86/kernel/Makefile
> Auto-merging arch/x86/kernel/pci-dma.c
> CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c
> Auto-merging arch/x86/kernel/vmlinux.lds.S
> Auto-merging drivers/iommu/amd/init.c
> Auto-merging drivers/iommu/amd/iommu.c
> CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c
> Auto-merging drivers/iommu/intel/dmar.c
> error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
>
> # dmidecode | grep -A2 "^Base Board"
> Base Board Information
> Manufacturer: Gigabyte Technology Co., Ltd.
> Product Name: 970A-DS3P
> #
>
> # lspci -nn | grep -i usb
> 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
> 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)
So this only happens with the on-board USB 2 controller?
This is odd, I would not expect one PCI controller to work, but the
other one not.
> #
>
> # lspci -v -s 00:12
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI])
> Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
> Memory at fe50a000 (32-bit, non-prefetchable) [size=4K]
> Kernel driver in use: ohci-pci
> Kernel modules: ohci_pci
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI])
> Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
> Memory at fe509000 (32-bit, non-prefetchable) [size=256]
> Capabilities: [c0] Power Management version 2
> Capabilities: [e4] Debug port: BAR=1 offset=00e0
> Kernel driver in use: ehci-pci
> Kernel modules: ehci_pci
> #
What is the output of the lspci -v for the USB 3 controller?
Christoph, any ideas?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Regression in 5.19.0: USB errors during boot
2022-08-18 14:47 ` Regression in 5.19.0: USB errors during boot Greg Kroah-Hartman
@ 2022-08-18 14:56 ` Alan J. Wylie
2022-08-21 6:23 ` Christoph Hellwig
1 sibling, 0 replies; 7+ messages in thread
From: Alan J. Wylie @ 2022-08-18 14:56 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: linux-usb, Christoph Hellwig, Linus Torvalds, linux-kernel,
stable
at 16:47 on Thu 18-Aug-2022 Greg Kroah-Hartman (gregkh@linuxfoundation.org) wrote:
> [Adding in linux-usb@vger]
>
>
> On Thu, Aug 18, 2022 at 03:36:44PM +0100, Alan J. Wylie wrote:
> >
> > Apologies for the delay in reporting this: I messed up my first attempt at
> > bisecting, then I've spent a week going to, enjoying, returning from and
> > recovering from a music festival.
> >
> > Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see
> > lots of errors and hangs on the USB2 chipset, e.g.
> >
> > $ grep "usb 9-4" dmesg.5.19.2
> > [ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci
> > [ 6.829087] usb 9-4: device descriptor read/64, error -32
> > [ 7.097094] usb 9-4: device descriptor read/64, error -32
> > [ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci
> > [ 7.521152] usb 9-4: device descriptor read/64, error -32
> > [ 7.789066] usb 9-4: device descriptor read/64, error -32
> > [ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci
> > [ 8.497138] usb 9-4: device not accepting address 4, error -32
> > [ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci
> > [ 9.069141] usb 9-4: device not accepting address 5, error -32
> > $
> >
> > $ grep "usb 1-2" dmesg.5.19.2
> > [ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci
> > [ 6.277076] usb 1-2: device descriptor read/64, error -71
> > [ 6.513143] usb 1-2: device descriptor read/64, error -32
> > [ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci
> > [ 6.881143] usb 1-2: device descriptor read/64, error -32
> > [ 7.117144] usb 1-2: device descriptor read/64, error -32
> > [ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci
> > [ 7.845134] usb 1-2: device not accepting address 4, error -32
> > [ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci
> > [ 8.393158] usb 1-2: device not accepting address 5, error -32
> > $
> >
> > the USB port is then no longer usable
> >
> > This is not reproducible on the other chipset (USB3) on this machine,
> > nor on two other systems. Swapping USB cables doesn't help.
> >
> > I have bisected it to
> >
> > $ git bisect bad
> > 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit
> > commit 78013eaadf696d2105982abb4018fbae394ca08f
> > Author: Christoph Hellwig <hch@lst.de>
> > Date: Mon Feb 14 14:11:44 2022 +0100
> >
> > x86: remove the IOMMU table infrastructure
> >
> > however it will not easily revert
> >
> > I'll be more than happy to assist with any debugging/testing.
> >
> > $ git revert 78013eaadf696d2105982abb4018fbae394ca08f
> > Auto-merging arch/x86/include/asm/dma-mapping.h
> > CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h
> > Auto-merging arch/x86/include/asm/iommu.h
> > Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h
> > Auto-merging arch/x86/kernel/Makefile
> > Auto-merging arch/x86/kernel/pci-dma.c
> > CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c
> > Auto-merging arch/x86/kernel/vmlinux.lds.S
> > Auto-merging drivers/iommu/amd/init.c
> > Auto-merging drivers/iommu/amd/iommu.c
> > CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c
> > Auto-merging drivers/iommu/intel/dmar.c
> > error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
> >
> > # dmidecode | grep -A2 "^Base Board"
> > Base Board Information
> > Manufacturer: Gigabyte Technology Co., Ltd.
> > Product Name: 970A-DS3P
> > #
> >
> > # lspci -nn | grep -i usb
> > 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> > 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> > 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> > 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> > 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
> > 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> > 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> > 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)
>
> So this only happens with the on-board USB 2 controller?
That is correct
> This is odd, I would not expect one PCI controller to work, but the
> other one not.
>
>
> > #
> >
> > # lspci -v -s 00:12
> > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI])
> > Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> > Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
> > Memory at fe50a000 (32-bit, non-prefetchable) [size=4K]
> > Kernel driver in use: ohci-pci
> > Kernel modules: ohci_pci
> > 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI])
> > Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> > Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
> > Memory at fe509000 (32-bit, non-prefetchable) [size=256]
> > Capabilities: [c0] Power Management version 2
> > Capabilities: [e4] Debug port: BAR=1 offset=00e0
> > Kernel driver in use: ehci-pci
> > Kernel modules: ehci_pci
> > #
>
> What is the output of the lspci -v for the USB 3 controller?
# lspci -v -s 02:00
02:00.0 USB controller: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller (rev 01) (prog-if 30 [XHCI])
Subsystem: Gigabyte Technology Co., Ltd VL805/806 xHCI USB 3.0 Controller
Flags: bus master, fast devsel, latency 0, IRQ 36
Memory at fe400000 (64-bit, non-prefetchable) [size=4K]
Capabilities: [80] Power Management version 3
Capabilities: [90] MSI: Enable+ Count=1/4 Maskable- 64bit+
Capabilities: [c4] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
>
> Christoph, any ideas?
>
> thanks,
>
> greg k-h
--
Alan J. Wylie https://www.wylie.me.uk/
Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Regression in 5.19.0: USB errors during boot
2022-08-18 14:47 ` Regression in 5.19.0: USB errors during boot Greg Kroah-Hartman
2022-08-18 14:56 ` Alan J. Wylie
@ 2022-08-21 6:23 ` Christoph Hellwig
2022-08-21 8:21 ` Alan J. Wylie
1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-08-21 6:23 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Alan J. Wylie, linux-usb, Christoph Hellwig, Linus Torvalds,
linux-kernel, stable
On Thu, Aug 18, 2022 at 04:47:14PM +0200, Greg Kroah-Hartman wrote:
> What is the output of the lspci -v for the USB 3 controller?
>
> Christoph, any ideas?
Well, with that commit it must be related to dma ops selection.
As this appears to be an AMD system the options here are direct,
amd_iommu and possibly amd_gart as the odd one in the mix.
Alan, can you send me your .config?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Regression in 5.19.0: USB errors during boot
2022-08-21 6:23 ` Christoph Hellwig
@ 2022-08-21 8:21 ` Alan J. Wylie
2022-08-21 14:26 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Alan J. Wylie @ 2022-08-21 8:21 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Greg Kroah-Hartman, linux-usb, Linus Torvalds, linux-kernel,
stable
at 08:23 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
> On Thu, Aug 18, 2022 at 04:47:14PM +0200, Greg Kroah-Hartman wrote:
> > What is the output of the lspci -v for the USB 3 controller?
> >
> > Christoph, any ideas?
>
> Well, with that commit it must be related to dma ops selection.
> As this appears to be an AMD system the options here are direct,
> amd_iommu and possibly amd_gart as the odd one in the mix.
>
> Alan, can you send me your .config?
I hope that with the following information there is no need for me to
do so.
It is indeed an old AMD CPU
Model name: AMD FX(tm)-4300 Quad-Core Processor
CPU family: 21
Model: 2
Comparing with another AMD system that doesn't show the problem,
I see that CONFIG_GART_IOMMU is only set on the one with the problem.
The configs have just had "make oldconfig" run on them for years, I
have no idea why one has it set.
Clearing it fixes the problem!
Thanks for the hint, although there is a still wider issue with this
regression.
$ diff .config.old .config
353c353
< CONFIG_GART_IOMMU=y
---
> # CONFIG_GART_IOMMU is not set
4683d4682
< CONFIG_IOMMU_HELPER=y
4987d4985
< # CONFIG_IOMMU_DEBUG is not set
$
--
Alan J. Wylie https://www.wylie.me.uk/
Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Regression in 5.19.0: USB errors during boot
2022-08-21 8:21 ` Alan J. Wylie
@ 2022-08-21 14:26 ` Christoph Hellwig
2022-08-21 16:50 ` Alan J. Wylie
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-08-21 14:26 UTC (permalink / raw)
To: Alan J. Wylie
Cc: Christoph Hellwig, Greg Kroah-Hartman, linux-usb, Linus Torvalds,
linux-kernel, stable
On Sun, Aug 21, 2022 at 09:21:22AM +0100, Alan J. Wylie wrote:
> Comparing with another AMD system that doesn't show the problem,
> I see that CONFIG_GART_IOMMU is only set on the one with the problem.
>
> The configs have just had "make oldconfig" run on them for years, I
> have no idea why one has it set.
>
> Clearing it fixes the problem!
Thanks for confirming my suspicion. I'd still like to fix the issue
with CONFIG_GART_IOMMU enabled once I've tracked it down. Would you
be willing to test patches?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Regression in 5.19.0: USB errors during boot
2022-08-21 14:26 ` Christoph Hellwig
@ 2022-08-21 16:50 ` Alan J. Wylie
2022-09-08 11:18 ` Regression in 5.19.0: USB errors during boot #forregzbot Thorsten Leemhuis
0 siblings, 1 reply; 7+ messages in thread
From: Alan J. Wylie @ 2022-08-21 16:50 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Greg Kroah-Hartman, linux-usb, Linus Torvalds, linux-kernel,
stable
at 16:26 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
> Thanks for confirming my suspicion. I'd still like to fix the issue
> with CONFIG_GART_IOMMU enabled once I've tracked it down. Would you
> be willing to test patches?
I'll be glad to help.
I've also had a look in the loft and my box of bits for an old
Athlon64/Opteron/Turion/Sempron processor, but I'm afraid all I've got
are:
Phenom II X6 1055T
Phenom II X2 545
Athlon 2 x2 270
--
Alan J. Wylie https://www.wylie.me.uk/
Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Regression in 5.19.0: USB errors during boot #forregzbot
2022-08-21 16:50 ` Alan J. Wylie
@ 2022-09-08 11:18 ` Thorsten Leemhuis
0 siblings, 0 replies; 7+ messages in thread
From: Thorsten Leemhuis @ 2022-09-08 11:18 UTC (permalink / raw)
To: regressions@lists.linux.dev; +Cc: linux-usb, linux-kernel, stable
TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.
On 21.08.22 18:50, Alan J. Wylie wrote:
> at 16:26 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
>
>> Thanks for confirming my suspicion. I'd still like to fix the issue
>> with CONFIG_GART_IOMMU enabled once I've tracked it down. Would you
>> be willing to test patches?
>
> I'll be glad to help.
>
> I've also had a look in the loft and my box of bits for an old
> Athlon64/Opteron/Turion/Sempron processor, but I'm afraid all I've got
> are:
>
> Phenom II X6 1055T
> Phenom II X2 545
> Athlon 2 x2 270
#regzbot backburner: unusual config, workaround found, devs still want
to fix it, but apparently not urgent
#regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-09-08 11:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <25342.20092.262450.330346@wylie.me.uk>
2022-08-18 14:47 ` Regression in 5.19.0: USB errors during boot Greg Kroah-Hartman
2022-08-18 14:56 ` Alan J. Wylie
2022-08-21 6:23 ` Christoph Hellwig
2022-08-21 8:21 ` Alan J. Wylie
2022-08-21 14:26 ` Christoph Hellwig
2022-08-21 16:50 ` Alan J. Wylie
2022-09-08 11:18 ` Regression in 5.19.0: USB errors during boot #forregzbot Thorsten Leemhuis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox