Linux USB
 help / color / mirror / Atom feed
* Re: Regression in 5.19.0: USB errors during boot
       [not found] <25342.20092.262450.330346@wylie.me.uk>
@ 2022-08-18 14:47 ` Greg Kroah-Hartman
  2022-08-18 14:56   ` Alan J. Wylie
  2022-08-21  6:23   ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-18 14:47 UTC (permalink / raw)
  To: Alan J. Wylie, linux-usb
  Cc: Christoph Hellwig, Linus Torvalds, linux-kernel, stable

[Adding in linux-usb@vger]


On Thu, Aug 18, 2022 at 03:36:44PM +0100, Alan J. Wylie wrote:
> 
> Apologies for the delay in reporting this: I messed up my first attempt at
> bisecting, then I've spent a week going to, enjoying, returning from and
> recovering from a music festival.
> 
> Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2)  I see
> lots of errors and hangs on the USB2 chipset, e.g.
> 
> $ grep "usb 9-4" dmesg.5.19.2
> [    6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci
> [    6.829087] usb 9-4: device descriptor read/64, error -32
> [    7.097094] usb 9-4: device descriptor read/64, error -32
> [    7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci
> [    7.521152] usb 9-4: device descriptor read/64, error -32
> [    7.789066] usb 9-4: device descriptor read/64, error -32
> [    8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci
> [    8.497138] usb 9-4: device not accepting address 4, error -32
> [    8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci
> [    9.069141] usb 9-4: device not accepting address 5, error -32
> $
> 
> $ grep "usb 1-2" dmesg.5.19.2
> [    5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci
> [    6.277076] usb 1-2: device descriptor read/64, error -71
> [    6.513143] usb 1-2: device descriptor read/64, error -32
> [    6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci
> [    6.881143] usb 1-2: device descriptor read/64, error -32
> [    7.117144] usb 1-2: device descriptor read/64, error -32
> [    7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci
> [    7.845134] usb 1-2: device not accepting address 4, error -32
> [    7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci
> [    8.393158] usb 1-2: device not accepting address 5, error -32
> $
> 
> the USB port is then no longer usable
> 
> This is not reproducible on the other chipset (USB3) on this machine,
> nor on two other systems. Swapping USB cables doesn't help.
> 
> I have bisected it to
> 
> $ git bisect bad
> 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit
> commit 78013eaadf696d2105982abb4018fbae394ca08f
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Mon Feb 14 14:11:44 2022 +0100
> 
>     x86: remove the IOMMU table infrastructure
> 
> however it will not easily revert
> 
> I'll be more than happy to assist with any debugging/testing.
> 
> $ git revert 78013eaadf696d2105982abb4018fbae394ca08f
> Auto-merging arch/x86/include/asm/dma-mapping.h
> CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h
> Auto-merging arch/x86/include/asm/iommu.h
> Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h
> Auto-merging arch/x86/kernel/Makefile
> Auto-merging arch/x86/kernel/pci-dma.c
> CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c
> Auto-merging arch/x86/kernel/vmlinux.lds.S
> Auto-merging drivers/iommu/amd/init.c
> Auto-merging drivers/iommu/amd/iommu.c
> CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c
> Auto-merging drivers/iommu/intel/dmar.c
> error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
> 
> # dmidecode  | grep -A2 "^Base Board"
> Base Board Information
>      Manufacturer: Gigabyte Technology Co., Ltd.
>      Product Name: 970A-DS3P
> #
> 
> # lspci -nn | grep -i usb
> 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
> 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)

So this only happens with the on-board USB 2 controller?

This is odd, I would not expect one PCI controller to work, but the
other one not.


> #
> 
> # lspci -v -s 00:12
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI])
> 	Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> 	Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
> 	Memory at fe50a000 (32-bit, non-prefetchable) [size=4K]
> 	Kernel driver in use: ohci-pci
> 				 	Kernel modules: ohci_pci
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI])
> 	Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> 	Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
> 	Memory at fe509000 (32-bit, non-prefetchable) [size=256]
> 	Capabilities: [c0] Power Management version 2
> 	Capabilities: [e4] Debug port: BAR=1 offset=00e0
> 	Kernel driver in use: ehci-pci
> 	Kernel modules: ehci_pci
> #

What is the output of the lspci -v for the USB 3 controller?

Christoph, any ideas?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Regression in 5.19.0: USB errors during boot
  2022-08-18 14:47 ` Regression in 5.19.0: USB errors during boot Greg Kroah-Hartman
@ 2022-08-18 14:56   ` Alan J. Wylie
  2022-08-21  6:23   ` Christoph Hellwig
  1 sibling, 0 replies; 7+ messages in thread
From: Alan J. Wylie @ 2022-08-18 14:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-usb, Christoph Hellwig, Linus Torvalds, linux-kernel,
	stable

at 16:47 on Thu 18-Aug-2022 Greg Kroah-Hartman (gregkh@linuxfoundation.org) wrote:
> [Adding in linux-usb@vger]
> 
> 
> On Thu, Aug 18, 2022 at 03:36:44PM +0100, Alan J. Wylie wrote:
> > 
> > Apologies for the delay in reporting this: I messed up my first attempt at
> > bisecting, then I've spent a week going to, enjoying, returning from and
> > recovering from a music festival.
> > 
> > Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2)  I see
> > lots of errors and hangs on the USB2 chipset, e.g.
> > 
> > $ grep "usb 9-4" dmesg.5.19.2
> > [    6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci
> > [    6.829087] usb 9-4: device descriptor read/64, error -32
> > [    7.097094] usb 9-4: device descriptor read/64, error -32
> > [    7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci
> > [    7.521152] usb 9-4: device descriptor read/64, error -32
> > [    7.789066] usb 9-4: device descriptor read/64, error -32
> > [    8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci
> > [    8.497138] usb 9-4: device not accepting address 4, error -32
> > [    8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci
> > [    9.069141] usb 9-4: device not accepting address 5, error -32
> > $
> > 
> > $ grep "usb 1-2" dmesg.5.19.2
> > [    5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci
> > [    6.277076] usb 1-2: device descriptor read/64, error -71
> > [    6.513143] usb 1-2: device descriptor read/64, error -32
> > [    6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci
> > [    6.881143] usb 1-2: device descriptor read/64, error -32
> > [    7.117144] usb 1-2: device descriptor read/64, error -32
> > [    7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci
> > [    7.845134] usb 1-2: device not accepting address 4, error -32
> > [    7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci
> > [    8.393158] usb 1-2: device not accepting address 5, error -32
> > $
> > 
> > the USB port is then no longer usable
> > 
> > This is not reproducible on the other chipset (USB3) on this machine,
> > nor on two other systems. Swapping USB cables doesn't help.
> > 
> > I have bisected it to
> > 
> > $ git bisect bad
> > 78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit
> > commit 78013eaadf696d2105982abb4018fbae394ca08f
> > Author: Christoph Hellwig <hch@lst.de>
> > Date:   Mon Feb 14 14:11:44 2022 +0100
> > 
> >     x86: remove the IOMMU table infrastructure
> > 
> > however it will not easily revert
> > 
> > I'll be more than happy to assist with any debugging/testing.
> > 
> > $ git revert 78013eaadf696d2105982abb4018fbae394ca08f
> > Auto-merging arch/x86/include/asm/dma-mapping.h
> > CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h
> > Auto-merging arch/x86/include/asm/iommu.h
> > Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h
> > Auto-merging arch/x86/kernel/Makefile
> > Auto-merging arch/x86/kernel/pci-dma.c
> > CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c
> > Auto-merging arch/x86/kernel/vmlinux.lds.S
> > Auto-merging drivers/iommu/amd/init.c
> > Auto-merging drivers/iommu/amd/iommu.c
> > CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c
> > Auto-merging drivers/iommu/intel/dmar.c
> > error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
> > 
> > # dmidecode  | grep -A2 "^Base Board"
> > Base Board Information
> >      Manufacturer: Gigabyte Technology Co., Ltd.
> >      Product Name: 970A-DS3P
> > #
> > 
> > # lspci -nn | grep -i usb
> > 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> > 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> > 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> > 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> > 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
> > 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
> > 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
> > 02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)
> 
> So this only happens with the on-board USB 2 controller?

That is correct

> This is odd, I would not expect one PCI controller to work, but the
> other one not.
> 
> 
> > #
> > 
> > # lspci -v -s 00:12
> > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI])
> > 	Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> > 	Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
> > 	Memory at fe50a000 (32-bit, non-prefetchable) [size=4K]
> > 	Kernel driver in use: ohci-pci
> > 				 	Kernel modules: ohci_pci
> > 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI])
> > 	Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
> > 	Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
> > 	Memory at fe509000 (32-bit, non-prefetchable) [size=256]
> > 	Capabilities: [c0] Power Management version 2
> > 	Capabilities: [e4] Debug port: BAR=1 offset=00e0
> > 	Kernel driver in use: ehci-pci
> > 	Kernel modules: ehci_pci
> > #
> 
> What is the output of the lspci -v for the USB 3 controller?

# lspci -v -s 02:00
02:00.0 USB controller: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller (rev 01) (prog-if 30 [XHCI])
	Subsystem: Gigabyte Technology Co., Ltd VL805/806 xHCI USB 3.0 Controller
	Flags: bus master, fast devsel, latency 0, IRQ 36
	Memory at fe400000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] MSI: Enable+ Count=1/4 Maskable- 64bit+
	Capabilities: [c4] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci

> 
> Christoph, any ideas?
> 
> thanks,
> 
> greg k-h

-- 
Alan J. Wylie                                          https://www.wylie.me.uk/

Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Regression in 5.19.0: USB errors during boot
  2022-08-18 14:47 ` Regression in 5.19.0: USB errors during boot Greg Kroah-Hartman
  2022-08-18 14:56   ` Alan J. Wylie
@ 2022-08-21  6:23   ` Christoph Hellwig
  2022-08-21  8:21     ` Alan J. Wylie
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-08-21  6:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Alan J. Wylie, linux-usb, Christoph Hellwig, Linus Torvalds,
	linux-kernel, stable

On Thu, Aug 18, 2022 at 04:47:14PM +0200, Greg Kroah-Hartman wrote:
> What is the output of the lspci -v for the USB 3 controller?
> 
> Christoph, any ideas?

Well, with that commit it must be related to dma ops selection.
As this appears to be an AMD system the options here are direct,
amd_iommu and possibly amd_gart as the odd one in the mix.

Alan, can you send me your .config?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Regression in 5.19.0: USB errors during boot
  2022-08-21  6:23   ` Christoph Hellwig
@ 2022-08-21  8:21     ` Alan J. Wylie
  2022-08-21 14:26       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Alan J. Wylie @ 2022-08-21  8:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-usb, Linus Torvalds, linux-kernel,
	stable

at 08:23 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:

> On Thu, Aug 18, 2022 at 04:47:14PM +0200, Greg Kroah-Hartman wrote:
> > What is the output of the lspci -v for the USB 3 controller?
> > 
> > Christoph, any ideas?
>
> Well, with that commit it must be related to dma ops selection.
> As this appears to be an AMD system the options here are direct,
> amd_iommu and possibly amd_gart as the odd one in the mix.
>
> Alan, can you send me your .config?

I hope that with the following information there is no need for me to
do so.

It is indeed an old AMD CPU
  Model name:          AMD FX(tm)-4300 Quad-Core Processor
  CPU family:          21
  Model:               2

Comparing with another AMD system that doesn't show the problem,
I see that CONFIG_GART_IOMMU is only set on the one with the problem.

The configs have just had "make oldconfig" run on them for years, I
have no idea why one has it set.

Clearing it fixes the problem!

Thanks for the hint, although there is a still wider issue with this
regression.

$ diff .config.old  .config
353c353
< CONFIG_GART_IOMMU=y
---
> # CONFIG_GART_IOMMU is not set
4683d4682
< CONFIG_IOMMU_HELPER=y
4987d4985
< # CONFIG_IOMMU_DEBUG is not set
$

-- 
Alan J. Wylie                                          https://www.wylie.me.uk/

Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Regression in 5.19.0: USB errors during boot
  2022-08-21  8:21     ` Alan J. Wylie
@ 2022-08-21 14:26       ` Christoph Hellwig
  2022-08-21 16:50         ` Alan J. Wylie
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-08-21 14:26 UTC (permalink / raw)
  To: Alan J. Wylie
  Cc: Christoph Hellwig, Greg Kroah-Hartman, linux-usb, Linus Torvalds,
	linux-kernel, stable

On Sun, Aug 21, 2022 at 09:21:22AM +0100, Alan J. Wylie wrote:
> Comparing with another AMD system that doesn't show the problem,
> I see that CONFIG_GART_IOMMU is only set on the one with the problem.
> 
> The configs have just had "make oldconfig" run on them for years, I
> have no idea why one has it set.
> 
> Clearing it fixes the problem!

Thanks for confirming my suspicion.  I'd still like to fix the issue
with CONFIG_GART_IOMMU enabled once I've tracked it down.  Would you
be willing to test patches?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Regression in 5.19.0: USB errors during boot
  2022-08-21 14:26       ` Christoph Hellwig
@ 2022-08-21 16:50         ` Alan J. Wylie
  2022-09-08 11:18           ` Regression in 5.19.0: USB errors during boot #forregzbot Thorsten Leemhuis
  0 siblings, 1 reply; 7+ messages in thread
From: Alan J. Wylie @ 2022-08-21 16:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-usb, Linus Torvalds, linux-kernel,
	stable

at 16:26 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:

> Thanks for confirming my suspicion.  I'd still like to fix the issue
> with CONFIG_GART_IOMMU enabled once I've tracked it down.  Would you
> be willing to test patches?

I'll be glad to help.

I've also had a look in the loft and my box of bits for an old
Athlon64/Opteron/Turion/Sempron processor, but I'm afraid all I've got
are:

Phenom II X6 1055T
Phenom II X2 545
Athlon 2  x2 270

-- 
Alan J. Wylie                                          https://www.wylie.me.uk/

Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Regression in 5.19.0: USB errors during boot #forregzbot
  2022-08-21 16:50         ` Alan J. Wylie
@ 2022-09-08 11:18           ` Thorsten Leemhuis
  0 siblings, 0 replies; 7+ messages in thread
From: Thorsten Leemhuis @ 2022-09-08 11:18 UTC (permalink / raw)
  To: regressions@lists.linux.dev; +Cc: linux-usb, linux-kernel, stable

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

On 21.08.22 18:50, Alan J. Wylie wrote:
> at 16:26 on Sun 21-Aug-2022 Christoph Hellwig (hch@lst.de) wrote:
> 
>> Thanks for confirming my suspicion.  I'd still like to fix the issue
>> with CONFIG_GART_IOMMU enabled once I've tracked it down.  Would you
>> be willing to test patches?
> 
> I'll be glad to help.
> 
> I've also had a look in the loft and my box of bits for an old
> Athlon64/Opteron/Turion/Sempron processor, but I'm afraid all I've got
> are:
> 
> Phenom II X6 1055T
> Phenom II X2 545
> Athlon 2  x2 270

#regzbot backburner: unusual config, workaround found, devs still want
to fix it, but apparently not urgent
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-08 11:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <25342.20092.262450.330346@wylie.me.uk>
2022-08-18 14:47 ` Regression in 5.19.0: USB errors during boot Greg Kroah-Hartman
2022-08-18 14:56   ` Alan J. Wylie
2022-08-21  6:23   ` Christoph Hellwig
2022-08-21  8:21     ` Alan J. Wylie
2022-08-21 14:26       ` Christoph Hellwig
2022-08-21 16:50         ` Alan J. Wylie
2022-09-08 11:18           ` Regression in 5.19.0: USB errors during boot #forregzbot Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox