* Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
@ 2025-08-08 21:47 Martin Maurer
2025-08-14 12:16 ` Michał Pecio
2025-08-14 13:02 ` Mathias Nyman
0 siblings, 2 replies; 8+ messages in thread
From: Martin Maurer @ 2025-08-08 21:47 UTC (permalink / raw)
To: linux-usb
Hello,
since some time I am fighting against a problem with USB:
I have a Qualcomm radio module (in my case a Quectel RM520N-GL and
SIMCOM SIM8260G-M2)
connected to a Phytec Pollux board with an NXP i.MX8MP.
I started with Linux 6.6.23. It communicates with USB 3.x.
I build up an internet connection with this radio module. I connected a
Notebook (via Wifi, but external hardware converts to Ethernet).
First test setup was a bit difficult.
Radio Module <-> USB 3.x <-> Phytec Linux Board <-> Ethernet Tunnel <->
Raspberry Pi CM5 <-> Wifi <-> Windows Notebook
I opened Firefox, Youtube worked well, HD video over multiple hours with
no problem.
Then I opened Microsoft Teams instead and data transfer immediately
stalled. I had a ping running in parallel directly to radio module, this
also stalled.
With more testing I found out also Firefox and opening Twitch.tv stalled
the connection.
# lsusb -t
/: Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
|__ Port 1: Dev 2, If 0, Class=Vendor Specific Class,
Driver=option, 5000M
|__ Port 1: Dev 2, If 1, Class=Vendor Specific Class,
Driver=option, 5000M
|__ Port 1: Dev 2, If 2, Class=Vendor Specific Class,
Driver=option, 5000M
|__ Port 1: Dev 2, If 3, Class=Vendor Specific Class,
Driver=option, 5000M
|__ Port 1: Dev 2, If 4, Class=Vendor Specific Class,
Driver=qmi_wwan, 5000M
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
/: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
|__ Port 1: Dev 2, If 0, Class=Communications, Driver=cdc_ncm, 5000M
|__ Port 1: Dev 2, If 1, Class=CDC Data, Driver=cdc_ncm, 5000M
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc2/1p, 480M
#
It is the Bus 05, Port 1, Dev 2 with multiple interface.
I found a newer Linux version 6.6.52. Same error occured.
When error occurs, I don't see anything in system logs (e.g. dmesg).
Instead of the Quectel radio module I took one from SIMCOM and same
problem occured.
I added my USB 2.0 analyzer (old Ellisys) and problem disappeared.
Unfortunately I have no USB 3.x analyzer.
I am still waiting for a original NXP board with an i.MX8MP, which seems
a 6.12.x kernel can be used and tested.
I found an errata for i.MX8MP: ERR050714 “USB: HOST Stream IN issue if
received short packet”, but it looks like I have no Stream IN in use...?
So perhaps something different.
For confirming, if it could be something i.MX8MP related, I today took a
Raspberry Pi Compute Module 5 (CM5).
Also ARM64, but else, I assume completely different USB 3 peripheral.
And surprise: I was able to reproduce the problem.
The Raspberry Pi uses:
Linux CM5 6.12.34+rpt-rpi-v8 #1 SMP PREEMPT Debian
1:6.12.34-1+rpt1~bookworm (2025-06-26) aarch64 GNU/Linux
I still can't decide which side makes the problem, or if it is just an
interop problem.
I saw the virtual channel, which is used for data transfer, uses a
wMaxPacketSize of 1024 Bytes (IN and OUT) and a wMaxBurst of 6 (OUT) and
2 (IN).
I already created traces by usbmon, which I can share.
I can also read out USB descriptor (lsusb -v) and share.
Qualcomm radio modules are widely used, due to high possible throughput,
I assume people are also using USB 3.x with it.
I am not sure yet, why the error occurs on my side and not just works...
Someone already heard from such an error? Is there perhaps some workaround?
Can I perhaps patch wMaxPacketSize or wMaxBurst done in Kernel, at least
only for test purposes?
Note beside:
I can query number of sent and received IP packets. I sent the pings (1
every second).
The ping does not display, that it receives an answer, after the hang
occured.
But the radio module tells 5 sent and 5 received packets, when querying
the statistics every 5 packets via QMI.
So I assume sending (Bulk OUT) is working, packet go to server and back
to radio module, but answer is not sent over USB from device to host.
Second perhaps interesting note:
It looks like the received radio module is keeping/storing the packets.
After a more or less long time (a few hours), all buffers are exhausted.
Then QMI commands are not answered anymore.
When doing some action (I saw it with opening the channel for AT
commands, or for creating log files),
it could happen that all kept packets are then sent in one go, so I get
QMI packet statistic a lot of time, all with increase of 5 packets,
in sum an amount of packets, which needed the hours the create.
The radio module seems also to be using Linux. Which version, I don't know.
What can I do/test next?
Try again on a AMD x64 controller? Perhaps with main/latest of Linux Kernel?
I thought about getting an i.MX95, but seems to be not yet available,
but Phytec has some engineering samples (?) on boards, but release notes
say: only USB 2 is working yet.
Can I enabled traces in USB kernel which could be helpful to narrow the
problem.
Sorry for the long description. I tested meanwhile a lot...
Best regards,
Martin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-08 21:47 Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5) Martin Maurer
@ 2025-08-14 12:16 ` Michał Pecio
2025-08-14 13:02 ` Mathias Nyman
1 sibling, 0 replies; 8+ messages in thread
From: Michał Pecio @ 2025-08-14 12:16 UTC (permalink / raw)
To: Martin Maurer; +Cc: linux-usb
Hi,
On Fri, 8 Aug 2025 23:47:07 +0200, Martin Maurer wrote:
> What can I do/test next?
>
> Try again on a AMD x64 controller? Perhaps with main/latest of Linux Kernel?
You are describing some fairly complex setups, can you confidently
say that the problem is USB and not elsewhere? For example, you have
tcpdump running on the USB host machine and packets go out through
the USB network interface but nothing comes back?
What driver are you using with this USB device? Any errors/diagnostics
from the driver, or from xhci_hcd (I guess that's your host)?
You tried usbmon and what happened? Is the driver submitting IN URBs?
Are they coming back empty? With error status? Not completing at all?
Trying on a PC with newer kernel makes sense, debugging may be easier
that way and lower risk of:
- chasing bugs in downstream kernels that nobody here can help with
- fixing something that has already been figured out and fixed
Regards,
Michal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-08 21:47 Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5) Martin Maurer
2025-08-14 12:16 ` Michał Pecio
@ 2025-08-14 13:02 ` Mathias Nyman
2025-08-17 14:58 ` Martin Maurer
1 sibling, 1 reply; 8+ messages in thread
From: Mathias Nyman @ 2025-08-14 13:02 UTC (permalink / raw)
To: Martin Maurer, linux-usb
On 9.8.2025 0.47, Martin Maurer wrote:
> Hello,
>
> since some time I am fighting against a problem with USB:
>
...
>
> Note beside:
>
> I can query number of sent and received IP packets. I sent the pings (1 every second).
>
> The ping does not display, that it receives an answer, after the hang occured.
>
> But the radio module tells 5 sent and 5 received packets, when querying the statistics every 5 packets via QMI.
>
> So I assume sending (Bulk OUT) is working, packet go to server and back to radio module, but answer is not sent over USB from device to host.
>
I didn't fully understand the complex setup, but the subject, and this section does give a hint it
maybe could be related to missing zero-length bulk packets.
The receiving side, which on host side would be the bulk in endpoint usually knows a transfer
is complete when it receives the exact amount it requested, or if it receives a packet shorter than
maxPacketSize. (short transfer)
But if the sender sends less than expected, and it happens to be exactly maxPacketSize (1024) bytes,
then the sender should send an additinal zero-length packet to let receiver know no more data is coming.
Otherwise the receiving side will be stuck waiting for the next packet.
Thanks
Mathias
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-14 13:02 ` Mathias Nyman
@ 2025-08-17 14:58 ` Martin Maurer
2025-08-17 15:07 ` Martin Maurer
2025-08-17 15:22 ` Daniele Palmas
0 siblings, 2 replies; 8+ messages in thread
From: Martin Maurer @ 2025-08-17 14:58 UTC (permalink / raw)
To: linux-usb, michal.pecio, mathias.nyman
Hello Michał, hello Mathias at all,
many thanks for your answers!
I have tried if I can reproduce it with a AMD Linux PC, but
unfortunately I was not able to reproduce (but setup is a bit different).
So I went back to Raspberry Pi Compute Module 5, where I mainly
connected the radio module (Quectel RM520N-GL) via USB3,
and installed a Wifi access point. All data/all connections from Wifi
access point are routed directly via wwan0 to radio module.
This is currently my easiest setup to be able to reproduce the error.
Mostly in a few seconds.
My knowledge in area Linux Kernel + USB is unfortunately not yet enough
to analyze and fix it by myself.
But I used the help of ChatGPT-5 to create an usbmon and xhci kernel trace.
I create an usbmon trace as well as a trace from xhci (both recorded in
parallel):
https://www.file-upload.net/en/download-15523936/usbmon_bus5_20250817-150158.log.html
https://www.file-upload.net/en/download-15523937/xhci_20250817-150158.trace.html
This was the last output, my ping in a shell has shown:
64 bytes from 8.8.8.8: icmp_seq=2323 ttl=112 time=26.0 ms
64 bytes from 8.8.8.8: icmp_seq=2324 ttl=112 time=25.0 ms
64 bytes from 8.8.8.8: icmp_seq=2325 ttl=112 time=29.1 ms
64 bytes from 8.8.8.8: icmp_seq=2326 ttl=112 time=37.8 ms
In parallel created more data traffic, but with ping I see first when IP
data connection does not work stable anymore.
According to ChatGPT-5 the following places contain errors:
*** USBMON ***
In your usbmon_bus5_20250817-150158.log:
First -71 (EPROTO) on the QMI Bulk-IN (Bi:5:005:14): line 2161,
timestamp 493245744
2161: ffffff8003c8cb40 493245744 C Bi:5:005:14 -71 0
Just before that, there’s a -75 (EOVERFLOW) on the same IN EP, which is
often the first sign of trouble: line 2159, timestamp 493245221
2159: ffffff8003c8cd80 493245221 C Bi:5:005:14 -75 1024 = ...
So the sequence is: several good completions → EOVERFLOW (-75) → then a
stream of EPROTO (-71) errors on Bi:5:005:14, which kills further ping
replies after your last good seq (2326).
*** XHCI TRACE ***
I found the first failure in your xHCI trace.
First error line: line 8216
Timestamp: 758267.000115
Event: xhci_handle_event … type 'Transfer Event' … 'Error' … slot 1 ep
29 … len 1472
Why ep 29? In xHCI, the endpoint context index is ep_index = 2 *
ep_number + (direction), where direction is 0=OUT, 1=IN.
So for Bulk IN ep 14: 2*14+1 = 29 → that’s your IN 0x87 pipe.
Right after that line you can see the driver react:
xhci_handle_transfer … length 1472 … (the failed TD)
xhci_queue_command: Reset Endpoint Command … ep 29 (host tries to recover)
xhci_handle_event: … 'Command Completion Event' (reset completes)
But from this point on, completions for that IN EP correspond to usbmon
-71 (EPROTO) — matching what you saw.
Does this give a clue, where it could be coming from?
It is 100% reproduceable in a few seconds on Raspberry Pi Ccompute
Module 5 (and I same behaviour on different kernel of i.MX8MP).
Could it be a hardware problem? I already tried different radio module
(all Qualcomm, X62/X65 and X72/X75),
different cables (all same length, all from same source), different eval
board for the M.2 radio modules (but from same source).
Can you give me a hint, what to try next?
ChatGPT-5 pinpoints me to try to disable LPM for USB3, could this be a
next step? Or is it something else?
Many thanks for your help!
Best regards,
Martin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-17 14:58 ` Martin Maurer
@ 2025-08-17 15:07 ` Martin Maurer
2025-08-17 15:22 ` Daniele Palmas
1 sibling, 0 replies; 8+ messages in thread
From: Martin Maurer @ 2025-08-17 15:07 UTC (permalink / raw)
To: linux-usb, michal.pecio, mathias.nyman
Don't use the links in original email!
Sorry, more spam than useful. I don't wanted to send big files to
mailing list, but this service is also shit, sorry again:
I tried it now with WeTransfer:
https://wetransfer.com/downloads/a5ddcb347b80bc0e58413b2053d7aa5d20250817150401/7b0199?t_exp=1755702241&t_lsid=b037582b-3c5f-4b07-abbf-74bf23b4890c&t_network=link&t_rid=YXV0aDB8NjQwOGVjNThhOTFhMzAyZWI3OGU5M2M3&t_s=download_link&t_ts=1755443041&utm_campaign=TRN_TDL_12&utm_source=sendgrid&utm_medium=email&trk=TRN_TDL_12
Download only valid for 3 days.
Am 17.08.2025 um 16:58 schrieb Martin Maurer:
> Hello Michał, hello Mathias at all,
>
> many thanks for your answers!
>
> I have tried if I can reproduce it with a AMD Linux PC, but
> unfortunately I was not able to reproduce (but setup is a bit different).
>
> So I went back to Raspberry Pi Compute Module 5, where I mainly
> connected the radio module (Quectel RM520N-GL) via USB3,
>
> and installed a Wifi access point. All data/all connections from Wifi
> access point are routed directly via wwan0 to radio module.
>
> This is currently my easiest setup to be able to reproduce the error.
> Mostly in a few seconds.
>
> My knowledge in area Linux Kernel + USB is unfortunately not yet
> enough to analyze and fix it by myself.
>
> But I used the help of ChatGPT-5 to create an usbmon and xhci kernel
> trace.
>
> I create an usbmon trace as well as a trace from xhci (both recorded
> in parallel):
>
> ... Removed due to spam website...
> This was the last output, my ping in a shell has shown:
>
> 64 bytes from 8.8.8.8: icmp_seq=2323 ttl=112 time=26.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=2324 ttl=112 time=25.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=2325 ttl=112 time=29.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=2326 ttl=112 time=37.8 ms
>
> In parallel created more data traffic, but with ping I see first when
> IP data connection does not work stable anymore.
>
> According to ChatGPT-5 the following places contain errors:
>
> *** USBMON ***
>
> In your usbmon_bus5_20250817-150158.log:
>
> First -71 (EPROTO) on the QMI Bulk-IN (Bi:5:005:14): line 2161,
> timestamp 493245744
>
> 2161: ffffff8003c8cb40 493245744 C Bi:5:005:14 -71 0
>
> Just before that, there’s a -75 (EOVERFLOW) on the same IN EP, which
> is often the first sign of trouble: line 2159, timestamp 493245221
>
> 2159: ffffff8003c8cd80 493245221 C Bi:5:005:14 -75 1024 = ...
>
> So the sequence is: several good completions → EOVERFLOW (-75) → then
> a stream of EPROTO (-71) errors on Bi:5:005:14, which kills further
> ping replies after your last good seq (2326).
>
>
> *** XHCI TRACE ***
>
> I found the first failure in your xHCI trace.
>
> First error line: line 8216
>
> Timestamp: 758267.000115
>
> Event: xhci_handle_event … type 'Transfer Event' … 'Error' … slot 1 ep
> 29 … len 1472
>
> Why ep 29? In xHCI, the endpoint context index is ep_index = 2 *
> ep_number + (direction), where direction is 0=OUT, 1=IN.
> So for Bulk IN ep 14: 2*14+1 = 29 → that’s your IN 0x87 pipe.
>
> Right after that line you can see the driver react:
>
> xhci_handle_transfer … length 1472 … (the failed TD)
>
> xhci_queue_command: Reset Endpoint Command … ep 29 (host tries to
> recover)
>
> xhci_handle_event: … 'Command Completion Event' (reset completes)
>
> But from this point on, completions for that IN EP correspond to
> usbmon -71 (EPROTO) — matching what you saw.
>
>
> Does this give a clue, where it could be coming from?
>
> It is 100% reproduceable in a few seconds on Raspberry Pi Ccompute
> Module 5 (and I same behaviour on different kernel of i.MX8MP).
>
> Could it be a hardware problem? I already tried different radio module
> (all Qualcomm, X62/X65 and X72/X75),
>
> different cables (all same length, all from same source), different
> eval board for the M.2 radio modules (but from same source).
>
>
> Can you give me a hint, what to try next?
>
>
> ChatGPT-5 pinpoints me to try to disable LPM for USB3, could this be a
> next step? Or is it something else?
>
>
> Many thanks for your help!
>
> Best regards,
>
> Martin
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-17 14:58 ` Martin Maurer
2025-08-17 15:07 ` Martin Maurer
@ 2025-08-17 15:22 ` Daniele Palmas
2025-08-17 17:01 ` Martin Maurer
1 sibling, 1 reply; 8+ messages in thread
From: Daniele Palmas @ 2025-08-17 15:22 UTC (permalink / raw)
To: Martin Maurer; +Cc: linux-usb, michal.pecio, mathias.nyman
Hello Martin,
Il giorno dom 17 ago 2025 alle ore 17:09 Martin Maurer
<martin.maurer@mmeacs.de> ha scritto:
>
> Hello Michał, hello Mathias at all,
>
> many thanks for your answers!
>
> I have tried if I can reproduce it with a AMD Linux PC, but
> unfortunately I was not able to reproduce (but setup is a bit different).
>
> So I went back to Raspberry Pi Compute Module 5, where I mainly
> connected the radio module (Quectel RM520N-GL) via USB3,
>
> and installed a Wifi access point. All data/all connections from Wifi
> access point are routed directly via wwan0 to radio module.
>
> This is currently my easiest setup to be able to reproduce the error.
> Mostly in a few seconds.
>
> My knowledge in area Linux Kernel + USB is unfortunately not yet enough
> to analyze and fix it by myself.
>
> But I used the help of ChatGPT-5 to create an usbmon and xhci kernel trace.
>
> I create an usbmon trace as well as a trace from xhci (both recorded in
> parallel):
>
> https://www.file-upload.net/en/download-15523936/usbmon_bus5_20250817-150158.log.html
>
> https://www.file-upload.net/en/download-15523937/xhci_20250817-150158.trace.html
>
> This was the last output, my ping in a shell has shown:
>
> 64 bytes from 8.8.8.8: icmp_seq=2323 ttl=112 time=26.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=2324 ttl=112 time=25.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=2325 ttl=112 time=29.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=2326 ttl=112 time=37.8 ms
>
> In parallel created more data traffic, but with ping I see first when IP
> data connection does not work stable anymore.
>
> According to ChatGPT-5 the following places contain errors:
>
> *** USBMON ***
>
> In your usbmon_bus5_20250817-150158.log:
>
> First -71 (EPROTO) on the QMI Bulk-IN (Bi:5:005:14): line 2161,
> timestamp 493245744
>
> 2161: ffffff8003c8cb40 493245744 C Bi:5:005:14 -71 0
>
> Just before that, there’s a -75 (EOVERFLOW) on the same IN EP, which is
> often the first sign of trouble: line 2159, timestamp 493245221
>
I did not have the chance to look at the usbmon traces so I'm not sure
that this is really the same scenario, but you could take a look at
the whole thread at
https://www.spinics.net/lists/netdev/msg635944.html
If it is the same issue, basically, if you setup the data connection
with QMAP you should not face the issue.
Regards,
Daniele
> 2159: ffffff8003c8cd80 493245221 C Bi:5:005:14 -75 1024 = ...
>
> So the sequence is: several good completions → EOVERFLOW (-75) → then a
> stream of EPROTO (-71) errors on Bi:5:005:14, which kills further ping
> replies after your last good seq (2326).
>
>
> *** XHCI TRACE ***
>
> I found the first failure in your xHCI trace.
>
> First error line: line 8216
>
> Timestamp: 758267.000115
>
> Event: xhci_handle_event … type 'Transfer Event' … 'Error' … slot 1 ep
> 29 … len 1472
>
> Why ep 29? In xHCI, the endpoint context index is ep_index = 2 *
> ep_number + (direction), where direction is 0=OUT, 1=IN.
> So for Bulk IN ep 14: 2*14+1 = 29 → that’s your IN 0x87 pipe.
>
> Right after that line you can see the driver react:
>
> xhci_handle_transfer … length 1472 … (the failed TD)
>
> xhci_queue_command: Reset Endpoint Command … ep 29 (host tries to recover)
>
> xhci_handle_event: … 'Command Completion Event' (reset completes)
>
> But from this point on, completions for that IN EP correspond to usbmon
> -71 (EPROTO) — matching what you saw.
>
>
> Does this give a clue, where it could be coming from?
>
> It is 100% reproduceable in a few seconds on Raspberry Pi Ccompute
> Module 5 (and I same behaviour on different kernel of i.MX8MP).
>
> Could it be a hardware problem? I already tried different radio module
> (all Qualcomm, X62/X65 and X72/X75),
>
> different cables (all same length, all from same source), different eval
> board for the M.2 radio modules (but from same source).
>
>
> Can you give me a hint, what to try next?
>
>
> ChatGPT-5 pinpoints me to try to disable LPM for USB3, could this be a
> next step? Or is it something else?
>
>
> Many thanks for your help!
>
> Best regards,
>
> Martin
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-17 15:22 ` Daniele Palmas
@ 2025-08-17 17:01 ` Martin Maurer
2025-08-18 19:49 ` Daniele Palmas
0 siblings, 1 reply; 8+ messages in thread
From: Martin Maurer @ 2025-08-17 17:01 UTC (permalink / raw)
To: Daniele Palmas; +Cc: linux-usb, michal.pecio, mathias.nyman
Hi Daniele,
many thanks for your reply!
I can only partly open
https://www.spinics.net
pages, often pages time out...
Have I understood correctly, that there is a known bug, but it was not
fixed (from 2020 till now).
But as workaround enabling qmux/qmimux could work?
Best regards,
Martin
Am 17.08.2025 um 17:22 schrieb Daniele Palmas:
> Hello Martin,
>
> Il giorno dom 17 ago 2025 alle ore 17:09 Martin Maurer
> <martin.maurer@mmeacs.de> ha scritto:
>> Hello Michał, hello Mathias at all,
>>
>> many thanks for your answers!
>>
>> I have tried if I can reproduce it with a AMD Linux PC, but
>> unfortunately I was not able to reproduce (but setup is a bit different).
>>
>> So I went back to Raspberry Pi Compute Module 5, where I mainly
>> connected the radio module (Quectel RM520N-GL) via USB3,
>>
>> and installed a Wifi access point. All data/all connections from Wifi
>> access point are routed directly via wwan0 to radio module.
>>
>> This is currently my easiest setup to be able to reproduce the error.
>> Mostly in a few seconds.
>>
>> My knowledge in area Linux Kernel + USB is unfortunately not yet enough
>> to analyze and fix it by myself.
>>
>> But I used the help of ChatGPT-5 to create an usbmon and xhci kernel trace.
>>
>> I create an usbmon trace as well as a trace from xhci (both recorded in
>> parallel):
>>
>> https://www.file-upload.net/en/download-15523936/usbmon_bus5_20250817-150158.log.html
>>
>> https://www.file-upload.net/en/download-15523937/xhci_20250817-150158.trace.html
>>
>> This was the last output, my ping in a shell has shown:
>>
>> 64 bytes from 8.8.8.8: icmp_seq=2323 ttl=112 time=26.0 ms
>> 64 bytes from 8.8.8.8: icmp_seq=2324 ttl=112 time=25.0 ms
>> 64 bytes from 8.8.8.8: icmp_seq=2325 ttl=112 time=29.1 ms
>> 64 bytes from 8.8.8.8: icmp_seq=2326 ttl=112 time=37.8 ms
>>
>> In parallel created more data traffic, but with ping I see first when IP
>> data connection does not work stable anymore.
>>
>> According to ChatGPT-5 the following places contain errors:
>>
>> *** USBMON ***
>>
>> In your usbmon_bus5_20250817-150158.log:
>>
>> First -71 (EPROTO) on the QMI Bulk-IN (Bi:5:005:14): line 2161,
>> timestamp 493245744
>>
>> 2161: ffffff8003c8cb40 493245744 C Bi:5:005:14 -71 0
>>
>> Just before that, there’s a -75 (EOVERFLOW) on the same IN EP, which is
>> often the first sign of trouble: line 2159, timestamp 493245221
>>
> I did not have the chance to look at the usbmon traces so I'm not sure
> that this is really the same scenario, but you could take a look at
> the whole thread at
> https://www.spinics.net/lists/netdev/msg635944.html
>
> If it is the same issue, basically, if you setup the data connection
> with QMAP you should not face the issue.
>
> Regards,
> Daniele
>
>> 2159: ffffff8003c8cd80 493245221 C Bi:5:005:14 -75 1024 = ...
>>
>> So the sequence is: several good completions → EOVERFLOW (-75) → then a
>> stream of EPROTO (-71) errors on Bi:5:005:14, which kills further ping
>> replies after your last good seq (2326).
>>
>>
>> *** XHCI TRACE ***
>>
>> I found the first failure in your xHCI trace.
>>
>> First error line: line 8216
>>
>> Timestamp: 758267.000115
>>
>> Event: xhci_handle_event … type 'Transfer Event' … 'Error' … slot 1 ep
>> 29 … len 1472
>>
>> Why ep 29? In xHCI, the endpoint context index is ep_index = 2 *
>> ep_number + (direction), where direction is 0=OUT, 1=IN.
>> So for Bulk IN ep 14: 2*14+1 = 29 → that’s your IN 0x87 pipe.
>>
>> Right after that line you can see the driver react:
>>
>> xhci_handle_transfer … length 1472 … (the failed TD)
>>
>> xhci_queue_command: Reset Endpoint Command … ep 29 (host tries to recover)
>>
>> xhci_handle_event: … 'Command Completion Event' (reset completes)
>>
>> But from this point on, completions for that IN EP correspond to usbmon
>> -71 (EPROTO) — matching what you saw.
>>
>>
>> Does this give a clue, where it could be coming from?
>>
>> It is 100% reproduceable in a few seconds on Raspberry Pi Ccompute
>> Module 5 (and I same behaviour on different kernel of i.MX8MP).
>>
>> Could it be a hardware problem? I already tried different radio module
>> (all Qualcomm, X62/X65 and X72/X75),
>>
>> different cables (all same length, all from same source), different eval
>> board for the M.2 radio modules (but from same source).
>>
>>
>> Can you give me a hint, what to try next?
>>
>>
>> ChatGPT-5 pinpoints me to try to disable LPM for USB3, could this be a
>> next step? Or is it something else?
>>
>>
>> Many thanks for your help!
>>
>> Best regards,
>>
>> Martin
>>
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)
2025-08-17 17:01 ` Martin Maurer
@ 2025-08-18 19:49 ` Daniele Palmas
0 siblings, 0 replies; 8+ messages in thread
From: Daniele Palmas @ 2025-08-18 19:49 UTC (permalink / raw)
To: Martin Maurer; +Cc: linux-usb, michal.pecio, mathias.nyman
Hi Martin,
Il giorno dom 17 ago 2025 alle ore 19:01 Martin Maurer
<martin.maurer@mmeacs.de> ha scritto:
>
> Hi Daniele,
>
> many thanks for your reply!
>
> I can only partly open
>
> https://www.spinics.net
>
> pages, often pages time out...
>
> Have I understood correctly, that there is a known bug, but it was not
> fixed (from 2020 till now).
>
> But as workaround enabling qmux/qmimux could work?
If the problem is the same, it should. If your kernel version supports
the passthrough sysfs file, you can also use the rmnet module (much
better than the inbox qmap implementation).
If enabling QMAP is too complicated in your setup, you can just try to
increase the rx_urb_size by increasing the mtu (at least 2048).
Regards,
Daniele
> Best regards,
>
> Martin
>
>
>
> Am 17.08.2025 um 17:22 schrieb Daniele Palmas:
> > Hello Martin,
> >
> > Il giorno dom 17 ago 2025 alle ore 17:09 Martin Maurer
> > <martin.maurer@mmeacs.de> ha scritto:
> >> Hello Michał, hello Mathias at all,
> >>
> >> many thanks for your answers!
> >>
> >> I have tried if I can reproduce it with a AMD Linux PC, but
> >> unfortunately I was not able to reproduce (but setup is a bit different).
> >>
> >> So I went back to Raspberry Pi Compute Module 5, where I mainly
> >> connected the radio module (Quectel RM520N-GL) via USB3,
> >>
> >> and installed a Wifi access point. All data/all connections from Wifi
> >> access point are routed directly via wwan0 to radio module.
> >>
> >> This is currently my easiest setup to be able to reproduce the error.
> >> Mostly in a few seconds.
> >>
> >> My knowledge in area Linux Kernel + USB is unfortunately not yet enough
> >> to analyze and fix it by myself.
> >>
> >> But I used the help of ChatGPT-5 to create an usbmon and xhci kernel trace.
> >>
> >> I create an usbmon trace as well as a trace from xhci (both recorded in
> >> parallel):
> >>
> >> https://www.file-upload.net/en/download-15523936/usbmon_bus5_20250817-150158.log.html
> >>
> >> https://www.file-upload.net/en/download-15523937/xhci_20250817-150158.trace.html
> >>
> >> This was the last output, my ping in a shell has shown:
> >>
> >> 64 bytes from 8.8.8.8: icmp_seq=2323 ttl=112 time=26.0 ms
> >> 64 bytes from 8.8.8.8: icmp_seq=2324 ttl=112 time=25.0 ms
> >> 64 bytes from 8.8.8.8: icmp_seq=2325 ttl=112 time=29.1 ms
> >> 64 bytes from 8.8.8.8: icmp_seq=2326 ttl=112 time=37.8 ms
> >>
> >> In parallel created more data traffic, but with ping I see first when IP
> >> data connection does not work stable anymore.
> >>
> >> According to ChatGPT-5 the following places contain errors:
> >>
> >> *** USBMON ***
> >>
> >> In your usbmon_bus5_20250817-150158.log:
> >>
> >> First -71 (EPROTO) on the QMI Bulk-IN (Bi:5:005:14): line 2161,
> >> timestamp 493245744
> >>
> >> 2161: ffffff8003c8cb40 493245744 C Bi:5:005:14 -71 0
> >>
> >> Just before that, there’s a -75 (EOVERFLOW) on the same IN EP, which is
> >> often the first sign of trouble: line 2159, timestamp 493245221
> >>
> > I did not have the chance to look at the usbmon traces so I'm not sure
> > that this is really the same scenario, but you could take a look at
> > the whole thread at
> > https://www.spinics.net/lists/netdev/msg635944.html
> >
> > If it is the same issue, basically, if you setup the data connection
> > with QMAP you should not face the issue.
> >
> > Regards,
> > Daniele
> >
> >> 2159: ffffff8003c8cd80 493245221 C Bi:5:005:14 -75 1024 = ...
> >>
> >> So the sequence is: several good completions → EOVERFLOW (-75) → then a
> >> stream of EPROTO (-71) errors on Bi:5:005:14, which kills further ping
> >> replies after your last good seq (2326).
> >>
> >>
> >> *** XHCI TRACE ***
> >>
> >> I found the first failure in your xHCI trace.
> >>
> >> First error line: line 8216
> >>
> >> Timestamp: 758267.000115
> >>
> >> Event: xhci_handle_event … type 'Transfer Event' … 'Error' … slot 1 ep
> >> 29 … len 1472
> >>
> >> Why ep 29? In xHCI, the endpoint context index is ep_index = 2 *
> >> ep_number + (direction), where direction is 0=OUT, 1=IN.
> >> So for Bulk IN ep 14: 2*14+1 = 29 → that’s your IN 0x87 pipe.
> >>
> >> Right after that line you can see the driver react:
> >>
> >> xhci_handle_transfer … length 1472 … (the failed TD)
> >>
> >> xhci_queue_command: Reset Endpoint Command … ep 29 (host tries to recover)
> >>
> >> xhci_handle_event: … 'Command Completion Event' (reset completes)
> >>
> >> But from this point on, completions for that IN EP correspond to usbmon
> >> -71 (EPROTO) — matching what you saw.
> >>
> >>
> >> Does this give a clue, where it could be coming from?
> >>
> >> It is 100% reproduceable in a few seconds on Raspberry Pi Ccompute
> >> Module 5 (and I same behaviour on different kernel of i.MX8MP).
> >>
> >> Could it be a hardware problem? I already tried different radio module
> >> (all Qualcomm, X62/X65 and X72/X75),
> >>
> >> different cables (all same length, all from same source), different eval
> >> board for the M.2 radio modules (but from same source).
> >>
> >>
> >> Can you give me a hint, what to try next?
> >>
> >>
> >> ChatGPT-5 pinpoints me to try to disable LPM for USB3, could this be a
> >> next step? Or is it something else?
> >>
> >>
> >> Many thanks for your help!
> >>
> >> Best regards,
> >>
> >> Martin
> >>
> >>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-08-18 19:49 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08 21:47 Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5) Martin Maurer
2025-08-14 12:16 ` Michał Pecio
2025-08-14 13:02 ` Mathias Nyman
2025-08-17 14:58 ` Martin Maurer
2025-08-17 15:07 ` Martin Maurer
2025-08-17 15:22 ` Daniele Palmas
2025-08-17 17:01 ` Martin Maurer
2025-08-18 19:49 ` Daniele Palmas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox