* Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
@ 2025-10-30 8:49 Li Haifeng
2025-10-30 9:39 ` Greg KH
0 siblings, 1 reply; 7+ messages in thread
From: Li Haifeng @ 2025-10-30 8:49 UTC (permalink / raw)
To: Felipe.Balbi, linux-usb
Hello Linux USB experts,
I'm encountering an issue with the DWC3 gadget driver on a Rockchip
RK3588 platform running Linux kernel 5.10 (from the
rk3588_linux_release_20230114_v1.0.6 SDK). I'm developing a kernel
module that uses the existing CDC-ECM USB Ethernet gadget to send 16KB
of data from the device to the host via scatter-gather lists,
leveraging TRB chaining for bulk IN transfers.
The module accesses the IN endpoint from the eth_dev structure (via
netdev_priv) and queues a USB request with 4 scatterlist entries (each
4KB, totaling 16KB). The request is set up with req->sg,
req->num_sgs=4, and req->length=16384, then queued using usb_ep_queue.
However, during transmission:
- The device side only transmits the first 2KB of data.
- The host acknowledges this partial transfer (ACK received).
- After the ACK, the device does not respond or continue the
transfer—no further data is sent, and the endpoint appears to stall
without triggering any completion callback or error.
This behavior is reproducible when scatter-gather is enabled on the
controller. Disabling scatter-gather or using non-chained TRBs allows
the full transfer to complete successfully.
Here's a simplified excerpt from the module's send function for reference:
```c
static int send_16kb_sg(struct usb_ep *ep) {
struct usb_request *req;
struct send_context *ctx;
struct scatterlist *sg;
int num_sg = 4; // 4 segments of 4KB each = 16KB
int size_per = 4096;
int i;
int status;
if (!ep->gadget->sg_supported) {
pr_err("Scatter-gather not supported by USB controller\n");
return -EOPNOTSUPP;
}
req = usb_ep_alloc_request(ep, GFP_KERNEL);
if (!req) return -ENOMEM;
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx) {
usb_ep_free_request(ep, req);
return -ENOMEM;
}
sg = kcalloc(num_sg, sizeof(struct scatterlist), GFP_KERNEL);
if (!sg) {
kfree(ctx);
usb_ep_free_request(ep, req);
return -ENOMEM;
}
sg_init_table(sg, num_sg);
for (i = 0; i < num_sg; i++) {
void *buf = kmalloc(size_per, GFP_KERNEL);
if (!buf) {
// Cleanup...
return -ENOMEM;
}
ctx->bufs[i] = buf;
memset(buf, 'A' + i, size_per); // Dummy data
sg_set_buf(&sg[i], buf, size_per);
}
ctx->sg = sg;
req->sg = sg;
req->num_sgs = num_sg;
req->length = num_sg * size_per;
req->buf = NULL; // SG mode
req->context = ctx;
req->complete = send_complete;
req->zero = 0;
status = usb_ep_queue(ep, req, GFP_KERNEL);
if (status) {
// Cleanup...
}
return status;
}
```
The complete callback (send_complete) is never invoked after the
partial transfer, and no errors are logged in dmesg.
Is this potentially due to incorrect usage of TRB chaining in the
scatter-gather setup (e.g., something missing in how the chain bit or
TRB fields are handled in the driver)? Or could this be a known issue
in the DWC3 gadget driver, perhaps related to handling chained TRBs
for larger transfers on certain controllers?
I've reviewed the DWC3 documentation and some past discussions on
stalls with scatter-gather, but haven't found an exact match. Any
insights, suggestions for debugging, or pointers to relevant patches
would be greatly appreciated.
Thanks in advance for your help!
Best regards,
Haifeng
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
2025-10-30 8:49 Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining Li Haifeng
@ 2025-10-30 9:39 ` Greg KH
2025-10-30 12:47 ` Li Haifeng
0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2025-10-30 9:39 UTC (permalink / raw)
To: Li Haifeng; +Cc: Felipe.Balbi, linux-usb
On Thu, Oct 30, 2025 at 04:49:19PM +0800, Li Haifeng wrote:
> Hello Linux USB experts,
>
> I'm encountering an issue with the DWC3 gadget driver on a Rockchip
> RK3588 platform running Linux kernel 5.10 (from the
> rk3588_linux_release_20230114_v1.0.6 SDK). I'm developing a kernel
5.10 is _VERY_ old and obsolete and way behind in new hardware support,
especially for the dwc3 driver. Have you tried the latest kernel
release with is many years newer? How about 6.17?
If you are stuck with an old release for some reason, please work with
the company that gave it to you as you are paying for support from them
for it, it is their responsibility, not the community's responsibility
to manage that release.
And pleaase, do not release new devices on this old kernel version, you
are about to loose security updates in a year for it, which is not good
for the lifetime of your device.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
2025-10-30 9:39 ` Greg KH
@ 2025-10-30 12:47 ` Li Haifeng
2025-10-31 22:41 ` Thinh Nguyen
0 siblings, 1 reply; 7+ messages in thread
From: Li Haifeng @ 2025-10-30 12:47 UTC (permalink / raw)
To: Greg KH; +Cc: Felipe.Balbi, linux-usb
Dear Greg,
Thank you for your prompt response and advice.
I just tested the issue on kernel version 6.1.17, and the problem
persists. I will proceed to try the latest kernel release 6.17, and
report back with the results as soon as possible.
Appreciate your guidance.
Best regards,
Haifeng
Greg KH <gregkh@linuxfoundation.org> 于2025年10月30日周四 17:39写道:
>
> On Thu, Oct 30, 2025 at 04:49:19PM +0800, Li Haifeng wrote:
> > Hello Linux USB experts,
> >
> > I'm encountering an issue with the DWC3 gadget driver on a Rockchip
> > RK3588 platform running Linux kernel 5.10 (from the
> > rk3588_linux_release_20230114_v1.0.6 SDK). I'm developing a kernel
>
> 5.10 is _VERY_ old and obsolete and way behind in new hardware support,
> especially for the dwc3 driver. Have you tried the latest kernel
> release with is many years newer? How about 6.17?
>
> If you are stuck with an old release for some reason, please work with
> the company that gave it to you as you are paying for support from them
> for it, it is their responsibility, not the community's responsibility
> to manage that release.
>
> And pleaase, do not release new devices on this old kernel version, you
> are about to loose security updates in a year for it, which is not good
> for the lifetime of your device.
>
> thanks,
>
> greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
2025-10-30 12:47 ` Li Haifeng
@ 2025-10-31 22:41 ` Thinh Nguyen
2025-11-03 6:22 ` Li Haifeng
0 siblings, 1 reply; 7+ messages in thread
From: Thinh Nguyen @ 2025-10-31 22:41 UTC (permalink / raw)
To: Li Haifeng; +Cc: Greg KH, Felipe.Balbi@microsoft.com, linux-usb@vger.kernel.org
Hi,
On Thu, Oct 30, 2025, Li Haifeng wrote:
> Dear Greg,
>
> Thank you for your prompt response and advice.
>
> I just tested the issue on kernel version 6.1.17, and the problem
> persists. I will proceed to try the latest kernel release 6.17, and
> report back with the results as soon as possible.
>
> Appreciate your guidance.
>
> Best regards,
> Haifeng
>
Avoid top-post if you can.
As Greg noted, please try the recent kernel release. For now, let's take
a look at what you already have.
This is from your previous email:
- The device side only transmits the first 2KB of data.
- The host acknowledges this partial transfer (ACK received).
- After the ACK, the device does not respond or continue the
transfer—no further data is sent, and the endpoint appears to stall
without triggering any completion callback or error.
You mentioned host ACK'ed IN data, so I assume you run in SuperSpeed and
have a usb traffic analyzer and able to see these packets. Did you see
the host requesting for the next set of data? My suspiction is it did
not. There's no mention of NRDY or flow control here. If the device did
not prepare enough TRBs, you'd see flow control.
Check your host.
BR,
Thinh
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
2025-10-31 22:41 ` Thinh Nguyen
@ 2025-11-03 6:22 ` Li Haifeng
2025-11-04 1:16 ` Thinh Nguyen
0 siblings, 1 reply; 7+ messages in thread
From: Li Haifeng @ 2025-11-03 6:22 UTC (permalink / raw)
To: Thinh Nguyen
Cc: Greg KH, Felipe.Balbi@microsoft.com, linux-usb@vger.kernel.org
Dear Thinh,
Thank you for your detailed analysis and suggestions.
Using a logic analyzer, I did not observe any NRDY or flow control
events on the device side for the corresponding endpoint. I reviewed
the handling logic in the USB ECM host module from the latest Linux
kernel. It allocates the receive buffer based on wMaxSegmentSize,
which is 1514 bytes, so if the data sent from the device exceeds this
value (as the previous 2KB size did), the host's behavior might become
abnormal. I will conduct an experiment to observe this phenomenon.
What puzzles me is why the device's endpoint does not generate an NRDY
event in this scenario.
Best regards,
Haifeng
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
2025-11-03 6:22 ` Li Haifeng
@ 2025-11-04 1:16 ` Thinh Nguyen
2025-11-04 1:20 ` Thinh Nguyen
0 siblings, 1 reply; 7+ messages in thread
From: Thinh Nguyen @ 2025-11-04 1:16 UTC (permalink / raw)
To: Li Haifeng
Cc: Thinh Nguyen, Greg KH, Felipe.Balbi@microsoft.com,
linux-usb@vger.kernel.org
On Mon, Nov 03, 2025, Li Haifeng wrote:
> Dear Thinh,
>
> Thank you for your detailed analysis and suggestions.
>
> Using a logic analyzer, I did not observe any NRDY or flow control
> events on the device side for the corresponding endpoint. I reviewed
> the handling logic in the USB ECM host module from the latest Linux
> kernel. It allocates the receive buffer based on wMaxSegmentSize,
> which is 1514 bytes, so if the data sent from the device exceeds this
> value (as the previous 2KB size did), the host's behavior might become
> abnormal. I will conduct an experiment to observe this phenomenon.
>
> What puzzles me is why the device's endpoint does not generate an NRDY
> event in this scenario.
>
The host drives the device. The device would respond with NRDY if the
host requests for data. If not, then probably host didn't request for
it. You should be able to confirm that in the USB traffic trace. This is
correspond to what you noted that host prepared and likely requested
1514 bytes only. There's a mismatch in the communication at the gadget
driver protocol. I'm not familiar with the Ethernet protocol, check
that. The device should not send more than what the host asks for.
BR,
Thinh
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining
2025-11-04 1:16 ` Thinh Nguyen
@ 2025-11-04 1:20 ` Thinh Nguyen
0 siblings, 0 replies; 7+ messages in thread
From: Thinh Nguyen @ 2025-11-04 1:20 UTC (permalink / raw)
To: Li Haifeng
Cc: Thinh Nguyen, Greg KH, Felipe.Balbi@microsoft.com,
linux-usb@vger.kernel.org
On Mon, Nov 03, 2025, Thinh Nguyen wrote:
> On Mon, Nov 03, 2025, Li Haifeng wrote:
> > Dear Thinh,
> >
> > Thank you for your detailed analysis and suggestions.
> >
> > Using a logic analyzer, I did not observe any NRDY or flow control
> > events on the device side for the corresponding endpoint. I reviewed
> > the handling logic in the USB ECM host module from the latest Linux
> > kernel. It allocates the receive buffer based on wMaxSegmentSize,
> > which is 1514 bytes, so if the data sent from the device exceeds this
> > value (as the previous 2KB size did), the host's behavior might become
> > abnormal. I will conduct an experiment to observe this phenomenon.
> >
> > What puzzles me is why the device's endpoint does not generate an NRDY
> > event in this scenario.
> >
>
> The host drives the device. The device would respond with NRDY if the
> host requests for data.
... with NRDY if the host request for data and the device does not have
the data prepared.
(somehow I thought had that)
> If not, then probably host didn't request for
> it. You should be able to confirm that in the USB traffic trace. This is
> correspond to what you noted that host prepared and likely requested
> 1514 bytes only. There's a mismatch in the communication at the gadget
> driver protocol. I'm not familiar with the Ethernet protocol, check
> that. The device should not send more than what the host asks for.
>
> BR,
> Thinh
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-11-04 1:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-30 8:49 Issue with DWC3 Gadget Driver: Stall After Transmitting Only 2KB Using Scatter-Gather and TRB Chaining Li Haifeng
2025-10-30 9:39 ` Greg KH
2025-10-30 12:47 ` Li Haifeng
2025-10-31 22:41 ` Thinh Nguyen
2025-11-03 6:22 ` Li Haifeng
2025-11-04 1:16 ` Thinh Nguyen
2025-11-04 1:20 ` Thinh Nguyen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox