* [PATCH] USB: EHCI: inflate max_tt_usecs and implement sitd backpointers
@ 2026-05-18 6:42 Brent Page
2026-05-18 18:24 ` Brent Page
0 siblings, 1 reply; 4+ messages in thread
From: Brent Page @ 2026-05-18 6:42 UTC (permalink / raw)
To: linux-usb; +Cc: Alan Stern, Michal Pecio
This is a follow-up on
https://lore.kernel.org/linux-usb/
B66AE752-B09C-49B3-A829-F7ABB36FB250@gmail.com/T/#u.
The goal is to accommodate 1023-byte-endpoint full-speed isochronous-in
transactions on a USB2 bus.
One of the main changes is
- max_tt_usecs[] = { 125, 125, 125, 125, 125, 125, 30, 0 };
+ max_tt_usecs[] = { 145, 145, 145, 145, 145, 145, 35, 0 };
To repeat for documentation's sake, "145 us is longer than a microframe, so
this best-case budget will often not reflect the times when transactions occur.
But, this situation is consistent with the best-case budget in section 11.18.1
of the USB-2 spec, in which 1157 data-bytes are scheduled to occur as though no
bit-stuffing is necessary (i.e., the 188 bytes = 12 megabits/second * (1 byte/8
bits)* 0.125 ms that can run on the full-speed bus in a microframe are all
taken taken be data-bytes). So, after the patch, max_tt_usecs is the same as
the spec's best-case budget but is just scaled by (125 us / 188 bytes) = 1/(12
megabits/s) and by the 7/6 bit-stuffing multiplier to reflect the fact that the
scheduling code works in bit-stuffing-inclusive bus-time instead of
data-bytes." However, as pointed out in the linked email chain, because
ehci-sched.c currently doesn't allow frame-hopping CSPLITS (a CSPLIT for a
given transaction that gets executed in the H-frame following the H-frame in
which the transaction was initiated), "it doesn't support 1023 byte packets at
all. 1023/188=5.4 and if worst case bit stuffing factor is 7/6 then up to 6.3
uframes of transfer time. Completion in [HuFrame 6 or 7] and CSPLIT required in
[HuFrame 0 of the next frame]." So, this patch also adds support for
frame-hopping CSPLITS. Per EHCI-1.0 4.12.3.3.2.1, the sitd containing such
CSPLITS must have a backpointer to the sitd of the previous frame.
The main cases to handle in the backpointer code are period==(1 frame) and
period!=(1 frame), where period is specifically derived from the bInterval of
the endpoint.
Say there's a 3-packet urb for the endpoint (urb0, with labels td0_#) in the
pipeline and another gets added (urb1, with labels td1_#). The result is (the
nodes below are sitds, and "td" stands for "transaction descriptor") ...
if period==(1 frame):
|-----| |-----| |-----| |-----| |-----| |-----|
|td0_0|-->|td0_1|-->|td0_2|-->|td1_0|-->|td1_1|-->|td1_2|
|-----|< |-----|< |-----|< |-----|< |-----|< |-----|
| | | | | | | | | |
|----| |---| |---| |---| |---|
where time increases from left to right, and the lines/arrows on the bottom
represent backpointer relationships. Also, all sitds are initialized in the Do
Complete Split state except for the very first in a stream.
Including a backpointer from the first sitd of urb1 to the last sitd of urb0
(as done here) is not certainly the 'correct' way to handle urb boundaries. An
alternative strategy would be to add an additional sitd, td0_3, to urb0 after
td0_2 that just contains CSPLITS for td0_2. However, td0_3 would have to be in
the same frame as td1_0 (the first sitd of urb1). It's possible that the
HC supports having two sitds for one endpoint in one frame, but it
seems unusual enough that it could produce issues, hence why I avoid this
alternative strategy. Also, the adopted strategy is simpler to code – the only
distinctive sitd is the very first in a stream.
if period!=(1 frame):
|-----| |-----| |-----| |-----| |-----| |-----|
|td0_0|-->|td0_1|-->|td0_2|-->|td0_3|-->|td0_4|-->|td0_5|-->
|-----|< |-----| |-----|< |-----| |-----|< |-----| cont.
| | | | | | below
|----| |----| |----|
|-----| |-----| |-----| |-----| |-----| |-----|
-->|td1_0|-->|td1_1|-->|td1_2|-->|td1_3|-->|td1_4|-->|td1_5|
|-----|< |-----| |-----|< |-----| |-----|< |-----|
| | | | | |
|----| |----| |----|
where the second sitd in each pair is positioned one frame after the first sitd
of the pair, gets initialized in the Do Complete Split state, contains CSPLITS
for the transfer that was started in the first sitd of the pair, and has no
SSPLITS.
I also want to make sure I understand something that line ~2197 of the patched
ehci-sched.c:
sitd_before = list_last_entry(&sched->td_list, struct ehci_sitd, sitd_list);
is based on. Namely, in the current code, am I correct that
insertion of a new urb's sitds into the endpoint's td_list looks like:
td_list initially (say it's loaded up with a 3-packet urb)
|-----| |-----| |-----|
|td0_0|-->|td0_1|-->|td0_2|
|-----| |-----| |-----|
line 2090 in the genuine kernel code:
(executed three times for a new 3-packet urb)
list_add(&sitd->sitd_list, &iso_sched->td_list);
|-----| |-----| |-----| |-----| |-----| |-----|
|td0_0|-->| |-->| |-->| |-->|td0_1|-->|td0_2|
|-----| |-----| |-----| |-----| |-----| |-----|
?
The following later lines in stid_link_urb (again genuine kernel code) then put
these new sitds in their proper place I think:
2177 sitd = list_entry(iso_sched->td_list.next,
2178 struct ehci_itd, sitd_list);
2179 list_move_tail(&sitd->sitd_list, &stream->td_list);
Lastly, I'm still a bit confused by this comment in ehci-sched.c:
/* special case for isoc transfers larger than 125us:
* the first and each subsequent fully used uframe
* must be empty, so as to not illegally delay
* already scheduled transactions
*/
To me, the main issue is that the adopted "carryover" approach to budgeting
doesn't take into account the fact that the TT processes the transactions
sequentially (at least according to all the footprints shown in the figs. in
EHCI1 and USB2) and doesn't, e.g., do part of transaction 1, switch to
transaction 2, then go back to transaction 1. This most obviously is
problematic for long transactions, hence ehci-sched.c handling this case
separately.
I make a point of bringing this up because, given the inflation of the
max_tt_usecs values in the patch, it is actually the case that a newly budgeted
endpoint may "delay already scheduled transactions". For example, say there's a
transaction budgeted for HuFrame 1 with tt_usecs=2 us and a new device gets
plugged in that requires up to tt_usecs=140 us per transaction. The new device
will also get budgeted for HuFrame 1. Then, the device's transactions that
actually do take >125 us (due to unfavorable bit stuffing requirements) will
delay the 2 us transaction (it's maybe worth pointing out that this relies on
the fact that sitd_link in ehci-sched.c puts the newer sitds first in the
periodic queue for each uframe). I don't think there's anything "illegal"
about this though.
---
drivers/usb/host/ehci-sched.c | 132 ++++++++++++++++++++++++++--------
drivers/usb/host/ehci.h | 5 +-
2 files changed, 106 insertions(+), 31 deletions(-)
diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
index a241337c9..7c2f2125d 100644
--- a/drivers/usb/host/ehci-sched.c
+++ b/drivers/usb/host/ehci-sched.c
@@ -285,13 +285,13 @@ static void compute_tt_budget(u8 budget_table[EHCI_BANDWIDTH_SIZE],
for (uf = ps->phase_uf; uf < 8; ++uf) {
x += budget_line[uf];
- /* Each microframe lasts 125 us */
- if (x <= 125) {
+ /* Cumulative tt_usecs can be as large as 145 us */
+ if (x <= 145) {
budget_line[uf] = x;
break;
}
- budget_line[uf] = 125;
- x -= 125;
+ budget_line[uf] = 145;
+ x -= 145;
}
}
}
@@ -313,7 +313,13 @@ static int __maybe_unused same_tt(struct usb_device *dev1,
#ifdef CONFIG_USB_EHCI_TT_NEWSCHED
static const unsigned char
-max_tt_usecs[] = { 125, 125, 125, 125, 125, 125, 30, 0 };
+/*
+ * tt_usecs includes worst-case bit stuffing time, so the transactions
+ * (complete and/or partial) budgeted for a given 125 us uframe can have a
+ * cumulative tt_usecs as large as 145 us and still possibly fit into the
+ * uframe.
+*/
+max_tt_usecs[] = { 145, 145, 145, 145, 145, 145, 35, 0 };
/* carryover low/fullspeed bandwidth that crosses uframe boundries */
static inline void carryover_tt_bandwidth(unsigned short tt_usecs[8])
@@ -378,13 +384,13 @@ static int tt_available(
if (max_tt_usecs[uframe] <= tt_usecs[uframe])
return 0;
- /* special case for isoc transfers larger than 125us:
+ /* special case for isoc transfers larger than max_tt_usecs:
* the first and each subsequent fully used uframe
* must be empty, so as to not illegally delay
* already scheduled transactions
*/
- if (usecs > 125) {
- int ufs = (usecs / 125);
+ if (usecs > 145) {
+ int ufs = (usecs / 145);
for (i = uframe; i < (uframe + ufs) && i < 8; i++)
if (tt_usecs[i] > 0)
@@ -1388,19 +1394,40 @@ sitd_slot_ok(
struct ehci_tt *tt
)
{
- unsigned mask, tmp;
+ unsigned mask, c_mask2, tmp;
unsigned frame, uf;
mask = stream->ps.cs_mask << (uframe & 7);
+ c_mask2 = mask >> 16;
+ mask = mask & 0xffff;
+ if((c_mask2 & 1) && (c_mask2 & 1<<1) && (c_mask2 & 1<<2)) {
+ /* if BuFrame6 is the last uframe in which a transaction is budgeted,
+ * the transaction will initially be configured to have CSPLITS in
+ * BuFrame7 as well as BuFrames 0 and 1 of the following frame
+ * (HuFrames 0,1,2) */
+ /* below: usb2 spec 11.18.4.3.c paragraph 2 */
+ if(mask & 1) {
+ c_mask2 = 1;
+ } else {
+ c_mask2 = (1<<2)-1;
+ }
+ } else if ((c_mask2 & 1) && (c_mask2 & 1<<1)) {
+ /* if BuFrame5 is the last uframe in which a transaction is budgeted,
+ * the transaction will initially be configured to have CSPLITS in
+ * BuFrames 6 and 7 as well as BuFrame 0 of the following frame
+ * (HuFrames 7,0,1) */
+ /* below: usb2 spec 11.18.4.3.c paragraph 1 */
+ if(mask & 1) {
+ c_mask2 = 1;
+ }
+ }
+ if((c_mask2 & mask & 1<<1))
+ return 0; /* ehci1 4.12.3.1 */
/* for OUT, don't wrap SSPLIT into H-microframe 7 */
if (((stream->ps.cs_mask & 0xff) << (uframe & 7)) >= (1 << 7))
return 0;
- /* for IN, don't wrap CSPLIT into the next frame */
- if (mask & ~0xffff)
- return 0;
-
/* check bandwidth */
uframe &= stream->ps.bw_uperiod - 1;
frame = uframe >> 3;
@@ -1450,8 +1477,10 @@ sitd_slot_ok(
uframe += stream->ps.bw_uperiod;
} while (uframe < EHCI_BANDWIDTH_SIZE);
- stream->ps.cs_mask <<= uframe & 7;
+ stream->ps.cs_mask = mask;
+ stream->ps.c_mask2 = c_mask2;
stream->splits = cpu_to_hc32(ehci, stream->ps.cs_mask);
+ stream->c_splits2 = cpu_to_hc32(ehci, stream->ps.c_mask2);
return 1;
}
@@ -2049,13 +2078,13 @@ sitd_urb_transaction(
/* allocate/init sITDs */
spin_lock_irqsave(&ehci->lock, flags);
- for (i = 0; i < urb->number_of_packets; i++) {
-
- /* NOTE: for now, we don't try to handle wraparound cases
- * for IN (using sitd->hw_backpointer, like a FSTN), which
- * means we never need two sitds for full speed packets.
- */
-
+ for (i = 0; i < 2 * urb->number_of_packets; i++) {
+ /* use 2 * number_of_packets to accommodate frame-hopping CSPLITS. if
+ * there are no such CSPLITS OR if the packet interval is 1 frame
+ * (meaning frame-hopping CSPLITS don't require an extra sitd, ehci
+ * spec 1.0 4.12.3.4), then more sitds than needed are allocated by
+ * this loop. Excess sitds are added to the free list later
+ */
/*
* Use siTDs from the free list, but not siTDs that may
* still be in use by the hardware.
@@ -2107,12 +2136,24 @@ sitd_patch(
{
struct ehci_iso_packet *uf = &iso_sched->packet[index];
u64 bufp;
+ __hc32 transaction;
sitd->hw_next = EHCI_LIST_END(ehci);
sitd->hw_fullspeed_ep = stream->address;
- sitd->hw_uframe = stream->splits;
- sitd->hw_results = uf->transaction;
- sitd->hw_backpointer = EHCI_LIST_END(ehci);
+ sitd->hw_backpointer = cpu_to_hc32(ehci, sitd->backpointer_sitd_dma);
+ transaction = uf->transaction;
+
+ if(sitd->backpointer_sitd_dma==1) { /* null backpointer */
+ sitd->hw_uframe=stream->splits;
+ } else {
+ if(stream->ps.period==1) {
+ sitd->hw_uframe=stream->splits|stream->c_splits2;
+ } else {
+ sitd->hw_uframe=stream->c_splits2;
+ }
+ transaction |= SITD_STS_STS; /* start in Do Complete Split mode, ehci1 4.12.3.3.2.1*/
+ }
+ sitd->hw_results = transaction;
bufp = uf->bufp;
sitd->hw_buf[0] = cpu_to_hc32(ehci, bufp);
@@ -2149,13 +2190,19 @@ static void sitd_link_urb(
unsigned next_uframe;
struct ehci_iso_sched *sched = urb->hcpriv;
struct ehci_sitd *sitd;
+ struct ehci_sitd *sitd_before;
next_uframe = stream->next_uframe;
- if (list_empty(&stream->td_list))
+ if (list_empty(&stream->td_list)) {
/* usbfs ignores TT bandwidth */
ehci_to_hcd(ehci)->self.bandwidth_allocated
+= stream->bandwidth;
+ sitd_before = NULL;
+ } else {
+ sitd_before = list_last_entry(&sched->td_list,
+ struct ehci_sitd, sitd_list);
+ }
if (ehci_to_hcd(ehci)->self.bandwidth_isoc_reqs == 0) {
if (ehci->amd_pll_fix == 1)
@@ -2165,9 +2212,9 @@ static void sitd_link_urb(
ehci_to_hcd(ehci)->self.bandwidth_isoc_reqs++;
/* fill sITDs frame by frame */
- for (packet = sched->first_packet, sitd = NULL;
- packet < urb->number_of_packets;
- packet++) {
+ for (int i = sched->first_packet, sitd = NULL;
+ i < 2 * urb->number_of_packets;
+ i++) {
/* ASSERT: we have all necessary sitds */
BUG_ON(list_empty(&sched->td_list));
@@ -2176,15 +2223,40 @@ static void sitd_link_urb(
sitd = list_entry(sched->td_list.next,
struct ehci_sitd, sitd_list);
+ if((i>=urb->number_of_packets)&&((stream->ps.period==1)||(!(stream->ps.c_mask2)))) {
+ /*
+ * no frame-hopping CSPLITS OR the period is 1, so such CSPLITS are
+ * put into the sitd for the next transfer ; in both cases
+ * #sitds=#transfers, so move the surplus sitd to the free list
+ */
+ list_move_tail(&sitd->sitd_list, &stream->free_list);
+ continue;
+ }
+ if(stream->ps.c_mask2 && !list_empty(&stream->td_list) &&
+ ((stream->ps.period==1)||(i%2==1)) ) {
+ sitd->backpointer_sitd_dma = sitd_before->sitd_dma;
+ } else {
+ sitd->backpointer_sitd_dma = 1;
+ }
list_move_tail(&sitd->sitd_list, &stream->td_list);
sitd->stream = stream;
sitd->urb = urb;
- sitd_patch(ehci, stream, sitd, sched, packet);
+ sitd_patch(ehci, stream, sitd, sched, i);
sitd_link(ehci, (next_uframe >> 3) & (ehci->periodic_size - 1),
sitd);
- next_uframe += stream->uperiod;
+ if(stream->ps.c_mask2 && (stream->ps.period!=1)) {
+ if((i%2)==0) {
+ /* next sitd only has frame-hopping CSPLITS */
+ next_uframe += 8;
+ } else if ((i%2)==1) {
+ next_uframe += stream->uperiod - 8;
+ }
+ } else {
+ next_uframe += stream->uperiod;
+ }
+ sitd_before = sitd;
}
stream->next_uframe = next_uframe & (mod - 1);
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
index d7a3c8d13..c1c006bd3 100644
--- a/drivers/usb/host/ehci.h
+++ b/drivers/usb/host/ehci.h
@@ -51,6 +51,7 @@ struct ehci_per_sched {
struct list_head ps_list; /* node on ehci_tt's ps_list */
u16 tt_usecs; /* time on the FS/LS bus */
u16 cs_mask; /* C-mask and S-mask bytes */
+ u8 c_mask2; /* C-mask for frame-crossing TT-iso transfer */
u16 period; /* actual period in frames */
u16 phase; /* actual phase, frame part */
u8 bw_phase; /* same, for bandwidth
@@ -489,7 +490,8 @@ struct ehci_iso_stream {
/* output of (re)scheduling */
struct ehci_per_sched ps; /* scheduling info */
unsigned next_uframe;
- __hc32 splits;
+ __hc32 splits; /* C-mask and S-mask */
+ __hc32 c_splits2; /* C-mask for frame-hopping CSPLITS */
/* the rest is derived from the endpoint descriptor,
* including the extra info for hw_bufp[0..2]
@@ -579,6 +581,7 @@ struct ehci_sitd {
/* the rest is HCD-private */
dma_addr_t sitd_dma;
+ dma_addr_t backpointer_sitd_dma;
union ehci_shadow sitd_next; /* ptr to periodic q entry */
struct urb *urb;
--
2.39.5
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] USB: EHCI: inflate max_tt_usecs and implement sitd backpointers
2026-05-18 6:42 [PATCH] USB: EHCI: inflate max_tt_usecs and implement sitd backpointers Brent Page
@ 2026-05-18 18:24 ` Brent Page
2026-05-20 17:05 ` Brent Page
0 siblings, 1 reply; 4+ messages in thread
From: Brent Page @ 2026-05-18 18:24 UTC (permalink / raw)
To: linux-usb; +Cc: Alan Stern, Michal Pecio
This revision is necessary in sitd_link_urb:
< sitd_patch(ehci, stream, sitd, sched, i);
---
> if(stream->ps.c_mask2 && (stream->ps.period!=1)) {
> packet = i/2;
> } else {
> packet = i;
> }
> sitd_patch(ehci, stream, sitd, sched, packet);
I didn't catch this before because I haven't yet tested the patch on a ps.period!=1 peripheral.
For such peripherals, I think some more logic will also have to be added to sitd_complete to
ensure that
if ((urb_index + 1) != urb->number_of_packets)
doesn't evaluate to true twice for the same urb (specifically for both sitds of the urb's last
sitd pair)
> On May 17, 2026, at 11:42 PM, Brent Page <brentfpage@gmail.com> wrote:
>
> This is a follow-up on
> https://lore.kernel.org/linux-usb/
> B66AE752-B09C-49B3-A829-F7ABB36FB250@gmail.com/T/#u.
> The goal is to accommodate 1023-byte-endpoint full-speed isochronous-in
> transactions on a USB2 bus.
>
> One of the main changes is
> - max_tt_usecs[] = { 125, 125, 125, 125, 125, 125, 30, 0 };
> + max_tt_usecs[] = { 145, 145, 145, 145, 145, 145, 35, 0 };
> To repeat for documentation's sake, "145 us is longer than a microframe, so
> this best-case budget will often not reflect the times when transactions occur.
> But, this situation is consistent with the best-case budget in section 11.18.1
> of the USB-2 spec, in which 1157 data-bytes are scheduled to occur as though no
> bit-stuffing is necessary (i.e., the 188 bytes = 12 megabits/second * (1 byte/8
> bits)* 0.125 ms that can run on the full-speed bus in a microframe are all
> taken taken be data-bytes). So, after the patch, max_tt_usecs is the same as
> the spec's best-case budget but is just scaled by (125 us / 188 bytes) = 1/(12
> megabits/s) and by the 7/6 bit-stuffing multiplier to reflect the fact that the
> scheduling code works in bit-stuffing-inclusive bus-time instead of
> data-bytes." However, as pointed out in the linked email chain, because
> ehci-sched.c currently doesn't allow frame-hopping CSPLITS (a CSPLIT for a
> given transaction that gets executed in the H-frame following the H-frame in
> which the transaction was initiated), "it doesn't support 1023 byte packets at
> all. 1023/188=5.4 and if worst case bit stuffing factor is 7/6 then up to 6.3
> uframes of transfer time. Completion in [HuFrame 6 or 7] and CSPLIT required in
> [HuFrame 0 of the next frame]." So, this patch also adds support for
> frame-hopping CSPLITS. Per EHCI-1.0 4.12.3.3.2.1, the sitd containing such
> CSPLITS must have a backpointer to the sitd of the previous frame.
>
> The main cases to handle in the backpointer code are period==(1 frame) and
> period!=(1 frame), where period is specifically derived from the bInterval of
> the endpoint.
>
> Say there's a 3-packet urb for the endpoint (urb0, with labels td0_#) in the
> pipeline and another gets added (urb1, with labels td1_#). The result is (the
> nodes below are sitds, and "td" stands for "transaction descriptor") ...
> if period==(1 frame):
> |-----| |-----| |-----| |-----| |-----| |-----|
> |td0_0|-->|td0_1|-->|td0_2|-->|td1_0|-->|td1_1|-->|td1_2|
> |-----|< |-----|< |-----|< |-----|< |-----|< |-----|
> | | | | | | | | | |
> |----| |---| |---| |---| |---|
>
> where time increases from left to right, and the lines/arrows on the bottom
> represent backpointer relationships. Also, all sitds are initialized in the Do
> Complete Split state except for the very first in a stream.
>
> Including a backpointer from the first sitd of urb1 to the last sitd of urb0
> (as done here) is not certainly the 'correct' way to handle urb boundaries. An
> alternative strategy would be to add an additional sitd, td0_3, to urb0 after
> td0_2 that just contains CSPLITS for td0_2. However, td0_3 would have to be in
> the same frame as td1_0 (the first sitd of urb1). It's possible that the
> HC supports having two sitds for one endpoint in one frame, but it
> seems unusual enough that it could produce issues, hence why I avoid this
> alternative strategy. Also, the adopted strategy is simpler to code – the only
> distinctive sitd is the very first in a stream.
>
> if period!=(1 frame):
> |-----| |-----| |-----| |-----| |-----| |-----|
> |td0_0|-->|td0_1|-->|td0_2|-->|td0_3|-->|td0_4|-->|td0_5|-->
> |-----|< |-----| |-----|< |-----| |-----|< |-----| cont.
> | | | | | | below
> |----| |----| |----|
>
> |-----| |-----| |-----| |-----| |-----| |-----|
> -->|td1_0|-->|td1_1|-->|td1_2|-->|td1_3|-->|td1_4|-->|td1_5|
> |-----|< |-----| |-----|< |-----| |-----|< |-----|
> | | | | | |
> |----| |----| |----|
> where the second sitd in each pair is positioned one frame after the first sitd
> of the pair, gets initialized in the Do Complete Split state, contains CSPLITS
> for the transfer that was started in the first sitd of the pair, and has no
> SSPLITS.
>
>
> I also want to make sure I understand something that line ~2197 of the patched
> ehci-sched.c:
> sitd_before = list_last_entry(&sched->td_list, struct ehci_sitd, sitd_list);
> is based on. Namely, in the current code, am I correct that
> insertion of a new urb's sitds into the endpoint's td_list looks like:
>
> td_list initially (say it's loaded up with a 3-packet urb)
> |-----| |-----| |-----|
> |td0_0|-->|td0_1|-->|td0_2|
> |-----| |-----| |-----|
>
> line 2090 in the genuine kernel code:
> (executed three times for a new 3-packet urb)
> list_add(&sitd->sitd_list, &iso_sched->td_list);
>
> |-----| |-----| |-----| |-----| |-----| |-----|
> |td0_0|-->| |-->| |-->| |-->|td0_1|-->|td0_2|
> |-----| |-----| |-----| |-----| |-----| |-----|
>
> ?
>
> The following later lines in stid_link_urb (again genuine kernel code) then put
> these new sitds in their proper place I think:
> 2177 sitd = list_entry(iso_sched->td_list.next,
> 2178 struct ehci_itd, sitd_list);
> 2179 list_move_tail(&sitd->sitd_list, &stream->td_list);
>
> Lastly, I'm still a bit confused by this comment in ehci-sched.c:
> /* special case for isoc transfers larger than 125us:
> * the first and each subsequent fully used uframe
> * must be empty, so as to not illegally delay
> * already scheduled transactions
> */
> To me, the main issue is that the adopted "carryover" approach to budgeting
> doesn't take into account the fact that the TT processes the transactions
> sequentially (at least according to all the footprints shown in the figs. in
> EHCI1 and USB2) and doesn't, e.g., do part of transaction 1, switch to
> transaction 2, then go back to transaction 1. This most obviously is
> problematic for long transactions, hence ehci-sched.c handling this case
> separately.
> I make a point of bringing this up because, given the inflation of the
> max_tt_usecs values in the patch, it is actually the case that a newly budgeted
> endpoint may "delay already scheduled transactions". For example, say there's a
> transaction budgeted for HuFrame 1 with tt_usecs=2 us and a new device gets
> plugged in that requires up to tt_usecs=140 us per transaction. The new device
> will also get budgeted for HuFrame 1. Then, the device's transactions that
> actually do take >125 us (due to unfavorable bit stuffing requirements) will
> delay the 2 us transaction (it's maybe worth pointing out that this relies on
> the fact that sitd_link in ehci-sched.c puts the newer sitds first in the
> periodic queue for each uframe). I don't think there's anything "illegal"
> about this though.
>
> ---
> drivers/usb/host/ehci-sched.c | 132 ++++++++++++++++++++++++++--------
> drivers/usb/host/ehci.h | 5 +-
> 2 files changed, 106 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
> index a241337c9..7c2f2125d 100644
> --- a/drivers/usb/host/ehci-sched.c
> +++ b/drivers/usb/host/ehci-sched.c
> @@ -285,13 +285,13 @@ static void compute_tt_budget(u8 budget_table[EHCI_BANDWIDTH_SIZE],
> for (uf = ps->phase_uf; uf < 8; ++uf) {
> x += budget_line[uf];
> - /* Each microframe lasts 125 us */
> - if (x <= 125) {
> + /* Cumulative tt_usecs can be as large as 145 us */
> + if (x <= 145) {
> budget_line[uf] = x;
> break;
> }
> - budget_line[uf] = 125;
> - x -= 125;
> + budget_line[uf] = 145;
> + x -= 145;
> }
> }
> }
> @@ -313,7 +313,13 @@ static int __maybe_unused same_tt(struct usb_device *dev1,
> #ifdef CONFIG_USB_EHCI_TT_NEWSCHED
> static const unsigned char
> -max_tt_usecs[] = { 125, 125, 125, 125, 125, 125, 30, 0 };
> +/*
> + * tt_usecs includes worst-case bit stuffing time, so the transactions
> + * (complete and/or partial) budgeted for a given 125 us uframe can have a
> + * cumulative tt_usecs as large as 145 us and still possibly fit into the
> + * uframe.
> +*/
> +max_tt_usecs[] = { 145, 145, 145, 145, 145, 145, 35, 0 };
> /* carryover low/fullspeed bandwidth that crosses uframe boundries */
> static inline void carryover_tt_bandwidth(unsigned short tt_usecs[8])
> @@ -378,13 +384,13 @@ static int tt_available(
> if (max_tt_usecs[uframe] <= tt_usecs[uframe])
> return 0;
> - /* special case for isoc transfers larger than 125us:
> + /* special case for isoc transfers larger than max_tt_usecs:
> * the first and each subsequent fully used uframe
> * must be empty, so as to not illegally delay
> * already scheduled transactions
> */
> - if (usecs > 125) {
> - int ufs = (usecs / 125);
> + if (usecs > 145) {
> + int ufs = (usecs / 145);
> for (i = uframe; i < (uframe + ufs) && i < 8; i++)
> if (tt_usecs[i] > 0)
> @@ -1388,19 +1394,40 @@ sitd_slot_ok(
> struct ehci_tt *tt
> )
> {
> - unsigned mask, tmp;
> + unsigned mask, c_mask2, tmp;
> unsigned frame, uf;
> mask = stream->ps.cs_mask << (uframe & 7);
> + c_mask2 = mask >> 16;
> + mask = mask & 0xffff;
> + if((c_mask2 & 1) && (c_mask2 & 1<<1) && (c_mask2 & 1<<2)) {
> + /* if BuFrame6 is the last uframe in which a transaction is budgeted,
> + * the transaction will initially be configured to have CSPLITS in
> + * BuFrame7 as well as BuFrames 0 and 1 of the following frame
> + * (HuFrames 0,1,2) */
> + /* below: usb2 spec 11.18.4.3.c paragraph 2 */
> + if(mask & 1) {
> + c_mask2 = 1;
> + } else {
> + c_mask2 = (1<<2)-1;
> + }
> + } else if ((c_mask2 & 1) && (c_mask2 & 1<<1)) {
> + /* if BuFrame5 is the last uframe in which a transaction is budgeted,
> + * the transaction will initially be configured to have CSPLITS in
> + * BuFrames 6 and 7 as well as BuFrame 0 of the following frame
> + * (HuFrames 7,0,1) */
> + /* below: usb2 spec 11.18.4.3.c paragraph 1 */
> + if(mask & 1) {
> + c_mask2 = 1;
> + }
> + }
> + if((c_mask2 & mask & 1<<1))
> + return 0; /* ehci1 4.12.3.1 */
> /* for OUT, don't wrap SSPLIT into H-microframe 7 */
> if (((stream->ps.cs_mask & 0xff) << (uframe & 7)) >= (1 << 7))
> return 0;
> - /* for IN, don't wrap CSPLIT into the next frame */
> - if (mask & ~0xffff)
> - return 0;
> -
> /* check bandwidth */
> uframe &= stream->ps.bw_uperiod - 1;
> frame = uframe >> 3;
> @@ -1450,8 +1477,10 @@ sitd_slot_ok(
> uframe += stream->ps.bw_uperiod;
> } while (uframe < EHCI_BANDWIDTH_SIZE);
> - stream->ps.cs_mask <<= uframe & 7;
> + stream->ps.cs_mask = mask;
> + stream->ps.c_mask2 = c_mask2;
> stream->splits = cpu_to_hc32(ehci, stream->ps.cs_mask);
> + stream->c_splits2 = cpu_to_hc32(ehci, stream->ps.c_mask2);
> return 1;
> }
> @@ -2049,13 +2078,13 @@ sitd_urb_transaction(
> /* allocate/init sITDs */
> spin_lock_irqsave(&ehci->lock, flags);
> - for (i = 0; i < urb->number_of_packets; i++) {
> -
> - /* NOTE: for now, we don't try to handle wraparound cases
> - * for IN (using sitd->hw_backpointer, like a FSTN), which
> - * means we never need two sitds for full speed packets.
> - */
> -
> + for (i = 0; i < 2 * urb->number_of_packets; i++) {
> + /* use 2 * number_of_packets to accommodate frame-hopping CSPLITS. if
> + * there are no such CSPLITS OR if the packet interval is 1 frame
> + * (meaning frame-hopping CSPLITS don't require an extra sitd, ehci
> + * spec 1.0 4.12.3.4), then more sitds than needed are allocated by
> + * this loop. Excess sitds are added to the free list later
> + */
> /*
> * Use siTDs from the free list, but not siTDs that may
> * still be in use by the hardware.
> @@ -2107,12 +2136,24 @@ sitd_patch(
> {
> struct ehci_iso_packet *uf = &iso_sched->packet[index];
> u64 bufp;
> + __hc32 transaction;
> sitd->hw_next = EHCI_LIST_END(ehci);
> sitd->hw_fullspeed_ep = stream->address;
> - sitd->hw_uframe = stream->splits;
> - sitd->hw_results = uf->transaction;
> - sitd->hw_backpointer = EHCI_LIST_END(ehci);
> + sitd->hw_backpointer = cpu_to_hc32(ehci, sitd->backpointer_sitd_dma);
> + transaction = uf->transaction;
> +
> + if(sitd->backpointer_sitd_dma==1) { /* null backpointer */
> + sitd->hw_uframe=stream->splits;
> + } else {
> + if(stream->ps.period==1) {
> + sitd->hw_uframe=stream->splits|stream->c_splits2;
> + } else {
> + sitd->hw_uframe=stream->c_splits2;
> + }
> + transaction |= SITD_STS_STS; /* start in Do Complete Split mode, ehci1 4.12.3.3.2.1*/
> + }
> + sitd->hw_results = transaction;
> bufp = uf->bufp;
> sitd->hw_buf[0] = cpu_to_hc32(ehci, bufp);
> @@ -2149,13 +2190,19 @@ static void sitd_link_urb(
> unsigned next_uframe;
> struct ehci_iso_sched *sched = urb->hcpriv;
> struct ehci_sitd *sitd;
> + struct ehci_sitd *sitd_before;
> next_uframe = stream->next_uframe;
> - if (list_empty(&stream->td_list))
> + if (list_empty(&stream->td_list)) {
> /* usbfs ignores TT bandwidth */
> ehci_to_hcd(ehci)->self.bandwidth_allocated
> += stream->bandwidth;
> + sitd_before = NULL;
> + } else {
> + sitd_before = list_last_entry(&sched->td_list,
> + struct ehci_sitd, sitd_list);
> + }
> if (ehci_to_hcd(ehci)->self.bandwidth_isoc_reqs == 0) {
> if (ehci->amd_pll_fix == 1)
> @@ -2165,9 +2212,9 @@ static void sitd_link_urb(
> ehci_to_hcd(ehci)->self.bandwidth_isoc_reqs++;
> /* fill sITDs frame by frame */
> - for (packet = sched->first_packet, sitd = NULL;
> - packet < urb->number_of_packets;
> - packet++) {
> + for (int i = sched->first_packet, sitd = NULL;
> + i < 2 * urb->number_of_packets;
> + i++) {
> /* ASSERT: we have all necessary sitds */
> BUG_ON(list_empty(&sched->td_list));
> @@ -2176,15 +2223,40 @@ static void sitd_link_urb(
> sitd = list_entry(sched->td_list.next,
> struct ehci_sitd, sitd_list);
> + if((i>=urb->number_of_packets)&&((stream->ps.period==1)||(!(stream->ps.c_mask2)))) {
> + /*
> + * no frame-hopping CSPLITS OR the period is 1, so such CSPLITS are
> + * put into the sitd for the next transfer ; in both cases
> + * #sitds=#transfers, so move the surplus sitd to the free list
> + */
> + list_move_tail(&sitd->sitd_list, &stream->free_list);
> + continue;
> + }
> + if(stream->ps.c_mask2 && !list_empty(&stream->td_list) &&
> + ((stream->ps.period==1)||(i%2==1)) ) {
> + sitd->backpointer_sitd_dma = sitd_before->sitd_dma;
> + } else {
> + sitd->backpointer_sitd_dma = 1;
> + }
> list_move_tail(&sitd->sitd_list, &stream->td_list);
> sitd->stream = stream;
> sitd->urb = urb;
> - sitd_patch(ehci, stream, sitd, sched, packet);
> + sitd_patch(ehci, stream, sitd, sched, i);
> sitd_link(ehci, (next_uframe >> 3) & (ehci->periodic_size - 1),
> sitd);
> - next_uframe += stream->uperiod;
> + if(stream->ps.c_mask2 && (stream->ps.period!=1)) {
> + if((i%2)==0) {
> + /* next sitd only has frame-hopping CSPLITS */
> + next_uframe += 8;
> + } else if ((i%2)==1) {
> + next_uframe += stream->uperiod - 8;
> + }
> + } else {
> + next_uframe += stream->uperiod;
> + }
> + sitd_before = sitd;
> }
> stream->next_uframe = next_uframe & (mod - 1);
> diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
> index d7a3c8d13..c1c006bd3 100644
> --- a/drivers/usb/host/ehci.h
> +++ b/drivers/usb/host/ehci.h
> @@ -51,6 +51,7 @@ struct ehci_per_sched {
> struct list_head ps_list; /* node on ehci_tt's ps_list */
> u16 tt_usecs; /* time on the FS/LS bus */
> u16 cs_mask; /* C-mask and S-mask bytes */
> + u8 c_mask2; /* C-mask for frame-crossing TT-iso transfer */
> u16 period; /* actual period in frames */
> u16 phase; /* actual phase, frame part */
> u8 bw_phase; /* same, for bandwidth
> @@ -489,7 +490,8 @@ struct ehci_iso_stream {
> /* output of (re)scheduling */
> struct ehci_per_sched ps; /* scheduling info */
> unsigned next_uframe;
> - __hc32 splits;
> + __hc32 splits; /* C-mask and S-mask */
> + __hc32 c_splits2; /* C-mask for frame-hopping CSPLITS */
> /* the rest is derived from the endpoint descriptor,
> * including the extra info for hw_bufp[0..2]
> @@ -579,6 +581,7 @@ struct ehci_sitd {
> /* the rest is HCD-private */
> dma_addr_t sitd_dma;
> + dma_addr_t backpointer_sitd_dma;
> union ehci_shadow sitd_next; /* ptr to periodic q entry */
> struct urb *urb;
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] USB: EHCI: inflate max_tt_usecs and implement sitd backpointers
2026-05-18 18:24 ` Brent Page
@ 2026-05-20 17:05 ` Brent Page
2026-05-21 5:52 ` Greg KH
0 siblings, 1 reply; 4+ messages in thread
From: Brent Page @ 2026-05-20 17:05 UTC (permalink / raw)
To: linux-usb; +Cc: Alan Stern, Michal Pecio
OK, a coherent set of additions/edits to the patch:
---
diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
index 7c2f2125d..04cf55484 100644
--- a/drivers/usb/host/ehci-sched.c
+++ b/drivers/usb/host/ehci-sched.c
@@ -1478,7 +1478,7 @@ sitd_slot_ok(
} while (uframe < EHCI_BANDWIDTH_SIZE);
stream->ps.cs_mask = mask;
- stream->ps.c_mask2 = c_mask2;
+ stream->ps.c_mask2 = c_mask2 << 8;
stream->splits = cpu_to_hc32(ehci, stream->ps.cs_mask);
stream->c_splits2 = cpu_to_hc32(ehci, stream->ps.c_mask2);
return 1;
@@ -2242,7 +2242,14 @@ static void sitd_link_urb(
sitd->stream = stream;
sitd->urb = urb;
- sitd_patch(ehci, stream, sitd, sched, i);
+ if(stream->ps.c_mask2 && (stream->ps.period!=1)) {
+ packet = i/2;
+ sitd->last_in_urb = i==(2*urb->number_of_packets - 1);
+ } else {
+ packet = i;
+ sitd->last_in_urb = i==(urb->number_of_packets-1);
+ }
+ sitd_patch(ehci, stream, sitd, sched, packet);
sitd_link(ehci, (next_uframe >> 3) & (ehci->periodic_size - 1),
sitd);
@@ -2291,13 +2298,17 @@ static bool sitd_complete(struct ehci_hcd *ehci, struct ehci_sitd *sitd)
int urb_index;
struct ehci_iso_stream *stream = sitd->stream;
bool retval = false;
+ bool has_ssplits;
urb_index = sitd->index;
desc = &urb->iso_frame_desc[urb_index];
t = hc32_to_cpup(ehci, &sitd->hw_results);
+ has_ssplits = hc32_to_cpu(ehci, sitd->hw_uframe) & 0x00ff;
/* report transfer status */
- if (unlikely(t & SITD_ERRS)) {
+ if(!has_ssplits) { /* just contains frame-hopping CSPLITS */
+ desc->status = 0; /* actual completion status reported by previous sitd */
+ } else if (unlikely(t & SITD_ERRS)) {
urb->error_count++;
if (t & SITD_STS_DBE)
desc->status = usb_pipein(urb->pipe)
@@ -2317,7 +2328,7 @@ static bool sitd_complete(struct ehci_hcd *ehci, struct ehci_sitd *sitd)
}
/* handle completion now? */
- if ((urb_index + 1) != urb->number_of_packets)
+ if (!sitd->last_in_urb)
goto done;
/*
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
index c1c006bd3..6c9b6f390 100644
--- a/drivers/usb/host/ehci.h
+++ b/drivers/usb/host/ehci.h
@@ -583,6 +583,7 @@ struct ehci_sitd {
dma_addr_t sitd_dma;
dma_addr_t backpointer_sitd_dma;
union ehci_shadow sitd_next; /* ptr to periodic q entry */
+ bool last_in_urb;
struct urb *urb;
struct ehci_iso_stream *stream; /* endpoint's queue */
---
I configured my 1023-byte-endpoint peripheral to have a bInterval of 2 so that I could test the patch's period!=1 code. The patch appeared to perform well for this case. That said, the peripheral's endpoint is actually not that demanding because the number of data bytes per frame is only ~750. It just has to set wMaxPacketSize to 1023 due to hardware limitations and the fact that 750>512. So, I plan to conduct some more intensive tests requiring closer to 1023 iso-in data bytes and maybe a few interrupt bytes for good measure.
From,
Brent Page
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] USB: EHCI: inflate max_tt_usecs and implement sitd backpointers
2026-05-20 17:05 ` Brent Page
@ 2026-05-21 5:52 ` Greg KH
0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2026-05-21 5:52 UTC (permalink / raw)
To: Brent Page; +Cc: linux-usb, Alan Stern, Michal Pecio
On Wed, May 20, 2026 at 10:05:39AM -0700, Brent Page wrote:
>
> OK, a coherent set of additions/edits to the patch:
> ---
> diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
> index 7c2f2125d..04cf55484 100644
> --- a/drivers/usb/host/ehci-sched.c
> +++ b/drivers/usb/host/ehci-sched.c
> @@ -1478,7 +1478,7 @@ sitd_slot_ok(
> } while (uframe < EHCI_BANDWIDTH_SIZE);
> stream->ps.cs_mask = mask;
> - stream->ps.c_mask2 = c_mask2;
> + stream->ps.c_mask2 = c_mask2 << 8;
> stream->splits = cpu_to_hc32(ehci, stream->ps.cs_mask);
> stream->c_splits2 = cpu_to_hc32(ehci, stream->ps.c_mask2);
> return 1;
Whitespace corrupted and can not be applied :(
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-21 5:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18 6:42 [PATCH] USB: EHCI: inflate max_tt_usecs and implement sitd backpointers Brent Page
2026-05-18 18:24 ` Brent Page
2026-05-20 17:05 ` Brent Page
2026-05-21 5:52 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox