From: Lukasz Majewski <lukma@denx.de>
To: u-boot@lists.denx.de
Subject: [PATCH] usb: ehci: Fix "EHCI timed out on TD - token=XXXX" error on ehci-hcd
Date: Mon, 2 Mar 2020 14:01:07 +0100 [thread overview]
Message-ID: <20200302140107.38e34657@jawa> (raw)
In-Reply-To: <19bb8417-8aa1-c963-2ef7-61c82f4f2d9b@denx.de>
Hi Marek,
> On 3/1/20 6:59 PM, Lukasz Majewski wrote:
> > Hi Marek,
>
> Hi,
>
> [...]
>
> >>> The description of the fix written by the original author:
> >>> https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> >>>
> >>> states that the reduction of the transfer is done just for "spin
> >>> up"/"detection" of the pen drive. The issue is that not all pen
> >>> drive disks are able to be properly detected in the first place.
> >>>
> >>
> >> But the commit message claims this patch reduces the transfer rate
> >> by 20% always. So that's a NAK right away, sorry,
> >
> > Marek, now we do have from time to time NOT functional USB (once per
> > ten attempts on iMX6Q - tpc70 board). If somebody relies on 'usb
> > start' for recovery/normal operation - it is bad.
> >
> > In such a use case - the 20% performance drop is acceptable.
>
> I'm sure we can do better. 2020.04 is still quite far out.
>
> Can you take a look ?
There are several issues which are fixed by this patch.
I will reply to the actual patch set to point them out in the code, as
this may be more readable.
>
> >> you need to find a
> >> better solution which doesn't have such an enormous impact.
> >>
> >> I'm fine with the spin-up handling, but this auto-detection of
> >> transfer size should be pulled into separate patch and is likely
> >> wrong anyway.
> >
> > I cannot agree that is it "likely wrong anyway" - please correct me
> > if I'm wrong, but up till now we were fiddling with transfer size
> > up till having "good enough" results.
>
> Correction, read the patches I linked above
> (7d6fd7f0ba71cd93d94079132f958d9630f27a89
The above sha1 uses the observation (and heuristics) from other
operating systems to deal with usb pendrives, which controllers cannot
handle more than 256 sectors (as they have 8 bit counter).
> and esp.
> 02b0e1a36c5bc20174299312556ec4e266872bd6).
This patch introduces the optimization - instead of setup/disable async
transmissions - it enables it once and then just adds new tokens to be
transmitted.
Is the transmission error (USB_ST_XACTERR 0x80) handled at this point?
> Those solve the "we were
> fiddling around with transfer size" attempts, because I actually
> invested the time to understand the issue with those devices that
> failed and I was right about never letting any of those "fiddling
> around with transfer size" patches into mainline.
>
> The problem with those devices which needed "fiddling with transfer
> size" was that their internal 16bit counters (or 8bit counters)
> overflew when a very long transfer was started and thus those devices
> failed. The fix there was to transfer less than that amount of data
> which triggered the overflow, which on a wast majority of devices is
> 240 blocks. However , this led to a massive slowdown, which is
> unacceptable. A fix for that slowdown was to avoid turning async
> schedule off-on between each transfer, but rather keep it running,
> that saves A LOT of wasted transfer time. If you build a long
> transfer up front, you would never do that many transfers to notice
> the delays incurred by turning the async schedule off/on, but if you
> do multiple shorter transfers, these delays add up.
>
Thanks for the explanation.
> Yes, these issues are convoluted and take time to solve properly, yet
> the end result might not have such a performance hit.
>
> > The above link describes what needs to be done in case of the "EHCI
> > timed out on TD - token=0x1f8c80" error.
> >
> > Do we handle it now? In my opinion we don't.
> >
> >>
> >> So try to separate this into two patches, one for handling the spin
> >> up, another one for this auto-detection of the transfer size. And I
> >> think you will end up needed only the first one.
> >
> > This is an open question if only the spin up fixes the error. It
> > would be great if you could look for the changes introduced in this
> > patch and compare it with your experience of ehci driver
> > development.
>
> Surely you can separate those two patches and perform such a test ?
I would prefer to first go through the actual patch code, before I
spent time on things which may be not necessary.
>
> >>> Last but not least - the mainline is still broken. The
> >>> "token=0x1f8c80" errors pops up from time to time. However, after
> >>> applying the approach from this fix - the error is gone (and pass
> >>> some acceptance tests).
> >>
> >> It works for me.
> >
> > Also works for Tom.
> >
> >> Do you have a specific device which fails ?
> >
> > Yes. On my desk - tpc70 with i.MX6Q. And hdc with i.MX53. I can
> > confirm that from U-Boot 2018.04 up till now the rate of:
> >
> > "EHCI timed out on TD - token=0x1f8c80" error has been reduced
> > significantly for their use case (reading ~16MiB of binary from
> > several pen drives).
>
> Can you run lsusb (to get IDs) on those pendrives which fail ?
> I would like to add them to a list.
This was working correctly on the beginning, but after some time it
started to show errors in question:
Bus 001 Device 040: ID 0930:6544 Toshiba Corp. TransMemory-Mini /
Kingston DataTraveler 2.0 Stick
This one was causing issues from the very beginning:
Bus 001 Device 041: ID 058f:6387 Alcor Micro Corp. Flash Drive
This one - Verbatim
Bus 001 Device 042: ID 21c4:8005
Never caused issues - worked reliably even with old U-Boot (on which
errors were observed very often).
>
> > Unfortunately the errors are still present from time to time.
> >
> >> Which
> >> device and on which hardware ?
> >>
> >> Also, what is "some acceptance test" ?
> >>
> >
> > The set of automated tests to assess if "usb start" don't break.
>
> The usb start only reads out the partition table and does not do any
> long transfers (above 240 blocks, which usually triggers the failure
> on cheap sticks). So if long transfers fail for you, then this is a
> separate issue from the 'usb start' spin-up problem. And so, if this
> patch makes both work, then this indeed should be two patches.
Best regards,
Lukasz Majewski
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200302/b7a659d3/attachment.sig>
next prev parent reply other threads:[~2020-03-02 13:01 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-26 11:29 [PATCH] usb: ehci: Fix "EHCI timed out on TD - token=XXXX" error on ehci-hcd Lukasz Majewski
2020-02-26 16:05 ` Tom Rini
2020-02-28 23:32 ` Marek Vasut
2020-02-29 3:20 ` Tom Rini
2020-02-29 7:20 ` Marek Vasut
2020-03-01 23:04 ` Tom Rini
2020-03-02 0:39 ` Marek Vasut
2020-03-02 17:00 ` Tom Rini
2020-03-14 18:16 ` Marek Vasut
2020-03-01 17:19 ` Lukasz Majewski
2020-03-01 17:35 ` Marek Vasut
2020-03-01 17:59 ` Lukasz Majewski
2020-03-01 18:39 ` Marek Vasut
2020-03-02 13:01 ` Lukasz Majewski [this message]
2020-03-02 19:08 ` Marek Vasut
2020-03-02 13:21 ` Lukasz Majewski
2020-03-02 19:19 ` Marek Vasut
2020-03-02 19:54 ` Tom Rini
2020-03-02 19:58 ` Marek Vasut
2020-03-02 19:59 ` Tom Rini
2020-03-02 23:25 ` Lukasz Majewski
2020-03-03 12:29 ` Marek Vasut
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200302140107.38e34657@jawa \
--to=lukma@denx.de \
--cc=u-boot@lists.denx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.