public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Pecio <michal.pecio@gmail.com>
To: Mathias Nyman <mathias.nyman@intel.com>
Cc: linux-usb@vger.kernel.org
Subject: [PATCH 0/2] Fix the NEC stop bug workaround
Date: Mon, 14 Oct 2024 21:08:40 +0200	[thread overview]
Message-ID: <20241014210840.5941d336@foxbook> (raw)

Hi,

I found an unfortunate problem with my workaround for this hardware bug.

To recap, Stop Endpoint sometimes fails, the Endpoint Context says the
EP is Stopped, but cancelled TRBs are still executed. I found this bug
earlier this year and submitted a workaround, which retries the command
(sometimes a few times) and all is good.

This works fine for common cases, but what if the endpoint is really
stopped? Then Stop Endpoint is supposed to fail and fail it does. The
workaround code doesn't know that it happened and retries infinitely.

I have never seen it in normal use, but I devised a reliable repro.
The effect isn't pretty - no URBs can be cancelled, device gets stuck,
if unplugged it locks up connections/disconnections on the whole bus.

With some experimentation I found that the bug is a variant of the old
"stop after restart" issue - the doorbell ring is internally reordered
after the subsequent command. By busy-waiting I confirmed that EP state
which is initially seen as Stopped becomes Running some time later.

I came up with a few ways to deal with it, this patch implements #3
which I think is the lowest risk. But for completeness:

0. Do nothing, revert the old patch. Not great, we are back to those
races and DMA-after-free. Seems particularly dangereous on Int IN EPs,
which may take a few ms to start - plenty of time to reuse URB buffers.

1. Ring the doorbell before queuing another Stop Endpoint. This ensures
that the EP will restart even if it wasn't meant to yet. Then a retry
succeeds and we are done. Super-simple and seems to work 100% reliably,
but what if there are still more gotchas in this hardware?

2. Set a flag on doorbell ring, clear on Stop EP or Halt. Look at the
flag to decide if it's the bug or a legitimately stopped EP. But I
didn't like adding overhead to DB ring, even if almost unmeasurable.

3. Set a flag when we know that the command will fail. This appears to
actually be doable. Then we just look at the flag and retry only if it
wasn't supposed to fail.

And some further ideas, considered but not implemented yet:

4. If we know that the command will fail, don't queue it at all. Other
commands are pending in those cases, so modify their handlers to do
our work for us. A little more invasive than the simple fixes 1-3.

5. Maybe it would make sense to ensure that the command can't ever
retry infinitely. Just give up and call hc_died() after 5 seconds.

Regards,
Michal

             reply	other threads:[~2024-10-14 19:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-14 19:08 Michal Pecio [this message]
2024-10-14 19:10 ` [PATCH 1/2] usb: xhci: Fix the NEC stop bug workaround Michal Pecio
2024-10-15 10:38   ` Greg KH
2024-10-15 11:05   ` Mathias Nyman
2024-10-15 13:27     ` Michał Pecio
2024-10-14 19:11 ` [PATCH 2/2] usb: xhci: Warn about suspected "start-stop" bugs in HCs Michal Pecio
2024-10-15 10:40   ` Greg KH
2024-10-15 18:52     ` Michał Pecio
2024-10-15 12:23 ` [PATCH 0/2] Fix the NEC stop bug workaround Mathias Nyman
2024-10-15 14:51   ` Alan Stern
2024-10-16  5:47   ` Michał Pecio
2024-10-24 15:29     ` Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241014210840.5941d336@foxbook \
    --to=michal.pecio@gmail.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox