public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Michal Pecio <michal.pecio@gmail.com>,
	Mathias Nyman <mathias.nyman@intel.com>
Cc: linux-usb@vger.kernel.org
Subject: Re: [PATCH 0/2] Fix the NEC stop bug workaround
Date: Tue, 15 Oct 2024 15:23:23 +0300	[thread overview]
Message-ID: <e3f8e58d-d132-430f-875f-283d8055b6c0@linux.intel.com> (raw)
In-Reply-To: <20241014210840.5941d336@foxbook>

On 14.10.2024 22.08, Michal Pecio wrote:
> Hi,
> 
> I found an unfortunate problem with my workaround for this hardware bug.
> 
> To recap, Stop Endpoint sometimes fails, the Endpoint Context says the
> EP is Stopped, but cancelled TRBs are still executed. I found this bug
> earlier this year and submitted a workaround, which retries the command
> (sometimes a few times) and all is good.
> 
> This works fine for common cases, but what if the endpoint is really
> stopped? Then Stop Endpoint is supposed to fail and fail it does. The
> workaround code doesn't know that it happened and retries infinitely.
> 
> I have never seen it in normal use, but I devised a reliable repro.
> The effect isn't pretty - no URBs can be cancelled, device gets stuck,
> if unplugged it locks up connections/disconnections on the whole bus.
> 
> With some experimentation I found that the bug is a variant of the old
> "stop after restart" issue - the doorbell ring is internally reordered
> after the subsequent command. By busy-waiting I confirmed that EP state
> which is initially seen as Stopped becomes Running some time later.
> 

Seems host controllers aren't designed to stop, move dequeue, and restart
an endpoint in quick succession.

In addition to fixing this NEC case we could think about avoiding these
cases, some could be avoided by adding a new ".flush_endpoint()" callback to
the USB host side API. Usb core itself has a usb_hcd_flush_endpoint() function
that calls .urb_dequeue() in a loop for each queued URB, causing host to
issue the stop, move deq and ring doorbell for every URB.

If usbcore knows all URBs will be cancelled it could let host do it in one go.
i.e. stop endpoint once.

Thanks
Mathias

  parent reply	other threads:[~2024-10-15 12:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-14 19:08 [PATCH 0/2] Fix the NEC stop bug workaround Michal Pecio
2024-10-14 19:10 ` [PATCH 1/2] usb: xhci: " Michal Pecio
2024-10-15 10:38   ` Greg KH
2024-10-15 11:05   ` Mathias Nyman
2024-10-15 13:27     ` Michał Pecio
2024-10-14 19:11 ` [PATCH 2/2] usb: xhci: Warn about suspected "start-stop" bugs in HCs Michal Pecio
2024-10-15 10:40   ` Greg KH
2024-10-15 18:52     ` Michał Pecio
2024-10-15 12:23 ` Mathias Nyman [this message]
2024-10-15 14:51   ` [PATCH 0/2] Fix the NEC stop bug workaround Alan Stern
2024-10-16  5:47   ` Michał Pecio
2024-10-24 15:29     ` Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3f8e58d-d132-430f-875f-283d8055b6c0@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=michal.pecio@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox