From: Mathias Nyman <mathias.nyman@intel.com>
To: Michal Pecio <michal.pecio@gmail.com>, linux-usb@vger.kernel.org
Subject: Re: [PATCH 0/2] xhci: Fix the NEC stop bug workaround
Date: Mon, 28 Oct 2024 11:54:39 +0200 [thread overview]
Message-ID: <f6dcf205-e8eb-4ba8-91d9-24fa0f769739@intel.com> (raw)
In-Reply-To: <20241028083351.655d54cf@foxbook>
On 28.10.2024 9.33, Michal Pecio wrote:
> Hi Mathias,
>
> I would be grateful if you could take a look at patch 2/2 and tell if
> there is anything obviously wrong with this approach. As far as I see,
> it should be OK to just call those invalidation and giveback functions
> directly from xhci_urb_dequeue(), and it works for me in practice.
Adding EP_HALTED case to xhci_urb_deqeue() should work fine, we
will both invalidate and give back the invalidated TDs in the completion
handler.
>
> Regarding the probem with xhci_invalidate_cancelled_tds() being called
> while Set TR Dequeue is already pending, I found that it is much easier
> to handle it by looking at SET_DEQ_PENDING and simply setting all TDs
> to TD_CLEARING_CACHE_DEFERRED if that's the case. So this is solved.
>
The SET_DEQ_PENDING case is trickier. We would read the dequeue pointer
from hardware while we know hardware is processing a command to move the
dequeue pointer. Result may be old dequeue, or new dequeue,
possible unknown.
We are turning an already difficult issue even more complex
>
> However, these patches still don't solve the issue of infinite retries
> completely. There is one more annoying case caused by halts. It is very
> poorly defined what happens when a halted EP is hard-reset. Usually Set
> TR Dequeue executes afterwards and it restarts the EP when done. But if
> it doesn't, the EP stays stopped until a new URB is submitted, if ever.
>
> There is the EP_HARD_CLEAR_TOGGLE flag which is set until the class
> driver calls usb_clear_halt(), but it's neither the case that the EP is
> guaranteed to be stopped until usb_clear_halt() is called nor that it
> is guaranteed to restart afterwards.
>
> Actually, I think it might be a bug? What if Set TR Dequeue restarts an
> EP before the class driver clears the device side of the halt?
Ok, I need to take some time to dig into this.
>
>
> I'm starting to think that it may not be realistic to quickly solve all
> those (and possibly other not known yet) problems now. Maybe just slap
> a 100ms timeout on those retries, add quirks for ASMedia/Intel and call
> it a day for now?
There are some mitigations that could be done.
As many of these issues are related to slow enpoint slow start causing
next stop endpoint command to complete with context error as endpoint is
still stopped.
We could ring the doorbell before giving back the invalidated tds in
xhci_handle_cmd_stop_ep(), and possibly xhci_handle_cmd_set_deq().
This gives hardware a bit more time to start the endpoint.
We could also track pending ring starts.
Set a "EP_START_PENDING flag when doorbell is rung on a stopped endpoint.
clear the flag when firt transfer event is received on that endpoint.
If a stop endpoint command fails with context error due to still being
stopped queue a new stop endpoint command, but only if flag was set:
xhci_handle_cmd_stop_ep()
if (comp_code == COMP_CONTEXT_STATE_ERROR) {
switch (GET_EP_CTX_STATE(ep_ctx))
case EP_STATE_STOPPED:
if (!(ep->ep_state & EP_START_PENDING)
break;
ep->ep_state &= ~EP_START_PENDING;
xhci_queue_stop_endpoint();
Thanks
Mathias
next prev parent reply other threads:[~2024-10-28 9:52 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 10:18 [PATCH 0/2] xhci: Fix the NEC stop bug workaround Michal Pecio
2024-10-25 10:19 ` [PATCH v2 1/2] usb: " Michal Pecio
2024-10-25 10:20 ` [PATCH v2 2/2 RFC] usb: xhci: Don't queue redundant Stop Endpoint commands Michal Pecio
2024-10-28 7:33 ` [PATCH 0/2] xhci: Fix the NEC stop bug workaround Michal Pecio
2024-10-28 9:54 ` Mathias Nyman [this message]
2024-10-29 8:28 ` Michał Pecio
2024-10-29 9:16 ` Mathias Nyman
2024-10-30 8:29 ` Mathias Nyman
2024-10-31 8:13 ` Michał Pecio
2024-10-31 10:49 ` Michał Pecio
2024-10-31 11:17 ` Michał Pecio
2024-10-31 14:22 ` Mathias Nyman
2024-11-01 9:10 ` Michał Pecio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f6dcf205-e8eb-4ba8-91d9-24fa0f769739@intel.com \
--to=mathias.nyman@intel.com \
--cc=linux-usb@vger.kernel.org \
--cc=michal.pecio@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.