linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michał Pecio" <michal.pecio@gmail.com>
To: Niklas Neronin <niklas.neronin@linux.intel.com>,
	mathias.nyman@linux.intel.com
Cc: linux-usb@vger.kernel.org
Subject: Re: [PATCH 4/4] usb: xhci: handle Set TR Deq Context State Error due to Endpoint state
Date: Mon, 25 Aug 2025 09:15:52 +0200	[thread overview]
Message-ID: <20250825091552.350d027e.michal.pecio@gmail.com> (raw)
In-Reply-To: <20250822101514.3a730f4f.michal.pecio@gmail.com>

On Fri, 22 Aug 2025 10:15:14 +0200, Michał Pecio wrote:
> Set TR Dequeue code is pretty dumb and it will jump to the first TRB
> after the first cancelled TD (or something like that). If the endpoint
> may have been running when Set TR Dequeue failed, is it guaranteed that
> trying again won't rewind it to some already completed TD?

Answering my own question, rewinding back to completed TDs could only
happen if TD cancel status were TD_HALTED, but it's made TD_DIRTY here.
In such case xhci_invalidate_cancelled_tds() will perform trb_in_td()
check and find that another Set TR Deq is not necessary if the HW has
already moved past the TD we tried to get out of. (It may still issue
the command for some other TD, of course.)

BTW, if the HW completed this TD, the TD *should* have already been
given back and this patch should have no effect then. But this is not
guaranteed, because fe49df60cdb7 ("xhci: Mitigate failed set dequeue
pointer commands") turned the TD into No-Ops. (Note that the contrary
isn't guaranteed either, since the xHC is allowed to cache this TD.)

Perhaps fe49df60cdb7 isn't really right? It's not illegal to complete
an unlinked URB due to race with normal execution. Sure, the condition
discussed here is abnormal, but callers won't know the difference.

And in the ASMedia case mentioned by fe49df60cdb7:

1. The unhandled Babble Error makes no real difference. IME, after
   a TRB Error from Set TR Dequeue, the endpoint stays stuck anyway.
   Some other HCs also get stuck similarly, but without TRB Error.

2. Babble Error wasn't handled because the TD had already been given
   back earlier when Set TR Deq failed with TRB Error. Perhaps that's
   the real bug which deserves fixing?

The ASMedia TRB Error case:
https://lore.kernel.org/linux-usb/ZsjgmCjHdzck9UKd@alphanet.ch/
https://lore.kernel.org/linux-usb/20241113002658.0a031681@foxbook/

> Is it impossible that the TD we tried to skip has been given back and
> the endpoint and/or slot are being disabled right now?

It's not impossible, but if the TD is given back and no other TDs
remain then nothing more will be done, so no problem.


In general, this patch looks like an improvement over the status quo -
pretending that the command succeeded and then unconditionally giving
back the TD, even though the HW may still complete it later.

That being said,

1. I'm not aware of any known cases leading to this situation.

2. A loop which finds and updates the TD_CLEARING_CACHE item already
   exists, so I think it would be better to modify this loop instead
   of adding another one. And the loop prints some xhci_dbg(), so it
   would be nice if they showed up in this case as well.


Regards,
Michal

  reply	other threads:[~2025-08-25  7:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-18 12:57 [PATCH 0/4] usb: xhci: improve Set TR Deq completion error handling Niklas Neronin
2025-08-18 12:57 ` [PATCH 1/4] usb: xhci: handle Set TR Deq TRB Error Niklas Neronin
2025-08-21  9:32   ` Michał Pecio
2025-08-22 11:00     ` Neronin, Niklas
2025-08-18 12:57 ` [PATCH 2/4] usb: xhci: handle Set TR Deq Slot Not Enabled Error Niklas Neronin
2025-08-21 10:35   ` Michał Pecio
2025-08-22  7:22     ` Michał Pecio
2025-08-18 12:57 ` [PATCH 3/4] usb: xhci: handle Set TR Deq Context State Error due to Slot state Niklas Neronin
2025-08-22  7:49   ` Michał Pecio
2025-08-18 12:57 ` [PATCH 4/4] usb: xhci: handle Set TR Deq Context State Error due to Endpoint state Niklas Neronin
2025-08-22  8:15   ` Michał Pecio
2025-08-25  7:15     ` Michał Pecio [this message]
2025-08-27 11:53       ` Neronin, Niklas
2025-08-29  8:06         ` Michał Pecio
2025-08-29  8:14           ` Michał Pecio
2025-09-01  6:18           ` Michał Pecio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250825091552.350d027e.michal.pecio@gmail.com \
    --to=michal.pecio@gmail.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    --cc=niklas.neronin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).