All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Pecio <michal.pecio@gmail.com>
To: Martin Alderson <martinalderson@gmail.com>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>, linux-usb@vger.kernel.org
Subject: Re: xhci_hcd: AMD Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8] dies on resume from suspend
Date: Sat, 13 Jun 2026 11:26:48 +0200	[thread overview]
Message-ID: <20260613112648.2ea61301.michal.pecio@gmail.com> (raw)
In-Reply-To: <CA+_z3hS4sdnZAhAvO+dDTR9hZVwGudDrsw93Do6dZkva8vK=5A@mail.gmail.com>

On Sat, 6 Jun 2026 14:12:22 +0100, Martin Alderson wrote:
> So the ordering that kills the controller is:
> 
> 1. dj work issues a control SET_REPORT on ep0; the URB lands on the ring
> 2. usb_suspend_both() → usb_suspend_device() drives the port to U3
> 3. only afterwards does usb_suspend_both() set udev->can_submit = 0
> and call usb_hcd_flush_endpoint() (drivers/usb/core/driver.c) — and
> that flush unlinks the still-pending ep0 URB
> 4. xhci issues Stop Endpoint to an endpoint on a U3 port → 5s timeout → HC died
> 
> That matches the trace exactly: the "Cancel URB ... ep 0x0" appears
> after "Set port 7-1 link state ... U3", and the debugfs command ring
> shows the single stuck Stop Endpoint TRB (slot 1, ep 1).

Makes sense. And good point about usb_hcd_flush_endpoint() - if this
gets stuck then usb_suspend_both() can't complete, which explains why
the HC still isn't suspended either.

> I had Claude patch the driver and this seems to fix it:
> 
> --- /tmp/hid-logitech-dj.orig.c 2026-06-06 14:08:26.580516662 +0100
> +++ hid-logitech-dj.c 2026-06-06 13:42:15.702948099 +0100

Improving HID drivers is one thing (which should be discussed with HID
maintainers - see Documentation/process/submitting-patches.rst), but
other drivers may behave similarly and it would be nice if this didn't
crash xHCI controllers.

While URBs on a suspended device are weird, it turns out that xHCI
spec (4.15.1) doesn't prohibit that, it only says that endpoints should
be stopped. Which they are - problem happens when we try to stop again.
I'd expect such a command to simply complete with Context State Error,
which it does in my tests on different HW, but yours gets stuck.

I'm not sure what specifically triggers this failure - is it stopping
a stopped endpoint always in general, or only after using the SP flag,
or on a device behind a suspended root port, or something else.

Would you mind testing some patches? I'm thikning about reordering
things in usb_suspend_both() so that URBs are flushed before suspending
the port, or investigating what exactly breaks your chip and adding
some workarounds. We don't need to stop a stopped endpoint, we could
proceed immediately to Set TR Dequeue, but it's uncertain if your HW
would accept that.

BTW, you initially stated that this happens on the first suspend after
new boot, but are you sure that it can't happen later? This would make
it possible to test patches by suspending repeatedly until failure.

Regards,
Michal

      reply	other threads:[~2026-06-13  9:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-29 21:52 xhci_hcd: AMD Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8] dies on resume from suspend martinalderson
2026-03-30  0:07 ` Michal Pecio
2026-04-04 12:04   ` Martin Alderson
2026-04-04 13:24     ` Michal Pecio
2026-05-09 14:51       ` Martin Alderson
2026-05-09 16:06         ` Michal Pecio
2026-05-10 16:29           ` Martin Alderson
2026-05-12 10:03             ` Michal Pecio
2026-05-12 14:01               ` Mathias Nyman
2026-05-28 11:52                 ` Martin Alderson
2026-05-28 22:10                   ` Michal Pecio
2026-05-28 23:06                     ` Martin Alderson
2026-05-29 10:22                       ` Michal Pecio
2026-05-29 12:04                         ` Martin Alderson
2026-05-29 22:57                           ` Michal Pecio
2026-06-06 13:12                             ` Martin Alderson
2026-06-13  9:26                               ` Michal Pecio [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613112648.2ea61301.michal.pecio@gmail.com \
    --to=michal.pecio@gmail.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=martinalderson@gmail.com \
    --cc=mathias.nyman@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.