From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
Lukas Wunner <lukas@wunner.de>,
Mark Blakeney <mark.blakeney@bullet-systems.net>,
Kamil Paral <kparal@redhat.com>,
Chris Chiu <chris.chiu@canonical.com>,
linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI/PM: Mark devices disconnected if their upstream PCIe link is down on resume
Date: Fri, 22 Sep 2023 07:42:37 +0300 [thread overview]
Message-ID: <20230922044237.GC3208943@black.fi.intel.com> (raw)
In-Reply-To: <20230921201945.GA343804@bhelgaas>
Hi Bjorn,
On Thu, Sep 21, 2023 at 03:19:45PM -0500, Bjorn Helgaas wrote:
> [+cc Kamil, Chris]
>
> On Mon, Sep 18, 2023 at 08:30:41AM +0300, Mika Westerberg wrote:
> > Mark Blakeney reported that when suspending system with a Thunderbolt
> > dock connected and then unplugging the dock before resume (which is
> > pretty normal flow with laptops), resuming takes long time.
> >
> > What happens is that the PCIe link from the root port to the PCIe switch
> > inside the Thunderbolt device does not train (as expected, the link is
> > upplugged):
> >
> > [ 34.903158] pcieport 0000:00:07.2: restoring config space at offset 0x24 (was 0x3bf12001, writing 0x3bf12001)
> > [ 34.903231] pcieport 0000:00:07.0: waiting 100 ms for downstream link
> > [ 36.140616] pcieport 0000:01:00.0: not ready 1023ms after resume; giving up
> >
> > However, at this point we still try the resume the devices below that
> > unplugged link:
> >
> > [ 36.140741] pcieport 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
> > ...
> > [ 36.142235] pcieport 0000:01:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
> > ...
> > [ 36.144702] pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation
> >
> > And this is the link from PCIe switch downstream port to the xHCI on the
> > dock:
> >
> > [ 38.380618] xhci_hcd 0000:03:00.0: not ready 1023ms after resume; waiting
> > [ 39.420587] xhci_hcd 0000:03:00.0: not ready 2047ms after resume; waiting
> > [ 41.527250] xhci_hcd 0000:03:00.0: not ready 4095ms after resume; waiting
> > [ 45.793957] xhci_hcd 0000:03:00.0: not ready 8191ms after resume; waiting
> > [ 54.113950] xhci_hcd 0000:03:00.0: not ready 16383ms after resume; waiting
> > [ 71.180576] xhci_hcd 0000:03:00.0: not ready 32767ms after resume; waiting
> > ...
> > [ 105.313963] xhci_hcd 0000:03:00.0: not ready 65535ms after resume; giving up
> > [ 105.314037] xhci_hcd 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
> > [ 105.315640] xhci_hcd 0000:03:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
> > ...
> >
> > This ends up slowing down the resume time considerably. For this reason
> > mark these devices as disconnected if the link above them did not train
> > properly.
> >
> > Fixes: e8b908146d44 ("PCI/PM: Increase wait time after resume")
> > Reported-by: Mark Blakeney <mark.blakeney@bullet-systems.net>
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217915
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
>
> Applied with Lukas' Reviewed-by to pm for v6.7.
Thanks!
> e8b908146d44 appeared in v6.4. Seems like maybe a candidate for
> stable? IIUC, resume actually does work, but takes 65+ seconds longer
> than it should?
Yes, I think it should be tagged for stable.
> Kamil also bisected a 60+ second resume delay to e8b908146d44
> (https://lore.kernel.org/r/CA+cBOTeWrsTyANjLZQ=bGoBQ_yOkkV1juyRvJq-C8GOrbW6t9Q@mail.gmail.com),
> but IIUC at
> https://lore.kernel.org/linux-pci/20230824114300.GU3465@black.fi.intel.com/T/#u
> you concluded that Kamil's issue was related to firmware and actually
> had nothing to do with e8b908146d44.
>
> Do you still think Kamil's issue is unrelated to e8b908146d44 and this
> patch? If so, how do we handle Kamil's issue? An answer like "users
> of v6.4+ must upgrade their Thunderbolt firmware" seems like it would
> be kind of a nightmare for users.
It's a different issue. What happens in his system is that the link went
down even though the dock was still connected and this should not happen
(the firmware should bring the link up during resume). The delay was
just a "symptom".
What happen here is that the user suspends the device and deliberately
disconnects the dock.
next prev parent reply other threads:[~2023-09-22 4:42 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-18 5:30 [PATCH] PCI/PM: Mark devices disconnected if their upstream PCIe link is down on resume Mika Westerberg
2023-09-18 8:37 ` Lukas Wunner
2023-09-21 20:19 ` Bjorn Helgaas
2023-09-22 4:42 ` Mika Westerberg [this message]
2023-09-22 12:59 ` Bjorn Helgaas
2023-09-24 13:44 ` Mika Westerberg
2023-09-22 11:45 ` Thorsten Leemhuis
2023-09-22 12:41 ` Bjorn Helgaas
2023-09-22 12:53 ` Thorsten Leemhuis
2023-09-29 22:45 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230922044237.GC3208943@black.fi.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=chris.chiu@canonical.com \
--cc=helgaas@kernel.org \
--cc=kparal@redhat.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mark.blakeney@bullet-systems.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.