From: Mario Limonciello <mario.limonciello@amd.com>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "Gary Li" <Gary.Li@amd.com>,
"Mario Limonciello" <superm1@kernel.org>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Mathias Nyman" <mathias.nyman@intel.com>,
"open list : PCI SUBSYSTEM" <linux-pci@vger.kernel.org>,
"open list" <linux-kernel@vger.kernel.org>,
"open list : USB XHCI DRIVER" <linux-usb@vger.kernel.org>,
"Daniel Drake" <drake@endlessos.org>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()
Date: Mon, 9 Sep 2024 15:40:54 -0500 [thread overview]
Message-ID: <b4237bef-809f-4d78-8a70-d962e7eb467b@amd.com> (raw)
In-Reply-To: <20240905093325.GJ1532424@black.fi.intel.com>
On 9/5/2024 04:33, Mika Westerberg wrote:
> Hi,
>
> On Wed, Sep 04, 2024 at 10:24:26AM -0500, Mario Limonciello wrote:
>> On 9/4/2024 07:05, Mika Westerberg wrote:
>>> Hi,
>>>
>>> On Tue, Sep 03, 2024 at 01:32:30PM -0500, Mario Limonciello wrote:
>>>> On 9/3/2024 13:25, Bjorn Helgaas wrote:
>>>>> On Tue, Sep 03, 2024 at 12:31:00PM -0500, Mario Limonciello wrote:
>>>>>> On 9/3/2024 12:11, Bjorn Helgaas wrote:
>>>>>> ...
>>>>>
>>>>>>> 8) The USB4 stack sees the device and assumes it is in D0, but it
>>>>>>> seems to still be in D3cold. What is this based on? Is there a
>>>>>>> config read that returns ~0 data when it shouldn't?
>>>>>>
>>>>>> Yes there is. From earlier in the thread I have a [log] I shared.
>>>>>>
>>>>>> The message emitted is from ring_interrupt_active():
>>>>>>
>>>>>> "thunderbolt 0000:e5:00.5: interrupt for TX ring 0 is already enabled"
>>>>>
>>>>> Right, that's in the cover letter, but I can't tell from this what the
>>>>> ioread32(ring->nhi->iobase + reg) returned. It looks like this is an
>>>>> MMIO read of BAR 0, not a config read.
>>>>>
>>>>
>>>> Yeah. I suppose another way to approach this problem is to make something
>>>> else in the call chain poll PCI_PM_CTRL.
>>>>
>>>> Polling at the start of nhi_runtime_resume() should also work. For the
>>>> "normal" scenario it would just be a single read to PCI_PM_CTRL.
>>>>
>>>> Mika, thoughts?
>>
>> We did this experiment to throw code to poll PCI_PM_CTRL at the start of
>> nhi_runtime_resume() but this also fails. From that I would hypothesize the
>> device transitioned to D0uninitialized sometime in the middle of
>> pci_pm_runtime_resume() before the call to pm->runtime_resume(dev);
>>
>>>
>>> I'm starting to wonder if we are looking at the correct place ;-) This
>>> reminds me that our PCIe SV people recently reported a couple of Linux
>>> related issues which they recommended to fix, and these are on my list
>>> but I'll share them because maybe they are related?
>>
>> Thanks for sharing those. We had a try with them but sorry to say no
>> improvements to the issue at hand.
>
> Okay, thanks for checking.
>
> Few additional side paths here, though. This is supposed to work so that
> once the host router sleep bit is set the driver is supposed to allow
> the domain to enter sleep (e.g it should not be waken up before it is
> fully transitioned). That's what we do:
>
> 1. All tunneled PCIe Root/Downstream ports are in D3.
> 2. All tunneled USB 3.x ports are in U3.
> 3. No DisplayPort is tunneled.
> 4. Thunderbolt driver enables wakes.
> 5. Thunderbolt driver writes sleep ready bit of the host router.
> 6. Thunderbolt driver runtime suspend is complete.
> 7. ACPI method is called (_PS3 or _PR3.OFF) that will trigger the "Sleep
> Event".
>
> If between 5 and 7 there is device connected, it should not "abort" the
> sequence. Unfortunately this is not explict in the USB4 spec but the
> connection manager guide has similar note. Even if the connect happens
> there the "Sleep Event" should happen but after that it can trigger
> normal wakeup which will then bring everything back.
>
> Would it be possible to enable tracing around these steps so that we
> could see if there is hotplug notification somewhere there that is not
> expected? Here are instructions how to get pretty accurate trace:
>
> https://github.com/intel/tbtools?tab=readme-ov-file#tracing
>
> Please also take full dmesg.
Sure, here is the dmesg with tracing enabled:
https://gist.github.com/superm1/5186e0023c8a5d2ecd75c50fd2168308
>
> It is entirely possible that this has nothing to do with the issue but I
> think it is worth checking.
>
> The second thing we could try is to check the wake status bits after
> this has happened, like:
>
> # tbdump -r 0 -a <ADAPTER> -vv -N 1 PORT_CS_18
>
> (where <ADAPTER> is the lane 0 adapter of the USB4 port the device was
> connected).
>
Unfortunately the adapter is in such a bad state at this time that
tbdump doesn't work.
> The third thing to try is to comment out TB_WAKE_ON_CONNECT in
> tb_switch_suspend(). This should result no wake even if the device is
> connected. This tells us that it is really the connect on USB4 port that
> triggered the wake.
Yup that's correct; there is no action on the hotplug with this change.
>
> These could (also) explain why the host router appears to be in D3 even
> if it should be in D0 already.
next prev parent reply other threads:[~2024-09-09 20:41 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-23 15:40 [PATCH v5 0/5] Verify devices transition from D3cold to D0 Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 1/5] PCI: Use an enum for reset type in pci_dev_wait() Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND " Mario Limonciello
2024-08-23 19:54 ` Bjorn Helgaas
2024-08-26 19:16 ` Mario Limonciello
2024-08-27 17:43 ` Mario Limonciello
2024-08-27 19:44 ` Bjorn Helgaas
2024-08-30 0:01 ` Bjorn Helgaas
2024-09-03 16:29 ` Mario Limonciello
2024-09-03 17:11 ` Bjorn Helgaas
2024-09-03 17:31 ` Mario Limonciello
2024-09-03 18:25 ` Bjorn Helgaas
2024-09-03 18:32 ` Mario Limonciello
2024-09-03 21:32 ` Bjorn Helgaas
2024-09-04 12:05 ` Mika Westerberg
2024-09-04 15:24 ` Mario Limonciello
2024-09-05 9:33 ` Mika Westerberg
2024-09-09 20:40 ` Mario Limonciello [this message]
2024-09-10 9:13 ` Mika Westerberg
2024-09-13 4:12 ` Mario Limonciello
2024-09-13 4:58 ` Mika Westerberg
2024-09-13 7:23 ` Mika Westerberg
2024-09-13 20:56 ` Mario Limonciello
2024-09-15 7:07 ` Mika Westerberg
2024-08-23 15:40 ` [PATCH v5 3/5] PCI: Verify functions currently in D3cold have entered D0 Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 4/5] PCI: Allow Ryzen XHCI controllers into D3cold and drop delays Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 5/5] PCI: Drop Radeon quirk for Macbook Pro 8.2 Mario Limonciello
2024-12-04 17:30 ` [PATCH v5 0/5] Verify devices transition from D3cold to D0 Mario Limonciello
2024-12-04 23:45 ` Bjorn Helgaas
2024-12-05 3:44 ` Mario Limonciello
2024-12-05 18:12 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b4237bef-809f-4d78-8a70-d962e7eb467b@amd.com \
--to=mario.limonciello@amd.com \
--cc=Gary.Li@amd.com \
--cc=bhelgaas@google.com \
--cc=drake@endlessos.org \
--cc=gregkh@linuxfoundation.org \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@intel.com \
--cc=mika.westerberg@linux.intel.com \
--cc=superm1@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).