public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mario Limonciello <mario.limonciello@amd.com>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "Gary Li" <Gary.Li@amd.com>,
	"Mario Limonciello" <superm1@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Mathias Nyman" <mathias.nyman@intel.com>,
	"open list : PCI SUBSYSTEM" <linux-pci@vger.kernel.org>,
	"open list" <linux-kernel@vger.kernel.org>,
	"open list : USB XHCI DRIVER" <linux-usb@vger.kernel.org>,
	"Daniel Drake" <drake@endlessos.org>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()
Date: Wed, 4 Sep 2024 10:24:26 -0500	[thread overview]
Message-ID: <2bf715fb-509b-4b00-a28d-1cc83c0bb588@amd.com> (raw)
In-Reply-To: <20240904120545.GF1532424@black.fi.intel.com>

On 9/4/2024 07:05, Mika Westerberg wrote:
> Hi,
> 
> On Tue, Sep 03, 2024 at 01:32:30PM -0500, Mario Limonciello wrote:
>> On 9/3/2024 13:25, Bjorn Helgaas wrote:
>>> On Tue, Sep 03, 2024 at 12:31:00PM -0500, Mario Limonciello wrote:
>>>> On 9/3/2024 12:11, Bjorn Helgaas wrote:
>>>> ...
>>>
>>>>>      8) The USB4 stack sees the device and assumes it is in D0, but it
>>>>>      seems to still be in D3cold.  What is this based on?  Is there a
>>>>>      config read that returns ~0 data when it shouldn't?
>>>>
>>>> Yes there is.  From earlier in the thread I have a [log] I shared.
>>>>
>>>> The message emitted is from ring_interrupt_active():
>>>>
>>>> "thunderbolt 0000:e5:00.5: interrupt for TX ring 0 is already enabled"
>>>
>>> Right, that's in the cover letter, but I can't tell from this what the
>>> ioread32(ring->nhi->iobase + reg) returned.  It looks like this is an
>>> MMIO read of BAR 0, not a config read.
>>>
>>
>> Yeah.  I suppose another way to approach this problem is to make something
>> else in the call chain poll PCI_PM_CTRL.
>>
>> Polling at the start of nhi_runtime_resume() should also work.  For the
>> "normal" scenario it would just be a single read to PCI_PM_CTRL.
>>
>> Mika, thoughts?

We did this experiment to throw code to poll PCI_PM_CTRL at the start of 
nhi_runtime_resume() but this also fails.  From that I would hypothesize 
the device transitioned to D0uninitialized sometime in the middle of 
pci_pm_runtime_resume() before the call to pm->runtime_resume(dev);

> 
> I'm starting to wonder if we are looking at the correct place ;-) This
> reminds me that our PCIe SV people recently reported a couple of Linux
> related issues which they recommended to fix, and these are on my list
> but I'll share them because maybe they are related?

Thanks for sharing those.  We had a try with them but sorry to say no 
improvements to the issue at hand.

> 
> First problem, and actually a PCI spec violation, is that Linux does not
> clear Bus Master, MMIO and IO space enables when it programs the device
> to D3 on runtime suspend path. It does so on system sleep path though.
> Something like below (untested) should do that:
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index f412ef73a6e4..79a566376301 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1332,6 +1332,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
>   
>   	if (!pci_dev->state_saved) {
>   		pci_save_state(pci_dev);
> +		pci_pm_default_suspend(pci_dev);
>   		pci_finish_runtime_suspend(pci_dev);
>   	}
>   
> 
> The second thing is that Thunderbolt driver, for historical reasons,
> leaves the MSI enabled when entering D3. This too might be related. I
> think we can unconditionally disable it so below hack should do that
> (untested as well). I wonder if you could try if any of these or both
> can help here? Both of these issues can result unwanted events during D3
> entry as far as I understand.
> 
> diff --git a/drivers/thunderbolt/ctl.c b/drivers/thunderbolt/ctl.c
> index dc1f456736dc..73b815fbbceb 100644
> --- a/drivers/thunderbolt/ctl.c
> +++ b/drivers/thunderbolt/ctl.c
> @@ -659,12 +659,11 @@ struct tb_ctl *tb_ctl_alloc(struct tb_nhi *nhi, int index, int timeout_msec,
>   	if (!ctl->frame_pool)
>   		goto err;
>   
> -	ctl->tx = tb_ring_alloc_tx(nhi, 0, 10, RING_FLAG_NO_SUSPEND);
> +	ctl->tx = tb_ring_alloc_tx(nhi, 0, 10, 0);
>   	if (!ctl->tx)
>   		goto err;
>   
> -	ctl->rx = tb_ring_alloc_rx(nhi, 0, 10, RING_FLAG_NO_SUSPEND, 0, 0xffff,
> -				   0xffff, NULL, NULL);
> +	ctl->rx = tb_ring_alloc_rx(nhi, 0, 10, 0, 0, 0xffff, 0xffff, NULL, NULL);
>   	if (!ctl->rx)
>   		goto err;
>   


  reply	other threads:[~2024-09-04 15:24 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-23 15:40 [PATCH v5 0/5] Verify devices transition from D3cold to D0 Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 1/5] PCI: Use an enum for reset type in pci_dev_wait() Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND " Mario Limonciello
2024-08-23 19:54   ` Bjorn Helgaas
2024-08-26 19:16     ` Mario Limonciello
2024-08-27 17:43       ` Mario Limonciello
2024-08-27 19:44       ` Bjorn Helgaas
2024-08-30  0:01   ` Bjorn Helgaas
2024-09-03 16:29     ` Mario Limonciello
2024-09-03 17:11       ` Bjorn Helgaas
2024-09-03 17:31         ` Mario Limonciello
2024-09-03 18:25           ` Bjorn Helgaas
2024-09-03 18:32             ` Mario Limonciello
2024-09-03 21:32               ` Bjorn Helgaas
2024-09-04 12:05               ` Mika Westerberg
2024-09-04 15:24                 ` Mario Limonciello [this message]
2024-09-05  9:33                   ` Mika Westerberg
2024-09-09 20:40                     ` Mario Limonciello
2024-09-10  9:13                       ` Mika Westerberg
2024-09-13  4:12                         ` Mario Limonciello
2024-09-13  4:58                           ` Mika Westerberg
2024-09-13  7:23                             ` Mika Westerberg
2024-09-13 20:56                               ` Mario Limonciello
2024-09-15  7:07                                 ` Mika Westerberg
2024-08-23 15:40 ` [PATCH v5 3/5] PCI: Verify functions currently in D3cold have entered D0 Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 4/5] PCI: Allow Ryzen XHCI controllers into D3cold and drop delays Mario Limonciello
2024-08-23 15:40 ` [PATCH v5 5/5] PCI: Drop Radeon quirk for Macbook Pro 8.2 Mario Limonciello
2024-12-04 17:30 ` [PATCH v5 0/5] Verify devices transition from D3cold to D0 Mario Limonciello
2024-12-04 23:45   ` Bjorn Helgaas
2024-12-05  3:44     ` Mario Limonciello
2024-12-05 18:12       ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bf715fb-509b-4b00-a28d-1cc83c0bb588@amd.com \
    --to=mario.limonciello@amd.com \
    --cc=Gary.Li@amd.com \
    --cc=bhelgaas@google.com \
    --cc=drake@endlessos.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=mika.westerberg@linux.intel.com \
    --cc=superm1@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox