Linux PCI subsystem development
 help / color / mirror / Atom feed
* [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state
@ 2022-10-21 10:17 Lukasz Majczak
  2022-10-21 11:19 ` Lukas Wunner
  2022-10-21 21:08 ` Bjorn Helgaas
  0 siblings, 2 replies; 5+ messages in thread
From: Lukasz Majczak @ 2022-10-21 10:17 UTC (permalink / raw)
  To: bhelgaas, Rajat Jain, Vidya Sagar; +Cc: upstream, linux-pci, LKML

Hi,

This a follow-up from a discussion from “[PATCH V2] PCI/ASPM:
Save/restore L1SS Capability for suspend/resume”
(https://lore.kernel.org/lkml/d3228b1f-8d12-bfab-4cba-6d93a6869f20@nvidia.com/t/)

While working with Vidya’s patch I have noticed that after
suspend/resume cycle on my Chromebook (Apollolake) PCIe bridge loses
its capabilities - the missing part is:

Capabilities: [200 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
  PortCommonModeRestoreTime=40us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
   T_CommonMode=40us LTR1.2_Threshold=98304ns
L1SubCtl2: T_PwrOn=60us

Digging more I’ve found out that entering D3Cold state causes this
issue (D3Hot seems to work fine).

With Vidya’s patch (all versions form V1 to V3) on upstream kernels
5.10/5.15  it was causing underlying device unavailable (in my case -
WiFi card) - the V4 (which was accepted and merged) works fine (I
guess thanks to “PCI/ASPM: Refactor L1 PM Substates Control Register
programming”) but the issue is still there - I mean now after
suspend/resume the underlying deceive works fine but mentioned
capabilities are still gone when using lspci -vvv.

I think with current code it does no harm to anyone, but just doing a
heads up about this.

Best regards,
Lukasz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state
  2022-10-21 10:17 [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state Lukasz Majczak
@ 2022-10-21 11:19 ` Lukas Wunner
  2022-10-21 12:33   ` Lukasz Majczak
  2022-10-21 21:08 ` Bjorn Helgaas
  1 sibling, 1 reply; 5+ messages in thread
From: Lukas Wunner @ 2022-10-21 11:19 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: bhelgaas, Rajat Jain, Vidya Sagar, upstream, linux-pci, LKML

On Fri, Oct 21, 2022 at 12:17:35PM +0200, Lukasz Majczak wrote:
> While working with Vidya???s patch I have noticed that after
> suspend/resume cycle on my Chromebook (Apollolake) PCIe bridge loses
> its capabilities - the missing part is:
> 
> Capabilities: [200 v1] L1 PM Substates
> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>   PortCommonModeRestoreTime=40us PortTPowerOnTime=10us
> L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
>    T_CommonMode=40us LTR1.2_Threshold=98304ns
> L1SubCtl2: T_PwrOn=60us
> 
> Digging more I???ve found out that entering D3Cold state causes this

You mean the capability is gone from lspci after D3cold?

My understanding is that BIOS is responsible for populating config space.
So this sounds like a BIOS bug.  What's the BIOS vendor and version?
(dmesg | grep DMI)

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state
  2022-10-21 11:19 ` Lukas Wunner
@ 2022-10-21 12:33   ` Lukasz Majczak
       [not found]     ` <CAOs-w0KRYh-=gTb0Ed5iYAMs92AYtV_oEei5OgezgKGfwfiBYg@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Lukasz Majczak @ 2022-10-21 12:33 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: bhelgaas, Rajat Jain, Vidya Sagar, upstream, linux-pci, LKML

pt., 21 paź 2022 o 13:19 Lukas Wunner <lukas@wunner.de> napisał(a):
>
> On Fri, Oct 21, 2022 at 12:17:35PM +0200, Lukasz Majczak wrote:
> > While working with Vidya???s patch I have noticed that after
> > suspend/resume cycle on my Chromebook (Apollolake) PCIe bridge loses
> > its capabilities - the missing part is:
> >
> > Capabilities: [200 v1] L1 PM Substates
> > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> >   PortCommonModeRestoreTime=40us PortTPowerOnTime=10us
> > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> >    T_CommonMode=40us LTR1.2_Threshold=98304ns
> > L1SubCtl2: T_PwrOn=60us
> >
> > Digging more I???ve found out that entering D3Cold state causes this
>
> You mean the capability is gone from lspci after D3cold?
>
> My understanding is that BIOS is responsible for populating config space.
> So this sounds like a BIOS bug.  What's the BIOS vendor and version?
> (dmesg | grep DMI)
>
> Thanks,
>
> Lukas

Hi Lukasz

here is the DMI

localhost ~ # dmesg | grep DMI
[    0.000000] DMI: Google Coral/Coral, BIOS Google_Coral.10068.81.0 11/27/2018
[    0.155420] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.447820] [drm] DMI info: DMI_BIOS_VENDOR coreboot
[    0.447828] [drm] DMI info: DMI_BIOS_VERSION Google_Coral.10068.81.0
[    0.447832] [drm] DMI info: DMI_BIOS_DATE 11/27/2018
[    0.447835] [drm] DMI info: DMI_BIOS_RELEASE 4.0
[    0.447838] [drm] DMI info: DMI_SYS_VENDOR Google
[    0.447841] [drm] DMI info: DMI_PRODUCT_NAME Coral
[    0.447844] [drm] DMI info: DMI_PRODUCT_VERSION rev3
[    0.447848] [drm] DMI info: DMI_PRODUCT_FAMILY Google_Coral

Yes, you are right and in our internal discussion the vendor (Intel)
has proposed a firmware patch, although I couldn't verified that the
issue is limited only to the given firmware/bios, so decided to send
this email.

Best regards,
Lukasz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state
       [not found]     ` <CAOs-w0KRYh-=gTb0Ed5iYAMs92AYtV_oEei5OgezgKGfwfiBYg@mail.gmail.com>
@ 2022-10-21 15:40       ` Radosław Biernacki
  0 siblings, 0 replies; 5+ messages in thread
From: Radosław Biernacki @ 2022-10-21 15:40 UTC (permalink / raw)
  To: Lukasz Majczak, Vidya Sagar
  Cc: Lukas Wunner, bhelgaas, Rajat Jain, upstream, linux-pci, LKML

>> pt., 21 paź 2022 o 13:19 Lukas Wunner <lukas@wunner.de> napisał(a):
>> >
>> > On Fri, Oct 21, 2022 at 12:17:35PM +0200, Lukasz Majczak wrote:
>> > > While working with Vidya???s patch I have noticed that after
>> > > suspend/resume cycle on my Chromebook (Apollolake) PCIe bridge loses
>> > > its capabilities - the missing part is:
>> > >
>> > > Capabilities: [200 v1] L1 PM Substates
>> > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>> > >   PortCommonModeRestoreTime=40us PortTPowerOnTime=10us
>> > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
>> > >    T_CommonMode=40us LTR1.2_Threshold=98304ns
>> > > L1SubCtl2: T_PwrOn=60us
>> > >
>> > > Digging more I???ve found out that entering D3Cold state causes this
>> >
>> > You mean the capability is gone from lspci after D3cold?
>> >
>> > My understanding is that BIOS is responsible for populating config space.
>> > So this sounds like a BIOS bug.  What's the BIOS vendor and version?
>> > (dmesg | grep DMI)
>> >
>> > Thanks,
>> >
>> > Lukas
>>
>> Hi Lukasz
>>
>> here is the DMI
>>
>> localhost ~ # dmesg | grep DMI
>> [    0.000000] DMI: Google Coral/Coral, BIOS Google_Coral.10068.81.0 11/27/2018
>> [    0.155420] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
>> [    0.447820] [drm] DMI info: DMI_BIOS_VENDOR coreboot
>> [    0.447828] [drm] DMI info: DMI_BIOS_VERSION Google_Coral.10068.81.0
>> [    0.447832] [drm] DMI info: DMI_BIOS_DATE 11/27/2018
>> [    0.447835] [drm] DMI info: DMI_BIOS_RELEASE 4.0
>> [    0.447838] [drm] DMI info: DMI_SYS_VENDOR Google
>> [    0.447841] [drm] DMI info: DMI_PRODUCT_NAME Coral
>> [    0.447844] [drm] DMI info: DMI_PRODUCT_VERSION rev3
>> [    0.447848] [drm] DMI info: DMI_PRODUCT_FAMILY Google_Coral
>>
>> Yes, you are right and in our internal discussion the vendor (Intel)
>> has proposed a firmware patch, although I couldn't verified that the
>> issue is limited only to the given firmware/bios, so decided to send
>> this email.
>>
>> Best regards,
>> Lukasz

Lukasz, Vidya, is the change in behaviour in V4 intentional fix for
mentioned problems with missing devices after D3cold or unintentional
side effects.
Or from another angle, can we base on this behaviour as a hotfix for
problems with missing devices?

As far as I understand we probably still should update FW in the fleet
of devices, right?

ps: Sorry for top-posting in the previous email, I forgot to switch my
gmail client.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state
  2022-10-21 10:17 [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state Lukasz Majczak
  2022-10-21 11:19 ` Lukas Wunner
@ 2022-10-21 21:08 ` Bjorn Helgaas
  1 sibling, 0 replies; 5+ messages in thread
From: Bjorn Helgaas @ 2022-10-21 21:08 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: bhelgaas, Rajat Jain, Vidya Sagar, upstream, linux-pci, LKML,
	Radosław Biernacki

[+cc Radosław]

On Fri, Oct 21, 2022 at 12:17:35PM +0200, Lukasz Majczak wrote:
> Hi,
> 
> This a follow-up from a discussion from “[PATCH V2] PCI/ASPM:
> Save/restore L1SS Capability for suspend/resume”
> (https://lore.kernel.org/lkml/d3228b1f-8d12-bfab-4cba-6d93a6869f20@nvidia.com/t/)
> 
> While working with Vidya’s patch I have noticed that after
> suspend/resume cycle on my Chromebook (Apollolake) PCIe bridge loses
> its capabilities - the missing part is:
> 
> Capabilities: [200 v1] L1 PM Substates
> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>   PortCommonModeRestoreTime=40us PortTPowerOnTime=10us
> L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
>    T_CommonMode=40us LTR1.2_Threshold=98304ns
> L1SubCtl2: T_PwrOn=60us
> 
> Digging more I’ve found out that entering D3Cold state causes this
> issue (D3Hot seems to work fine).
> 
> With Vidya’s patch (all versions form V1 to V3) on upstream kernels
> 5.10/5.15  it was causing underlying device unavailable (in my case -
> WiFi card) - the V4 (which was accepted and merged) works fine (I
> guess thanks to “PCI/ASPM: Refactor L1 PM Substates Control Register
> programming”) but the issue is still there - I mean now after
> suspend/resume the underlying deceive works fine but mentioned
> capabilities are still gone when using lspci -vvv.
> 
> I think with current code it does no harm to anyone, but just doing a
> heads up about this.

Thanks a lot for following up on this!  Tell me if I have this right:

  - After a fresh boot, the Root Port at 00:14.0 [8086:5ad6] has an L1
    PM Substates Capability [per 1,2].

  - You suspend and resume the system.

  - After resume, 00:14.0 no longer has an L1 PM Substates Capability,
    as in [2].

  - The 00:14.0 Root Port leads to an iwlwifi device at 01:00.0, and
    the wifi device works fine after resume.

  - On the 01:00.0 iwlwifi device, lspci -vv still shows L1.1 and L1.2
    enabled after resume, as it did in [2].

If substates are enabled at iwlwifi but not at the Root Port, that
would not be a valid scenario per spec.  Per PCIe r6.0, sec 5.5.4:

  An L1 PM Substate enable bit must only be Set in the Upstream and
  Downstream Ports on a Link when the corresponding supported
  capability bit is Set by both the Upstream and Downstream Ports on
  that Link, otherwise the behavior is undefined.

So I don't know whether the L1.s states would still actually work.
(Is there any way to tell whether the iwlwifi power consumption
changes after the suspend/resume?  Maybe powertop?)

And ASPM configuration, e.g., disabling/enabling substates via the
sysfs "l1_1_aspm" and "l1_2_aspm" files probably won't work right.

Bjorn

[1] https://lore.kernel.org/lkml/20220722174212.GA1911979@bhelgaas/
[2] https://gist.github.com/semihalf-majczak-lukasz/fb36dfa2eff22911109dfb91ab0fc0e3

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-21 21:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-21 10:17 [BUG] Intel Apollolake: PCIe bridge "loses" capabilities after entering D3Cold state Lukasz Majczak
2022-10-21 11:19 ` Lukas Wunner
2022-10-21 12:33   ` Lukasz Majczak
     [not found]     ` <CAOs-w0KRYh-=gTb0Ed5iYAMs92AYtV_oEei5OgezgKGfwfiBYg@mail.gmail.com>
2022-10-21 15:40       ` Radosław Biernacki
2022-10-21 21:08 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox