From: Mario Limonciello <mario.limonciello@amd.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Joyful Lee" <joy@joyfullee.me>,
platform-driver-x86@vger.kernel.org,
"Shyam Sundar S K" <Shyam-sundar.S-k@amd.com>,
"Hans de Goede" <hansg@kernel.org>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
linux-pci@vger.kernel.org, "Bjorn Helgaas" <bhelgaas@google.com>,
linux-acpi@vger.kernel.org,
"Rafael J. Wysocki" <rafael@kernel.org>,
linux-kernel@vger.kernel.org, "Lukas Wunner" <lukas@wunner.de>
Subject: Re: [BUG] ASUS ProArt PX13 HN7306WU: amd_pmc s2idle S0ix corrupts AMD 1022:150b root port, NVIDIA dGPU returns header type 7f
Date: Fri, 3 Apr 2026 13:41:04 -0500 [thread overview]
Message-ID: <a24fd871-2589-46eb-8117-ec8cfa5f720d@amd.com> (raw)
In-Reply-To: <20260403180422.GA341023@bhelgaas>
On 4/3/26 1:04 PM, Bjorn Helgaas wrote:
> [+cc Lukas, pciehp expert, beginning of thread with full dmesg/lspci at
> https://lore.kernel.org/all/CADj6jrgK+sRXoNaYH90Rdc8DYEFK2iSF4vkJC=KE4UaZ73y67A@mail.gmail.com]
>
> On Fri, Apr 03, 2026 at 11:48:15AM -0500, Mario Limonciello wrote:
>> On 4/3/26 11:19 AM, Joyful Lee wrote:
>>> On Fri, Apr 3, 2026 at 10:25 AM Mario Limonciello
>>> <mario.limonciello@amd.com> wrote:
>>>> That's really unfortunate to hear. If I was in your shoes - If not
>>>> solved by the end of the return period I would return the machine and
>>>> purchase one from a vendor that has been testing, fixing BIOS issues and
>>>> supporting Linux.
>>>>
>>>> I'm not going to pick favorites, but Dell, Framework, HP, and Lenovo all
>>>> have offerings that they do this.
>>>>
>>>> By chance does the BIOS offer access to "AMD PBS" or "AMD CBS" menus?
>>>> If so, there may be an option nestled in there for dGPU D3 behavior.
>>>
>>> I got it working. My kernel, which I build up from defconfig, was
>>> missing CONFIG_HOTPLUG_PCI_PCIE. It makes sense, but I wish there was
>>> more evidence that pointed me to this option. At least we can close the
>>> loop here in case anyone else runs into this problem. As a side benefit,
>>> enabling this option also got the dGPU to enter D3cold where before the
>>> lowest it would get is D3hot.
>>
>> That's great news! I'll add a check to flag this in amd-debug-tools too to
>> help anyone else in the future.
>
> That is indeed great news.
>
> But as you point out, it doesn't close the issue. Somebody else is
> going to trip over the same issue. Most likely they will not report
> it and have no idea how to fix it. Even if they do report it, we'll
> have to go through this whole debug process again.
>
> The kernel should work correctly (possibly with increased power
> consumption or some other non-functional issue) regardless of whether
> CONFIG_HOTPLUG_PCI_PCIE is enabled.
I do hope as part of this we can reconsider why CONFIG_HOTPLUG_PCI_PCIE
isn't part of the defconfig in the first place.
defconfig doesn't work on any hardware of mine by default and it's too
much work to figure out what to add to it. So I always start at distro
configs and peel back for my own use.
But, if we could actually make defconfig *usable* for general purpose
kernel users maybe more people would use it.
>
> How can we make Linux smart enough that if we're lacking pciehp or
> whatever is necessary, we automatically avoid s2idle or S0ix or
> whatever causes this problem?
I suppose we /could/ have CONFIG_AMD_PMC depend on
CONFIG_HOTPLUG_PCI_PCIE but it feels like using super glue on a wound
until we know why this happens.
>
> Here are the obvious clues in dmesg during resume from S0ix:
>
> pci 0000:00:03.1: [1022:150b] type 01 class 0x060400 PCIe Root Port
> pci 0000:00:03.1: PCI bridge to [bus c4]
> pci 0000:c4:00.0: [10de:28a1] type 00 class 0x030000 PCIe Legacy Endpoint
> pci 0000:c4:00.1: [10de:22be] type 00 class 0x040300 PCIe Endpoint
> pci 0000:c4:00.1: extending delay after power-on from D3hot to 20 msec
> pci 0000:c4:00.1: D0 power state depends on 0000:c4:00.0
> pci 0000:c4:00.0: Unable to change power state from D0 to D0, device inaccessible
> snd_hda_intel 0000:c4:00.1: Unable to change power state from D3hot to D0, device inaccessible
>
> It looks like something is wrong with the 00:03.1 Root Port config
> space after S0ix, e.g., the HwInit Port Number is non-sensical:
>
> 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Strix/Strix Halo GPP Bridge
> - LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM not supported
> + LnkCap: Port #247, Speed 16GT/s, Width x8, ASPM not supported
>
> but the c4:00 device below it seems completely inaccessible; maybe the
> link is down or the endpoint is in D3cold so config reads return ~0:
>
> c4:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD107M
> + !!! Unknown header type 7f
> + Interrupt: pin ? routed to IRQ 255
>
> I don't know what pciehp is doing that avoids this issue.
> Understanding that seems like the first step in avoiding or fixing the
> problem.
If I was to guess what's happening here is the firmware* has never been
tested with the PcieHotplug _OSC negotiation failing and there is an
implicit assumption on that working. All Windows testing has that in
place, and our internal Linux testing has always been on kernels with it
too (see defconfig comment above).
>
> Joyful, could you collect a dmesg log with pciehp enabled and with
> this kernel parameter (the quotes are a required part of the
> parameter):
>
> dyndbg="file drivers/pci/* +p"
* I don't know if this is an ASUS firmware or AMD (AGESA) firmware issue.
next prev parent reply other threads:[~2026-04-03 18:41 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-02 18:49 [BUG] ASUS ProArt PX13 HN7306WU: amd_pmc s2idle S0ix corrupts AMD 1022:150b root port, NVIDIA dGPU returns header type 7f Joyful Lee
2026-04-02 19:58 ` Mario Limonciello
2026-04-02 20:59 ` Joyful Lee
2026-04-02 21:14 ` Mario Limonciello
2026-04-02 21:32 ` Joyful Lee
2026-04-03 13:23 ` Mario Limonciello
2026-04-03 14:11 ` Joyful Lee
2026-04-03 14:24 ` Mario Limonciello
2026-04-03 16:19 ` Joyful Lee
2026-04-03 16:48 ` Mario Limonciello
2026-04-03 18:04 ` Bjorn Helgaas
2026-04-03 18:41 ` Mario Limonciello [this message]
2026-04-03 19:10 ` Bjorn Helgaas
2026-04-03 19:11 ` Mario Limonciello
2026-04-03 19:07 ` Joyful Lee
2026-04-03 20:02 ` Mario Limonciello
2026-04-03 20:47 ` Joyful Lee
2026-04-03 21:05 ` Mario Limonciello
2026-04-03 21:28 ` Mario Limonciello
2026-04-05 7:16 ` Lukas Wunner
2026-04-05 7:05 ` Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a24fd871-2589-46eb-8117-ec8cfa5f720d@amd.com \
--to=mario.limonciello@amd.com \
--cc=Shyam-sundar.S-k@amd.com \
--cc=bhelgaas@google.com \
--cc=hansg@kernel.org \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=joy@joyfullee.me \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=platform-driver-x86@vger.kernel.org \
--cc=rafael@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox