From: "Christian König" <christian.koenig@amd.com>
To: Mario Limonciello <superm1@kernel.org>,
Bert Karwatzki <spasswolf@web.de>,
linux-kernel@vger.kernel.org
Cc: linux-next@vger.kernel.org, regressions@lists.linux.dev,
linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: [REGRESSION 00/04] Crash during resume of pcie bridge
Date: Tue, 14 Oct 2025 12:50:44 +0200 [thread overview]
Message-ID: <25f36fa7-d1d6-4b81-a42f-64c445d6f065@amd.com> (raw)
In-Reply-To: <4a8302a0-209f-446a-9825-36cb267c1718@kernel.org>
On 13.10.25 20:51, Mario Limonciello wrote:
> On 10/13/25 11:29 AM, Bert Karwatzki wrote:
>> Am Dienstag, dem 07.10.2025 um 16:33 -0500 schrieb Mario Limonciello:
>>>
>>> Can you still reproduce with amd_iommu=off?
>>
>> Reproducing this is at all is very difficult, so I'll try to find the exact spot
>> where things break (i.e. when the pci bus breaks and no more message are transmitted
>> via netconsole) first. The current state of this search is that the crash occurs in
>> pci_pm_runtime_resume(), before pci_fixup_device() is called:
>>
>
> One other (unfortunate) possibility is that the timing of this crash occurring is not deterministic.
Yeah, completely agree.
The exact spot where things break is actually pretty uninteresting I think. Background is that it is most likely not the spot which caused the issue.
Instead what happens is that something in the HW times out and you see a spontaneous reboot because of this.
I would rather try to narrow down which operation or combination of things is causing the issue.
Maybe also double check if runtime pm is actually working on the good kernel or if the issue might be that somebody fixed runtime pm and you are now seeing issues because you happen to have problematic HW which we need to add to the blacklist.
Regards,
Christian.
>
> As an idea for debugging this issue, do you think maybe using kdumpst [1] might be helpful to get more information on the state during the crash?
>
> Since NVME is missing you might need to boot off of USB or SD though so that kdumpst is able to save the vmcore out of RAM.
>
> Link: https://blogs.igalia.com/gpiccoli/2024/07/presenting-kdumpst-or-how-to-collect-kernel-crash-logs-on-arch-linux/ [1]
>> static int pci_pm_runtime_resume(struct device *dev)
>> {
>> struct pci_dev *pci_dev = to_pci_dev(dev);
>> const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>> pci_power_t prev_state = pci_dev->current_state;
>> int error = 0;
>> // dev_info(dev, "%s = %px\n", __func__, (void *) pci_pm_runtime_resume); // remove this so we don't get too much delay
>> // This was still printed in the case of a crash
>> // so the crash must happen below
>>
>> /*
>> * Restoring config space is necessary even if the device is not bound
>> * to a driver because although we left it in D0, it may have gone to
>> * D3cold when the bridge above it runtime suspended.
>> */
>> pci_pm_default_resume_early(pci_dev);
>> if (!strcmp(dev_name(dev), "0000:00:01.1")) // This is the current test.
>> dev_info(dev, "%s %d\n", __func__, __LINE__);
>> pci_resume_ptm(pci_dev);
>>
>> if (!pci_dev->driver)
>> return 0;
>>
>> //if (!strcmp(dev_name(dev), "0000:00:01.1")) // This was not printed when 6.17.0-rc6-next-20250917-gpudebug-00036-g4f7b4067c9ce
>> // dev_info(dev, "%s %d\n", __func__, __LINE__); // crashed, so the crash must happen above
>> pci_fixup_device(pci_fixup_resume_early, pci_dev);
>> pci_pm_default_resume(pci_dev);
>>
>> if (prev_state == PCI_D3cold)
>> pci_pm_bridge_power_up_actions(pci_dev);
>>
>> if (pm && pm->runtime_resume)
>> error = pm->runtime_resume(dev);
>>
>> return error;
>> }
>>
>>
>> Bert Karwatzki
>
next prev parent reply other threads:[~2025-10-14 10:50 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 12:09 [REGRESSION 00/04] Crash during resume of pcie bridge Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 01/04] " Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 02/04] " Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 03/04] " Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 04/04] " Bert Karwatzki
2025-10-06 12:39 ` [REGRESSION 00/04] " Christian König
2025-10-06 16:22 ` Bert Karwatzki
2025-10-07 6:50 ` Bert Karwatzki
2025-10-07 21:33 ` Mario Limonciello
2025-10-13 16:29 ` Bert Karwatzki
2025-10-13 18:51 ` Mario Limonciello
2025-10-14 10:50 ` Christian König [this message]
[not found] ` <1853e2af7f70cf726df278137b6d2d89d9d9dc82.camel@web.de>
2025-10-31 13:38 ` Bert Karwatzki
2025-10-31 13:47 ` Bert Karwatzki
2025-10-31 18:35 ` Bert Karwatzki
2025-11-05 11:44 ` Bert Karwatzki
2025-11-05 21:31 ` Mario Limonciello (AMD) (kernel.org)
2025-11-07 13:09 ` Bert Karwatzki
2025-11-07 17:09 ` Bert Karwatzki
2025-11-10 13:33 ` Christian König
2025-11-16 21:08 ` Crash during resume of pcie bridge due to infinite loop in ACPICA Bert Karwatzki
2025-11-17 16:40 ` Rafael J. Wysocki
2025-11-24 22:34 ` Bert Karwatzki
2025-11-25 19:46 ` Rafael J. Wysocki
2025-11-27 0:08 ` Bert Karwatzki
2025-11-27 13:02 ` Rafael J. Wysocki
2025-11-28 20:47 ` Bert Karwatzki
2025-12-02 18:59 ` Rafael J. Wysocki
2025-12-02 19:53 ` Bert Karwatzki
2025-12-02 20:01 ` Rafael J. Wysocki
2025-12-05 10:05 ` Crash during resume of pcie bridge due to incorrect error handling Bert Karwatzki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=25f36fa7-d1d6-4b81-a42f-64c445d6f065@amd.com \
--to=christian.koenig@amd.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=regressions@lists.linux.dev \
--cc=spasswolf@web.de \
--cc=superm1@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox