From: "Christian König" <christian.koenig@amd.com>
To: Thomas Gleixner <tglx@kernel.org>,
Bert Karwatzki <spasswolf@web.de>,
linux-kernel@vger.kernel.org
Cc: linux-next@vger.kernel.org,
Mario Limonciello <mario.limonciello@amd.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Clark Williams <clrkwllms@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
regressions@lists.linux.dev, linux-pci@vger.kernel.org,
linux-acpi@vger.kernel.org,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
acpica-devel@lists.linux.dev,
Robert Moore <robert.moore@intel.com>,
Saket Dumbre <saket.dumbre@intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
Clemens Ladisch <clemens@ladisch.de>,
Jinchao Wang <wangjinchao600@gmail.com>,
Yury Norov <yury.norov@gmail.com>,
Anna Schumaker <anna.schumaker@oracle.com>,
Baoquan He <bhe@redhat.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Dave Young <dyoung@redhat.com>,
Doug Anderson <dianders@chromium.org>,
"Guilherme G. Piccoli" <gpiccoli@igalia.com>,
Helge Deller <deller@gmx.de>, Ingo Molnar <mingo@kernel.org>,
Jason Gunthorpe <jgg@ziepe.ca>,
Joanthan Cameron <Jonathan.Cameron@huawei.com>,
Joel Granados <joel.granados@kernel.org>,
John Ogness <john.ogness@linutronix.de>,
Kees Cook <kees@kernel.org>, Li Huafei <lihuafei1@huawei.com>,
"Luck, Tony" <tony.luck@intel.com>,
Luo Gengkun <luogengkun@huaweicloud.com>,
Max Kellermann <max.kellermann@ionos.com>,
Nam Cao <namcao@linutronix.de>,
oushixiong <oushixiong@kylinos.cn>,
Petr Mladek <pmladek@suse.com>,
Qianqiang Liu <qianqiang.liu@163.com>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
Sohil Mehta <sohil.mehta@intel.com>, Tejun Heo <tj@kernel.org>,
Thomas Zimemrmann <tzimmermann@suse.de>,
Thorsten Blum <thorsten.blum@linux.dev>,
Ville Syrjala <ville.syrjala@linux.intel.com>,
Vivek Goyal <vgoyal@redhat.com>,
Yunhui Cui <cuiyunhui@bytedance.com>,
Andrew Morton <akpm@linux-foundation.org>,
W_Armin@gmx.de
Subject: Re: crash during resume of PCIe bridge from v5.17 to next-20260130 (v5.16 works)
Date: Mon, 2 Feb 2026 11:37:12 +0100 [thread overview]
Message-ID: <1331d331-7056-4a32-a69d-e4556bb117b0@amd.com> (raw)
In-Reply-To: <87a4xs2z6i.ffs@tglx>
On 2/1/26 17:42, Thomas Gleixner wrote:
> On Sun, Feb 01 2026 at 01:36, Bert Karwatzki wrote:
>> I found the error, the commit
>> ("drm/amd: Check if ASPM is enabled from PCIe subsystem")
>> has been applied twice first as cba07cce39ac and a second time
>> as 7294863a6f01 after it had been superseeded by commit
>> 0ab5d711ec74 ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
>> This effectively disables ASPM globally after the built-in GPU (which does not
>> support ASPM) is probed. This is the reason for the crashes and loss of devices
>> errors which on average occur after ~1000 resumes of the discrete GPU.
>
> Wow. Nice detective work...
Good catch, indeed.
But it is not clear to me why disabling ASPM causes trouble, usually it is the other way around.
Regards,
Christian.
prev parent reply other threads:[~2026-02-02 10:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-13 9:41 NMI stack overflow during resume of PCIe bridge with CONFIG_HARDLOCKUP_DETECTOR=y Bert Karwatzki
2026-01-13 15:24 ` Thomas Gleixner
2026-01-13 17:50 ` Bert Karwatzki
2026-01-13 19:30 ` Thomas Gleixner
2026-01-13 21:15 ` Jason Gunthorpe
2026-01-13 22:19 ` Bert Karwatzki
2026-01-20 10:27 ` crash during resume of PCIe bridge in v5.17 (v5.16 works) Bert Karwatzki
2026-02-01 0:36 ` crash during resume of PCIe bridge from v5.17 to next-20260130 " Bert Karwatzki
2026-02-01 10:19 ` Armin Wolf
2026-02-01 11:42 ` Rafael J. Wysocki
2026-02-01 16:42 ` Thomas Gleixner
2026-02-02 10:37 ` Christian König [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1331d331-7056-4a32-a69d-e4556bb117b0@amd.com \
--to=christian.koenig@amd.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=W_Armin@gmx.de \
--cc=acpica-devel@lists.linux.dev \
--cc=akpm@linux-foundation.org \
--cc=anna.schumaker@oracle.com \
--cc=bhe@redhat.com \
--cc=bhelgaas@google.com \
--cc=bigeasy@linutronix.de \
--cc=clemens@ladisch.de \
--cc=clrkwllms@kernel.org \
--cc=cuiyunhui@bytedance.com \
--cc=deller@gmx.de \
--cc=dianders@chromium.org \
--cc=djwong@kernel.org \
--cc=dyoung@redhat.com \
--cc=gpiccoli@igalia.com \
--cc=jgg@ziepe.ca \
--cc=joel.granados@kernel.org \
--cc=john.ogness@linutronix.de \
--cc=kees@kernel.org \
--cc=lihuafei1@huawei.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=luogengkun@huaweicloud.com \
--cc=mario.limonciello@amd.com \
--cc=max.kellermann@ionos.com \
--cc=mingo@kernel.org \
--cc=namcao@linutronix.de \
--cc=oushixiong@kylinos.cn \
--cc=pmladek@suse.com \
--cc=qianqiang.liu@163.com \
--cc=rafael.j.wysocki@intel.com \
--cc=regressions@lists.linux.dev \
--cc=robert.moore@intel.com \
--cc=rostedt@goodmis.org \
--cc=saket.dumbre@intel.com \
--cc=senozhatsky@chromium.org \
--cc=sohil.mehta@intel.com \
--cc=spasswolf@web.de \
--cc=tglx@kernel.org \
--cc=thorsten.blum@linux.dev \
--cc=tj@kernel.org \
--cc=tony.luck@intel.com \
--cc=tzimmermann@suse.de \
--cc=vgoyal@redhat.com \
--cc=ville.syrjala@linux.intel.com \
--cc=wangjinchao600@gmail.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox