From: Mario Limonciello <superm1@kernel.org>
To: "Peter Zijlstra" <peterz@infradead.org>,
"Christian König" <christian.koenig@amd.com>
Cc: alexander.deucher@amd.com, Borislav Petkov <bp@alien8.de>,
amd-gfx@lists.freedesktop.org
Subject: Re: amdgpu vs kexec
Date: Tue, 17 Jun 2025 21:12:12 -0500 [thread overview]
Message-ID: <2bbcc44d-9079-4a73-ba6c-e93fdcb9cf6f@kernel.org> (raw)
In-Reply-To: <20250616145437.GG1613376@noisy.programming.kicks-ass.net>
On 6/16/2025 9:54 AM, Peter Zijlstra wrote:
> On Mon, Jun 16, 2025 at 01:51:21PM +0200, Christian König wrote:
>> Hi Peter,
>>
>> On 6/16/25 11:39, Peter Zijlstra wrote:
>>> Hi guys,
>>>
>>> My (Intel Sapphire Rapids) workstation has a RX 7800 XT and when I kexec
>>> a bunch of times, the amdgpu driver gets upset and barfs on boot.
>>
>> yeah, that is an "intentional" HW feature and yes you're certainly not
>> the first one to complain about it :(
>>
>> The PSP (platform security processor IIRC) is designed in such a way
>> that you can initialize it only once after a power cycle / hard reset
>> for security reasons (e.g. to not leak crypto keys used for digital
>> rights management etc..).
>>
>> On dGPUs we work around that manually by power cycling the ASIC when
>> that situation is detected during amdgpu load, but that unfortunately
>> doesn't work 100% reliable.
>
> Right.. hence the splats.
How about if we reset before the kexec? There is a symbol for drivers
to use to know they're about to go through kexec to do $THINGS.
Something like this:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 0fc0eeedc6461..2b1216b14d618 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -34,6 +34,7 @@
#include <linux/cc_platform.h>
#include <linux/dynamic_debug.h>
+#include <linux/kexec.h>
#include <linux/module.h>
#include <linux/mmu_notifier.h>
#include <linux/pm_runtime.h>
@@ -2544,6 +2545,9 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
adev->mp1_state = PP_MP1_STATE_UNLOAD;
amdgpu_device_ip_suspend(adev);
adev->mp1_state = PP_MP1_STATE_NONE;
+
+ if (kexec_in_progress)
+ amdgpu_asic_reset(adev);
}
static int amdgpu_pmops_prepare(struct device *dev)
>
>> On APUs the situation is even worse because the PSP is shared between
>> the GPU and the CPU.
>>
>> We have forwarded such complains internally for years, but there is
>> not much else Alex and I can do about it.
>
> Oh well. Thanks for the info!
>
next prev parent reply other threads:[~2025-06-18 2:12 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-16 9:39 amdgpu vs kexec Peter Zijlstra
2025-06-16 11:51 ` Christian König
2025-06-16 14:54 ` Peter Zijlstra
2025-06-18 2:12 ` Mario Limonciello [this message]
2025-06-18 8:51 ` Peter Zijlstra
2025-06-18 9:05 ` Christian König
2025-06-18 13:34 ` Mario Limonciello
2025-06-18 13:46 ` Alex Deucher
2025-06-18 9:12 ` Peter Zijlstra
2025-06-18 9:26 ` Peter Zijlstra
2025-06-18 13:35 ` Mario Limonciello
2025-06-20 10:39 ` Lazar, Lijo
2025-06-18 23:55 ` Baoquan He
2025-06-19 13:32 ` Mario Limonciello
2025-06-16 14:02 ` Lazar, Lijo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2bbcc44d-9079-4a73-ba6c-e93fdcb9cf6f@kernel.org \
--to=superm1@kernel.org \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=bp@alien8.de \
--cc=christian.koenig@amd.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.