From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Ahmad Rehman <Ahmad.Rehman@amd.com>,
Felix Kuehling <Felix.Kuehling@amd.com>,
Alex Deucher <alexander.deucher@amd.com>,
Sasha Levin <sashal@kernel.org>,
christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@gmail.com,
daniel@ffwll.ch, mario.limonciello@amd.com, lijo.lazar@amd.com,
le.ma@amd.com, srinivasan.shanmugam@amd.com,
andrealmeid@igalia.com, Jun.Ma2@amd.com, James.Zhu@amd.com,
hamza.mahfooz@amd.com, aurabindo.pillai@amd.com,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.8 28/28] drm/amdgpu: Init zone device and drm client after mode-1 reset on reload
Date: Wed, 3 Apr 2024 13:16:30 -0400 [thread overview]
Message-ID: <20240403171656.335224-28-sashal@kernel.org> (raw)
In-Reply-To: <20240403171656.335224-1-sashal@kernel.org>
From: Ahmad Rehman <Ahmad.Rehman@amd.com>
[ Upstream commit f679fd6057fbf5ab34aaee28d58b7f81af0cbf48 ]
In passthrough environment, when amdgpu is reloaded after unload, mode-1
is triggered after initializing the necessary IPs, That init does not
include KFD, and KFD init waits until the reset is completed. KFD init
is called in the reset handler, but in this case, the zone device and
drm client is not initialized, causing app to create kernel panic.
v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create.
As the previous version has the potential of creating DRM client twice.
v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA
before SDMA init. Adding the condition to in drm client creation, on top of v1,
to guard against drm client creation call multiple times.
Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 ++++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 41db030ddc4ee..131983ed43465 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -146,7 +146,7 @@ int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev)
{
int ret;
- if (!adev->kfd.init_complete)
+ if (!adev->kfd.init_complete || adev->kfd.client.dev)
return 0;
ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 586f4d03039df..64b1bb2404242 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2451,8 +2451,11 @@ static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work)
}
for (i = 0; i < mgpu_info.num_dgpu; i++) {
adev = mgpu_info.gpu_ins[i].adev;
- if (!adev->kfd.init_complete)
+ if (!adev->kfd.init_complete) {
+ kgd2kfd_init_zone_device(adev);
amdgpu_amdkfd_device_init(adev);
+ amdgpu_amdkfd_drm_client_create(adev);
+ }
amdgpu_ttm_set_buffer_funcs_status(adev, true);
}
}
--
2.43.0
prev parent reply other threads:[~2024-04-03 17:17 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-03 17:16 [PATCH AUTOSEL 6.8 01/28] tools: iio: replace seekdir() in iio_generic_buffer Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 02/28] usb: storage: sddr55: fix sloppy typing in sddr55_{read|write}_data() Sasha Levin
2024-04-03 18:10 ` Sergey Shtylyov
2024-04-08 2:11 ` Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 03/28] serial: qcom-geni: Don't cancel/abort if we can't get the port lock Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 04/28] bus: mhi: host: Add MHI_PM_SYS_ERR_FAIL state Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 05/28] kernfs: RCU protect kernfs_nodes and avoid kernfs_idr_lock in kernfs_find_and_get_node_by_id() Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 06/28] usb: typec: ucsi: Add qcm6490-pmic-glink as needing PDOS quirk Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 07/28] thunderbolt: Calculate DisplayPort tunnel bandwidth after DPRX capabilities read Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 08/28] usb: gadget: uvc: refactor the check for a valid buffer in the pump worker Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 09/28] usb: gadget: uvc: mark incomplete frames with UVC_STREAM_ERR Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 10/28] usb: typec: ucsi: Limit read size on v1.2 Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 11/28] serial: 8250_of: Drop quirk fot NPCM from 8250_port Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 12/28] thunderbolt: Keep the domain powered when USB4 port is in redrive mode Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 13/28] usb: typec: tcpci: add generic tcpci fallback compatible Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 14/28] usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 15/28] ASoC: amd: yc: Fix non-functional mic on ASUS M7600RE Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 16/28] thermal/of: Assume polling-delay(-passive) 0 when absent Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 17/28] ASoC: soc-core.c: Skip dummy codec when adding platforms Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 18/28] x86/xen: attempt to inflate the memory balloon on PVH Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 19/28] ASoC: amd: yc: Revert "Fix non-functional mic on Lenovo 21J2" Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 20/28] ASoC: amd: yc: Revert "add new YC platform variant (0x63) support" Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 21/28] fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2 Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 22/28] io_uring: clear opcode specific data for an early failure Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 23/28] modpost: fix null pointer dereference Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 24/28] drivers/nvme: Add quirks for device 126f:2262 Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 25/28] fbmon: prevent division by zero in fb_videomode_from_videomode() Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 26/28] ALSA: hda/realtek: Add quirks for some Clevo laptops Sasha Levin
2024-04-03 17:16 ` [PATCH AUTOSEL 6.8 27/28] drm/amdgpu: fix use-after-free bug Sasha Levin
2024-04-03 17:16 ` Sasha Levin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240403171656.335224-28-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=Ahmad.Rehman@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=James.Zhu@amd.com \
--cc=Jun.Ma2@amd.com \
--cc=Xinhui.Pan@amd.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrealmeid@igalia.com \
--cc=aurabindo.pillai@amd.com \
--cc=christian.koenig@amd.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=hamza.mahfooz@amd.com \
--cc=le.ma@amd.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=srinivasan.shanmugam@amd.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox