From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Sasha Levin <sashal@kernel.org>,
katrinzhou@tencent.com, Jack.Gui@amd.com,
Guchun Chen <guchun.chen@amd.com>,
Longlong Yao <Longlong.Yao@amd.com>,
Feifei Xu <Feifei.Xu@amd.com>,
dri-devel@lists.freedesktop.org, Xinhui.Pan@amd.com,
amd-gfx@lists.freedesktop.org, YiPeng.Chai@amd.com,
mario.limonciello@amd.com, daniel@ffwll.ch, Lyndon.Li@amd.com,
Alex Deucher <alexander.deucher@amd.com>,
candice.li@amd.com, airlied@gmail.com, christian.koenig@amd.com,
Hawking.Zhang@amd.com
Subject: [PATCH AUTOSEL 6.1 02/41] drm/amdgpu: fix calltrace warning in amddrm_buddy_fini
Date: Sun, 23 Jul 2023 21:20:35 -0400 [thread overview]
Message-ID: <20230724012118.2316073-2-sashal@kernel.org> (raw)
In-Reply-To: <20230724012118.2316073-1-sashal@kernel.org>
From: Longlong Yao <Longlong.Yao@amd.com>
[ Upstream commit 01382501509871d0799bab6bd412c228486af5bf ]
The following call trace is observed when removing the amdgpu driver, which
is caused by that BOs allocated for psp are not freed until removing.
[61811.450562] RIP: 0010:amddrm_buddy_fini.cold+0x29/0x47 [amddrm_buddy]
[61811.450577] Call Trace:
[61811.450577] <TASK>
[61811.450579] amdgpu_vram_mgr_fini+0x135/0x1c0 [amdgpu]
[61811.450728] amdgpu_ttm_fini+0x207/0x290 [amdgpu]
[61811.450870] amdgpu_bo_fini+0x27/0xa0 [amdgpu]
[61811.451012] gmc_v9_0_sw_fini+0x4a/0x60 [amdgpu]
[61811.451166] amdgpu_device_fini_sw+0x117/0x520 [amdgpu]
[61811.451306] amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[61811.451447] devm_drm_dev_init_release+0x4d/0x80 [drm]
[61811.451466] devm_action_release+0x15/0x20
[61811.451469] release_nodes+0x40/0xb0
[61811.451471] devres_release_all+0x9b/0xd0
[61811.451473] __device_release_driver+0x1bb/0x2a0
[61811.451476] driver_detach+0xf3/0x140
[61811.451479] bus_remove_driver+0x6c/0xf0
[61811.451481] driver_unregister+0x31/0x60
[61811.451483] pci_unregister_driver+0x40/0x90
[61811.451486] amdgpu_exit+0x15/0x447 [amdgpu]
For smu v13_0_2, if the GPU supports xgmi, refer to
commit f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2"),
it will run gpu recover in AMDGPU_RESET_FOR_DEVICE_REMOVE mode when removing,
which makes all devices in hive list have hw reset but no resume except the
basic ip blocks, then other ip blocks will not call .hw_fini according to
ip_block.status.hw.
Since psp_free_shared_bufs just includes some software operations, so move
it to psp_sw_fini.
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a3cd816f98a14..9e6719a561587 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -514,6 +514,8 @@ static int psp_sw_fini(void *handle)
kfree(cmd);
cmd = NULL;
+ psp_free_shared_bufs(psp);
+
if (psp->km_ring.ring_mem)
amdgpu_bo_free_kernel(&adev->firmware.rbuf,
&psp->km_ring.ring_mem_mc_addr,
@@ -2671,8 +2673,6 @@ static int psp_hw_fini(void *handle)
psp_ring_destroy(psp, PSP_RING_TYPE__KM);
- psp_free_shared_bufs(psp);
-
return 0;
}
--
2.39.2
next prev parent reply other threads:[~2023-07-24 1:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-24 1:20 [PATCH AUTOSEL 6.1 01/41] drm/amd/display: Do not set drr on pipe commit Sasha Levin
2023-07-24 1:20 ` Sasha Levin [this message]
2023-07-24 1:20 ` [PATCH AUTOSEL 6.1 03/41] drm/radeon: Fix integer overflow in radeon_cs_parser_init Sasha Levin
2023-07-24 1:20 ` [PATCH AUTOSEL 6.1 04/41] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1 Sasha Levin
2023-07-24 1:20 ` [PATCH AUTOSEL 6.1 05/41] drm/amdgpu: fix memory leak in mes self test Sasha Levin
2023-07-24 1:20 ` [PATCH AUTOSEL 6.1 26/41] drm/amdgpu: install stub fence into potential unused fence pointers Sasha Levin
2023-07-24 1:21 ` [PATCH AUTOSEL 6.1 27/41] drm/amd/display: Apply 60us prefetch for DCFCLK <= 300Mhz Sasha Levin
2023-07-24 1:21 ` [PATCH AUTOSEL 6.1 28/41] Revert "drm/amd/display: Do not set drr on pipe commit" Sasha Levin
2023-07-24 10:46 ` Michel Dänzer
2023-07-24 1:21 ` [PATCH AUTOSEL 6.1 31/41] drm/amd/display: Skip DPP DTO update if root clock is gated Sasha Levin
2023-07-24 1:21 ` [PATCH AUTOSEL 6.1 32/41] drm/amd/display: Enable dcn314 DPP RCO Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230724012118.2316073-2-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=Feifei.Xu@amd.com \
--cc=Hawking.Zhang@amd.com \
--cc=Jack.Gui@amd.com \
--cc=Longlong.Yao@amd.com \
--cc=Lyndon.Li@amd.com \
--cc=Xinhui.Pan@amd.com \
--cc=YiPeng.Chai@amd.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=candice.li@amd.com \
--cc=christian.koenig@amd.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=guchun.chen@amd.com \
--cc=katrinzhou@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox