From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Zhenneng Li" <lizhenneng@kylinos.cn>,
"Christian König" <christian.koenig@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Sasha Levin" <sashal@kernel.org>,
Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 5.19 13/33] drm/radeon: add a force flush to delay work when radeon
Date: Tue, 30 Aug 2022 13:18:04 -0400 [thread overview]
Message-ID: <20220830171825.580603-13-sashal@kernel.org> (raw)
In-Reply-To: <20220830171825.580603-1-sashal@kernel.org>
From: Zhenneng Li <lizhenneng@kylinos.cn>
[ Upstream commit f461950fdc374a3ada5a63c669d997de4600dffe ]
Although radeon card fence and wait for gpu to finish processing current batch rings,
there is still a corner case that radeon lockup work queue may not be fully flushed,
and meanwhile the radeon_suspend_kms() function has called pci_set_power_state() to
put device in D3hot state.
Per PCI spec rev 4.0 on 5.3.1.4.1 D3hot State.
> Configuration and Message requests are the only TLPs accepted by a Function in
> the D3hot state. All other received Requests must be handled as Unsupported Requests,
> and all received Completions may optionally be handled as Unexpected Completions.
This issue will happen in following logs:
Unable to handle kernel paging request at virtual address 00008800e0008010
CPU 0 kworker/0:3(131): Oops 0
pc = [<ffffffff811bea5c>] ra = [<ffffffff81240844>] ps = 0000 Tainted: G W
pc is at si_gpu_check_soft_reset+0x3c/0x240
ra is at si_dma_is_lockup+0x34/0xd0
v0 = 0000000000000000 t0 = fff08800e0008010 t1 = 0000000000010000
t2 = 0000000000008010 t3 = fff00007e3c00000 t4 = fff00007e3c00258
t5 = 000000000000ffff t6 = 0000000000000001 t7 = fff00007ef078000
s0 = fff00007e3c016e8 s1 = fff00007e3c00000 s2 = fff00007e3c00018
s3 = fff00007e3c00000 s4 = fff00007fff59d80 s5 = 0000000000000000
s6 = fff00007ef07bd98
a0 = fff00007e3c00000 a1 = fff00007e3c016e8 a2 = 0000000000000008
a3 = 0000000000000001 a4 = 8f5c28f5c28f5c29 a5 = ffffffff810f4338
t8 = 0000000000000275 t9 = ffffffff809b66f8 t10 = ff6769c5d964b800
t11= 000000000000b886 pv = ffffffff811bea20 at = 0000000000000000
gp = ffffffff81d89690 sp = 00000000aa814126
Disabling lock debugging due to kernel taint
Trace:
[<ffffffff81240844>] si_dma_is_lockup+0x34/0xd0
[<ffffffff81119610>] radeon_fence_check_lockup+0xd0/0x290
[<ffffffff80977010>] process_one_work+0x280/0x550
[<ffffffff80977350>] worker_thread+0x70/0x7c0
[<ffffffff80977410>] worker_thread+0x130/0x7c0
[<ffffffff80982040>] kthread+0x200/0x210
[<ffffffff809772e0>] worker_thread+0x0/0x7c0
[<ffffffff80981f8c>] kthread+0x14c/0x210
[<ffffffff80911658>] ret_from_kernel_thread+0x18/0x20
[<ffffffff80981e40>] kthread+0x0/0x210
Code: ad3e0008 43f0074a ad7e0018 ad9e0020 8c3001e8 40230101
<88210000> 4821ed21
So force lockup work queue flush to fix this problem.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/radeon/radeon_device.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 429644d5ddc69..9fba16cb3f1e7 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1604,6 +1604,9 @@ int radeon_suspend_kms(struct drm_device *dev, bool suspend,
if (r) {
/* delay GPU reset to resume */
radeon_fence_driver_force_completion(rdev, i);
+ } else {
+ /* finish executing delayed work */
+ flush_delayed_work(&rdev->fence_drv[i].lockup_work);
}
}
--
2.35.1
next prev parent reply other threads:[~2022-08-30 17:20 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-30 17:17 [PATCH AUTOSEL 5.19 01/33] firmware: dmi: Use the proper accessor for the version field Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 02/33] scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 03/33] scsi: core: Allow the ALUA transitioning state enough time Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 04/33] scsi: megaraid_sas: Fix double kfree() Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 05/33] drm/vc4: hdmi: Depends on CONFIG_PM Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 06/33] drm/vc4: hdmi: Rework power up Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 07/33] drm/gem: Fix GEM handle release errors Sasha Levin
2022-08-30 17:17 ` [PATCH AUTOSEL 5.19 08/33] perf/x86/core: Set pebs_capable and PMU_FL_PEBS_ALL for the Baseline Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 09/33] drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 10/33] drm/amdgpu: fix hive reference leak when adding xgmi device Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 11/33] drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 12/33] drm/amdgpu: Remove the additional kfd pre reset call for sriov Sasha Levin
2022-08-30 17:18 ` Sasha Levin [this message]
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 14/33] scsi: ufs: core: Reduce the power mode change timeout Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 15/33] Revert "parisc: Show error if wrong 32/64-bit compiler is being used" Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 16/33] parisc: ccio-dma: Handle kmalloc failure in ccio_init_resources() Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 17/33] parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 18/33] arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 19/33] arm64: cacheinfo: Fix incorrect assignment of signed error value to unsigned fw_level Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 20/33] arm64/signal: Raise limit on stack frames Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 21/33] netfilter: conntrack: work around exceeded receive window Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 22/33] thermal/int340x_thermal: handle data_vault when the value is ZERO_SIZE_PTR Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 23/33] cpufreq: check only freq_table in __resolve_freq() Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 24/33] fec: Restart PPS after link state change Sasha Levin
2022-08-31 13:02 ` Csókás Bence
2022-09-09 1:21 ` Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 25/33] net/core/skbuff: Check the return value of skb_copy_bits() Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 26/33] md: Flush workqueue md_rdev_misc_wq in md_alloc() Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 27/33] fbdev: omapfb: Fix tests for platform_get_irq() failure Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 28/33] fbdev: fb_pm2fb: Avoid potential divide by zero error Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 29/33] fbdev: fbcon: Destroy mutex on freeing struct fb_info Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 30/33] fbdev: chipsfb: Add missing pci_disable_device() in chipsfb_pci_init() Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 31/33] x86/sev: Mark snp_abort() noreturn Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 32/33] drm/amdgpu: add sdma instance check for gfx11 CGCG Sasha Levin
2022-08-30 17:18 ` [PATCH AUTOSEL 5.19 33/33] drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly Sasha Levin
2022-08-30 21:32 ` [PATCH AUTOSEL 5.19 01/33] firmware: dmi: Use the proper accessor for the version field Jean Delvare
2022-08-31 11:50 ` Andy Shevchenko
2022-09-09 1:22 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220830171825.580603-13-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=Xinhui.Pan@amd.com \
--cc=airlied@linux.ie \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizhenneng@kylinos.cn \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox