All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/11] drm/amdgpu/gfx9: manually control gfxoff for CS on RV
@ 2025-02-03 21:43 Alex Deucher
  2025-02-03 21:43 ` [PATCH 02/11] drm/amdgpu: bump version for RV/PCO compute fix Alex Deucher
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Alex Deucher @ 2025-02-03 21:43 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Błażej Szczygieł, Sergey Kovalenko

When mesa started using compute queues more often
we started seeing additional hangs with compute queues.
Disabling gfxoff seems to mitigate that.  Manually
control gfxoff and gfx pg with command submissions to avoid
any issues related to gfxoff.  KFD already does the same
thing for these chips.

v2: limit to compute
v3: limit to APUs
v4: limit to Raven/PCO
v5: only update the compute ring_funcs
v6: Disable GFX PG
v7: adjust order

Suggested-by: Błażej Szczygieł <mumei6102@gmail.com>
Suggested-by: Sergey Kovalenko <seryoga.engineering@gmail.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3861
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-January/119116.html
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 36 +++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 6aa713cfa2f3e..a666832ecefea 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -7443,6 +7443,38 @@ static void gfx_v9_0_ring_emit_cleaner_shader(struct amdgpu_ring *ring)
 	amdgpu_ring_write(ring, 0);  /* RESERVED field, programmed to zero */
 }
 
+static void gfx_v9_0_ring_begin_use_compute(struct amdgpu_ring *ring)
+{
+	struct amdgpu_device *adev = ring->adev;
+	struct amdgpu_ip_block *gfx_block =
+		amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX);
+
+	amdgpu_gfx_enforce_isolation_ring_begin_use(ring);
+
+	/* Raven and PCO APUs seem to have stability issues
+	 * with compute and gfxoff and gfx pg.  Disable gfx pg during
+	 * submission and allow again afterwards.
+	 */
+	if (gfx_block && amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 1, 0))
+		gfx_v9_0_set_powergating_state(gfx_block, AMD_PG_STATE_UNGATE);
+}
+
+static void gfx_v9_0_ring_end_use_compute(struct amdgpu_ring *ring)
+{
+	struct amdgpu_device *adev = ring->adev;
+	struct amdgpu_ip_block *gfx_block =
+		amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX);
+
+	/* Raven and PCO APUs seem to have stability issues
+	 * with compute and gfxoff and gfx pg.  Disable gfx pg during
+	 * submission and allow again afterwards.
+	 */
+	if (gfx_block && amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 1, 0))
+		gfx_v9_0_set_powergating_state(gfx_block, AMD_PG_STATE_GATE);
+
+	amdgpu_gfx_enforce_isolation_ring_end_use(ring);
+}
+
 static const struct amd_ip_funcs gfx_v9_0_ip_funcs = {
 	.name = "gfx_v9_0",
 	.early_init = gfx_v9_0_early_init,
@@ -7619,8 +7651,8 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_compute = {
 	.emit_wave_limit = gfx_v9_0_emit_wave_limit,
 	.reset = gfx_v9_0_reset_kcq,
 	.emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader,
-	.begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use,
-	.end_use = amdgpu_gfx_enforce_isolation_ring_end_use,
+	.begin_use = gfx_v9_0_ring_begin_use_compute,
+	.end_use = gfx_v9_0_ring_end_use_compute,
 };
 
 static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_kiq = {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-02-06 15:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-03 21:43 [PATCH 01/11] drm/amdgpu/gfx9: manually control gfxoff for CS on RV Alex Deucher
2025-02-03 21:43 ` [PATCH 02/11] drm/amdgpu: bump version for RV/PCO compute fix Alex Deucher
2025-02-03 21:43 ` [PATCH 03/11] drm/amdgpu/gfx: add amdgpu_gfx_off_ctrl_immediate() Alex Deucher
2025-02-03 21:43 ` [PATCH 04/11] drm/amdgpu/gfx9: use amdgpu_gfx_off_ctrl_immediate() for PG Alex Deucher
2025-02-06 14:48   ` Lazar, Lijo
2025-02-03 21:43 ` [PATCH 05/11] drm/amdgpu/sdma5.2: use amdgpu_gfx_off_ctrl_immediate() Alex Deucher
2025-02-05 16:04   ` Alex Deucher
2025-02-06 14:50   ` Lazar, Lijo
2025-02-06 15:25     ` Alex Deucher
2025-02-06 15:36       ` Lazar, Lijo
2025-02-06 15:39         ` Alex Deucher
2025-02-03 21:43 ` [PATCH 06/11] drm/amdgpu/gfx10: manually control gfxoff for CS Alex Deucher
2025-02-04  7:30   ` Lazar, Lijo
2025-02-03 21:43 ` [PATCH 07/11] drm/amdgpu/gfx11: " Alex Deucher
2025-02-03 21:43 ` [PATCH 08/11] drm/amdgpu/gfx12: " Alex Deucher
2025-02-03 21:43 ` [PATCH 09/11] drm/amdgpu/sdma5.0: " Alex Deucher
2025-02-03 21:43 ` [PATCH 10/11] drm/amdgpu/sdma6.0: " Alex Deucher
2025-02-03 21:43 ` [PATCH 11/11] drm/amdgpu/sdma7.0: " Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.