From: "Timur Kristóf" <timur.kristof@gmail.com>
To: amd-gfx@lists.freedesktop.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
Jiqian Chen <Jiqian.Chen@amd.com>,
Alex Deucher <alexander.deucher@amd.com>
Subject: Re: [PATCH 5/5] drm/amdgpu/gfx9: Implement KGQ ring reset
Date: Fri, 19 Dec 2025 14:46:02 -0600 [thread overview]
Message-ID: <2577545.lZ2vcFHjTE@timur-max> (raw)
In-Reply-To: <20251219182201.5722-5-alexander.deucher@amd.com>
On 2025. december 19., péntek 12:22:00 középső államokbeli zónaidő Alex
Deucher wrote:
> GFX ring resets work differently on pre-GFX10 hardware since
> there is no MQD managed by the scheduler.
> For ring reset, you need issue the reset via CP_VMID_RESET
> via KIQ or MMIO and submit the following to the gfx ring to
> complete the reset:
> 1. EOP packet with EXEC bit set
> 2. WAIT_REG_MEM to wait for the fence
> 3. Clear CP_VMID_RESET to 0
> 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE
> 5. EOP packet with EXEC bit set
> 6. WAIT_REG_MEM to wait for the fence
> Once those commands have completed the reset should
> be complete and the ring can accept new packets.
>
> Tested-by: Jiqian Chen <Jiqian.Chen@amd.com> (v1)
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hi Alex,
Thank you for working on this.
For the entire series,
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
I can't test it at the moment but can give it a try in January or so.
Best regards,
Timur
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 92 ++++++++++++++++++++++++++-
> 1 file changed, 89 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 0d8e797d59b8a..7e9d753f4a808
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -2411,8 +2411,10 @@ static int gfx_v9_0_sw_init(struct amdgpu_ip_block
> *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]);
> adev->gfx.compute_supported_reset =
> amdgpu_get_soft_full_reset_mask(&adev-
>gfx.compute_ring[0]);
> - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset)
> + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset)
{
> adev->gfx.compute_supported_reset |=
AMDGPU_RESET_TYPE_PER_QUEUE;
> + adev->gfx.gfx_supported_reset |=
AMDGPU_RESET_TYPE_PER_QUEUE;
> + }
>
> r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0);
> if (r) {
> @@ -7172,6 +7174,91 @@ static void gfx_v9_ring_insert_nop(struct amdgpu_ring
> *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, num_nop - 1);
> }
>
> +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring,
> + uint32_t reg,
> + uint32_t val)
> +{
> + uint32_t cmd = 0;
> +
> + switch (ring->funcs->type) {
> + case AMDGPU_RING_TYPE_KIQ:
> + cmd = (1 << 16); /* no inc addr */
> + break;
> + default:
> + cmd = WR_CONFIRM;
> + break;
> + }
> + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
> + amdgpu_ring_write(ring, cmd);
> + amdgpu_ring_write(ring, reg);
> + amdgpu_ring_write(ring, 0);
> + amdgpu_ring_write(ring, val);
> +}
> +
> +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring,
> + unsigned int vmid,
> + struct amdgpu_fence *timedout_fence)
> +{
> + struct amdgpu_device *adev = ring->adev;
> + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0];
> + struct amdgpu_ring *kiq_ring = &kiq->ring;
> + unsigned long flags;
> + u32 tmp;
> + int r;
> +
> + amdgpu_ring_reset_helper_begin(ring, timedout_fence);
> +
> + spin_lock_irqsave(&kiq->ring_lock, flags);
> +
> + if (amdgpu_ring_alloc(kiq_ring, 5)) {
> + spin_unlock_irqrestore(&kiq->ring_lock, flags);
> + return -ENOMEM;
> + }
> +
> + /* send the reset - 5 */
> + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid);
> + gfx_v9_0_ring_emit_wreg(kiq_ring,
> + SOC15_REG_OFFSET(GC, 0,
mmCP_VMID_RESET), tmp);
> + amdgpu_ring_commit(kiq_ring);
> + r = amdgpu_ring_test_ring(kiq_ring);
> + spin_unlock_irqrestore(&kiq->ring_lock, flags);
> + if (r)
> + return r;
> +
> + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7))
> + return -ENOMEM;
> + /* emit the fence to finish the reset - 8 */
> + ring->trail_seq++;
> + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr,
> + ring->trail_seq,
AMDGPU_FENCE_FLAG_EXEC);
> + /* wait for the fence - 7 */
> + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0,
> + lower_32_bits(ring-
>trail_fence_gpu_addr),
> + upper_32_bits(ring-
>trail_fence_gpu_addr),
> + ring->trail_seq, 0xffffffff, 4);
> + /* clear mmCP_VMID_RESET - 5 */
> + gfx_v9_0_ring_emit_wreg_me(ring,
> + SOC15_REG_OFFSET(GC, 0,
mmCP_VMID_RESET), 0);
> + /* event write ENABLE_LEGACY_PIPELINE - 2 */
> + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0);
> + /* emit a regular fence - 8 */
> + ring->trail_seq++;
> + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr,
> + ring->trail_seq,
AMDGPU_FENCE_FLAG_EXEC);
> + /* wait for the fence - 7 */
> + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0,
> + lower_32_bits(ring-
>trail_fence_gpu_addr),
> + upper_32_bits(ring-
>trail_fence_gpu_addr),
> + ring->trail_seq, 0xffffffff, 4);
> + amdgpu_ring_commit(ring);
> + /* wait for the commands to complete */
> + r = amdgpu_ring_test_ring(ring);
> + if (r)
> + return r;
> +
> + return amdgpu_ring_reset_helper_end(ring, timedout_fence);
> +}
> +
> static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring,
> unsigned int vmid,
> struct amdgpu_fence *timedout_fence)
> @@ -7450,9 +7537,9 @@ static const struct amdgpu_ring_funcs
> gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg,
> .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
> .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
> - .soft_recovery = gfx_v9_0_ring_soft_recovery,
> .emit_mem_sync = gfx_v9_0_emit_mem_sync,
> .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader,
> + .reset = gfx_v9_0_reset_kgq,
> .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use,
> .end_use = amdgpu_gfx_enforce_isolation_ring_end_use,
> };
> @@ -7551,7 +7638,6 @@ static const struct amdgpu_ring_funcs
> gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg,
> .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
> .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
> - .soft_recovery = gfx_v9_0_ring_soft_recovery,
> .emit_mem_sync = gfx_v9_0_emit_mem_sync,
> .emit_wave_limit = gfx_v9_0_emit_wave_limit,
> .reset = gfx_v9_0_reset_kcq,
prev parent reply other threads:[~2025-12-19 20:46 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-19 18:21 [PATCH 1/5] drm/amdgpu: use dma_fence_get_status() for adapter reset Alex Deucher
2025-12-19 18:21 ` [PATCH 2/5] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher
2025-12-19 18:21 ` [PATCH 3/5] drm/amdgpu: mark fences with errors before ring reset Alex Deucher
2025-12-19 19:36 ` Alex Deucher
2025-12-19 18:21 ` [PATCH 4/5] drm/amdgpu/gfx9: rework pipeline sync packet sequence Alex Deucher
2025-12-19 18:22 ` [PATCH 5/5] drm/amdgpu/gfx9: Implement KGQ ring reset Alex Deucher
2025-12-19 20:46 ` Timur Kristóf [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2577545.lZ2vcFHjTE@timur-max \
--to=timur.kristof@gmail.com \
--cc=Jiqian.Chen@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.