* [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once
@ 2025-12-15 16:07 Alex Deucher
2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher
` (5 more replies)
0 siblings, 6 replies; 19+ messages in thread
From: Alex Deucher @ 2025-12-15 16:07 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher
If we cancel a bad job and reemit the ring contents, and
we get another timeout, cancel everything rather than reemitting.
The wptr markers are only relevant for the original emit. If
we reemit, the wptr markers are no longer correct.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 22 +++++++++++++++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 1fe31d2f27060..334ddd6e48c06 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -710,6 +710,7 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af)
struct amdgpu_ring *ring = af->ring;
unsigned long flags;
u32 seq, last_seq;
+ bool reemitted = false;
last_seq = amdgpu_fence_read(ring) & ring->fence_drv.num_fences_mask;
seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask;
@@ -727,7 +728,9 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af)
if (unprocessed && !dma_fence_is_signaled_locked(unprocessed)) {
fence = container_of(unprocessed, struct amdgpu_fence, base);
- if (fence == af)
+ if (fence->reemitted > 1)
+ reemitted = true;
+ else if (fence == af)
dma_fence_set_error(&fence->base, -ETIME);
else if (fence->context == af->context)
dma_fence_set_error(&fence->base, -ECANCELED);
@@ -735,9 +738,16 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af)
rcu_read_unlock();
} while (last_seq != seq);
spin_unlock_irqrestore(&ring->fence_drv.lock, flags);
- /* signal the guilty fence */
- amdgpu_fence_write(ring, (u32)af->base.seqno);
- amdgpu_fence_process(ring);
+
+ if (reemitted) {
+ /* if we've already reemitted once then just cancel everything */
+ amdgpu_fence_driver_force_completion(af->ring);
+ af->ring->ring_backup_entries_to_copy = 0;
+ } else {
+ /* signal the guilty fence */
+ amdgpu_fence_write(ring, (u32)af->base.seqno);
+ amdgpu_fence_process(ring);
+ }
}
void amdgpu_fence_save_wptr(struct amdgpu_fence *af)
@@ -785,10 +795,12 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring,
/* save everything if the ring is not guilty, otherwise
* just save the content from other contexts.
*/
- if (!guilty_fence || (fence->context != guilty_fence->context))
+ if (!fence->reemitted &&
+ (!guilty_fence || (fence->context != guilty_fence->context)))
amdgpu_ring_backup_unprocessed_command(ring, wptr,
fence->wptr);
wptr = fence->wptr;
+ fence->reemitted++;
}
rcu_read_unlock();
} while (last_seq != seq);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index a1fb0fadb6eab..d881829528976 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -150,6 +150,8 @@ struct amdgpu_fence {
u64 wptr;
/* fence context for resets */
u64 context;
+ /* has this fence been reemitted */
+ unsigned int reemitted;
};
extern const struct drm_sched_backend_ops amdgpu_sched_ops;
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH 2/6] drm/amdgpu: always backup and reemit fences 2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher @ 2025-12-15 16:07 ` Alex Deucher 2025-12-18 5:17 ` Timur Kristóf 2025-12-15 16:07 ` [PATCH 3/6] drm/amdgpu: use dma_fence_get_status() for adapter reset Alex Deucher ` (4 subsequent siblings) 5 siblings, 1 reply; 19+ messages in thread From: Alex Deucher @ 2025-12-15 16:07 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher If when we backup the ring contents for reemit before a ring reset, we skip jobs associated with the bad context, however, we need to make sure the fences are reemited as unprocessed submissions may depend on them. v2: clean up fence handling, make helpers static Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 24 ++++++++++++++++++----- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 ++++- 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 334ddd6e48c06..3a23cce5f769a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -89,6 +89,16 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring) return seq; } +static void amdgpu_fence_save_fence_wptr_start(struct amdgpu_fence *af) +{ + af->fence_wptr_start = af->ring->wptr; +} + +static void amdgpu_fence_save_fence_wptr_end(struct amdgpu_fence *af) +{ + af->fence_wptr_end = af->ring->wptr; +} + /** * amdgpu_fence_emit - emit a fence on the requested ring * @@ -116,8 +126,10 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct amdgpu_fence *af, &ring->fence_drv.lock, adev->fence_context + ring->idx, seq); + amdgpu_fence_save_fence_wptr_start(af); amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr, seq, flags | AMDGPU_FENCE_FLAG_INT); + amdgpu_fence_save_fence_wptr_end(af); amdgpu_fence_save_wptr(af); pm_runtime_get_noresume(adev_to_drm(adev)->dev); ptr = &ring->fence_drv.fences[seq & ring->fence_drv.num_fences_mask]; @@ -743,10 +755,6 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af) /* if we've already reemitted once then just cancel everything */ amdgpu_fence_driver_force_completion(af->ring); af->ring->ring_backup_entries_to_copy = 0; - } else { - /* signal the guilty fence */ - amdgpu_fence_write(ring, (u32)af->base.seqno); - amdgpu_fence_process(ring); } } @@ -796,9 +804,15 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring, * just save the content from other contexts. */ if (!fence->reemitted && - (!guilty_fence || (fence->context != guilty_fence->context))) + (!guilty_fence || (fence->context != guilty_fence->context))) { amdgpu_ring_backup_unprocessed_command(ring, wptr, fence->wptr); + } else if (!fence->reemitted) { + /* always save the fence */ + amdgpu_ring_backup_unprocessed_command(ring, + fence->fence_wptr_start, + fence->fence_wptr_end); + } wptr = fence->wptr; fence->reemitted++; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index d881829528976..87c9df6c2ecfe 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -146,12 +146,15 @@ struct amdgpu_fence { struct amdgpu_ring *ring; ktime_t start_timestamp; - /* wptr for the fence for resets */ + /* wptr for the total submission for resets */ u64 wptr; /* fence context for resets */ u64 context; /* has this fence been reemitted */ unsigned int reemitted; + /* wptr for the fence for the submission */ + u64 fence_wptr_start; + u64 fence_wptr_end; }; extern const struct drm_sched_backend_ops amdgpu_sched_ops; -- 2.52.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/6] drm/amdgpu: always backup and reemit fences 2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher @ 2025-12-18 5:17 ` Timur Kristóf 0 siblings, 0 replies; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 5:17 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher, Alex Deucher On 2025. december 15., hétfő 10:07:07 középső államokbeli zónaidő Alex Deucher wrote: > If when we backup the ring contents for reemit before a > ring reset, we skip jobs associated with the bad > context, however, we need to make sure the fences > are reemited as unprocessed submissions may depend on > them. > > v2: clean up fence handling, make helpers static Nice work! We definitely need this when amdgpu_sched_hw_submission>2. This patch is: Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 24 ++++++++++++++++++----- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 ++++- > 2 files changed, 23 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index > 334ddd6e48c06..3a23cce5f769a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > @@ -89,6 +89,16 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring) > return seq; > } > > +static void amdgpu_fence_save_fence_wptr_start(struct amdgpu_fence *af) > +{ > + af->fence_wptr_start = af->ring->wptr; > +} > + > +static void amdgpu_fence_save_fence_wptr_end(struct amdgpu_fence *af) > +{ > + af->fence_wptr_end = af->ring->wptr; > +} > + > /** > * amdgpu_fence_emit - emit a fence on the requested ring > * > @@ -116,8 +126,10 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct > amdgpu_fence *af, &ring->fence_drv.lock, > adev->fence_context + ring->idx, seq); > > + amdgpu_fence_save_fence_wptr_start(af); > amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr, > seq, flags | AMDGPU_FENCE_FLAG_INT); > + amdgpu_fence_save_fence_wptr_end(af); > amdgpu_fence_save_wptr(af); > pm_runtime_get_noresume(adev_to_drm(adev)->dev); > ptr = &ring->fence_drv.fences[seq & ring- >fence_drv.num_fences_mask]; > @@ -743,10 +755,6 @@ void amdgpu_fence_driver_guilty_force_completion(struct > amdgpu_fence *af) /* if we've already reemitted once then just cancel > everything */ amdgpu_fence_driver_force_completion(af->ring); > af->ring->ring_backup_entries_to_copy = 0; > - } else { > - /* signal the guilty fence */ > - amdgpu_fence_write(ring, (u32)af->base.seqno); > - amdgpu_fence_process(ring); > } > } > > @@ -796,9 +804,15 @@ void amdgpu_ring_backup_unprocessed_commands(struct > amdgpu_ring *ring, * just save the content from other contexts. > */ > if (!fence->reemitted && > - (!guilty_fence || (fence->context != guilty_fence->context))) > + (!guilty_fence || (fence->context != guilty_fence->context))) { > amdgpu_ring_backup_unprocessed_command(ring, wptr, > fence->wptr); > + } else if (!fence->reemitted) { > + /* always save the fence */ > + amdgpu_ring_backup_unprocessed_command(ring, > + fence->fence_wptr_start, > + fence->fence_wptr_end); > + } > wptr = fence->wptr; > fence->reemitted++; > } > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index > d881829528976..87c9df6c2ecfe 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > @@ -146,12 +146,15 @@ struct amdgpu_fence { > struct amdgpu_ring *ring; > ktime_t start_timestamp; > > - /* wptr for the fence for resets */ > + /* wptr for the total submission for resets */ > u64 wptr; > /* fence context for resets */ > u64 context; > /* has this fence been reemitted */ > unsigned int reemitted; > + /* wptr for the fence for the submission */ > + u64 fence_wptr_start; > + u64 fence_wptr_end; > }; > > extern const struct drm_sched_backend_ops amdgpu_sched_ops; ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/6] drm/amdgpu: use dma_fence_get_status() for adapter reset 2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher 2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher @ 2025-12-15 16:07 ` Alex Deucher 2025-12-15 16:07 ` [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher ` (3 subsequent siblings) 5 siblings, 0 replies; 19+ messages in thread From: Alex Deucher @ 2025-12-15 16:07 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher We need to check if the fence was signaled without an error as the per queue resets may have signalled the fence while attempting to reset the queue. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 0c314d278b699..d6c44ecfc8356 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -6367,7 +6367,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, * * job->base holds a reference to parent fence */ - if (job && dma_fence_is_signaled(&job->hw_fence->base)) { + if (job && (dma_fence_get_status(&job->hw_fence->base) > 0)) { job_signaled = true; dev_info(adev->dev, "Guilty job already signaled, skipping HW reset"); goto skip_hw_reset; -- 2.52.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler 2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher 2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher 2025-12-15 16:07 ` [PATCH 3/6] drm/amdgpu: use dma_fence_get_status() for adapter reset Alex Deucher @ 2025-12-15 16:07 ` Alex Deucher 2025-12-18 5:15 ` Timur Kristóf 2025-12-15 16:07 ` [PATCH 5/6] drm/amdgpu: mark fences with errors before ring reset Alex Deucher ` (2 subsequent siblings) 5 siblings, 1 reply; 19+ messages in thread From: Alex Deucher @ 2025-12-15 16:07 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher Only set an error on the fence if the fence is not signalled. We can end up with a warning if the per queue reset path signals the fence and sets an error as part of the reset, but fails to recover. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 67fde99724bad..7f5d01164897f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -147,7 +147,8 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) dev_err(adev->dev, "Ring %s reset failed\n", ring->sched.name); } - dma_fence_set_error(&s_job->s_fence->finished, -ETIME); + if (dma_fence_get_status(&s_job->s_fence->finished) == 0) + dma_fence_set_error(&s_job->s_fence->finished, -ETIME); amdgpu_vm_put_task_info(ti); -- 2.52.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler 2025-12-15 16:07 ` [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher @ 2025-12-18 5:15 ` Timur Kristóf 2025-12-18 15:41 ` Alex Deucher 0 siblings, 1 reply; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 5:15 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher, Alex Deucher On 2025. december 15., hétfő 10:07:09 középső államokbeli zónaidő Alex Deucher wrote: > Only set an error on the fence if the fence is not > signalled. We can end up with a warning if the > per queue reset path signals the fence and sets an error > as part of the reset, but fails to recover. Can you please elaborate why this is necessary? I don't entirely see the point of this patch. Why don't want to set an error on the fence when it was signalled by the per queue reset? I would have thought that the next patch does that, and also fixes the warning mentioned in the commit message here. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index > 67fde99724bad..7f5d01164897f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > @@ -147,7 +147,8 @@ static enum drm_gpu_sched_stat > amdgpu_job_timedout(struct drm_sched_job *s_job) dev_err(adev->dev, "Ring > %s reset failed\n", ring->sched.name); } > > - dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > + if (dma_fence_get_status(&s_job->s_fence->finished) == 0) > + dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > amdgpu_vm_put_task_info(ti); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler 2025-12-18 5:15 ` Timur Kristóf @ 2025-12-18 15:41 ` Alex Deucher 2025-12-18 15:49 ` Timur Kristóf 0 siblings, 1 reply; 19+ messages in thread From: Alex Deucher @ 2025-12-18 15:41 UTC (permalink / raw) To: Timur Kristóf; +Cc: amd-gfx, Alex Deucher On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 15., hétfő 10:07:09 középső államokbeli zónaidő Alex Deucher > wrote: > > Only set an error on the fence if the fence is not > > signalled. We can end up with a warning if the > > per queue reset path signals the fence and sets an error > > as part of the reset, but fails to recover. > > Can you please elaborate why this is necessary? > I don't entirely see the point of this patch. Why don't want to set an error > on the fence when it was signalled by the per queue reset? I would have > thought that the next patch does that, and also fixes the warning mentioned in > the commit message here. If you call dma_fence_set_error() on a fence that has already signaled it triggers a warning. What could happen is that the queue reset sets the error on the fence and then signals the fence as part of the reset sequence. However if the queue reset ultimately fails, the fence is already signaled and then we try and set an error again here as we fall back to adapter reset, triggering the warning. Alex > > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index > > 67fde99724bad..7f5d01164897f 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > @@ -147,7 +147,8 @@ static enum drm_gpu_sched_stat > > amdgpu_job_timedout(struct drm_sched_job *s_job) dev_err(adev->dev, "Ring > > %s reset failed\n", ring->sched.name); } > > > > - dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > + if (dma_fence_get_status(&s_job->s_fence->finished) == 0) > > + dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > > > amdgpu_vm_put_task_info(ti); > > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler 2025-12-18 15:41 ` Alex Deucher @ 2025-12-18 15:49 ` Timur Kristóf 0 siblings, 0 replies; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 15:49 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx, Alex Deucher On 2025. december 18., csütörtök 9:41:41 középső államokbeli zónaidő Alex Deucher wrote: > On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 15., hétfő 10:07:09 középső államokbeli zónaidő Alex > > Deucher> > > wrote: > > > Only set an error on the fence if the fence is not > > > signalled. We can end up with a warning if the > > > per queue reset path signals the fence and sets an error > > > as part of the reset, but fails to recover. > > > > Can you please elaborate why this is necessary? > > I don't entirely see the point of this patch. Why don't want to set an > > error on the fence when it was signalled by the per queue reset? I would > > have thought that the next patch does that, and also fixes the warning > > mentioned in the commit message here. > > If you call dma_fence_set_error() on a fence that has already signaled > it triggers a warning. What could happen is that the queue reset sets > the error on the fence and then signals the fence as part of the reset > sequence. However if the queue reset ultimately fails, the fence is > already signaled and then we try and set an error again here as we > fall back to adapter reset, triggering the warning. > > Alex I would have thought that the next patch in the series would take care of this problem by itself. Thanks for the explanation. The patch is: Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > > --- > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index > > > 67fde99724bad..7f5d01164897f 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > @@ -147,7 +147,8 @@ static enum drm_gpu_sched_stat > > > amdgpu_job_timedout(struct drm_sched_job *s_job) dev_err(adev->dev, > > > "Ring > > > %s reset failed\n", ring->sched.name); } > > > > > > - dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > > + if (dma_fence_get_status(&s_job->s_fence->finished) == 0) > > > + dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > > > > > amdgpu_vm_put_task_info(ti); ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 5/6] drm/amdgpu: mark fences with errors before ring reset 2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher ` (2 preceding siblings ...) 2025-12-15 16:07 ` [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher @ 2025-12-15 16:07 ` Alex Deucher 2025-12-15 16:07 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ " Alex Deucher 2025-12-18 5:20 ` [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Timur Kristóf 5 siblings, 0 replies; 19+ messages in thread From: Alex Deucher @ 2025-12-15 16:07 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher Mark fences with errors before we reset the rings as we may end up signalling fences as part of the reset sequence. The error needs to be set before the fence is signalled. On GC10 and newer, this isn't a problem since we don't signal any fences. On GC9, we need to signal the fence after the reset to unblock the recovery sequence. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index 1aee207b13524..fd7b0f955e4df 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -872,6 +872,10 @@ void amdgpu_ring_reset_helper_begin(struct amdgpu_ring *ring, drm_sched_wqueue_stop(&ring->sched); /* back up the non-guilty commands */ amdgpu_ring_backup_unprocessed_commands(ring, guilty_fence); + /* signal the guilty fence and set an error on all fences from the context */ + if (guilty_fence) + amdgpu_fence_driver_guilty_force_completion(guilty_fence); + } int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring, @@ -885,9 +889,6 @@ int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring, if (r) return r; - /* signal the guilty fence and set an error on all fences from the context */ - if (guilty_fence) - amdgpu_fence_driver_guilty_force_completion(guilty_fence); /* Re-emit the non-guilty commands */ if (ring->ring_backup_entries_to_copy) { amdgpu_ring_alloc_reemit(ring, ring->ring_backup_entries_to_copy); -- 2.52.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset 2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher ` (3 preceding siblings ...) 2025-12-15 16:07 ` [PATCH 5/6] drm/amdgpu: mark fences with errors before ring reset Alex Deucher @ 2025-12-15 16:07 ` Alex Deucher 2025-12-18 5:11 ` Timur Kristóf 2025-12-18 5:20 ` [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Timur Kristóf 5 siblings, 1 reply; 19+ messages in thread From: Alex Deucher @ 2025-12-15 16:07 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher GFX ring resets work differently on pre-GFX10 hardware since there is no MQD managed by the scheduler. For ring reset, you need issue the reset via CP_VMID_RESET via KIQ or MMIO and submit the following to the gfx ring to complete the reset: 1. EOP packet with EXEC bit set 2. WAIT_REG_MEM to wait for the fence 3. Clear CP_VMID_RESET to 0 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE 5. EOP packet with EXEC bit set 6. WAIT_REG_MEM to wait for the fence Once those commands have completed the reset should be complete and the ring can accept new packets. However, because we have a pipeline sync between jobs, the PFP is waiting on the fence from the bad job to signal so it can't process any of the packets in the reset sequence until that pipeline sync clears. To unblock the PFP, we use the KIQ to signal the fence after we reset the queue. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++- 1 file changed, 101 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index bb1465a98c7ca..9b7073650315e 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct amdgpu_ip_block *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); adev->gfx.compute_supported_reset = amdgpu_get_soft_full_reset_mask(&adev->gfx.compute_ring[0]); - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) { adev->gfx.compute_supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; + adev->gfx.gfx_supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; + } r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0); if (r) { @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, num_nop - 1); } +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring, + uint32_t reg, + uint32_t val) +{ + uint32_t cmd = 0; + + switch (ring->funcs->type) { + case AMDGPU_RING_TYPE_KIQ: + cmd = (1 << 16); /* no inc addr */ + break; + default: + cmd = WR_CONFIRM; + break; + } + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); + amdgpu_ring_write(ring, cmd); + amdgpu_ring_write(ring, reg); + amdgpu_ring_write(ring, 0); + amdgpu_ring_write(ring, val); +} + +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring, + uint32_t event_type, + uint32_t event_index) +{ + amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0)); + amdgpu_ring_write(ring, EVENT_TYPE(event_type) | + EVENT_INDEX(event_index)); +} + +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring, + unsigned int vmid, + struct amdgpu_fence *timedout_fence) +{ + struct amdgpu_device *adev = ring->adev; + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; + struct amdgpu_ring *kiq_ring = &kiq->ring; + unsigned long flags; + u32 tmp; + int r; + + amdgpu_ring_reset_helper_begin(ring, timedout_fence); + + spin_lock_irqsave(&kiq->ring_lock, flags); + + if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) { + spin_unlock_irqrestore(&kiq->ring_lock, flags); + return -ENOMEM; + } + + /* send the reset - 5 */ + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); + gfx_v9_0_ring_emit_wreg(kiq_ring, + SOC15_REG_OFFSET(GC, 0, mmCP_VMID_RESET), tmp); + /* emit the fence to clear the pipeline sync - 5 */ + gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr, + timedout_fence->base.seqno, 0); + amdgpu_ring_commit(kiq_ring); + r = amdgpu_ring_test_ring(kiq_ring); + spin_unlock_irqrestore(&kiq->ring_lock, flags); + if (r) + return r; + + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7)) + return -ENOMEM; + /* emit the fence to finish the reset - 8 */ + ring->trail_seq++; + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, + ring->trail_seq, AMDGPU_FENCE_FLAG_EXEC); + /* wait for the fence - 7 */ + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0, + lower_32_bits(ring->trail_fence_gpu_addr), + upper_32_bits(ring->trail_fence_gpu_addr), + ring->trail_seq, 0xffffffff, 4); + /* clear mmCP_VMID_RESET - 5 */ + gfx_v9_0_ring_emit_wreg_me(ring, + SOC15_REG_OFFSET(GC, 0, mmCP_VMID_RESET), 0); + /* event write ENABLE_LEGACY_PIPELINE - 2 */ + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0); + /* emit a regular fence - 8 */ + ring->trail_seq++; + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, + ring->trail_seq, AMDGPU_FENCE_FLAG_EXEC); + /* wait for the fence - 7 */ + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0, + lower_32_bits(ring->trail_fence_gpu_addr), + upper_32_bits(ring->trail_fence_gpu_addr), + ring->trail_seq, 0xffffffff, 4); + amdgpu_ring_commit(ring); + /* wait for the commands to complete */ + r = amdgpu_ring_test_ring(ring); + if (r) + return r; + + return amdgpu_ring_reset_helper_end(ring, timedout_fence); +} + static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring, unsigned int vmid, struct amdgpu_fence *timedout_fence) @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg, .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, - .soft_recovery = gfx_v9_0_ring_soft_recovery, .emit_mem_sync = gfx_v9_0_emit_mem_sync, .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, + .reset = gfx_v9_0_reset_kgq, .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, }; @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg, .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, - .soft_recovery = gfx_v9_0_ring_soft_recovery, .emit_mem_sync = gfx_v9_0_emit_mem_sync, .emit_wave_limit = gfx_v9_0_emit_wave_limit, .reset = gfx_v9_0_reset_kcq, -- 2.52.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset 2025-12-15 16:07 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ " Alex Deucher @ 2025-12-18 5:11 ` Timur Kristóf 2025-12-18 5:28 ` Timur Kristóf 2025-12-18 15:58 ` Alex Deucher 0 siblings, 2 replies; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 5:11 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher, Alex Deucher On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex Deucher wrote: > GFX ring resets work differently on pre-GFX10 hardware since > there is no MQD managed by the scheduler. > For ring reset, you need issue the reset via CP_VMID_RESET > via KIQ or MMIO and submit the following to the gfx ring to > complete the reset: > 1. EOP packet with EXEC bit set > 2. WAIT_REG_MEM to wait for the fence > 3. Clear CP_VMID_RESET to 0 > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE > 5. EOP packet with EXEC bit set > 6. WAIT_REG_MEM to wait for the fence > Once those commands have completed the reset should > be complete and the ring can accept new packets. > > However, because we have a pipeline sync between jobs, > the PFP is waiting on the fence from the bad job to signal so > it can't process any of the packets in the reset sequence > until that pipeline sync clears. To unblock the PFP, we > use the KIQ to signal the fence after we reset the queue. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++- > 1 file changed, 101 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index bb1465a98c7ca..9b7073650315e > 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct amdgpu_ip_block > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); > adev->gfx.compute_supported_reset = > amdgpu_get_soft_full_reset_mask(&adev- >gfx.compute_ring[0]); > - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) { > adev->gfx.compute_supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; > + adev->gfx.gfx_supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; > + } > > r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0); > if (r) { > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, num_nop - > 1); > } > > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring, > + uint32_t reg, > + uint32_t val) > +{ > + uint32_t cmd = 0; > + > + switch (ring->funcs->type) { > + case AMDGPU_RING_TYPE_KIQ: > + cmd = (1 << 16); /* no inc addr */ What do you mean by "inc addr" in this context? > + break; > + default: > + cmd = WR_CONFIRM; > + break; > + } > + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); > + amdgpu_ring_write(ring, cmd); > + amdgpu_ring_write(ring, reg); > + amdgpu_ring_write(ring, 0); > + amdgpu_ring_write(ring, val); > +} > + > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring, > + uint32_t event_type, > + uint32_t event_index) > +{ > + amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0)); > + amdgpu_ring_write(ring, EVENT_TYPE(event_type) | > + EVENT_INDEX(event_index)); > +} > + > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring, > + unsigned int vmid, > + struct amdgpu_fence *timedout_fence) > +{ > + struct amdgpu_device *adev = ring->adev; > + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; > + struct amdgpu_ring *kiq_ring = &kiq->ring; > + unsigned long flags; > + u32 tmp; > + int r; > + > + amdgpu_ring_reset_helper_begin(ring, timedout_fence); > + > + spin_lock_irqsave(&kiq->ring_lock, flags); > + > + if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) { > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > + return -ENOMEM; > + } > + > + /* send the reset - 5 */ > + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); > + gfx_v9_0_ring_emit_wreg(kiq_ring, > + SOC15_REG_OFFSET(GC, 0, mmCP_VMID_RESET), tmp); > + /* emit the fence to clear the pipeline sync - 5 */ > + gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr, > + timedout_fence->base.seqno, 0); As far as I see, this isn't going to work when sched_hw_submission > 2 and there are more than two jobs (from various different userspace processes) emitted in the ring. I can think of two possible solutons: - Emit each fence individually, with a short delay in between to give a chance to the GFX ring to catch up with the KIQ. - Change the wait_reg_mem command used for the pipeline sync to allow greater than equal instead of just equal. Then it's enough to signal just the last fence on the KIQ ring. > + amdgpu_ring_commit(kiq_ring); > + r = amdgpu_ring_test_ring(kiq_ring); > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > + if (r) > + return r; > + > + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7)) > + return -ENOMEM; > + /* emit the fence to finish the reset - 8 */ > + ring->trail_seq++; > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > + ring->trail_seq, AMDGPU_FENCE_FLAG_EXEC); > + /* wait for the fence - 7 */ > + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0, > + lower_32_bits(ring- >trail_fence_gpu_addr), > + upper_32_bits(ring- >trail_fence_gpu_addr), > + ring->trail_seq, 0xffffffff, 4); > + /* clear mmCP_VMID_RESET - 5 */ > + gfx_v9_0_ring_emit_wreg_me(ring, > + SOC15_REG_OFFSET(GC, 0, mmCP_VMID_RESET), 0); > + /* event write ENABLE_LEGACY_PIPELINE - 2 */ > + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0); > + /* emit a regular fence - 8 */ > + ring->trail_seq++; > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > + ring->trail_seq, AMDGPU_FENCE_FLAG_EXEC); > + /* wait for the fence - 7 */ > + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0, > + lower_32_bits(ring- >trail_fence_gpu_addr), > + upper_32_bits(ring- >trail_fence_gpu_addr), > + ring->trail_seq, 0xffffffff, 4); Why is it necessary to emit (and wait for) a regular fence here? I'm not against it, just curious why it's needed. > + amdgpu_ring_commit(ring); > + /* wait for the commands to complete */ > + r = amdgpu_ring_test_ring(ring); > + if (r) > + return r; > + > + return amdgpu_ring_reset_helper_end(ring, timedout_fence); > +} > + > static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring, > unsigned int vmid, > struct amdgpu_fence *timedout_fence) > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > - .soft_recovery = gfx_v9_0_ring_soft_recovery, Can you please split removing the soft recovery into a separate patch? Can we talk about removing the soft recovery? For the other chips where it has already been removed, it is percieved by users as a regression. > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, > + .reset = gfx_v9_0_reset_kgq, > .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, > .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, > }; > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > .emit_wave_limit = gfx_v9_0_emit_wave_limit, > .reset = gfx_v9_0_reset_kcq, ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset 2025-12-18 5:11 ` Timur Kristóf @ 2025-12-18 5:28 ` Timur Kristóf 2025-12-18 15:58 ` Alex Deucher 1 sibling, 0 replies; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 5:28 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher, Alex Deucher On 2025. december 17., szerda 23:11:45 középső államokbeli zónaidő Timur Kristóf wrote: > On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex > Deucher > wrote: > > GFX ring resets work differently on pre-GFX10 hardware since > > there is no MQD managed by the scheduler. > > For ring reset, you need issue the reset via CP_VMID_RESET > > via KIQ or MMIO and submit the following to the gfx ring to > > complete the reset: > > 1. EOP packet with EXEC bit set > > 2. WAIT_REG_MEM to wait for the fence > > 3. Clear CP_VMID_RESET to 0 > > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE > > 5. EOP packet with EXEC bit set > > 6. WAIT_REG_MEM to wait for the fence > > Once those commands have completed the reset should > > be complete and the ring can accept new packets. > > > > However, because we have a pipeline sync between jobs, > > the PFP is waiting on the fence from the bad job to signal so > > it can't process any of the packets in the reset sequence > > until that pipeline sync clears. To unblock the PFP, we > > use the KIQ to signal the fence after we reset the queue. > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > --- > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++- > > 1 file changed, 101 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index bb1465a98c7ca..9b7073650315e > > 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct amdgpu_ip_block > > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); > > > > adev->gfx.compute_supported_reset = > > > > amdgpu_get_soft_full_reset_mask(&adev- > > > >gfx.compute_ring[0]); > > > > - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > { > > > adev->gfx.compute_supported_reset |= > > AMDGPU_RESET_TYPE_PER_QUEUE; > > > + adev->gfx.gfx_supported_reset |= > > AMDGPU_RESET_TYPE_PER_QUEUE; > > > + } > > > > r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0); > > if (r) { > > > > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct > > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, num_nop > > - > > 1); > > > > } > > > > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring, > > + uint32_t reg, > > + uint32_t val) > > +{ > > + uint32_t cmd = 0; > > + > > + switch (ring->funcs->type) { > > + case AMDGPU_RING_TYPE_KIQ: > > + cmd = (1 << 16); /* no inc addr */ > > What do you mean by "inc addr" in this context? > > > + break; > > + default: > > + cmd = WR_CONFIRM; > > + break; > > + } > > + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); > > + amdgpu_ring_write(ring, cmd); > > + amdgpu_ring_write(ring, reg); > > + amdgpu_ring_write(ring, 0); > > + amdgpu_ring_write(ring, val); > > +} > > + > > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring, > > + uint32_t event_type, > > + uint32_t > > event_index) > > > +{ > > + amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0)); > > + amdgpu_ring_write(ring, EVENT_TYPE(event_type) | > > + EVENT_INDEX(event_index)); > > +} > > + > > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring, > > + unsigned int vmid, > > + struct amdgpu_fence *timedout_fence) > > +{ > > + struct amdgpu_device *adev = ring->adev; > > + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; > > + struct amdgpu_ring *kiq_ring = &kiq->ring; > > + unsigned long flags; > > + u32 tmp; > > + int r; > > + > > + amdgpu_ring_reset_helper_begin(ring, timedout_fence); > > + > > + spin_lock_irqsave(&kiq->ring_lock, flags); > > + > > + if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) { > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > + return -ENOMEM; > > + } > > + > > + /* send the reset - 5 */ > > + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); > > + gfx_v9_0_ring_emit_wreg(kiq_ring, > > + SOC15_REG_OFFSET(GC, 0, > > mmCP_VMID_RESET), tmp); > > > + /* emit the fence to clear the pipeline sync - 5 */ > > + gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr, > > + timedout_fence->base.seqno, > > 0); > > As far as I see, this isn't going to work when sched_hw_submission > 2 and > there are more than two jobs (from various different userspace processes) > emitted in the ring. > > I can think of two possible solutons: > - Emit each fence individually, with a short delay in between to give a > chance to the GFX ring to catch up with the KIQ. > - Change the wait_reg_mem command used for the pipeline sync to allow > greater than equal instead of just equal. Then it's enough to signal just > the last fence on the KIQ ring. On a second thought, both of my suggestions are wrong because they would cause the fences of non-guilty jobs to be emitted before the jobs are actually complete. It occurs to me that we could change the gfx pipeline sync to not rely on the same fences that signal job completion. Instead we could emit a different fence for just the pipeline sync. Then it is no longer a problem to signal those during recovery. > > > + amdgpu_ring_commit(kiq_ring); > > + r = amdgpu_ring_test_ring(kiq_ring); > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > + if (r) > > + return r; > > + > > + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7)) > > + return -ENOMEM; > > + /* emit the fence to finish the reset - 8 */ > > + ring->trail_seq++; > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > + ring->trail_seq, > > AMDGPU_FENCE_FLAG_EXEC); > > > + /* wait for the fence - 7 */ > > + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0, > > + lower_32_bits(ring- > > > >trail_fence_gpu_addr), > > > > + upper_32_bits(ring- > > > >trail_fence_gpu_addr), > > > > + ring->trail_seq, 0xffffffff, 4); > > + /* clear mmCP_VMID_RESET - 5 */ > > + gfx_v9_0_ring_emit_wreg_me(ring, > > + SOC15_REG_OFFSET(GC, 0, > > mmCP_VMID_RESET), 0); > > > + /* event write ENABLE_LEGACY_PIPELINE - 2 */ > > + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0); > > + /* emit a regular fence - 8 */ > > + ring->trail_seq++; > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > + ring->trail_seq, > > AMDGPU_FENCE_FLAG_EXEC); > > > + /* wait for the fence - 7 */ > > + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0, > > + lower_32_bits(ring- > > > >trail_fence_gpu_addr), > > > > + upper_32_bits(ring- > > > >trail_fence_gpu_addr), > > > > + ring->trail_seq, 0xffffffff, 4); > > Why is it necessary to emit (and wait for) a regular fence here? > I'm not against it, just curious why it's needed. > > > + amdgpu_ring_commit(ring); > > + /* wait for the commands to complete */ > > + r = amdgpu_ring_test_ring(ring); > > + if (r) > > + return r; > > + > > + return amdgpu_ring_reset_helper_end(ring, timedout_fence); > > +} > > + > > > > static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring, > > > > unsigned int vmid, > > struct amdgpu_fence *timedout_fence) > > > > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs > > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > Can you please split removing the soft recovery into a separate patch? > > Can we talk about removing the soft recovery? For the other chips where it > has already been removed, it is percieved by users as a regression. > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, > > > > + .reset = gfx_v9_0_reset_kgq, > > > > .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, > > .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, > > > > }; > > > > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs > > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > .emit_wave_limit = gfx_v9_0_emit_wave_limit, > > .reset = gfx_v9_0_reset_kcq, ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset 2025-12-18 5:11 ` Timur Kristóf 2025-12-18 5:28 ` Timur Kristóf @ 2025-12-18 15:58 ` Alex Deucher 2025-12-18 17:03 ` Timur Kristóf 1 sibling, 1 reply; 19+ messages in thread From: Alex Deucher @ 2025-12-18 15:58 UTC (permalink / raw) To: Timur Kristóf; +Cc: amd-gfx, Alex Deucher On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex Deucher > wrote: > > GFX ring resets work differently on pre-GFX10 hardware since > > there is no MQD managed by the scheduler. > > For ring reset, you need issue the reset via CP_VMID_RESET > > via KIQ or MMIO and submit the following to the gfx ring to > > complete the reset: > > 1. EOP packet with EXEC bit set > > 2. WAIT_REG_MEM to wait for the fence > > 3. Clear CP_VMID_RESET to 0 > > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE > > 5. EOP packet with EXEC bit set > > 6. WAIT_REG_MEM to wait for the fence > > Once those commands have completed the reset should > > be complete and the ring can accept new packets. > > > > However, because we have a pipeline sync between jobs, > > the PFP is waiting on the fence from the bad job to signal so > > it can't process any of the packets in the reset sequence > > until that pipeline sync clears. To unblock the PFP, we > > use the KIQ to signal the fence after we reset the queue. > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++- > > 1 file changed, 101 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index bb1465a98c7ca..9b7073650315e > > 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct amdgpu_ip_block > > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); > > adev->gfx.compute_supported_reset = > > amdgpu_get_soft_full_reset_mask(&adev- > >gfx.compute_ring[0]); > > - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > { > > adev->gfx.compute_supported_reset |= > AMDGPU_RESET_TYPE_PER_QUEUE; > > + adev->gfx.gfx_supported_reset |= > AMDGPU_RESET_TYPE_PER_QUEUE; > > + } > > > > r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0); > > if (r) { > > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct > > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, num_nop - > > 1); > > } > > > > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring, > > + uint32_t reg, > > + uint32_t val) > > +{ > > + uint32_t cmd = 0; > > + > > + switch (ring->funcs->type) { > > + case AMDGPU_RING_TYPE_KIQ: > > + cmd = (1 << 16); /* no inc addr */ > > What do you mean by "inc addr" in this context? It's part of the packet. bit 16 controls whether the address is incremented or not. This function is basically the same as gfx_v9_0_ring_emit_wreg(), but uses the ME to do the wait rather than the PFP. I could have alternatively added a new parameter to gfx_v9_0_ring_emit_wreg() to select between PFP and ME. > > > + break; > > + default: > > + cmd = WR_CONFIRM; > > + break; > > + } > > + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); > > + amdgpu_ring_write(ring, cmd); > > + amdgpu_ring_write(ring, reg); > > + amdgpu_ring_write(ring, 0); > > + amdgpu_ring_write(ring, val); > > +} > > + > > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring, > > + uint32_t event_type, > > + uint32_t > event_index) > > +{ > > + amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0)); > > + amdgpu_ring_write(ring, EVENT_TYPE(event_type) | > > + EVENT_INDEX(event_index)); > > +} > > + > > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring, > > + unsigned int vmid, > > + struct amdgpu_fence *timedout_fence) > > +{ > > + struct amdgpu_device *adev = ring->adev; > > + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; > > + struct amdgpu_ring *kiq_ring = &kiq->ring; > > + unsigned long flags; > > + u32 tmp; > > + int r; > > + > > + amdgpu_ring_reset_helper_begin(ring, timedout_fence); > > + > > + spin_lock_irqsave(&kiq->ring_lock, flags); > > + > > + if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) { > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > + return -ENOMEM; > > + } > > + > > + /* send the reset - 5 */ > > + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); > > + gfx_v9_0_ring_emit_wreg(kiq_ring, > > + SOC15_REG_OFFSET(GC, 0, > mmCP_VMID_RESET), tmp); > > + /* emit the fence to clear the pipeline sync - 5 */ > > + gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr, > > + timedout_fence->base.seqno, > 0); > > As far as I see, this isn't going to work when sched_hw_submission > 2 and > there are more than two jobs (from various different userspace processes) > emitted in the ring. > > I can think of two possible solutons: > - Emit each fence individually, with a short delay in between to give a chance > to the GFX ring to catch up with the KIQ. > - Change the wait_reg_mem command used for the pipeline sync to allow greater > than equal instead of just equal. Then it's enough to signal just the last > fence on the KIQ ring. That won't work. The signalling patch is asynchronous so if we signal additional fences other than the bad one, those jobs will end up being seen as successfully completed. That said, there is a change coming for the firmware to fix this. I'd suggest we just limit the queue depth to 2 until the new firmware is available. > > > > > + amdgpu_ring_commit(kiq_ring); > > + r = amdgpu_ring_test_ring(kiq_ring); > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > + if (r) > > + return r; > > + > > + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7)) > > + return -ENOMEM; > > + /* emit the fence to finish the reset - 8 */ > > + ring->trail_seq++; > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > + ring->trail_seq, > AMDGPU_FENCE_FLAG_EXEC); > > + /* wait for the fence - 7 */ > > + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0, > > + lower_32_bits(ring- > >trail_fence_gpu_addr), > > + upper_32_bits(ring- > >trail_fence_gpu_addr), > > + ring->trail_seq, 0xffffffff, 4); > > + /* clear mmCP_VMID_RESET - 5 */ > > + gfx_v9_0_ring_emit_wreg_me(ring, > > + SOC15_REG_OFFSET(GC, 0, > mmCP_VMID_RESET), 0); > > + /* event write ENABLE_LEGACY_PIPELINE - 2 */ > > + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0); > > + /* emit a regular fence - 8 */ > > + ring->trail_seq++; > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > + ring->trail_seq, > AMDGPU_FENCE_FLAG_EXEC); > > + /* wait for the fence - 7 */ > > + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0, > > + lower_32_bits(ring- > >trail_fence_gpu_addr), > > + upper_32_bits(ring- > >trail_fence_gpu_addr), > > + ring->trail_seq, 0xffffffff, 4); > > Why is it necessary to emit (and wait for) a regular fence here? > I'm not against it, just curious why it's needed. It's part of the recovery sequence to make sure the ENABLE_LEGACY_PIPELINE event has completed successfully. > > > + amdgpu_ring_commit(ring); > > + /* wait for the commands to complete */ > > + r = amdgpu_ring_test_ring(ring); > > + if (r) > > + return r; > > + > > + return amdgpu_ring_reset_helper_end(ring, timedout_fence); > > +} > > + > > static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring, > > unsigned int vmid, > > struct amdgpu_fence *timedout_fence) > > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs > > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > Can you please split removing the soft recovery into a separate patch? > > Can we talk about removing the soft recovery? For the other chips where it has > already been removed, it is percieved by users as a regression. Queue reset is superset of soft recovery. There's no need for soft recovery when queue reset is available. Alex > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, > > + .reset = gfx_v9_0_reset_kgq, > > .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, > > .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, > > }; > > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs > > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > .emit_wave_limit = gfx_v9_0_emit_wave_limit, > > .reset = gfx_v9_0_reset_kcq, > > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset 2025-12-18 15:58 ` Alex Deucher @ 2025-12-18 17:03 ` Timur Kristóf 2025-12-18 19:02 ` Alex Deucher 0 siblings, 1 reply; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 17:03 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx, Alex Deucher On 2025. december 18., csütörtök 9:58:45 középső államokbeli zónaidő Alex Deucher wrote: > On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex > > Deucher> > > wrote: > > > GFX ring resets work differently on pre-GFX10 hardware since > > > there is no MQD managed by the scheduler. > > > For ring reset, you need issue the reset via CP_VMID_RESET > > > via KIQ or MMIO and submit the following to the gfx ring to > > > complete the reset: > > > 1. EOP packet with EXEC bit set > > > 2. WAIT_REG_MEM to wait for the fence > > > 3. Clear CP_VMID_RESET to 0 > > > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE > > > 5. EOP packet with EXEC bit set > > > 6. WAIT_REG_MEM to wait for the fence > > > Once those commands have completed the reset should > > > be complete and the ring can accept new packets. > > > > > > However, because we have a pipeline sync between jobs, > > > the PFP is waiting on the fence from the bad job to signal so > > > it can't process any of the packets in the reset sequence > > > until that pipeline sync clears. To unblock the PFP, we > > > use the KIQ to signal the fence after we reset the queue. > > > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > > --- > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++- > > > 1 file changed, 101 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index > > > bb1465a98c7ca..9b7073650315e > > > 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct > > > amdgpu_ip_block > > > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); > > > > > > adev->gfx.compute_supported_reset = > > > > > > amdgpu_get_soft_full_reset_mask(&adev- > > > > > >gfx.compute_ring[0]); > > > > > > - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > > + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > > > { > > > > > adev->gfx.compute_supported_reset |= > > > > AMDGPU_RESET_TYPE_PER_QUEUE; > > > > > + adev->gfx.gfx_supported_reset |= > > > > AMDGPU_RESET_TYPE_PER_QUEUE; > > > > > + } > > > > > > r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0); > > > if (r) { > > > > > > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct > > > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, > > > num_nop - > > > 1); > > > > > > } > > > > > > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring, > > > + uint32_t reg, > > > + uint32_t val) > > > +{ > > > + uint32_t cmd = 0; > > > + > > > + switch (ring->funcs->type) { > > > + case AMDGPU_RING_TYPE_KIQ: > > > + cmd = (1 << 16); /* no inc addr */ > > > > What do you mean by "inc addr" in this context? > > It's part of the packet. bit 16 controls whether the address is > incremented or not. This function is basically the same as > gfx_v9_0_ring_emit_wreg(), but uses the ME to do the wait rather than > the PFP. I could have alternatively added a new parameter to > gfx_v9_0_ring_emit_wreg() to select between PFP and ME. Thanks. I was just not familiar with this field of the packet. > > > > + break; > > > + default: > > > + cmd = WR_CONFIRM; > > > + break; > > > + } > > > + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); > > > + amdgpu_ring_write(ring, cmd); > > > + amdgpu_ring_write(ring, reg); > > > + amdgpu_ring_write(ring, 0); > > > + amdgpu_ring_write(ring, val); > > > +} > > > + > > > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring, > > > + uint32_t event_type, > > > + uint32_t > > > > event_index) > > > > > +{ > > > + amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0)); > > > + amdgpu_ring_write(ring, EVENT_TYPE(event_type) | > > > + EVENT_INDEX(event_index)); > > > +} > > > + > > > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring, > > > + unsigned int vmid, > > > + struct amdgpu_fence *timedout_fence) > > > +{ > > > + struct amdgpu_device *adev = ring->adev; > > > + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; > > > + struct amdgpu_ring *kiq_ring = &kiq->ring; > > > + unsigned long flags; > > > + u32 tmp; > > > + int r; > > > + > > > + amdgpu_ring_reset_helper_begin(ring, timedout_fence); > > > + > > > + spin_lock_irqsave(&kiq->ring_lock, flags); > > > + > > > + if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) { > > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > > + return -ENOMEM; > > > + } > > > + > > > + /* send the reset - 5 */ > > > + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); > > > + gfx_v9_0_ring_emit_wreg(kiq_ring, > > > + SOC15_REG_OFFSET(GC, 0, > > > > mmCP_VMID_RESET), tmp); > > > > > + /* emit the fence to clear the pipeline sync - 5 */ > > > + gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr, > > > + timedout_fence->base.seqno, > > > > 0); > > > > As far as I see, this isn't going to work when sched_hw_submission > 2 and > > there are more than two jobs (from various different userspace processes) > > emitted in the ring. > > > > I can think of two possible solutons: > > - Emit each fence individually, with a short delay in between to give a > > chance to the GFX ring to catch up with the KIQ. > > - Change the wait_reg_mem command used for the pipeline sync to allow > > greater than equal instead of just equal. Then it's enough to signal just > > the last fence on the KIQ ring. > > That won't work. The signalling patch is asynchronous so if we signal > additional fences other than the bad one, those jobs will end up being > seen as successfully completed. Yeah, I already realized that after I sent the email, and sent a follow-up email with a different suggestion, that is to change the pipeline sync to not rely on the fence of the previous submission. eg. the pipeline sync could emit a separate fence, or even simpler, an ACQUIRE_MEM (gfx9+) / SURFACE_SYNC (gfx6-8). Then this becomes a non-issue. What do you think about that? > That said, there is a change coming > for the firmware to fix this. That's very nice to hear that AMD is still willing to fix firmware for old chips like Vega. However, - Would be nice to solve the problem for users who are still running old firmware, as it doesn't seem too difficult to do so. - AFAIK gfx7-8 also use the same queue reset mechanism so I think they may be susceptible to the same issue (and I don't think anyone's gonna release new FW for those). > I'd suggest we just limit the queue > depth to 2 until the new firmware is available. Last I heard of it, the SteamOS kernel set it to 4 due to "CPU bubbles" that they observed with the default setting of 2 on the Steam Deck. I think on the long term we should seriously consider increasing the default in upstream as well. > > > > + amdgpu_ring_commit(kiq_ring); > > > + r = amdgpu_ring_test_ring(kiq_ring); > > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > > + if (r) > > > + return r; > > > + > > > + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7)) > > > + return -ENOMEM; > > > + /* emit the fence to finish the reset - 8 */ > > > + ring->trail_seq++; > > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > > + ring->trail_seq, > > > > AMDGPU_FENCE_FLAG_EXEC); > > > > > + /* wait for the fence - 7 */ > > > + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0, > > > + lower_32_bits(ring- > > > > > >trail_fence_gpu_addr), > > > > > > + upper_32_bits(ring- > > > > > >trail_fence_gpu_addr), > > > > > > + ring->trail_seq, 0xffffffff, 4); > > > + /* clear mmCP_VMID_RESET - 5 */ > > > + gfx_v9_0_ring_emit_wreg_me(ring, > > > + SOC15_REG_OFFSET(GC, 0, > > > > mmCP_VMID_RESET), 0); > > > > > + /* event write ENABLE_LEGACY_PIPELINE - 2 */ > > > + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0); > > > + /* emit a regular fence - 8 */ > > > + ring->trail_seq++; > > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > > + ring->trail_seq, > > > > AMDGPU_FENCE_FLAG_EXEC); > > > > > + /* wait for the fence - 7 */ > > > + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0, > > > + lower_32_bits(ring- > > > > > >trail_fence_gpu_addr), > > > > > > + upper_32_bits(ring- > > > > > >trail_fence_gpu_addr), > > > > > > + ring->trail_seq, 0xffffffff, 4); > > > > Why is it necessary to emit (and wait for) a regular fence here? > > I'm not against it, just curious why it's needed. > > It's part of the recovery sequence to make sure the > ENABLE_LEGACY_PIPELINE event has completed successfully. Ah, okay. I didn't realize that we needed fences after EVENT_WRITE. > > > > + amdgpu_ring_commit(ring); > > > + /* wait for the commands to complete */ > > > + r = amdgpu_ring_test_ring(ring); > > > + if (r) > > > + return r; > > > + > > > + return amdgpu_ring_reset_helper_end(ring, timedout_fence); > > > +} > > > + > > > > > > static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring, > > > > > > unsigned int vmid, > > > struct amdgpu_fence *timedout_fence) > > > > > > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs > > > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > > > > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > > > > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > > > Can you please split removing the soft recovery into a separate patch? > > > > Can we talk about removing the soft recovery? For the other chips where it > > has already been removed, it is percieved by users as a regression. > > Queue reset is superset of soft recovery. There's no need for soft > recovery when queue reset is available. > I'm not arguing for or against either approach, just relaying how the change is percieved by users who tell their experience. From a user's perspective: - queue reset just kills the entire job and the offending process, ie. the user sees the game crash - soft recovery sometimes allows the guilty process to move on intact, ie. the user would see a "hitch" in the game but then it would keep working > > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > > .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, > > > > > > + .reset = gfx_v9_0_reset_kgq, > > > > > > .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, > > > .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, > > > > > > }; > > > > > > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs > > > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > > > > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > > > > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > > > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > > .emit_wave_limit = gfx_v9_0_emit_wave_limit, > > > .reset = gfx_v9_0_reset_kcq, ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset 2025-12-18 17:03 ` Timur Kristóf @ 2025-12-18 19:02 ` Alex Deucher 0 siblings, 0 replies; 19+ messages in thread From: Alex Deucher @ 2025-12-18 19:02 UTC (permalink / raw) To: Timur Kristóf; +Cc: amd-gfx, Alex Deucher On Thu, Dec 18, 2025 at 12:03 PM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 18., csütörtök 9:58:45 középső államokbeli zónaidő Alex > Deucher wrote: > > On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com> > wrote: > > > On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex > > > Deucher> > > > wrote: > > > > GFX ring resets work differently on pre-GFX10 hardware since > > > > there is no MQD managed by the scheduler. > > > > For ring reset, you need issue the reset via CP_VMID_RESET > > > > via KIQ or MMIO and submit the following to the gfx ring to > > > > complete the reset: > > > > 1. EOP packet with EXEC bit set > > > > 2. WAIT_REG_MEM to wait for the fence > > > > 3. Clear CP_VMID_RESET to 0 > > > > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE > > > > 5. EOP packet with EXEC bit set > > > > 6. WAIT_REG_MEM to wait for the fence > > > > Once those commands have completed the reset should > > > > be complete and the ring can accept new packets. > > > > > > > > However, because we have a pipeline sync between jobs, > > > > the PFP is waiting on the fence from the bad job to signal so > > > > it can't process any of the packets in the reset sequence > > > > until that pipeline sync clears. To unblock the PFP, we > > > > use the KIQ to signal the fence after we reset the queue. > > > > > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > > > --- > > > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++- > > > > 1 file changed, 101 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index > > > > bb1465a98c7ca..9b7073650315e > > > > 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct > > > > amdgpu_ip_block > > > > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); > > > > > > > > adev->gfx.compute_supported_reset = > > > > > > > > amdgpu_get_soft_full_reset_mask(&adev- > > > > > > > >gfx.compute_ring[0]); > > > > > > > > - if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > > > + if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset) > > > > > > { > > > > > > > adev->gfx.compute_supported_reset |= > > > > > > AMDGPU_RESET_TYPE_PER_QUEUE; > > > > > > > + adev->gfx.gfx_supported_reset |= > > > > > > AMDGPU_RESET_TYPE_PER_QUEUE; > > > > > > > + } > > > > > > > > r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0); > > > > if (r) { > > > > > > > > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct > > > > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, > > > > num_nop - > > > > 1); > > > > > > > > } > > > > > > > > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring, > > > > + uint32_t reg, > > > > + uint32_t val) > > > > +{ > > > > + uint32_t cmd = 0; > > > > + > > > > + switch (ring->funcs->type) { > > > > + case AMDGPU_RING_TYPE_KIQ: > > > > + cmd = (1 << 16); /* no inc addr */ > > > > > > What do you mean by "inc addr" in this context? > > > > It's part of the packet. bit 16 controls whether the address is > > incremented or not. This function is basically the same as > > gfx_v9_0_ring_emit_wreg(), but uses the ME to do the wait rather than > > the PFP. I could have alternatively added a new parameter to > > gfx_v9_0_ring_emit_wreg() to select between PFP and ME. > > Thanks. I was just not familiar with this field of the packet. > > > > > > > + break; > > > > + default: > > > > + cmd = WR_CONFIRM; > > > > + break; > > > > + } > > > > + amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); > > > > + amdgpu_ring_write(ring, cmd); > > > > + amdgpu_ring_write(ring, reg); > > > > + amdgpu_ring_write(ring, 0); > > > > + amdgpu_ring_write(ring, val); > > > > +} > > > > + > > > > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring, > > > > + uint32_t event_type, > > > > + uint32_t > > > > > > event_index) > > > > > > > +{ > > > > + amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0)); > > > > + amdgpu_ring_write(ring, EVENT_TYPE(event_type) | > > > > + EVENT_INDEX(event_index)); > > > > +} > > > > + > > > > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring, > > > > + unsigned int vmid, > > > > + struct amdgpu_fence *timedout_fence) > > > > +{ > > > > + struct amdgpu_device *adev = ring->adev; > > > > + struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; > > > > + struct amdgpu_ring *kiq_ring = &kiq->ring; > > > > + unsigned long flags; > > > > + u32 tmp; > > > > + int r; > > > > + > > > > + amdgpu_ring_reset_helper_begin(ring, timedout_fence); > > > > + > > > > + spin_lock_irqsave(&kiq->ring_lock, flags); > > > > + > > > > + if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) { > > > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > > > + return -ENOMEM; > > > > + } > > > > + > > > > + /* send the reset - 5 */ > > > > + tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); > > > > + gfx_v9_0_ring_emit_wreg(kiq_ring, > > > > + SOC15_REG_OFFSET(GC, 0, > > > > > > mmCP_VMID_RESET), tmp); > > > > > > > + /* emit the fence to clear the pipeline sync - 5 */ > > > > + gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr, > > > > + timedout_fence->base.seqno, > > > > > > 0); > > > > > > As far as I see, this isn't going to work when sched_hw_submission > 2 and > > > there are more than two jobs (from various different userspace processes) > > > emitted in the ring. > > > > > > I can think of two possible solutons: > > > - Emit each fence individually, with a short delay in between to give a > > > chance to the GFX ring to catch up with the KIQ. > > > - Change the wait_reg_mem command used for the pipeline sync to allow > > > greater than equal instead of just equal. Then it's enough to signal just > > > the last fence on the KIQ ring. > > > > That won't work. The signalling patch is asynchronous so if we signal > > additional fences other than the bad one, those jobs will end up being > > seen as successfully completed. > > Yeah, I already realized that after I sent the email, and sent a follow-up > email with a different suggestion, that is to change the pipeline sync to not > rely on the fence of the previous submission. eg. the pipeline sync could emit > a separate fence, or even simpler, an ACQUIRE_MEM (gfx9+) / SURFACE_SYNC > (gfx6-8). Then this becomes a non-issue. > > What do you think about that? We could try it. We need to verify that the ACQUIRE_MEM semantics match what we need. I think the following would also work and not require a WAIT_REG_MEM: EVENT_WRITE(VS done) EVENT_WRITE(PS done) EVENT_WRITE(CS done) PFP_SYNC_ME I'll give it a try. > > > That said, there is a change coming > > for the firmware to fix this. > > That's very nice to hear that AMD is still willing to fix firmware for old chips > like Vega. However, > > - Would be nice to solve the problem for users who are still running old > firmware, as it doesn't seem too difficult to do so. > - AFAIK gfx7-8 also use the same queue reset mechanism so I think they may be > susceptible to the same issue (and I don't think anyone's gonna release new FW > for those). > > > I'd suggest we just limit the queue > > depth to 2 until the new firmware is available. > > Last I heard of it, the SteamOS kernel set it to 4 due to "CPU bubbles" that > they observed with the default setting of 2 on the Steam Deck. I think on the > long term we should seriously consider increasing the default in upstream as > well. Ok. > > > > > > > + amdgpu_ring_commit(kiq_ring); > > > > + r = amdgpu_ring_test_ring(kiq_ring); > > > > + spin_unlock_irqrestore(&kiq->ring_lock, flags); > > > > + if (r) > > > > + return r; > > > > + > > > > + if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7)) > > > > + return -ENOMEM; > > > > + /* emit the fence to finish the reset - 8 */ > > > > + ring->trail_seq++; > > > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > > > + ring->trail_seq, > > > > > > AMDGPU_FENCE_FLAG_EXEC); > > > > > > > + /* wait for the fence - 7 */ > > > > + gfx_v9_0_wait_reg_mem(ring, 0, 1, 0, > > > > + lower_32_bits(ring- > > > > > > > >trail_fence_gpu_addr), > > > > > > > > + upper_32_bits(ring- > > > > > > > >trail_fence_gpu_addr), > > > > > > > > + ring->trail_seq, 0xffffffff, 4); > > > > + /* clear mmCP_VMID_RESET - 5 */ > > > > + gfx_v9_0_ring_emit_wreg_me(ring, > > > > + SOC15_REG_OFFSET(GC, 0, > > > > > > mmCP_VMID_RESET), 0); > > > > > > > + /* event write ENABLE_LEGACY_PIPELINE - 2 */ > > > > + gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0); > > > > + /* emit a regular fence - 8 */ > > > > + ring->trail_seq++; > > > > + gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, > > > > + ring->trail_seq, > > > > > > AMDGPU_FENCE_FLAG_EXEC); > > > > > > > + /* wait for the fence - 7 */ > > > > + gfx_v9_0_wait_reg_mem(ring, 1, 1, 0, > > > > + lower_32_bits(ring- > > > > > > > >trail_fence_gpu_addr), > > > > > > > > + upper_32_bits(ring- > > > > > > > >trail_fence_gpu_addr), > > > > > > > > + ring->trail_seq, 0xffffffff, 4); > > > > > > Why is it necessary to emit (and wait for) a regular fence here? > > > I'm not against it, just curious why it's needed. > > > > It's part of the recovery sequence to make sure the > > ENABLE_LEGACY_PIPELINE event has completed successfully. > > Ah, okay. I didn't realize that we needed fences after EVENT_WRITE. > > > > > > > + amdgpu_ring_commit(ring); > > > > + /* wait for the commands to complete */ > > > > + r = amdgpu_ring_test_ring(ring); > > > > + if (r) > > > > + return r; > > > > + > > > > + return amdgpu_ring_reset_helper_end(ring, timedout_fence); > > > > +} > > > > + > > > > > > > > static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring, > > > > > > > > unsigned int vmid, > > > > struct amdgpu_fence *timedout_fence) > > > > > > > > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs > > > > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > > > > > > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > > > > > > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > > > > > Can you please split removing the soft recovery into a separate patch? > > > > > > Can we talk about removing the soft recovery? For the other chips where it > > > has already been removed, it is percieved by users as a regression. > > > > Queue reset is superset of soft recovery. There's no need for soft > > recovery when queue reset is available. > > > > I'm not arguing for or against either approach, just relaying how the change > is percieved by users who tell their experience. > > From a user's perspective: > - queue reset just kills the entire job and the offending process, ie. the user > sees the game crash > - soft recovery sometimes allows the guilty process to move on intact, ie. the > user would see a "hitch" in the game but then it would keep working > Ah, I see what you mean. We do set the error on the fence in that case, but I guess mesa treats it differently (-ENODATA vs. -ETIME). We can think about bringing it back assuming it doesn't negatively impact the per queue resets. Alex > > > > > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > > > .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, > > > > > > > > + .reset = gfx_v9_0_reset_kgq, > > > > > > > > .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, > > > > .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, > > > > > > > > }; > > > > > > > > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs > > > > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg, > > > > > > > > .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, > > > > .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait, > > > > > > > > - .soft_recovery = gfx_v9_0_ring_soft_recovery, > > > > > > > > .emit_mem_sync = gfx_v9_0_emit_mem_sync, > > > > .emit_wave_limit = gfx_v9_0_emit_wave_limit, > > > > .reset = gfx_v9_0_reset_kcq, > > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once 2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher ` (4 preceding siblings ...) 2025-12-15 16:07 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ " Alex Deucher @ 2025-12-18 5:20 ` Timur Kristóf 2025-12-18 15:36 ` Alex Deucher 5 siblings, 1 reply; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 5:20 UTC (permalink / raw) To: amd-gfx; +Cc: Alex Deucher, Alex Deucher On 2025. december 15., hétfő 10:07:06 középső államokbeli zónaidő Alex Deucher wrote: > If we cancel a bad job and reemit the ring contents, and > we get another timeout, cancel everything rather than reemitting. > The wptr markers are only relevant for the original emit. If > we reemit, the wptr markers are no longer correct. I see the point of not reemitting, considering the wptrs are no longer correct. However, would it be possible to correct the wptrs instead? As it is, this sounds like it would make the reset less reliable when there is more than one job emitted that can cause hangs. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 22 +++++++++++++++++----- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++ > 2 files changed, 19 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index > 1fe31d2f27060..334ddd6e48c06 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > @@ -710,6 +710,7 @@ void amdgpu_fence_driver_guilty_force_completion(struct > amdgpu_fence *af) struct amdgpu_ring *ring = af->ring; > unsigned long flags; > u32 seq, last_seq; > + bool reemitted = false; > > last_seq = amdgpu_fence_read(ring) & ring- >fence_drv.num_fences_mask; > seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask; > @@ -727,7 +728,9 @@ void amdgpu_fence_driver_guilty_force_completion(struct > amdgpu_fence *af) if (unprocessed && > !dma_fence_is_signaled_locked(unprocessed)) { fence = > container_of(unprocessed, struct amdgpu_fence, base); > > - if (fence == af) > + if (fence->reemitted > 1) > + reemitted = true; > + else if (fence == af) > dma_fence_set_error(&fence->base, -ETIME); > else if (fence->context == af->context) > dma_fence_set_error(&fence->base, -ECANCELED); > @@ -735,9 +738,16 @@ void amdgpu_fence_driver_guilty_force_completion(struct > amdgpu_fence *af) rcu_read_unlock(); > } while (last_seq != seq); > spin_unlock_irqrestore(&ring->fence_drv.lock, flags); > - /* signal the guilty fence */ > - amdgpu_fence_write(ring, (u32)af->base.seqno); > - amdgpu_fence_process(ring); > + > + if (reemitted) { > + /* if we've already reemitted once then just cancel everything */ > + amdgpu_fence_driver_force_completion(af->ring); > + af->ring->ring_backup_entries_to_copy = 0; > + } else { > + /* signal the guilty fence */ > + amdgpu_fence_write(ring, (u32)af->base.seqno); > + amdgpu_fence_process(ring); > + } > } > > void amdgpu_fence_save_wptr(struct amdgpu_fence *af) > @@ -785,10 +795,12 @@ void amdgpu_ring_backup_unprocessed_commands(struct > amdgpu_ring *ring, /* save everything if the ring is not guilty, otherwise > * just save the content from other contexts. > */ > - if (!guilty_fence || (fence->context != guilty_fence->context)) > + if (!fence->reemitted && > + (!guilty_fence || (fence->context != guilty_fence->context))) > amdgpu_ring_backup_unprocessed_command(ring, wptr, > fence->wptr); > wptr = fence->wptr; > + fence->reemitted++; > } > rcu_read_unlock(); > } while (last_seq != seq); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index > a1fb0fadb6eab..d881829528976 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > @@ -150,6 +150,8 @@ struct amdgpu_fence { > u64 wptr; > /* fence context for resets */ > u64 context; > + /* has this fence been reemitted */ > + unsigned int reemitted; > }; > > extern const struct drm_sched_backend_ops amdgpu_sched_ops; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once 2025-12-18 5:20 ` [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Timur Kristóf @ 2025-12-18 15:36 ` Alex Deucher 2025-12-18 15:47 ` Timur Kristóf 0 siblings, 1 reply; 19+ messages in thread From: Alex Deucher @ 2025-12-18 15:36 UTC (permalink / raw) To: Timur Kristóf; +Cc: amd-gfx, Alex Deucher On Thu, Dec 18, 2025 at 12:36 AM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 15., hétfő 10:07:06 középső államokbeli zónaidő Alex Deucher > wrote: > > If we cancel a bad job and reemit the ring contents, and > > we get another timeout, cancel everything rather than reemitting. > > The wptr markers are only relevant for the original emit. If > > we reemit, the wptr markers are no longer correct. > > I see the point of not reemitting, considering the wptrs are no longer > correct. However, would it be possible to correct the wptrs instead? > Yes, that could be done, but it's more complicated. This is just a short term fix until that is implemented. Alex > As it is, this sounds like it would make the reset less reliable when there is > more than one job emitted that can cause hangs. > > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 22 +++++++++++++++++----- > > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++ > > 2 files changed, 19 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index > > 1fe31d2f27060..334ddd6e48c06 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > @@ -710,6 +710,7 @@ void amdgpu_fence_driver_guilty_force_completion(struct > > amdgpu_fence *af) struct amdgpu_ring *ring = af->ring; > > unsigned long flags; > > u32 seq, last_seq; > > + bool reemitted = false; > > > > last_seq = amdgpu_fence_read(ring) & ring- > >fence_drv.num_fences_mask; > > seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask; > > @@ -727,7 +728,9 @@ void amdgpu_fence_driver_guilty_force_completion(struct > > amdgpu_fence *af) if (unprocessed && > > !dma_fence_is_signaled_locked(unprocessed)) { fence = > > container_of(unprocessed, struct amdgpu_fence, base); > > > > - if (fence == af) > > + if (fence->reemitted > 1) > > + reemitted = true; > > + else if (fence == af) > > dma_fence_set_error(&fence->base, > -ETIME); > > else if (fence->context == af->context) > > dma_fence_set_error(&fence->base, > -ECANCELED); > > @@ -735,9 +738,16 @@ void amdgpu_fence_driver_guilty_force_completion(struct > > amdgpu_fence *af) rcu_read_unlock(); > > } while (last_seq != seq); > > spin_unlock_irqrestore(&ring->fence_drv.lock, flags); > > - /* signal the guilty fence */ > > - amdgpu_fence_write(ring, (u32)af->base.seqno); > > - amdgpu_fence_process(ring); > > + > > + if (reemitted) { > > + /* if we've already reemitted once then just cancel > everything */ > > + amdgpu_fence_driver_force_completion(af->ring); > > + af->ring->ring_backup_entries_to_copy = 0; > > + } else { > > + /* signal the guilty fence */ > > + amdgpu_fence_write(ring, (u32)af->base.seqno); > > + amdgpu_fence_process(ring); > > + } > > } > > > > void amdgpu_fence_save_wptr(struct amdgpu_fence *af) > > @@ -785,10 +795,12 @@ void amdgpu_ring_backup_unprocessed_commands(struct > > amdgpu_ring *ring, /* save everything if the ring is not guilty, otherwise > > * just save the content from other > contexts. > > */ > > - if (!guilty_fence || (fence->context != > guilty_fence->context)) > > + if (!fence->reemitted && > > + (!guilty_fence || (fence->context != > guilty_fence->context))) > > > amdgpu_ring_backup_unprocessed_command(ring, wptr, > > > fence->wptr); > > wptr = fence->wptr; > > + fence->reemitted++; > > } > > rcu_read_unlock(); > > } while (last_seq != seq); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index > > a1fb0fadb6eab..d881829528976 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > > @@ -150,6 +150,8 @@ struct amdgpu_fence { > > u64 wptr; > > /* fence context for resets */ > > u64 context; > > + /* has this fence been reemitted */ > > + unsigned int reemitted; > > }; > > > > extern const struct drm_sched_backend_ops amdgpu_sched_ops; > > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once 2025-12-18 15:36 ` Alex Deucher @ 2025-12-18 15:47 ` Timur Kristóf 0 siblings, 0 replies; 19+ messages in thread From: Timur Kristóf @ 2025-12-18 15:47 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx, Alex Deucher On 2025. december 18., csütörtök 9:36:57 középső államokbeli zónaidő Alex Deucher wrote: > On Thu, Dec 18, 2025 at 12:36 AM Timur Kristóf <timur.kristof@gmail.com> wrote: > > On 2025. december 15., hétfő 10:07:06 középső államokbeli zónaidő Alex > > Deucher> > > wrote: > > > If we cancel a bad job and reemit the ring contents, and > > > we get another timeout, cancel everything rather than reemitting. > > > The wptr markers are only relevant for the original emit. If > > > we reemit, the wptr markers are no longer correct. > > > > I see the point of not reemitting, considering the wptrs are no longer > > correct. However, would it be possible to correct the wptrs instead? > > Yes, that could be done, but it's more complicated. This is just a > short term fix until that is implemented. > > Alex I see. The patch is: Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> > > > As it is, this sounds like it would make the reset less reliable when > > there is more than one job emitted that can cause hangs. > > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > > --- > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 22 +++++++++++++++++----- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++ > > > 2 files changed, 19 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index > > > 1fe31d2f27060..334ddd6e48c06 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > @@ -710,6 +710,7 @@ void > > > amdgpu_fence_driver_guilty_force_completion(struct > > > amdgpu_fence *af) struct amdgpu_ring *ring = af->ring; > > > > > > unsigned long flags; > > > u32 seq, last_seq; > > > > > > + bool reemitted = false; > > > > > > last_seq = amdgpu_fence_read(ring) & ring- > > > > > >fence_drv.num_fences_mask; > > > > > > seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask; > > > > > > @@ -727,7 +728,9 @@ void > > > amdgpu_fence_driver_guilty_force_completion(struct > > > amdgpu_fence *af) if (unprocessed && > > > !dma_fence_is_signaled_locked(unprocessed)) { fence = > > > container_of(unprocessed, struct amdgpu_fence, base); > > > > > > - if (fence == af) > > > + if (fence->reemitted > 1) > > > + reemitted = true; > > > + else if (fence == af) > > > > > > dma_fence_set_error(&fence->base, > > > > -ETIME); > > > > > else if (fence->context == af->context) > > > > > > dma_fence_set_error(&fence->base, > > > > -ECANCELED); > > > > > @@ -735,9 +738,16 @@ void > > > amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af) > > > rcu_read_unlock(); > > > > > > } while (last_seq != seq); > > > spin_unlock_irqrestore(&ring->fence_drv.lock, flags); > > > > > > - /* signal the guilty fence */ > > > - amdgpu_fence_write(ring, (u32)af->base.seqno); > > > - amdgpu_fence_process(ring); > > > + > > > + if (reemitted) { > > > + /* if we've already reemitted once then just cancel > > > > everything */ > > > > > + amdgpu_fence_driver_force_completion(af->ring); > > > + af->ring->ring_backup_entries_to_copy = 0; > > > + } else { > > > + /* signal the guilty fence */ > > > + amdgpu_fence_write(ring, (u32)af->base.seqno); > > > + amdgpu_fence_process(ring); > > > + } > > > > > > } > > > > > > void amdgpu_fence_save_wptr(struct amdgpu_fence *af) > > > > > > @@ -785,10 +795,12 @@ void > > > amdgpu_ring_backup_unprocessed_commands(struct > > > amdgpu_ring *ring, /* save everything if the ring is not guilty, > > > otherwise > > > > > > * just save the content from other > > > > contexts. > > > > > */ > > > > > > - if (!guilty_fence || (fence->context != > > > > guilty_fence->context)) > > > > > + if (!fence->reemitted && > > > + (!guilty_fence || (fence->context != > > > > guilty_fence->context))) > > > > amdgpu_ring_backup_unprocessed_command(ring, wptr, > > > > fence->wptr); > > > > > wptr = fence->wptr; > > > > > > + fence->reemitted++; > > > > > > } > > > rcu_read_unlock(); > > > > > > } while (last_seq != seq); > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index > > > a1fb0fadb6eab..d881829528976 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h > > > @@ -150,6 +150,8 @@ struct amdgpu_fence { > > > > > > u64 wptr; > > > /* fence context for resets */ > > > u64 context; > > > > > > + /* has this fence been reemitted */ > > > + unsigned int reemitted; > > > > > > }; > > > > > > extern const struct drm_sched_backend_ops amdgpu_sched_ops; ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <BL1PR12MB5849EF718ACDD3FB753F8D1BE7A8A@BL1PR12MB5849.namprd12.prod.outlook.com>]
* Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset [not found] <BL1PR12MB5849EF718ACDD3FB753F8D1BE7A8A@BL1PR12MB5849.namprd12.prod.outlook.com> @ 2025-12-18 2:56 ` Chen, Jiqian 0 siblings, 0 replies; 19+ messages in thread From: Chen, Jiqian @ 2025-12-18 2:56 UTC (permalink / raw) To: Deucher, Alexander; +Cc: amd-gfx@lists.freedesktop.org > GFX ring resets work differently on pre-GFX10 hardware since > > there is no MQD managed by the scheduler. > > For ring reset, you need issue the reset via CP_VMID_RESET > > via KIQ or MMIO and submit the following to the gfx ring to > > complete the reset: > > 1. EOP packet with EXEC bit set > > 2. WAIT_REG_MEM to wait for the fence > > 3. Clear CP_VMID_RESET to 0 > > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE > > 5. EOP packet with EXEC bit set > > 6. WAIT_REG_MEM to wait for the fence > > Once those commands have completed the reset should > > be complete and the ring can accept new packets. > > > > However, because we have a pipeline sync between jobs, > > the PFP is waiting on the fence from the bad job to signal so > > it can't process any of the packets in the reset sequence > > until that pipeline sync clears. To unblock the PFP, we > > use the KIQ to signal the fence after we reset the queue. > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > --- > > Tested-by: Jiqian Chen <Jiqian.Chen@amd.com> > > -- Best regards, Jiqian Chen. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-12-18 19:02 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher
2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher
2025-12-18 5:17 ` Timur Kristóf
2025-12-15 16:07 ` [PATCH 3/6] drm/amdgpu: use dma_fence_get_status() for adapter reset Alex Deucher
2025-12-15 16:07 ` [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher
2025-12-18 5:15 ` Timur Kristóf
2025-12-18 15:41 ` Alex Deucher
2025-12-18 15:49 ` Timur Kristóf
2025-12-15 16:07 ` [PATCH 5/6] drm/amdgpu: mark fences with errors before ring reset Alex Deucher
2025-12-15 16:07 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ " Alex Deucher
2025-12-18 5:11 ` Timur Kristóf
2025-12-18 5:28 ` Timur Kristóf
2025-12-18 15:58 ` Alex Deucher
2025-12-18 17:03 ` Timur Kristóf
2025-12-18 19:02 ` Alex Deucher
2025-12-18 5:20 ` [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Timur Kristóf
2025-12-18 15:36 ` Alex Deucher
2025-12-18 15:47 ` Timur Kristóf
[not found] <BL1PR12MB5849EF718ACDD3FB753F8D1BE7A8A@BL1PR12MB5849.namprd12.prod.outlook.com>
2025-12-18 2:56 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset Chen, Jiqian
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.